reference-finder
Reference Finder
A professional OpenClaw skill that uses Gemini AI to analyze research text, extract research domains and key concepts, then generate comprehensive reference lists with summaries.
Quick Start
- Configure
config.yamlwith your Gemini API key - Run:
python main.py --input your_research_text.txt - Review extracted domains and confirm
- Get structured Markdown output with references
Configuration
Edit config.yaml:
model:
name: "gemini-2.0-flash-exp" # Model to use
api_key: "${GEMINI_API_KEY}" # API key (env var or hardcode)
api_base: "https://generativelanguage.googleapis.com/v1beta"
proxy:
enabled: false
http: "http://127.0.0.1:7890"
https: "http://127.0.0.1:7890"
defaults:
min_papers_per_domain: 20 # Minimum papers per domain
max_papers_per_domain: 30 # Maximum papers per domain
output_dir: "./references" # Output directory
Usage
Basic Usage
python main.py --input research_idea.txt
With Options
python main.py \
--input research_idea.txt \
--output-dir ./my_references \
--min-papers 15 \
--max-papers 25
Interactive Mode
python main.py --interactive
Then paste your research text when prompted.
Workflow
- Domain Extraction: Gemini analyzes input text and extracts 3-7 research domains with key concepts
- User Confirmation: Displays extracted domains and proposed paper counts for approval
- Literature Generation: Generates relevant references for each domain (min 20, max configurable)
- Output: Creates structured Markdown file with all references
Output Format
The skill generates a Markdown file with:
- Header: Analysis timestamp and source
- Domains Section: Each domain as a section
- Domain name and key concepts
- Paper list with:
- Title
- Authors
- Year
- Venue
- Abstract
- Relevance explanation
File Structure
reference-finder/
├── SKILL.md # This file
├── config.yaml # User configuration
├── main.py # Entry point
├── requirements.txt # Python dependencies
├── prompts/
│ ├── extraction.txt # Domain extraction prompt
│ └── literature.txt # Paper generation prompt
├── src/
│ ├── __init__.py
│ ├── config.py # Configuration management
│ ├── gemini_client.py # API client
│ ├── extractor.py # Domain extraction
│ ├── generator.py # Literature generation
│ └── reporter.py # Markdown output
└── tests/
├── test_config.py
├── test_gemini_client.py
├── test_extractor.py
├── test_generator.py
└── test_reporter.py
API Reference
Main Classes
Config (src/config.py)
- Loads configuration from YAML
- Supports environment variable substitution
- Provides typed accessors
GeminiClient (src/gemini_client.py)
- Handles API communication with Gemini
- Configurable model, API key, proxy settings
- JSON response parsing
DomainExtractor (src/extractor.py)
- Extracts research domains from text
- Returns structured domain data with concepts
LiteratureGenerator (src/generator.py)
- Generates paper references for each domain
- Validates paper counts
- Randomizes paper count within min/max bounds
MarkdownReporter (src/reporter.py)
- Formats output as Markdown
- Groups papers by domain
- Includes summary statistics
Prompts
Domain Extraction (prompts/extraction.txt)
Prompt template for extracting 3-7 research domains from input text. Returns JSON array with domain name and key concepts.
Literature Generation (prompts/literature.txt)
Prompt template for generating paper references for a domain. Returns JSON array with paper details (title, authors, year, venue, abstract, relevance).
Testing
Install test dependencies:
pip install pytest pytest-cov
Run tests:
python -m pytest tests/
Run with coverage:
python -m pytest tests/ --cov=src --cov-report=html
Error Handling
All modules include comprehensive error handling:
- Config errors: File not found, invalid YAML
- API errors: Connection issues, rate limits, blocked content
- Validation errors: Invalid response format, missing fields
- I/O errors: File read/write issues
Dependencies
requests>=2.28.0- HTTP client for API callspyyaml>=6.0- YAML configuration parsingpytest>=7.0.0- Testing framework (dev)
Install all:
pip install -r requirements.txt