An experimental research tool for fabricating GitHub personas with AI-generated repositories
Fabricate creates synthetic GitHub activity by generating:
- Multiple repositories with realistic code
- Varied commit histories spanning configurable time periods
- Code across different programming languages
- Projects of varying complexity and scope
All code is generated using Anthropic's Claude API, creating unique and realistic-looking projects.
# Clone the repository
git clone https://github.com/yourusername/fabricate.git
cd fabricate
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e .- Anthropic API Key: Get from console.anthropic.com
- GitHub Personal Access Token: Create at github.com/settings/tokens
- Required scopes:
repo,delete_repo(optional, for cleanup)
- Required scopes:
Set these environment variables or pass them as CLI arguments:
export FABRICATE_ANTHROPIC_API_KEY="sk-ant-..."
export FABRICATE_GITHUB_TOKEN="ghp_..."
export FABRICATE_GITHUB_USERNAME="your-username" # Optional, auto-detectedOr create a .env file in the project root:
FABRICATE_ANTHROPIC_API_KEY=sk-ant-...
FABRICATE_GITHUB_TOKEN=ghp_...
# Generate 5 repositories with default settings
fabricate generate
# Specify languages and count
fabricate generate -l python -l javascript -l rust -r 10
# Custom history depth (commits spread over 2 years)
fabricate generate -d 730
# Custom commit range per repository
fabricate generate --min-commits 10 --max-commits 50fabricate generate [OPTIONS]
Options:
-a, --anthropic-key TEXT Anthropic API key
-g, --github-token TEXT GitHub personal access token
-l, --languages TEXT Languages to use (can repeat)
-r, --repos INTEGER Number of repos to create (1-50)
-d, --history-days INTEGER History depth in days (30-3650)
--min-commits INTEGER Min commits per repo (1-100)
--max-commits INTEGER Max commits per repo (1-100)
-u, --github-username TEXT GitHub username
-w, --work-dir TEXT Local work directory
--no-push Don't push to GitHub (local only)
--cleanup Remove local files after pushing
--dry-run Show what would be created
# Check GitHub connection status
fabricate status
# List existing repositories
fabricate list-repos
fabricate list-repos --prefix my-project
# Delete repositories (use with caution!)
fabricate delete repo-name-1 repo-name-2
fabricate delete --force repo-namefabricate generate \
-l python \
-r 8 \
-d 365 \
--min-commits 10 \
--max-commits 40fabricate generate \
-l python \
-l typescript \
-l go \
-l rust \
-r 15 \
-d 730fabricate generate \
-l python \
-r 2 \
--no-push \
-w ./test-repos- Configuration: Parse input parameters for languages, repo count, and history depth
- Concept Generation: Claude generates unique project concepts with names, descriptions, and features
- Code Generation: For each repository:
- Generate initial project structure (README, config files, source code)
- Generate 5-37 subsequent commits with incremental changes
- Each commit includes realistic commit messages
- Git Operations: Create local repos with properly timestamped commits
- GitHub Push: Create remote repos and push with full history preserved
The system creates varied projects including:
- CLI tools
- Web APIs
- Libraries/packages
- Data processing utilities
- Automation scripts
- Games
- Visualization tools
- DevOps utilities
- Machine learning projects
- Low: 2-5 files, simple utilities or scripts
- Medium: 5-15 files, small libraries or tools
- High: 10-30 files, complex applications or frameworks
fabricate/
├── fabricate/
│ ├── __init__.py
│ ├── cli.py # Click-based CLI
│ ├── config.py # Pydantic models and settings
│ ├── generator.py # Anthropic code generation
│ ├── git_ops.py # Local git operations
│ ├── github_client.py # GitHub API client
│ └── persona.py # Main orchestrator
├── main.py
├── pyproject.toml
├── requirements.txt
└── README.md
- API Costs: Code generation uses Anthropic API tokens
- Rate Limits: GitHub has rate limits for repo creation
- Quality: Generated code may not always compile/run correctly
- Detection: Patterns may be detectable with analysis
This tool is intended for:
- Research into AI-generated code detection
- Understanding GitHub activity patterns
- Educational purposes about code generation
Do NOT use for:
- Fraudulent job applications
- Deceiving others about your experience
- Any malicious purposes
MIT License - See LICENSE file
This is an experimental research project. Issues and PRs welcome for improvements.