tl;dr : here is something to pull down a bunch of my github repositories, and eventually weed out the rest. The filtering and cloning works, but there might be an issue with the syncing from the original fork automatically right now.
After churning this out so quickly with Grok, I am realizing that I'll really never go back and use, nor even look at, the code in practically all of these repos.
.... the sands continue to shift in this strange new world with "AI"
GitHub fork-sync filtered cloner
upper-downer.v is a command-line utility written in the V programming language to manage and analyze GitHub repositories. It fetches all repositories for a specified GitHub user, displays their details in a Markdown table, and optionally clones or updates them locally. The tool supports filtering by license and programming language, and for forked repositories, it can sync with the upstream parent before cloning or updating.
Features
- Fetch All Repositories: Retrieves all public and private (if authorized) repositories for a GitHub user, handling pagination.
- Markdown Table Output: Displays repository details (Name, License, Language, Forked From, Description) in a clean Markdown table.
- Clone or Update Repositories: Clones new repositories or updates existing ones locally with the -c or --clone switch.
- Fork Syncing: Automatically syncs forked repositories with their upstream parent before cloning or updating (requires -c).
- Flexible Filtering: Filters repositories by license (e.g., MIT, GPL-3.0) and language (e.g., C, Go) using the -f switch, with support for inclusion and exclusion.
- Command-Line Interface: Takes GitHub username and token as arguments, with intuitive switches for customization.
Prerequisites
- V Language: Install V from vlang.io. Verify with v --version.
- Git: Install Git and ensure it's in your PATH. Verify with git --version.
- GitHub Personal Access Token: Generate a token with repo scope from GitHub (Settings > Developer settings > Personal access tokens) to access private repositories and sync forks.
Installation
Clone or Download the Script:
fossil clone https://refaqtory.net/github-repo-processor repo-proc.fossil
mkdir repo-proc
cd repo-proc
fossil open ../repo-proc.fossil
Alternatively, download script directly.
Verify V Installation:Ensure V is installed:
v --version
Usage
Run the script using V, providing your GitHub username and personal access token as arguments. Use optional switches to customize behavior.
Basic Syntax
v run github_repo_processor.v <github_username> <github_token> [-c --clone] [-f filter_terms...]
alternately, just build it once, and run the executable.
Options
<github_username>: Your GitHub username (e.g., refaqtor). <github_token>: Your GitHub personal access token with repo scope. -c, --clone: Clone new repositories or update existing ones locally in the ./repos directory. For forks, syncs with the upstream parent first. -f filter_terms...: Filter repositories by license or language. Use ! to exclude terms. Examples: MIT: Include MIT-licensed repositories. !GPL: Exclude GPL-licensed repositories. C Go: Include repositories in C or Go. !Python !Javascript: Exclude Python or JavaScript repositories.
Examples
Print All Repositories in a Markdown Table:
v run github_repo_processor.v github_user your_token
Output: Found 150 repositories
Name | License | Language | Forked From | Description |
---|---|---|---|---|
repo1 | mit | go | owner/orig_repo | A Go project for testing |
repo2 | no license | c | N/A | No description |
Repository information printed successfully
Print MIT-Licensed C or Go Repositories:
v run github_repo_processor.v github_user your_token -f MIT C Go
Output:
Found 150 repositories
Name | License | Language | Forked From | Description |
---|---|---|---|---|
repo1 | mit | go | owner/orig_repo | A Go project for testing |
repo2 | mit | c | N/A | No description |
Repository information printed successfully
Clone MIT-Licensed C or Go Repositories, Excluding GPL, Python, JavaScript:
v run github_repo_processor.v github_user your_token -c -f MIT !GPL C Go !Python !Javascript
Output:
Found 150 repositories
Syncing fork: repo1
Cloning repository: repo1
Updating repository: repo2
All selected repositories cloned or updated successfully
Handle No Matching Repositories:
v run github_repo_processor.v github_user your_token -f NonExistent
Output:
Found 150 repositories
No repositories match the specified filters
How It Works
Fetching Repositories:
Uses the GitHub API (GET /users/{username}/repos) to fetch all repositories, handling pagination with per_page=100. Requires a token with repo scope for private repositories and fork syncing.
Filtering:
Filters repositories by license (e.g., mit, gpl-3.0) and language (e.g., C, Go). Supports inclusion (e.g., MIT) and exclusion (e.g., !GPL) in a case-insensitive manner. If no filters are specified, all repositories are included.
Output:
By default, prints a Markdown table with columns: Name, License, Language, Forked From (parent repository for forks, N/A otherwise), and Description. Handles missing data (e.g., No license, No description).
Cloning and Syncing:
With -c or --clone, clones repositories to ./repos/<repo_name> or updates existing ones with git pull. For forked repositories, syncs with the upstream parent using POST /repos/{owner}/{repo}/merge-upstream and waits for completion (polling updated_at with a 300-second timeout).
Notes
- Token Security: Store your GitHub token securely. Avoid committing it to version control. Consider using environment variables or a secure vault.
- Rate Limits: The GitHub API allows 5,000 requests per hour with a token. Syncing forks and polling use additional requests but typically stay within limits.
- Fork Syncing: Assumes main as the default branch for syncing. To support other branches, modify the sync_fork function.
- Markdown Table: Renders best in Markdown viewers (e.g., GitHub, VS Code). Long descriptions may need truncation for display.
- Error Handling: Gracefully handles API errors, Git failures, and sync timeouts, skipping problematic repositories.
Troubleshooting
- Compilation Errors: Ensure V is installed (v --version) and the script is saved correctly.
- API Errors: Verify your token has repo scope and check rate limits (X-RateLimit-Remaining header).
- No Matches: Ensure filter terms match GitHub’s license keys (e.g., mit, not MIT License) and language names (e.g., Go, not Golang).
- Sync Failures: Check if the fork’s default branch matches the upstream branch (main by default).