Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@ ## master #510 +/- ## ========================================== + Coverage 81.54% 82.10% +0.55% ========================================== Files 16 27 +11 Lines 2254 2900 +646 Branches 473 581 +108 ========================================== + Hits 1838 2381 +543 - Misses 266 333 +67 - Partials 150 186 +36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
why: Document the new import feature for the changelog. what: - Add New features section for v1.51.x unreleased - Document vcspull import command with usage examples - List supported services, aliases, and filtering options
There was a problem hiding this comment.
Pull request overview
Adds a new vcspull import CLI command and supporting “remote service importer” implementations to discover repositories from hosted services and write them into a vcspull config.
Changes:
- Introduces
vcspull importcommand with service selection, filtering, output modes (human/json/ndjson), confirmation, and dry-run. - Adds a new internal
remotespackage implementing GitHub/GitLab/Gitea(Codeberg/Forgejo)/CodeCommit importers plus shared HTTP/filtering primitives. - Adds comprehensive unit tests for the CLI command and each importer, plus changelog and logger name coverage updates.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_log.py | Adds the new CLI module logger name to the logger discovery test. |
| tests/cli/test_import_repos.py | Adds CLI-level tests for importer selection, config resolution, output modes, dry-run/confirmation, and error handling. |
| tests/_internal/remotes/test_gitlab.py | Adds GitLab importer tests including auth-required search behavior. |
| tests/_internal/remotes/test_github.py | Adds GitHub importer tests including filtering and limit handling. |
| tests/_internal/remotes/test_gitea.py | Adds Gitea/Codeberg importer tests including search response variants. |
| tests/_internal/remotes/test_base.py | Adds tests for shared base models/utilities (RemoteRepo, ImportOptions, filter_repo). |
| tests/_internal/remotes/conftest.py | Adds shared HTTP mocking helpers and sample API payload fixtures for remotes tests. |
| tests/_internal/remotes/init.py | Marks the remotes tests package. |
| src/vcspull/cli/import_repos.py | Implements the vcspull import command handler and argument parsing. |
| src/vcspull/cli/init.py | Registers the new import subcommand and help text/examples. |
| src/vcspull/_internal/remotes/gitlab.py | Implements GitLab repository discovery (user/org/search) via GitLab REST API. |
| src/vcspull/_internal/remotes/github.py | Implements GitHub repository discovery (user/org/search) via GitHub REST API. |
| src/vcspull/_internal/remotes/gitea.py | Implements Gitea/Forgejo/Codeberg discovery via Gitea-compatible REST API. |
| src/vcspull/_internal/remotes/codecommit.py | Implements CodeCommit discovery via AWS CLI subprocess calls. |
| src/vcspull/_internal/remotes/base.py | Adds shared dataclasses, filtering logic, error hierarchy, and a small urllib-based HTTP client. |
| src/vcspull/_internal/remotes/init.py | Exposes the remotes package public API (__all__). |
| CHANGES | Documents the new vcspull import feature and usage examples. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. The new code follows existing patterns in the codebase for docstrings, imports, and test structure. 🤖 Generated with Claude Code |
Code reviewFound 1 issue:
vcspull/src/vcspull/_internal/remotes/codecommit.py Lines 237 to 239 in 25d16e2 Compare to GitHub importer which correctly filters: vcspull/src/vcspull/_internal/remotes/github.py Lines 189 to 191 in 25d16e2 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| The import itself seems to be working, but it imports the repositories with |
why: Document the new import feature for the changelog. what: - Add New features section for v1.51.x unreleased - Document vcspull import command with usage examples - List supported services, aliases, and filtering options
why: Document the new import feature for the changelog. what: - Add New features section for v1.51.x unreleased - Document vcspull import command with usage examples - List supported services, aliases, and filtering options
why: Document the new import feature for the changelog. what: - Add New features section for v1.51.x unreleased - Document vcspull import command with usage examples - List supported services, aliases, and filtering options
why: Document the new import feature for the changelog. what: - Add New features section for v1.51.x unreleased - Document vcspull import command with usage examples - List supported services, aliases, and filtering options
| The ssh based urls work now, but i noticed that it flattens the structure. So basically I run |
| @aschleifer Yep - right now the GitLab importer flattens everything under a single workspace key (e.g. To implement “preserve GitLab structure” correctly, can you paste a small PII-redacted example of:
For the YAML, should this be represented as multiple workspace roots that mirror namespaces, e.g. ~/tmp/a/b: repo1: ... ~/tmp/a/b/subgroup: repo2: ...or keep a single Also confirm the desired on-disk layout relative to |
| Example structure: Command: Expected content of So basically the full tree structure under the targeted group. |
why: Document the new import feature for the changelog. what: - Add New features section for v1.51.x unreleased - Document vcspull import command with usage examples - List supported services, aliases, and filtering options
a8915f5 to 26d91ae Compare why: Enable importing repositories from GitHub, GitLab, Codeberg/Gitea/Forgejo, and AWS CodeCommit into vcspull configuration. what: - Add base.py with RemoteRepo dataclass, ImportOptions, ImportMode enum - Add HTTPClient for stdlib-only HTTP requests (urllib) - Add error hierarchy: AuthenticationError, RateLimitError, NotFoundError, etc. - Add GitHubImporter with user/org/search modes - Add GitLabImporter with group/search support (auth required for search) - Add GiteaImporter supporting Codeberg, Gitea, Forgejo instances - Add CodeCommitImporter using AWS CLI subprocess calls - Add filter_repo() for client-side filtering by language, topics, stars
why: Allow users to import repositories from remote services directly into their vcspull configuration without manual entry. what: - Add create_import_subparser() for CLI argument handling - Add import_repos() main function with full import workflow - Support services: github, gitlab, codeberg, gitea, forgejo, codecommit - Add service aliases (gh, gl, cb, cc, aws) - Add filtering: --language, --topics, --min-stars, --archived, --forks - Add output modes: human-readable, --json, --ndjson - Add --dry-run and --yes options for confirmation control - Require --workspace flag (no default guessing)
why: Make the import command accessible via vcspull CLI. what: - Import create_import_subparser, import_repos from import_repos module - Add IMPORT_DESCRIPTION with usage examples - Add import subparser to CLI - Add handler for import subparser in cli() function
why: GitHub Enterprise requires /api/v3 path prefix but the importer used the base URL as-is, unlike Gitea which correctly appends /api/v1. what: - Auto-append /api/v3 when base_url is provided and lacks /api/ path - Skip normalization for default api.github.com and pre-suffixed URLs - Add tests for GHE normalization, idempotency, and public URL
…pace why: When a workspace section in the config is not a dict, the import loop logged an error but returned exit 0 with a misleading success message ("All repositories already exist"). what: - Track workspace sections that fail validation in error_labels set - Return exit 1 before the "all exist" message when errors occurred - Add test asserting non-mapping workspace returns exit code 1 …bort why: When stdin is not a TTY and --yes is not provided, _run_import returned 0 (success) even though no import occurred. CI/automation scripts chaining on exit codes would incorrectly proceed. what: - Change return 0 to return 1 at the non-interactive abort path - Add return value assertion to test_import_repos_non_tty_aborts
…sponses why: dict.get("key", {}) returns None when the key exists with JSON null value, causing AttributeError on subsequent .get() calls. APIs may return null for deleted accounts, system repos, or self-hosted edge cases. what: - Change data.get("namespace", {}) to data.get("namespace") or {} in gitlab.py - Change data.get("owner", {}) to (data.get("owner") or {}) in github.py - Change data.get("owner", {}) to data.get("owner") or {} in gitea.py - Add test_github_parse_repo_null_owner - Add test_gitlab_parse_repo_null_namespace - Add test_gitea_parse_repo_null_owner …g filter why: Help said "prefix filter" but the implementation uses substring matching (the `in` operator), which matches anywhere in the name. what: - Change help text from "prefix" to "substring" at codecommit.py
…ging why: Naive f"{url}?{urlencode(params)}" would produce a malformed URL with double question marks if the endpoint already contained query parameters. what: - Replace string concatenation with urllib.parse.urlsplit/urlunsplit to properly merge existing and new query parameters - Add test_http_client_get_merges_query_params to verify correct behavior why: Authorization tokens sent via HTTP are visible to network observers. Users who provide http:// URLs with --url should be warned about the security risk. what: - Add warning log in HTTPClient.__init__ when token + non-HTTPS base URL - Add test_http_client_warns_on_non_https_with_token - Add test_http_client_no_warning_on_https_with_token
why: save_config_json had zero test coverage, and no integration test exercised the JSON config write path through _run_import. what: - Add test_save_config_json_write_and_readback - Add test_save_config_json_atomic_write - Add test_save_config_json_atomic_preserves_permissions - Add test_import_repos_json_config_write integration test
why: GitHub search API returns HTTP 422 when requesting results beyond offset 1000. Without a guard, the pagination loop would crash after partial progress when --limit exceeds 1000. what: - Add SEARCH_MAX_RESULTS = 1000 constant - Break pagination when page * DEFAULT_PER_PAGE >= SEARCH_MAX_RESULTS - Add test_github_search_caps_at_1000_results
why: subprocess.run without timeout blocks indefinitely if the AWS CLI hangs due to network issues or broken credential providers. HTTP-based importers already have a 30-second timeout via HTTPClient. what: - Add timeout=60 to subprocess.run in _run_aws_command - Catch subprocess.TimeoutExpired and raise ServiceUnavailableError - Add ServiceUnavailableError to imports - Add test_codecommit_timeout_raises_service_unavailable
… files why: Each file defined log = logging.getLogger(__name__) but never used it. The logging import and log variable are dead code. what: - Remove import logging and log variable from github.py, gitlab.py, codeberg.py, forgejo.py, and gitea.py CLI handlers
why: yaml.safe_load was used for all config files regardless of extension. While YAML is a superset of JSON, dispatching on file extension is semantically correct and produces more specific error messages for JSON parse failures. what: - Dispatch on config file suffix: json.loads for .json, yaml.safe_load for .yaml/.yml - Use broad except to catch both json.JSONDecodeError and yaml.YAMLError
why: The lambda-based mock caused a mypy type inference error. what: - Replace inline io.BytesIO mock with shared MockHTTPResponse fixture
…onfig loading why: The inline JSON/YAML dispatch duplicated what ConfigReader._from_file() already provides, creating an asymmetry with the save path that already uses ConfigReader._dump() via save_config_yaml/save_config_json. what: - Replace 12-line inline JSON/YAML dispatch block with ConfigReader._from_file() - Remove lazy imports of json and yaml that were only needed for inline dispatch
why: Project style guide requires one command per code block for copyability. what: - Split combined auth+import code blocks into separate blocks in 6 files - Add explanatory text between the blocks (github, gitlab, codeberg, gitea, forgejo, codecommit)
why: Consecutive code blocks without explanatory text leave the reader guessing. what: - Add "SSH (default):" label before the first block - Add "Use --https for HTTPS clone URLs:" before the second block
why: README omitted vcspull import despite it being a major v1.55 feature. what: - Add vcspull import to the config-creation sentence at line 71 - Add "Import from remote services" subsection with example commands
…commands why: Shortform flags are cryptic in user-facing docs; multi-flag one-liners are hard to scan and copy-paste. what: - Add "Prefer longform flags" rule to Documentation Standards - Add "Split multi-flag commands" rule with \-continuation style - Include Good/Bad examples showing both rules together
why: Shortform flags (-w, -f, -S, -v) are cryptic in user-facing docs; multi-flag one-liners are hard to scan and copy-paste. what: - Replace -w with --workspace in all doc code blocks - Replace -f with --file in all doc code blocks - Replace -S with --smart-case and -v with --invert-match in search docs - Split multi-flag commands onto \-continuation lines - Update prose references to prefer longform names - Remove redundant "Short form" examples from fmt.md
| Addressed in 12899cf — all 6 service pages now have separate code blocks with prose between them. |
| Feasible doctests were added in f84a7a5. The remaining methods without doctests ( |
why: `vcspull import gitlab` (PR #510) fully replaces these community scripts with built-in pagination, dry-run, filtering, and config merging. what: - Remove scripts/generate_gitlab.py - Remove scripts/generate_gitlab.sh
…edirect why: The generation page referenced the now-removed gitlab scripts. Redirect readers to `vcspull import` which is the supported approach. what: - Replace generation.md content with stub pointing to {ref}cli-import - Update quickstart.md seealso to reference cli-import instead of config-generation - Remove generation toctree entry from configuration/index.md why: `vcspull import gitlab` (PR #510) fully replaces these community scripts with built-in pagination, dry-run, filtering, and config merging. what: - Remove scripts/generate_gitlab.py - Remove scripts/generate_gitlab.sh
vcspull import
Summary
Adds a new
vcspull importcommand to search and import repositories from remote services into vcspull configuration.Closes #416
Features
gh,gl,cb,cc,awsfor convenienceuser,org,search--language,--topics,--min-stars,--archived,--forks--json,--ndjson--dry-runpreview,--yesto skip confirmationUsage Examples
Import a user's repositories:
$ vcspull import github torvalds -w ~/repos/linux --mode userImport an organization's repositories:
$ vcspull import github django -w ~/study/python --mode orgSearch and import repositories:
$ vcspull import github "machine learning" -w ~/ml-repos --mode search --min-stars 1000Use with self-hosted GitLab:
$ vcspull import gitlab myuser -w ~/work --url https://gitlab.company.comPreview without writing (dry run):
$ vcspull import codeberg user -w ~/oss --dry-runImport from AWS CodeCommit:
$ vcspull import codecommit -w ~/work/aws --region us-east-1Architecture
src/vcspull/_internal/remotes/: New package with service importersbase.py:RemoteRepodataclass,ImportOptions,HTTPClient, error hierarchygithub.py,gitlab.py,gitea.py,codecommit.py: Service-specific implementationssrc/vcspull/cli/import_repos.py: CLI command handlerurllibfor HTTP,subprocessfor AWS CLITest Plan
Automated Tests
Authentication Requirements
GITLAB_TOKEN)CODEBERG_TOKEN/GITEA_TOKEN)Setup for Testing
Option A: Test via uvx (no clone required)
Option B: Test from cloned branch
Manual Test Commands
Show help:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import --helpuv run vcspull import --helpShow help (no args is equivalent to --help):
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull importuv run vcspull importGitHub - user repos:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import github torvalds -w ~/test --mode user --dry-run --limit 10uv run vcspull import github torvalds -w ~/test --mode user --dry-run --limit 10GitHub - org repos:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import github django -w ~/test --mode org --dry-run --limit 10uv run vcspull import github django -w ~/test --mode org --dry-run --limit 10GitHub - search with min-stars filter:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import github "machine learning" -w ~/test --mode search --dry-run --limit 5 --min-stars 1000uv run vcspull import github "machine learning" -w ~/test --mode search --dry-run --limit 5 --min-stars 1000Codeberg - org repos:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import codeberg forgejo -w ~/test --mode org --dry-run --limit 10uv run vcspull import codeberg forgejo -w ~/test --mode org --dry-run --limit 10GitLab - org/group (requires token):
GitLab - subgroup with slash notation (requires token):
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import gitlab gitlab-org/ci-cd -w ~/test --mode org --dry-run --limit 10uv run vcspull import gitlab gitlab-org/ci-cd -w ~/test --mode org --dry-run --limit 10JSON output:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import github torvalds -w ~/test --dry-run --limit 3 --jsonuv run vcspull import github torvalds -w ~/test --dry-run --limit 3 --jsonNDJSON output:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import github torvalds -w ~/test --dry-run --limit 3 --ndjsonuv run vcspull import github torvalds -w ~/test --dry-run --limit 3 --ndjsonLanguage filter:
uvx --with typing_extensions --from "git+https://github.com/vcs-python/vcspull@scraper" vcspull import github tony -w ~/test --dry-run --limit 5 --language Pythonuv run vcspull import github tony -w ~/test --dry-run --limit 5 --language Python