Skip to content

fix: Improve Python project detection and entrypoint resolution#1010

Open
vdusek wants to merge 4 commits intomasterfrom
feat/improve-python-project-detection
Open

fix: Improve Python project detection and entrypoint resolution#1010
vdusek wants to merge 4 commits intomasterfrom
feat/improve-python-project-detection

Conversation

@vdusek
Copy link
Contributor

@vdusek vdusek commented Jan 27, 2026

Summary

  • Simplified Python project detection to discover packages purely by looking for directories with __init__.py files.
  • Derived default Actor name from the Python package name.
  • Added near-miss detection with actionable fix suggestions for broken Python packages.
  • Subpackage detection.
  • Added 29 unit tests covering the full detection matrix.

Python project detection

  • Detect Python projects purely via package structure: directories with valid Python identifier names containing __init__.py.
  • Skip hidden directories (.venv) and underscore-prefixed directories (__pycache__, _internal) as they shouldn't be main entrypoints.
  • When src/ is itself a package (has __init__.py), treat nested directories as subpackages, not separate top-level packages.

Actor name derivation

  • For Python projects, the default Actor name is derived from the package name.
  • Entrypoint src.my_package → extracts my_package → sanitizes to my-package.
  • Entrypoint my_package → sanitizes to my-package.

Runtime precedence

  • Node.js (package.json) takes precedence over Python indicators when both exist.

Error handling

  • Near-miss: invalid name + __init__.py — suggests renaming the directory (e.g., my-package/ → my_package/)
  • Near-miss: valid name + .py files but no __init__.py — suggests adding __init__.py
  • Near-miss: invalid name + .py files but no __init__.py — suggests both renaming and adding __init__.py
  • Multiple packages — lists all found packages and guides user to ensure only one top-level package exists
  • No package and no Python files — returns Unknown (not a Python project)

Test plan

  • 29 tests covering the full parametrized matrix: {CWD with dashes | underscores} × {flat | src container} × {valid | invalid pkg name} × {__init__.py + .py | .py only | no .py} = 24 cases, plus 5 individual cases:
    • JavaScript and Python coexistence (JS takes precedence)
    • Multiple flat packages error
    • Multiple packages in src/ error
    • Loose .py files without package structure
    • No Python project at all (Unknown)
  • All existing local tests pass

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

@vdusek vdusek added this to the 133rd sprint - Tooling team milestone Jan 27, 2026
@vdusek vdusek self-assigned this Jan 27, 2026
@vdusek vdusek added adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. labels Jan 27, 2026
@github-actions github-actions bot added the tested Temporary label used only programatically for some analytics. label Jan 27, 2026
@vdusek vdusek changed the title feat: improve Python project detection and entrypoint resolution fix: Improve Python project detection and entrypoint resolution Jan 27, 2026
@vdusek vdusek force-pushed the feat/improve-python-project-detection branch from b1954c2 to 27ca7f5 Compare February 5, 2026 09:59
@vdusek vdusek requested a review from Copilot February 5, 2026 10:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances Python project detection in the Apify CLI to support standard Python package layouts without requiring specific directory names. The changes enable automatic package discovery and provide better error messages for common configuration issues.

Changes:

  • Implemented flexible Python project detection based on pyproject.toml, requirements.txt, or presence of .py files
  • Added automatic package discovery that searches for valid Python packages in the current directory and src/ subdirectory
  • Introduced smart entrypoint resolution that automatically selects the entrypoint when exactly one package is found
  • Enhanced error messages with clear guidance for scenarios including no packages found, multiple packages found, and mixed Python/Node.js projects
  • Maintained backwards compatibility with existing projects using src/__main__.py

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/lib/hooks/useCwdProject.ts Core implementation of enhanced Python project detection including helper functions for package discovery, validation, and entrypoint resolution; added mixed project detection
test/lib/hooks/useCwdProject.test.ts Comprehensive new test suite covering flat packages, src container structures, subpackages, multiple packages error cases, edge cases, and mixed project detection
test/local/__fixtures__/commands/run/python/prints-error-message-on-project-with-no-detected-start.test.ts Updated to remove requirements.txt to properly test the "no detection" scenario with the new detection logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vdusek vdusek force-pushed the feat/improve-python-project-detection branch from 09d051e to 9a23d04 Compare February 5, 2026 10:42
fix: remove references to unsupported --entrypoint flag

The --entrypoint flag doesn't exist yet, so remove mentions of it from
error messages. Updated messages now guide users to fix their project
structure instead.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: prefer Node.js over Python when both indicators exist

When a project has both package.json and Python indicators (like
requirements.txt), prefer Node.js detection instead of throwing an error.
This simplifies the user experience for mixed projects.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vdusek vdusek marked this pull request as ready for review February 5, 2026 10:55
@vdusek vdusek requested a review from vladfrangu as a code owner February 5, 2026 10:55
vdusek and others added 2 commits February 6, 2026 15:46
- dirExists now actually checks for directories using stat().isDirectory()
- Warn when detected package is missing __main__.py (python -m will fail)
- Rename suggestion handles all invalid chars, not just hyphens
- Add tests for src/ as a package and __main__.py warning
- Document JS precedence over Python with detailed comment
- Fix tests to reflect that requirements.txt no longer affects detection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant