DIARCHERS-1396: MCP-dedicated API layer with rate limiting, auth, and audit by sap-yuan · Pull Request #610 · SAP/InfraBox

sap-yuan · 2026-06-05T04:08:29Z

Summary

Implements DIARCHERS-1396: isolates all MCP/AI-agent API calls under a dedicated /api/v1/mcp/* namespace, with independent rate limiting, scoped bearer tokens, and audit logging — mirroring the dhaas-control-center pattern.

New DB migration (00047.sql): mcp_token and mcp_access_log tables
src/api/handlers/mcp/auth.py: ib_mcp_* bearer token validation; check_project_access_mcp and check_trigger_access_mcp helpers
src/api/handlers/mcp/rate_limit.py: Redis sliding-window rate limiter (per-user per-endpoint, fail-open on Redis outage)
src/api/handlers/mcp/audit.py: fire-and-forget audit logging into mcp_access_log
src/api/handlers/mcp/token_routes.py: token CRUD at POST/GET/PATCH/DELETE /api/v1/mcp/tokens/* (session auth)
src/api/handlers/mcp/routes/: /api/v1/mcp/* endpoints for projects, builds, jobs, logs, artifacts, trigger
infrabox/test/api/mcp_test.py: 21 unit tests covering token hash, access checks, rate limiter (allow/deny/fail-open)

Global token OPA policy fix

Global tokens were introduced to allow AI agents to access multiple projects with a single token. However, the OPA policies only covered project list/detail — all project-scoped read endpoints returned 401.

This commit adds the missing allow rules:

projects_build.rego — builds list, single build, sub-paths; writes (restart, abort, cache-clear) gated on scope_push=true
projects_jobs.rego — all job sub-resources (console, output, testresults, testruns, tests/history, tabs, archive, badges, stats); writes gated on scope_push=true
projects_commits.rego — read commits for collaborators; fixes latent bug where helper referenced undefined project instead of project_id
projects_cronjobs.rego — read-only cron job list for collaborators with administrator role
trigger.rego — trigger gated on scope_push=true; fixes same variable name bug
global_viewer_test.rego — 9 new test cases: allowed reads, denied non-collaborator, denied writes without scope_push, allowed writes with scope_push

Authorization model: global token reads any endpoint under /api/v1/projects/<id>/... iff user_id is a collaborator. Writes require scope_push=true. Admin, secrets, SSH keys, project tokens remain inaccessible. No Python or DB changes needed.

Changes

Zero impact on existing /api/v1/* endpoints — MCP tokens are blocked from all non-MCP paths
Per-user Redis sliding window with configurable RPM limits per endpoint (log/artifact: 10, trigger: 5, default: 30)
Project-scoped tokens with per-project expiry in JSONB; trigger requires explicit allow_trigger=true
Middleware order: auth → rate limit → project access → trigger access → handler → audit

Testing

cd infrabox/test/api
PYTHONPATH=../../src python -m pytest mcp_test.py -v   # 21 passed

OPA unit tests:

opa test src/openpolicyagent/policies/

JIRA

DIARCHERS-1396

… and audit - DB migration 00046: mcp_token and mcp_access_log tables - api/handlers/mcp/auth.py: ib_mcp_* bearer token validation, project/trigger access checks - api/handlers/mcp/rate_limit.py: Redis sliding-window per-user per-endpoint rate limiter (fail-open) - api/handlers/mcp/audit.py: fire-and-forget audit logging to mcp_access_log - api/handlers/mcp/token_routes.py: token CRUD at /api/v1/mcp/tokens/* - api/handlers/mcp/routes/: /api/v1/mcp/* endpoints for projects, builds, jobs, artifacts, trigger - infrabox/test/api/mcp_test.py: 21 unit tests covering hash, access checks, rate limiter

- ibflask.py: recognize ib_mcp_ prefix in get_token() to skip JWT decode; normalize_token() passes mcp type through unchanged - policies/mcp.rego: add OPA rules allowing mcp tokens on /api/v1/mcp/* data paths and session users on tokens management path - auth.py: fix per-project expiry check to handle naive vs aware datetime; treat malformed expiry as denied instead of granted - routes/builds.py: add LOCK TABLE before MAX(build_number)+1 INSERT to prevent duplicate build numbers under concurrency; move uuid import to top; drop premature audit 'attempt' before lock - routes/projects.py: add collaborator JOIN on MCP token path so revoked collaborators cannot enumerate project metadata - rate_limit.py: reset _redis_client to None on pipeline error so reconnect is attempted on the next request - token_routes.py: replace inline hashlib.sha256 with _hash_token(); move json/hashlib imports to top; fix MCPTokenTrigger.post docstring (permanent grant, not time-limited); remove dead body variable; add try/except around int(expires_days) conversion - audit.py: remove unused threading import; fix misleading docstring; move json import to top - 00046.sql: add retention comment for mcp_access_log pruning

…onify() flask-restx's make_response wrapper cannot accept a Flask Response object as data — it tries to JSON-serialize it and raises TypeError. All mcp handler methods must return plain dict/list (+ optional status code), not jsonify() Response objects.

flask-restx make_response cannot accept Flask Response objects; _reject() must raise via abort() so the error handler formats the response, not the Resource wrapper.

The test environment DB already has schema_version=46 from a prior development iteration (global_token columns) that was later consolidated into 00045.sql. Keeping mcp_token/mcp_access_log as 00046 causes the migration runner to skip it. Renaming to 00047 ensures it runs on any environment where 00046 is already marked as applied. mcp.rego is automatically included in the OPA image via the existing Dockerfile COPY instruction (src/openpolicyagent/policies → /policies), so no separate deployment step is needed.

…gger Global tokens were only authorized for project list/detail endpoints. All project-scoped read endpoints (builds, jobs, console, testresults, archive, stats, etc.) returned 401 because no OPA allow rules existed for token.type = "global". This commit adds the missing allow rules across all affected policy files, following the same collaborator-check pattern already used for user tokens: - Read operations (builds, jobs and all sub-resources): allowed when global_token.user_id is a collaborator on the target project. - Write operations (restart, abort, rerun, cache-clear, trigger): additionally require global_token.scope_push = true. Also fixes a latent bug in projects_commits.rego and trigger.rego where the helper functions referenced an undefined `project` variable instead of the correct `project_id` parameter. New OPA test cases in global_viewer_test.rego cover: - builds list and single build allowed for collaborator - job console and stats allowed for collaborator - read denied for non-collaborator project - restart/trigger denied without scope_push - restart/trigger allowed with scope_push

sap-yuan self-assigned this Jun 5, 2026

Yuan Huang added 2 commits June 9, 2026 12:07

fix: add mcp_token and mcp_access_log to test teardown TRUNCATE

051d279

sap-yuan force-pushed the diarchers-1396-mcp-api-layer branch from 5244b9e to 051d279 Compare June 9, 2026 04:07

Yuan Huang added 5 commits June 9, 2026 12:35

fix: use abort() instead of jsonify() in mcp_auth_required _reject()

ccbc3f0

flask-restx make_response cannot accept Flask Response objects; _reject() must raise via abort() so the error handler formats the response, not the Resource wrapper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DIARCHERS-1396: MCP-dedicated API layer with rate limiting, auth, and audit#610

DIARCHERS-1396: MCP-dedicated API layer with rate limiting, auth, and audit#610
sap-yuan wants to merge 7 commits into
masterfrom
diarchers-1396-mcp-api-layer

sap-yuan commented Jun 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sap-yuan commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Global token OPA policy fix

Changes

Testing

JIRA

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sap-yuan commented Jun 5, 2026 •

edited

Loading