Skip to content

fix(sonarqube): bypass 10K result cap by fetching issues per-rule#448

Open
TheAuditorTool wants to merge 1 commit intoOWASP-Benchmark:masterfrom
TheAuditorTool:fix/sonarqube-10k-result-cap-33
Open

fix(sonarqube): bypass 10K result cap by fetching issues per-rule#448
TheAuditorTool wants to merge 1 commit intoOWASP-Benchmark:masterfrom
TheAuditorTool:fix/sonarqube-10k-result-cap-33

Conversation

@TheAuditorTool
Copy link
Copy Markdown

Closes #33
Refs #187

Summary

The SonarQube /api/issues/search endpoint enforces a hard server-side limit:
p * ps <= 10000. With PAGE_SIZE = 500, page 21 requests offset 10,500 and
receives HTTP 400. For instances reporting >10K vulnerabilities, results are
either silently truncated or the script crashes.

This was reported in #33 (March 2017) and deferred during the #187 cleanup.
The original reporter's suggestion -- filter by individual rules to keep each
query under the cap -- is exactly what this PR implements.


Problem Chain

Date State
2017-03 #33 opened: SonarQube API caps at 10K issues. Suggestion: use &rules=squid:XXX per-rule.
2025-02 32933c4e6 added SonarReport.java with native Java pagination. Passes ALL rules in one &rules= query -- still hits 10K cap for large result sets.
2025-02 dc9abba63 fixed hostname. 283e0e61f applied Spotless formatting.
This PR Iterates per-rule instead of all-at-once, bypassing the 10K cap entirely.

What Changed

Single file changed: src/main/java/org/owasp/benchmark/report/sonarqube/SonarReport.java

1. Per-rule issue fetching (main())

Before: All ~600 Java rules are comma-joined into a single &rules= query
parameter. If the aggregate result exceeds 10K issues, the API returns HTTP 400
or silently truncates.

After: Each rule is queried individually. A single rule produces far fewer
issues (typically 0-200 for Benchmark's 2,740 test cases), staying well under
the 10K cap.

This also fixes:

  • A double && in the URL (&&rules= -> &rules=)
  • A ~9KB URL from comma-joining 600+ rule IDs, which risks server URL-length limits

Why no duplicates: Each SonarQube issue belongs to exactly one rule.
Per-rule iteration produces disjoint result sets.

Performance: ~600 rules x 1 lightweight API call each. Most rules return 0
results (1 page). Total overhead: 1-5 minutes. The scan itself takes >= 1 hour.

2. Off-by-one in page count (forAllPagesAt())

Before: (total / PAGE_SIZE) + 1 -- always adds an extra page. When total
is an exact multiple of PAGE_SIZE (e.g., 500 issues), fires one unnecessary
empty-page request. At exactly 10K results, this triggers a forbidden page 21.

After: (total + PAGE_SIZE - 1) / PAGE_SIZE -- standard ceiling division.

3. HTTP error handling (apiCall())

Before: getInputStream() on a non-200 response throws a generic
IOException with "Server returned HTTP response code: 400 for URL: ...".
For some error codes, it may silently read an error body that breaks Jackson
deserialization downstream.

After: getResponseCode() is checked before reading the body. Non-200
throws IOException("SonarQube API returned HTTP " + status + " for " + apiPath)
with clear context.


What Was NOT Changed

Item Reason
testcode/ (2,740 files) Benchmark test cases -- not related to this fix
SonarQubeResult.java DTO structure is unchanged
KeepAsJsonDeserializer.java Unchanged
runSonarQube.sh Cleanup handled separately in #187
pom.xml No new dependencies
setDoOutput(true) in apiCall() Technically incorrect for GET requests, but it is what the original author shipped and tested. Removing it is a style fix with nonzero regression risk and zero benefit.

Known Limitations (future work)

  1. hotspots/search has the same 10K cap but does not support per-rule
    filtering. Benchmark's 2,740 test cases are unlikely to produce 10K+
    hotspots, but an Enterprise instance could. Addressing this requires a
    different partitioning strategy (by file or security category).

  2. rules/search also paginates with the same mechanism. Default SonarQube
    has ~3,000 rules total, well under 10K. Only relevant for instances with
    10K+ custom rules.


Test Plan

  • Run scripts/runSonarQube.sh on a machine with Docker
  • Verify results/Benchmark_*-sonarqube-v*.json contains both issues and hotspots arrays
  • Confirm issue count matches SonarQube UI (no silent truncation)
  • For instances with >10K total issues: verify all issues are captured (the core fix)
  • Verify no orphaned sonarqube-benchmark container after script completes

@TheAuditorTool TheAuditorTool force-pushed the fix/sonarqube-10k-result-cap-33 branch 2 times, most recently from 6286d30 to 698ae83 Compare April 13, 2026 17:23
Closes OWASP-Benchmark#33

The issues/search API enforces p*ps <= 10000. With PAGE_SIZE=500, page
21 returns HTTP 400. Fix: iterate per-rule instead of passing all ~600
rules in one query. Each single-rule query stays well under 10K.

Also fixes off-by-one in page count (ceiling division) and adds HTTP
status checking before reading response body.
@TheAuditorTool TheAuditorTool force-pushed the fix/sonarqube-10k-result-cap-33 branch from 698ae83 to adef7bc Compare April 13, 2026 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SonarQube Result-Writer fails: SonarQube-WebApi allows to request for a maximum of 10000 issues only

2 participants