fix #18 - Add GitHub Action to enforce Apache 2.0 license headers#27
Conversation
|
@fantonangeli you must sign your commits with a key that matches your email. |
Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
5dd02f0 to
40a08ea
Compare
There was a problem hiding this comment.
Pull request overview
Adds automated license-header enforcement using Apache RAT so pull requests fail if files without approved licensing are introduced, supporting the repository’s Apache 2.0 compliance goal.
Changes:
- Add a CI workflow that runs Apache RAT on pushes to
mainand on PR events. - Introduce
.rat-excludesto exclude certain non-source/config/generated files from RAT scanning. - Add Apache 2.0 license headers to
README.mdand.gitignore.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| README.md | Adds an Apache 2.0 license header in an HTML comment. |
| .rat-excludes | Defines files to exclude from Apache RAT scanning. |
| .gitignore | Adds an Apache 2.0 license header comment block. |
| .github/workflows/ci_check_license_headers.yaml | New workflow that downloads and runs Apache RAT, failing on unapproved files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents | ||
| java -jar apache-rat-${APACHE_RAT_VERSION}.jar --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true | ||
| if grep -q "^! Unapproved:" .rat-reports; then | ||
| echo "❌ Apache RAT check FAILED - Files with unapproved licenses found:" | ||
| echo "" | ||
| grep "^! /" .rat-reports | ||
| exit 1 |
There was a problem hiding this comment.
The workflow unconditionally swallows Apache RAT’s exit code (|| true) and only greps for a specific report line. If java is missing, the JAR download fails/corrupts, or RAT errors for any other reason, this step can incorrectly report success because the report won’t contain ! Unapproved:. Capture and check RAT’s exit status (or explicitly detect tool/runtime errors) and fail the job when RAT cannot run successfully; you can still print the report for debugging before exiting non-zero.
| uses: actions/checkout@v3 | ||
|
|
||
| - name: Download Apache RAT | ||
| run: | | ||
| curl -LO "https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}/apache-rat-${APACHE_RAT_VERSION}.jar" | ||
|
|
There was a problem hiding this comment.
This downloads and executes a remote JAR via curl without any integrity verification (checksum/signature) and uses an unpinned actions/checkout@v3. For supply-chain safety, pin actions to a commit SHA (or at least update to the current major) and verify the RAT artifact (e.g., download the corresponding .sha512/signature and validate before running).
| uses: actions/checkout@v3 | |
| - name: Download Apache RAT | |
| run: | | |
| curl -LO "https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}/apache-rat-${APACHE_RAT_VERSION}.jar" | |
| uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # actions/checkout@v4 | |
| - name: Download Apache RAT | |
| run: | | |
| set -e | |
| BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}" | |
| # Download JAR and corresponding SHA-512 checksum | |
| curl -L -o "apache-rat-${APACHE_RAT_VERSION}.jar" "${BASE_URL}/apache-rat-${APACHE_RAT_VERSION}.jar" | |
| curl -L -o "apache-rat-${APACHE_RAT_VERSION}.jar.sha512" "${BASE_URL}/apache-rat-${APACHE_RAT_VERSION}.jar.sha512" | |
| # Verify the downloaded JAR against the published checksum | |
| EXPECTED_SHA512="$(awk '{print $1}' "apache-rat-${APACHE_RAT_VERSION}.jar.sha512")" | |
| ACTUAL_SHA512="$(sha512sum "apache-rat-${APACHE_RAT_VERSION}.jar" | awk '{print $1}')" | |
| if [ "${EXPECTED_SHA512}" != "${ACTUAL_SHA512}" ]; then | |
| echo "Checksum verification FAILED for apache-rat-${APACHE_RAT_VERSION}.jar" >&2 | |
| exit 1 | |
| fi |
| if grep -q "^! Unapproved:" .rat-reports; then | ||
| echo "❌ Apache RAT check FAILED - Files with unapproved licenses found:" | ||
| echo "" | ||
| grep "^! /" .rat-reports |
There was a problem hiding this comment.
grep "^! /" .rat-reports is very likely to miss the file list because RAT typically reports paths as relative (e.g., ! ./path or ! path) rather than absolute (/path). This makes failures hard to diagnose. Broaden the match (e.g., ^! ) or parse the report section that lists offending files so the workflow reliably prints the unapproved paths.
| grep "^! /" .rat-reports | |
| grep "^! " .rat-reports |
Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
2947dc9 to
68eca73
Compare
Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
|
||
| # Download JAR and corresponding SHA-1 checksum | ||
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | ||
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA}" | ||
|
|
||
| # Verify the downloaded JAR against the published checksum | ||
| EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)" | ||
| ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')" | ||
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | ||
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | ||
| exit 1 | ||
| fi | ||
| rm $APACHE_RAT_SHA | ||
|
|
||
| - name: Run Apache RAT | ||
| run: | | ||
| APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar" | ||
|
|
||
| # Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents | ||
| java -jar $APACHE_RAT_JAR --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true |
There was a problem hiding this comment.
The workflow downloads apache-rat-${APACHE_RAT_VERSION}.jar into the repository root and then runs RAT over .. This means the freshly-downloaded JAR itself is included in the scan, which can cause RAT to report it as an unapproved/binary file and fail the job. Download the JAR into a temp directory outside the scan root (e.g., $RUNNER_TEMP) or add an exclude pattern for apache-rat-*.jar and scan only tracked repo files.
| # Download JAR and corresponding SHA-1 checksum | |
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | |
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA}" | |
| # Verify the downloaded JAR against the published checksum | |
| EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)" | |
| ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')" | |
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | |
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | |
| exit 1 | |
| fi | |
| rm $APACHE_RAT_SHA | |
| - name: Run Apache RAT | |
| run: | | |
| APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar" | |
| # Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents | |
| java -jar $APACHE_RAT_JAR --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true | |
| APACHE_RAT_DIR="${RUNNER_TEMP:-/tmp}" | |
| mkdir -p "${APACHE_RAT_DIR}" | |
| cd "${APACHE_RAT_DIR}" | |
| # Download JAR and corresponding SHA-1 checksum | |
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | |
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA}" | |
| # Verify the downloaded JAR against the published checksum | |
| EXPECTED_SHA1="$(awk '{print $1}' "${APACHE_RAT_SHA}")" | |
| ACTUAL_SHA1="$(sha1sum "${APACHE_RAT_JAR}" | awk '{print $1}')" | |
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | |
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | |
| exit 1 | |
| fi | |
| rm "${APACHE_RAT_SHA}" | |
| - name: Run Apache RAT | |
| run: | | |
| APACHE_RAT_JAR="${RUNNER_TEMP:-/tmp}/apache-rat-${APACHE_RAT_VERSION}.jar" | |
| # Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents | |
| java -jar "$APACHE_RAT_JAR" --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true |
| APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1" | ||
|
|
||
| # Download JAR and corresponding SHA-1 checksum | ||
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | ||
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA}" | ||
|
|
||
| # Verify the downloaded JAR against the published checksum | ||
| EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)" | ||
| ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')" | ||
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | ||
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | ||
| exit 1 | ||
| fi | ||
| rm $APACHE_RAT_SHA |
There was a problem hiding this comment.
Checksum verification is currently based on Maven Central's .sha1 file. SHA-1 is considered weak; Maven Central typically provides stronger digests (e.g., .sha512) and/or signature files. Prefer verifying with SHA-512 (or GPG signature verification) to improve supply-chain integrity.
| APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1" | |
| # Download JAR and corresponding SHA-1 checksum | |
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | |
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA}" | |
| # Verify the downloaded JAR against the published checksum | |
| EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)" | |
| ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')" | |
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | |
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | |
| exit 1 | |
| fi | |
| rm $APACHE_RAT_SHA | |
| APACHE_RAT_SHA512="apache-rat-${APACHE_RAT_VERSION}.jar.sha512" | |
| # Download JAR and corresponding SHA-512 checksum | |
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | |
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA512}" | |
| # Verify the downloaded JAR against the published checksum | |
| EXPECTED_SHA512="$(awk '{print $1}' $APACHE_RAT_SHA512)" | |
| ACTUAL_SHA512="$(sha512sum $APACHE_RAT_JAR | awk '{print $1}')" | |
| if [ "${EXPECTED_SHA512}" != "${ACTUAL_SHA512}" ]; then | |
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | |
| exit 1 | |
| fi | |
| rm $APACHE_RAT_SHA512 |
| set -e | ||
| BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}" | ||
| APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar" | ||
| APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1" | ||
|
|
||
| # Download JAR and corresponding SHA-1 checksum | ||
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | ||
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA}" | ||
|
|
||
| # Verify the downloaded JAR against the published checksum | ||
| EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)" | ||
| ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')" | ||
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | ||
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | ||
| exit 1 | ||
| fi | ||
| rm $APACHE_RAT_SHA |
There was a problem hiding this comment.
The download step uses set -e but relies on pipelines (sha1sum ... | awk ...) without pipefail, and curl is invoked without --fail. A failed download or checksum command can be masked by pipeline behavior, making failures harder to diagnose. Consider set -euo pipefail and curl -fsSLO for more reliable failure handling.
| set -e | |
| BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}" | |
| APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar" | |
| APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1" | |
| # Download JAR and corresponding SHA-1 checksum | |
| curl -LO "${BASE_URL}/${APACHE_RAT_JAR}" | |
| curl -LO "${BASE_URL}/${APACHE_RAT_SHA}" | |
| # Verify the downloaded JAR against the published checksum | |
| EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)" | |
| ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')" | |
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | |
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | |
| exit 1 | |
| fi | |
| rm $APACHE_RAT_SHA | |
| set -euo pipefail | |
| BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}" | |
| APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar" | |
| APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1" | |
| # Download JAR and corresponding SHA-1 checksum | |
| curl -fsSLO "${BASE_URL}/${APACHE_RAT_JAR}" | |
| curl -fsSLO "${BASE_URL}/${APACHE_RAT_SHA}" | |
| # Verify the downloaded JAR against the published checksum | |
| EXPECTED_SHA1="$(awk '{print $1}' "${APACHE_RAT_SHA}")" | |
| ACTUAL_SHA1="$(sha1sum "${APACHE_RAT_JAR}" | awk '{print $1}')" | |
| if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then | |
| echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2 | |
| exit 1 | |
| fi | |
| rm "${APACHE_RAT_SHA}" |
| .gitattributes | ||
| .npmrc | ||
| .prettierignore | ||
| .rat-excludes | ||
| .rat-reports | ||
| pnpm-lock.yaml |
There was a problem hiding this comment.
If the RAT JAR continues to be downloaded into the workspace, consider adding an exclude entry for apache-rat-*.jar here. Otherwise the CI job may end up scanning artifacts created during the workflow run rather than only repository content.
| java-version: 17 | ||
| distribution: "temurin" | ||
|
|
||
| - name: Download Apache RAT |
Closes #18
Summary
Add a GitHub Actions workflow that validates Apache 2.0 license headers are present in all applicable source files and fails the build if headers are missing. The check must execute on every Pull Request.
Goals
Ensure all source files comply with Apache 2.0 licensing requirements.
Non-Goals