Skip to content

fix #18 - Add GitHub Action to enforce Apache 2.0 license headers#27

Open
fantonangeli wants to merge 4 commits intoserverlessworkflow:mainfrom
fantonangeli:issue-18-Add-GitHub-Action-to-enforce-Apache-20-license-headers-2
Open

fix #18 - Add GitHub Action to enforce Apache 2.0 license headers#27
fantonangeli wants to merge 4 commits intoserverlessworkflow:mainfrom
fantonangeli:issue-18-Add-GitHub-Action-to-enforce-Apache-20-license-headers-2

Conversation

@fantonangeli
Copy link
Contributor

Closes #18

Summary

Add a GitHub Actions workflow that validates Apache 2.0 license headers are present in all applicable source files and fails the build if headers are missing. The check must execute on every Pull Request.

Goals

Ensure all source files comply with Apache 2.0 licensing requirements.

  • Prevent merging code without proper license headers.
  • Automate enforcement instead of relying on manual review.
  • Maintain CNCF/open-source governance standards.

Non-Goals

  • Complex legal compliance tooling beyond header validation.
  • Checking third-party dependency licenses.

@fantonangeli fantonangeli requested review from Copilot, lornakelly and ricardozanini and removed request for Copilot March 5, 2026 16:31
@ricardozanini
Copy link
Member

@fantonangeli you must sign your commits with a key that matches your email.

Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
@fantonangeli fantonangeli force-pushed the issue-18-Add-GitHub-Action-to-enforce-Apache-20-license-headers-2 branch from 5dd02f0 to 40a08ea Compare March 6, 2026 16:42
fantonangeli added a commit to fantonangeli/serverlessworkflow-editor that referenced this pull request Mar 6, 2026
Copilot AI review requested due to automatic review settings March 6, 2026 17:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automated license-header enforcement using Apache RAT so pull requests fail if files without approved licensing are introduced, supporting the repository’s Apache 2.0 compliance goal.

Changes:

  • Add a CI workflow that runs Apache RAT on pushes to main and on PR events.
  • Introduce .rat-excludes to exclude certain non-source/config/generated files from RAT scanning.
  • Add Apache 2.0 license headers to README.md and .gitignore.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.

File Description
README.md Adds an Apache 2.0 license header in an HTML comment.
.rat-excludes Defines files to exclude from Apache RAT scanning.
.gitignore Adds an Apache 2.0 license header comment block.
.github/workflows/ci_check_license_headers.yaml New workflow that downloads and runs Apache RAT, failing on unapproved files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +43 to +49
# Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents
java -jar apache-rat-${APACHE_RAT_VERSION}.jar --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true
if grep -q "^! Unapproved:" .rat-reports; then
echo "❌ Apache RAT check FAILED - Files with unapproved licenses found:"
echo ""
grep "^! /" .rat-reports
exit 1
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow unconditionally swallows Apache RAT’s exit code (|| true) and only greps for a specific report line. If java is missing, the JAR download fails/corrupts, or RAT errors for any other reason, this step can incorrectly report success because the report won’t contain ! Unapproved:. Capture and check RAT’s exit status (or explicitly detect tool/runtime errors) and fail the job when RAT cannot run successfully; you can still print the report for debugging before exiting non-zero.

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +40
uses: actions/checkout@v3

- name: Download Apache RAT
run: |
curl -LO "https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}/apache-rat-${APACHE_RAT_VERSION}.jar"

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This downloads and executes a remote JAR via curl without any integrity verification (checksum/signature) and uses an unpinned actions/checkout@v3. For supply-chain safety, pin actions to a commit SHA (or at least update to the current major) and verify the RAT artifact (e.g., download the corresponding .sha512/signature and validate before running).

Suggested change
uses: actions/checkout@v3
- name: Download Apache RAT
run: |
curl -LO "https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}/apache-rat-${APACHE_RAT_VERSION}.jar"
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # actions/checkout@v4
- name: Download Apache RAT
run: |
set -e
BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}"
# Download JAR and corresponding SHA-512 checksum
curl -L -o "apache-rat-${APACHE_RAT_VERSION}.jar" "${BASE_URL}/apache-rat-${APACHE_RAT_VERSION}.jar"
curl -L -o "apache-rat-${APACHE_RAT_VERSION}.jar.sha512" "${BASE_URL}/apache-rat-${APACHE_RAT_VERSION}.jar.sha512"
# Verify the downloaded JAR against the published checksum
EXPECTED_SHA512="$(awk '{print $1}' "apache-rat-${APACHE_RAT_VERSION}.jar.sha512")"
ACTUAL_SHA512="$(sha512sum "apache-rat-${APACHE_RAT_VERSION}.jar" | awk '{print $1}')"
if [ "${EXPECTED_SHA512}" != "${ACTUAL_SHA512}" ]; then
echo "Checksum verification FAILED for apache-rat-${APACHE_RAT_VERSION}.jar" >&2
exit 1
fi

Copilot uses AI. Check for mistakes.
if grep -q "^! Unapproved:" .rat-reports; then
echo "❌ Apache RAT check FAILED - Files with unapproved licenses found:"
echo ""
grep "^! /" .rat-reports
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grep "^! /" .rat-reports is very likely to miss the file list because RAT typically reports paths as relative (e.g., ! ./path or ! path) rather than absolute (/path). This makes failures hard to diagnose. Broaden the match (e.g., ^! ) or parse the report section that lists offending files so the workflow reliably prints the unapproved paths.

Suggested change
grep "^! /" .rat-reports
grep "^! " .rat-reports

Copilot uses AI. Check for mistakes.
Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
@fantonangeli fantonangeli force-pushed the issue-18-Add-GitHub-Action-to-enforce-Apache-20-license-headers-2 branch from 2947dc9 to 68eca73 Compare March 6, 2026 17:29
Signed-off-by: fantonangeli <fabrizio.antonangeli@gmail.com>
Copilot AI review requested due to automatic review settings March 6, 2026 18:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +49 to +68

# Download JAR and corresponding SHA-1 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA}"

# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)"
ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm $APACHE_RAT_SHA

- name: Run Apache RAT
run: |
APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar"

# Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents
java -jar $APACHE_RAT_JAR --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow downloads apache-rat-${APACHE_RAT_VERSION}.jar into the repository root and then runs RAT over .. This means the freshly-downloaded JAR itself is included in the scan, which can cause RAT to report it as an unapproved/binary file and fail the job. Download the JAR into a temp directory outside the scan root (e.g., $RUNNER_TEMP) or add an exclude pattern for apache-rat-*.jar and scan only tracked repo files.

Suggested change
# Download JAR and corresponding SHA-1 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA}"
# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)"
ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm $APACHE_RAT_SHA
- name: Run Apache RAT
run: |
APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar"
# Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents
java -jar $APACHE_RAT_JAR --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true
APACHE_RAT_DIR="${RUNNER_TEMP:-/tmp}"
mkdir -p "${APACHE_RAT_DIR}"
cd "${APACHE_RAT_DIR}"
# Download JAR and corresponding SHA-1 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA}"
# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' "${APACHE_RAT_SHA}")"
ACTUAL_SHA1="$(sha1sum "${APACHE_RAT_JAR}" | awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm "${APACHE_RAT_SHA}"
- name: Run Apache RAT
run: |
APACHE_RAT_JAR="${RUNNER_TEMP:-/tmp}/apache-rat-${APACHE_RAT_VERSION}.jar"
# Redirect output to .rat-reports file, continue even if RAT returns non-zero exit code because we want to print Unapproved documents
java -jar "$APACHE_RAT_JAR" --input-exclude-file .rat-excludes -- . > .rat-reports 2>&1 || true

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +61
APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1"

# Download JAR and corresponding SHA-1 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA}"

# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)"
ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm $APACHE_RAT_SHA
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checksum verification is currently based on Maven Central's .sha1 file. SHA-1 is considered weak; Maven Central typically provides stronger digests (e.g., .sha512) and/or signature files. Prefer verifying with SHA-512 (or GPG signature verification) to improve supply-chain integrity.

Suggested change
APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1"
# Download JAR and corresponding SHA-1 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA}"
# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)"
ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm $APACHE_RAT_SHA
APACHE_RAT_SHA512="apache-rat-${APACHE_RAT_VERSION}.jar.sha512"
# Download JAR and corresponding SHA-512 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA512}"
# Verify the downloaded JAR against the published checksum
EXPECTED_SHA512="$(awk '{print $1}' $APACHE_RAT_SHA512)"
ACTUAL_SHA512="$(sha512sum $APACHE_RAT_JAR | awk '{print $1}')"
if [ "${EXPECTED_SHA512}" != "${ACTUAL_SHA512}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm $APACHE_RAT_SHA512

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +61
set -e
BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}"
APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar"
APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1"

# Download JAR and corresponding SHA-1 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA}"

# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)"
ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm $APACHE_RAT_SHA
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The download step uses set -e but relies on pipelines (sha1sum ... | awk ...) without pipefail, and curl is invoked without --fail. A failed download or checksum command can be masked by pipeline behavior, making failures harder to diagnose. Consider set -euo pipefail and curl -fsSLO for more reliable failure handling.

Suggested change
set -e
BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}"
APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar"
APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1"
# Download JAR and corresponding SHA-1 checksum
curl -LO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -LO "${BASE_URL}/${APACHE_RAT_SHA}"
# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' $APACHE_RAT_SHA)"
ACTUAL_SHA1="$(sha1sum $APACHE_RAT_JAR| awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm $APACHE_RAT_SHA
set -euo pipefail
BASE_URL="https://repo1.maven.org/maven2/org/apache/rat/apache-rat/${APACHE_RAT_VERSION}"
APACHE_RAT_JAR="apache-rat-${APACHE_RAT_VERSION}.jar"
APACHE_RAT_SHA="apache-rat-${APACHE_RAT_VERSION}.jar.sha1"
# Download JAR and corresponding SHA-1 checksum
curl -fsSLO "${BASE_URL}/${APACHE_RAT_JAR}"
curl -fsSLO "${BASE_URL}/${APACHE_RAT_SHA}"
# Verify the downloaded JAR against the published checksum
EXPECTED_SHA1="$(awk '{print $1}' "${APACHE_RAT_SHA}")"
ACTUAL_SHA1="$(sha1sum "${APACHE_RAT_JAR}" | awk '{print $1}')"
if [ "${EXPECTED_SHA1}" != "${ACTUAL_SHA1}" ]; then
echo "Checksum verification FAILED for ${APACHE_RAT_JAR}" >&2
exit 1
fi
rm "${APACHE_RAT_SHA}"

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +6
.gitattributes
.npmrc
.prettierignore
.rat-excludes
.rat-reports
pnpm-lock.yaml
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the RAT JAR continues to be downloaded into the workspace, consider adding an exclude entry for apache-rat-*.jar here. Otherwise the CI job may end up scanning artifacts created during the workflow run rather than only repository content.

Copilot uses AI. Check for mistakes.
java-version: 17
distribution: "temurin"

- name: Download Apache RAT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we cache this jar?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add GitHub Action to enforce Apache 2.0 license headers

3 participants