Skip to content

HDDS-15211. Automatically create a github discussion for junit failures on master#10238

Draft
errose28 wants to merge 1 commit into
apache:masterfrom
errose28:junit-failure-discussion
Draft

HDDS-15211. Automatically create a github discussion for junit failures on master#10238
errose28 wants to merge 1 commit into
apache:masterfrom
errose28:junit-failure-discussion

Conversation

@errose28
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Draft generated with AI as an example to see if this is helpful.

We often do not check the master branch for flaky tests to proactively track them under HDDS-5626 and tag them with @Flaky. Usually they don't get tagged until they disrupt PRs, but even then tests are frequently just rerun without bothering to tag them going forward. This can cause the reliability of master to slowly degrade over time, until it gets bad enough that we inspect a lot of past runs and add lots of @Flaky tags all at once.

To help proactively tag these flaky tests, this PR contains a github actions job that will create a new github discussion for each test run on master that has junit failures. The current draft is sending this to the General category, but we could create a different category for these.

Format of the discussion would look like this:

[CI] JUnit failure on master (072f758)

**Workflow run:** https://github.com/errose28/ozone/actions/runs/25561938156
**Commit:** 072f758268491dd2b8945089f916f4421755741e
Flaky test Jiras should be filed as **subtasks of [HDDS-5626](https://issues.apache.org/jira/browse/HDDS-5626)**.
---
## integration-hdds
org.apache.hadoop.hdds.upgrade.TestDNDataDistributionFinalization
org.apache.hadoop.hdds.upgrade.TestScmDataDistributionFinalization
org.apache.hadoop.hdds.upgrade.TestScmHAFinalization
Error: Process completed with exit code 1.

Other alternatives were considered but dropped due to complexity:

  • Automatically filing Jira issues:
    • Requires a new Jira token to be added by ASF Infra. I recall from past testing that our existing github token is enough to create discussions.
    • Requires deduplication of test failures. Each run needs to somehow figure out if a Jira was already filed for this test which has not been tagged yet.
    • Jira's notification system is not as good as github's. It would be hard to subscribe to notifications that there is a new flaky test without also pushing to a different channel.
  • Automatically create a PR that adds the @Flaky annotation to the test. The corresponding Jira would then be filed by whoever reviews and merged the change.
    • This was my initial approach, but there is a lot of nuance in mapping the output of a maven surefire xml report to the line number to put the annotation on.
    • This also has similar deduplication trouble as automatically creating Jiras.
  • Using failed run archives from https://github.com/adoroszlai/ozone-build-results
    • This creates a dependency pointing from Apache to a personal repo which is not ideal.

What is the link to the Apache JIRA

HDDS-15211

How was this patch tested?

I haven't done a dry run of this on my fork yet, this PR is mostly to see if the community is interested in this. If so I can proceed with testing.

@errose28 errose28 added the CI label May 11, 2026
@adoroszlai adoroszlai changed the title HDDS-15211. Automatically create a github discussion when there are junit failures on the master branch HDDS-15211. Automatically create a github discussion for junit failures on master May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant