-
Notifications
You must be signed in to change notification settings - Fork 1
RFC [KT] Add content-release command for kernel release automation #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: mainline
Are you sure you want to change the base?
Conversation
76083c0 to
b286616
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR extends the kt vm command with a --content_release workflow that automates building and installing LTS kernels in a VM from dist-git, then running kselftests. It introduces configuration fields and YAML metadata needed to drive the per-kernel mock builds and depot authentication.
Changes:
- Extend
KernelInfo/kernels.yamlto track a per-kernelmock_configused for mock builds. - Extend
Configwithuser,email,depot_user, anddepot_tokenand adjustfrom_str_dictto only treat specific keys as paths. - Add
VmInstance.content_release, a--content_releaseCLI flag, and additional VM dependencies (mock,python3-GitPython) plus a few supporting wiring changes.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
kt/ktlib/kernels.py |
Adds a required mock_config field to KernelInfo to match new kernels.yaml entries and support selecting the proper mock config for each kernel. |
kt/ktlib/config.py |
Adds user/depot-related config fields and refines from_str_dict to convert only specific keys to Path, changing how JSON configs must be structured. |
kt/data/kernels.yaml |
Annotates each kernel entry with its corresponding mock_config filename to be used by the content release workflow. |
kt/commands/vm/impl.py |
Implements the VmInstance.content_release orchestration, wires it into main, and consumes configuration/kernels metadata to run mkdistgitdiff, mock builds, install RPMs, and run kselftests. |
kt/commands/vm/command.py |
Exposes the new --content_release flag on the kt vm CLI and forwards it to the implementation. |
kernel_install_dep.sh |
Ensures mock and python3-GitPython are installed for supported Rocky versions and adds the invoking user to the mock group to allow mock builds within the VM. |
Comments suppressed due to low confidence (1)
kt/commands/vm/impl.py:375
- In the
content_releasepath we log “Waiting for the deps to be installed” but the actualtime.sleep(Constants.VM_DEPS_INSTALL_WAIT_SECONDS)call is commented out, unlike thetestpath above. This means that on a freshly created VM the content release workflow may start beforekernel_install_dep.shhas finished installingmock,python3-GitPython, and other dependencies, leading to sporadic failures; either restore the sleep or otherwise ensure dependency installation has completed before invokingcontent_release.
vm = Vm.load(config=config, kernel_workspace=kernel_workspace)
if destroy:
vm.destroy()
return
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b286616 to
e63daa0
Compare
roxanan1996
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments inline.
I think this needs a separate kt command.
Since building the kernel package can be done locally (not in a vm).
And reuse some vm functionality to test the kernel. The current state could be refactored a bit to "export" some functionality to other kt commands. I would put those into kt/ktlib.
kt/commands/vm/impl.py
Outdated
|
|
||
| ssh_cmd = ( | ||
| f"cd {self.kernel_workspace.dist_worktree.folder.absolute()} && " | ||
| f"mock -v -r ../${{USER}}_{mock_config} --resultdir={self.kernel_workspace.folder.absolute()}/build_files --buildsrpm " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to run mock in the vm? I think this is already using a chroot or sth similar.
So why not running this on the host?
And use the vm just for testing. I believe the content_release original script is like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are absolutely right that the rpms could be built on the host. And I probably prefer it that way (which is why content_release.sh does it that way). The one nice thing about them being built in the vm is that the system running kt doesn't even have to care about mock, which is kind of nice from the perspective of minimizing the requirements for automation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but it kt won't run locally. It will run in a container in jenkins or idk. I think it's not an issue if we have more requiremements, as long as they are nicely installable. I will add support to install other packages than through pip to fix this issue.
e63daa0 to
e31845d
Compare
Introduces a new 'content-release' command that automates the complete
kernel content release workflow through three stages: prepare, build,
and test. The command can run all stages sequentially (default) or
execute individual stages via --prepare, --build, and --test flags.
Prepare Stage:
- Validates git user.name and user.email configuration in dist-git repo
- Executes mkdistgitdiff.py to generate staging branch and release files
- Staging branch format: {automation_tmp}_<src_branch>
- Checks out the newly created staging branch
- Extracts and displays the new release tag from script output
Build Stage:
- Verifies mock command availability and user membership in mock group
- Validates DEPOT_USER and DEPOT_TOKEN environment variables
- Creates temporary mock config with depot credentials injected
- Downloads kernel sources using getsrc.sh script
- Builds source RPM with mock in build_files directory
- Builds binary RPMs from the SRPM
- Lists all created RPM packages
- Cleans up temporary mock configuration
Test Stage:
- Spins up VM for the kernel workspace (creates if needed)
- Installs built kernel RPMs excluding .src.rpm, kernel-rt*, kernel-debug*
- Reboots VM and verifies running kernel version matches installed version
- Executes kselftests using /usr/libexec/kselftests/run_kselftest.sh
- Counts and reports number of tests passed
- Outputs logs: install.log, selftest-<version>.log
Additional Changes:
- Refactored VM implementation to extract reusable functions:
* load_vm_from_workspace(): Load config and VM from workspace name
* setup_and_spinup_vm(): Common VM setup and spinup logic
- Increased VM disk size to 30GB during creation to accommodate kernel
packages and debug symbols
- Updated KT.md with comprehensive documentation for the new command
The content-release command streamlines the kernel release process by
automating previously manual steps, ensuring consistency and reducing
the potential for human error in the release workflow.
e31845d to
a18a453
Compare
|
I have revamped this to introduce a new content-release command that does the mkdistgitdiff and mock builds locally, and only spins up the vm to test. This is much closer to what content_release.sh does today, with the added flexibility of splitting the task up into prepare, build and test steps taht can be run individually if needed. See what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
kt/commands/vm/impl.py:308
setup_and_spinup_vmalways destroys and recreates the VM whenoverride=Truewithout checking if it exists, which can cause an unnecessary failure if the VM name has never been created. It would be more robust to checkVirtHelper.exists(vm_name=...)before callingdestroy()so--overrideworks both for existing and first-time VMs.
vm, config = load_vm_from_workspace(kernel_workspace_name)
if override:
vm.destroy()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Get the mock config name from kernel workspace name | ||
| # lts-9.2 -> lts92, lts-9.4 -> lts94, etc. | ||
| version_part = kernel_workspace.replace("lts-", "lts").replace(".", "") | ||
| mock_config_name = f"rocky-{version_part}-depot-x86_64.cfg" |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic that derives mock_config_name from kernel_workspace by only special-casing the lts- prefix (e.g., lts-9.2 -> rocky-lts92-depot-x86_64.cfg) does not align with the more general mock_config mapping in release_config.release_map (release_config.py:27-54), and will likely look for a non-existent config for non-lts products such as fips-9.2. To keep this working across all supported kernels, consider obtaining the mock config name from the central release mapping instead of reconstructing it from the workspace string here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, this is a good idea.
| or globally. | ||
|
|
||
| Args: | ||
| kernel_workspace: The name of the kernel workspace (e.g., 'lts9_4') |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kernel_workspace docstring example (e.g., 'lts9_4') does not match the actual kernel names defined in kt/data/kernels.yaml (e.g., lts-9.2, lts-9.4) or the CLI examples in KT.md. For clarity, it would be better to update this example to use one of the real workspace names so users don’t get confused about the expected format.
| kernel_workspace: The name of the kernel workspace (e.g., 'lts9_4') | |
| kernel_workspace: The name of the kernel workspace (e.g., 'lts-9.4') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix
| def _resize_disk(self): | ||
| """Resize the qcow2 disk image to 30GB.""" | ||
| logging.info(f"Resizing disk {self.qcow2_path} to 30G") | ||
| subprocess.run(["qemu-img", "resize", str(self.qcow2_path), "30G"], check=True) |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using qemu-img resize with check=True will raise CalledProcessError on any failure but the exception is not caught or wrapped, which will surface as an unhandled stack trace rather than a clear user-facing error like in other subprocess calls in this module. Consider catching subprocess.CalledProcessError here and raising a RuntimeError with a concise message so CLI users get consistent error reporting with the rest of the tool.
| subprocess.run(["qemu-img", "resize", str(self.qcow2_path), "30G"], check=True) | |
| try: | |
| subprocess.run( | |
| ["qemu-img", "resize", str(self.qcow2_path), "30G"], | |
| check=True, | |
| ) | |
| except subprocess.CalledProcessError as e: | |
| raise RuntimeError( | |
| f"Failed to resize disk {self.qcow2_path} to 30G (exit code {e.returncode})" | |
| ) from e |
| vm, config = load_vm_from_workspace(name) | ||
| vm.destroy() | ||
| return | ||
|
|
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the vm main() flow you create vm_instance unconditionally but only use it when test or console is true, which means running kt vm <name> with no flags will still spin up a VM and then immediately exit. If there is a supported no-op mode here, consider short-circuiting before calling setup_and_spinup_vm when neither test nor console is requested so the command doesn't do unexpected work.
| # If neither test nor console is requested, do not spin up a VM. | |
| if not (test or console): | |
| return |
| result = SshCommand.run(domain=vm_instance.domain, command=["uname -r"]) | ||
| kernel_version = result.stdout.strip() if hasattr(result, "stdout") else None | ||
| if not kernel_version: | ||
| # Try alternate method | ||
| import subprocess as sp | ||
|
|
||
| result = sp.run( | ||
| ["ssh", vm_instance.domain, "uname -r"], | ||
| capture_output=True, | ||
| text=True, | ||
| check=True, | ||
| ) | ||
| kernel_version = result.stdout.strip() |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SshCommand.run (via CommandRunner.run) already returns the stdout string, so checking result.stdout and then falling back to a separate subprocess.run(["ssh", ...]) call both misuses the API and causes the ssh command to be executed twice on the success path. You can simplify this to use the string returned by SshCommand.run(...).strip() directly and, if you still want a fallback, catch RuntimeError from SshCommand.run before invoking a secondary ssh call.
| result = SshCommand.run(domain=vm_instance.domain, command=["uname -r"]) | |
| kernel_version = result.stdout.strip() if hasattr(result, "stdout") else None | |
| if not kernel_version: | |
| # Try alternate method | |
| import subprocess as sp | |
| result = sp.run( | |
| ["ssh", vm_instance.domain, "uname -r"], | |
| capture_output=True, | |
| text=True, | |
| check=True, | |
| ) | |
| kernel_version = result.stdout.strip() | |
| try: | |
| # SshCommand.run returns the stdout string directly | |
| kernel_version = SshCommand.run(domain=vm_instance.domain, command=["uname -r"]).strip() | |
| except RuntimeError: | |
| # Fallback: run ssh via subprocess if SshCommand.run fails | |
| result = subprocess.run( | |
| ["ssh", vm_instance.domain, "uname -r"], | |
| capture_output=True, | |
| text=True, | |
| check=True, | |
| ) | |
| kernel_version = result.stdout.strip() | |
| if not kernel_version: | |
| raise RuntimeError("Received empty kernel version from VM") |
Introduces a new 'content-release' command that automates the complete
kernel content release workflow through three stages: prepare, build,
and test. The command can run all stages sequentially (default) or
execute individual stages via --prepare, --build, and --test flags.
Prepare Stage:
Build Stage:
Test Stage:
Additional Changes:
packages and debug symbols
The content-release command streamlines the kernel release process by
automating previously manual steps, ensuring consistency and reducing
the potential for human error in the release workflow.
And since its not in the diff, this requires the mock-configs and kernel-tools repos checked out in the kt workspace.
This works for lts-9.2. That is all I tested. More testing will be needed. This is really just an RFC at the moment.