Skip to content

iOS device startup: upload .logarchive and pull crash reports on cleanup failure#5232

Open
matouskozak wants to merge 2 commits into
mainfrom
lm-ios-diagnostic-extraction-pr
Open

iOS device startup: upload .logarchive and pull crash reports on cleanup failure#5232
matouskozak wants to merge 2 commits into
mainfrom
lm-ios-diagnostic-extraction-pr

Conversation

@matouskozak
Copy link
Copy Markdown
Member

Summary

When the post-iteration iOS app cleanup fails (devicectl returns 'No such process'), the app already exited before we could terminate it. This PR adds diagnostic data collection to help investigate why — crash, iOS-initiated termination, or natural exit.

Changes

src/scenarios/shared/runner.py:

  • Added make_archive to the shutil import
  • Wrapped killCmdCommand.run() in a try/except CalledProcessError block that:
    1. Zips and uploads the iOS device's .logarchive to HELIX_WORKITEM_UPLOAD_ROOT
    2. Copies device-side crash reports (.ips) via xcrun devicectl device copy from --domain-type systemCrashLogs
    3. Re-raises the original exception after diagnostics are saved

Why

  • Adds observability for iOS device startup/cleanup failures — a daily triage pain point
  • Investigative-only: triggers only on the cleanup failure that already terminates the iteration
  • No behavior change for successful runs

Origin

Cherry-picked from matouskozak/ios-sdk-jobs-macos-26 (commits 71bfd4a1, 3101d30c) which were left unmerged after PR #5231 was merged.

Verification

  • ✅ No conflict markers
  • python3 -m py_compile src/scenarios/shared/runner.py passes

…nup failure

When post-iteration app cleanup fails (devicectl 'No such process'), the app
already exited before we could terminate it. To diagnose why — crash, iOS-
initiated termination, or natural exit:

1. Zip and upload the iOS device's .logarchive to the Helix results container
2. Pull device-side crash reports (.ips) via xcrun devicectl

Investigative-only: triggers only on the cleanup failure that already
terminates the iteration.

Cherry-picked from matouskozak/ios-sdk-jobs-macos-26 (71bfd4a1, 3101d30c).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 28, 2026 15:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds diagnostic collection for iOS device startup cleanup failures, helping investigate cases where the app process is already gone when the runner tries to kill it.

Changes:

  • Wraps the iOS app kill step in CalledProcessError handling.
  • Uploads the collected .logarchive and attempts to copy device crash logs before re-raising.
  • Adds make_archive import for zipping the log archive.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/scenarios/shared/runner.py Outdated
# Pull any iOS crash reports (.ips) for our bundle from the device into the upload root.
# The systemCrashLogs domain on devicectl exposes /var/mobile/Library/Logs/CrashReporter/.
crash_dest = os.path.join(upload_root, f'iteration{i}_crashlogs')
os.makedirs(crash_dest, exist_ok=True)
Comment thread src/scenarios/shared/runner.py Outdated
Comment on lines +864 to +879
# Pull any iOS crash reports (.ips) for our bundle from the device into the upload root.
# The systemCrashLogs domain on devicectl exposes /var/mobile/Library/Logs/CrashReporter/.
crash_dest = os.path.join(upload_root, f'iteration{i}_crashlogs')
os.makedirs(crash_dest, exist_ok=True)
crashCopyCmd = [
'xcrun', 'devicectl', 'device', 'copy', 'from',
'--device', deviceUDID,
'--domain-type', 'systemCrashLogs',
'--source', '/',
'--destination', crash_dest,
]
try:
getLogger().info(f"Copying device crash logs to {crash_dest} for diagnosis.")
RunCommand(crashCopyCmd, verbose=True).run()
except Exception as crash_ex:
getLogger().warning(f"Failed to copy device crash logs for diagnosis: {crash_ex}")
- Wrap os.makedirs into the same try/except as the crash copy so a
  diagnostic-path failure cannot mask the original CalledProcessError
  from the failed app kill.
- Prune the systemCrashLogs copy after devicectl to entries relevant
  to this iteration: keep files matching the bundle name or with
  mtime >= the iteration's log-collect start. Drops dozens of unrelated
  .ips files (WiFiLQMMetrics, wifip2pd, old crashes, etc.) per upload.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants