Skip to content

Add support for using pre-built images#6737

Open
arrrnas wants to merge 14 commits intotilt-dev:masterfrom
arrrnas:feature/live_update_initial_sync
Open

Add support for using pre-built images#6737
arrrnas wants to merge 14 commits intotilt-dev:masterfrom
arrrnas:feature/live_update_initial_sync

Conversation

@arrrnas
Copy link
Copy Markdown
Contributor

@arrrnas arrrnas commented Apr 4, 2026

Adds initial_sync() as a new live_update step that syncs all local files to the container on start or restart. This enables live_update with pre-built images that weren't built locally.

live_update=[
    initial_sync(ignore=['node_modules', '.git', '**/test']),
    sync('./src', '/app/src'),
    run('npm install', trigger=['package.json']),
]

Addresses #2582. Partially addresses #3961 and #4776. Improves on file_sync_only extension (tilt-dev/tilt-extensions#77).

What it does

  • initial_sync() — new step, must be first in the list. On container start/restart, walks all sync paths and transfers files to the container
  • ignore — accepts dockerignore-style glob patterns to exclude files
  • dockerignore — optional path to a directory containing a .dockerignore to reuse its patterns
  • run steps respect their triggers during initial sync — steps with unmatched triggers are skipped, steps without triggers always run

What it doesn't do

  • No compression — tested with gzip but CPU overhead exceeded network savings for typical workloads (~20k files, ~90MB in under 20s to remote EKS)
  • No changes to existing live_update behavior — initial_sync is opt-in

@arrrnas
Copy link
Copy Markdown
Contributor Author

arrrnas commented Apr 4, 2026

here's an example of using a generic nginx:alpine image with a static website that get's synced when pod starts:

tilt-initial-sync-example cat Tiltfile 
# Example: using initial_sync to serve local files with a stock nginx image.
# No Dockerfile needed — just sync local files into the running container.

allow_k8s_contexts(k8s_context())

k8s_custom_deploy(
    'nginx',
    apply_cmd='kubectl apply -f k8s.yaml -o yaml',
    delete_cmd='kubectl delete -f k8s.yaml --ignore-not-found',
    deps=['site'],
    container_selector='nginx',
    live_update=[
        initial_sync(),
        sync('site', '/usr/share/nginx/html'),
    ],
)

k8s_resource('nginx', port_forwards='8080:80')
➜  tilt-initial-sync-example cat k8s.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80
➜  tilt-initial-sync-example cat site/index.html 
<!DOCTYPE html>
<html>
<head><title>Hello from Tilt</title></head>
<body>
  <h1>Hello from Tilt initial_sync!</h1>
  <p>Edit this file and watch it update in the container.</p>
</body>
</html>

@nicks
Copy link
Copy Markdown
Member

nicks commented Apr 10, 2026

high-level comments:

overall i like the approach!

i don't understand what problem you're trying to solve with the ignore arguments to initial_sync. in all the cases i can think of, this will put users in weird and inconsistent states? for example, initial_sync(ignore=['node_modules']), sync('.', '/app') will put you in states where directories get half-synced if they're modified post-deploy.

it looks like the tests are broken?

is that why the pr is still in draft? or are there other things you want to iterate on before review?

@arrrnas
Copy link
Copy Markdown
Contributor Author

arrrnas commented Apr 10, 2026

hey @nicks and thanks for looking into this, apologies for the broken state of the PR, didn't have enough time to update it yet...

regarding the ignore filter - our use case might be a bit contrived but the reason to ignore node_modules: the pre-built image the we use already has the correct node_modules folder baked in (at least with the dependencies that existed at that commit), so there's 2 scenarios:

  • developer legitimately has changes to the packages and needs them installed
    • live_update trigger handles running npm -i (or equivalent command) and node_modules inside the pod is up-to-date
  • developer has changes in node_modules due to working on the project locally/natively
    • we don't want to sync those just because there's a huge number of files to sync and our main bottle neck is number of files

this was the main use case for us with the ignore filter, but think there might be more - mono-repo where you might want just a subset of the repo to get synced when using a generic image (like in the nginx example).

There's one main optimization that's missing from this, reconciler.go changes can be optimized I think, to simplify collecting all of the files that need to be synced, otherwise the structure of the PR shouldn't change much.

arrrnas added 9 commits April 10, 2026 02:02
Signed-off-by: arnas dundulis <arnas@vinted.com>
Signed-off-by: arnas dundulis <arnas@vinted.com>
Signed-off-by: arnas dundulis <arnas@vinted.com>
Without this, maybeSync is gated behind hasChangesToSync which requires
file change events or container state changes. On a fresh tilt up, the
container starts with no file changes detected, so maybeSync is never
entered and isInitialSync is never checked. This means initial_sync
silently does nothing.

Fix: when InitialSync is configured on the spec, always set
hasChangesToSync=true so the reconciler enters maybeSync. The
isInitialSync check inside maybeSync still correctly gates on whether
the container is new (lastFileTimeSynced is zero).

Signed-off-by: arnas dundulis <arnas@vinted.com>
Two pre-existing test bugs:

1. Test containers use Image "frontend-image" which doesn't match the
   ImageMap selector (expects "local-registry:12345/frontend-image:my-tag").
   Tests were actually syncing to the setupFrontend fixture's "main-id"
   container, not the test's "container-1". Assertions only checked call
   count so this was hidden.

2. f.Update() and f.kdUpdateStatus() auto-reconcile via MustReconcile,
   so initial sync fires during setup before the test's explicit
   MustReconcile call. Tests now clear f.cu.Calls before the assertion
   window and account for the correct reconcile sequence.

Signed-off-by: arnas dundulis <arnas@vinted.com>
During initial sync, instead of collecting all individual file paths,
converting them to PathMappings via FilesToPathMappings (O(files x syncs)),
statting each for existence via MissingLocalPaths, and then creating a tar
from individual file mappings — build the tar archive directly from
directory-level sync mappings with the ignore filter passed to
TarArchiveForPaths.

This means a single directory walk produces the tar stream, skipping
the redundant FilesToPathMappings conversion and MissingLocalPaths stat
calls. collectAllSyncedFiles still runs for BoilRuns trigger matching.

Benchmarks at 14k files:
  old (individual files): 189ms, 33MB allocs
  new (tar batched):      164ms, 21MB allocs  (14% faster, 37% less memory)

Also adds:
- buildInitialSyncFilter: composes per-sync-path ignore matchers into
  a single filter for TarArchiveForPaths
- Warnings when sync paths don't exist or dockerignore path has no
  .dockerignore file during initial sync

Signed-off-by: arnas dundulis <arnas@vinted.com>
Add 6 tests that inspect the actual tar archive produced by initial_sync:
- TarBatching_IgnoresFilteredFiles: IgnorePaths exclude files from tar
- TarBatching_DockerignoreExcludesFromTar: .dockerignore patterns applied
- TarBatching_MultipleSyncPathsInSingleTar: files from all syncs in one tar
- TarBatching_NoDeletesDuringInitialSync: ToDelete is always empty
- TarBatching_GlobPatternExcludesFromTar: **/testdata glob works
- TarBatching_LargeFileTree: 1000 files with 100 ignored, correct count

Signed-off-by: arnas dundulis <arnas@vinted.com>
- Rename noMoreInitialSync → seenInitialSync in liveUpdateFromSteps
  for clarity (the variable tracks whether we've seen an initial_sync
  step, not whether more are allowed)

- Update IgnorePaths field comment to accurately describe dockerignore
  glob syntax support (**/ patterns, *.ext wildcards) instead of just
  "exact matches and directory prefixes"

Signed-off-by: arnas dundulis <arnas@vinted.com>
Signed-off-by: arnas dundulis <arnas@vinted.com>
@arrrnas arrrnas force-pushed the feature/live_update_initial_sync branch from 6582bb3 to e1d6198 Compare April 10, 2026 02:03
@arrrnas
Copy link
Copy Markdown
Contributor Author

arrrnas commented Apr 10, 2026

so the main change to figuring out which files to sync:
before:

  1. collectAllSyncedFiles — walk all sync dirs, collect every file path into a []string
  2. FilesToPathMappings — loop through each file × each sync rule to compute container paths
  3. MissingLocalPaths — os.Stat() every file again to check if it still exists
  4. TarArchiveForPaths(individualFileMappings, nil) — build tar from individual file PathMappings (which internally walks/stats each file again)

after:

  1. SyncsToPathMappings — build directory-level mappings directly from sync spec (no per-file work)
  2. TarArchiveForPaths(directoryMappings, ignoreFilter) — single directory walk builds the tar with filtering

benchmarked this to save ~14% time and ~37% memory at 14k files

arrrnas added 4 commits April 10, 2026 02:13
The previous approach set hasChangesToSync=true on every reconcile when
InitialSync was configured, causing unnecessary work entering and
exiting maybeSync after all containers have already been synced.

Now uses monitor.needsInitialSync() which returns true only when no
containers are tracked yet (first reconcile) or when any tracked
container has never been synced (lastFileTimeSynced is zero). Once all
containers complete their initial sync, the normal hasChangesToSync
gate applies.

Signed-off-by: arnas dundulis <arnas@vinted.com>
The err2 rename was an artifact of the tar batching refactor. Since err
is not used in the outer scope at that point, use err directly with var.

Signed-off-by: arnas dundulis <arnas@vinted.com>
filepath.IsAbs("/absolute/path") returns false on Windows since it's
not a Windows absolute path (no drive letter). Use path.IsAbs (POSIX
semantics) instead so the validation rejects Unix-style absolute paths
on all platforms.

Fixes TestLiveUpdate_InitialSync_AbsolutePathError on Windows.

Signed-off-by: arnas dundulis <arnas@vinted.com>
Signed-off-by: arnas dundulis <arnas@vinted.com>
@arrrnas arrrnas marked this pull request as ready for review April 10, 2026 05:02
@arrrnas
Copy link
Copy Markdown
Contributor Author

arrrnas commented Apr 10, 2026

I think I'm done making changes, sorry for the messy PR :{

@nicks
Copy link
Copy Markdown
Member

nicks commented Apr 15, 2026

i thought about it a while. i don't think i'm comfortable accepting the change with the ignore_paths/dockerignore arguments to initial_sync.

i understand the problem you're trying to solve. But the feature only works if you hold it in a very specific way. And will lead to weird/broken behavior otherwise.

I'd be comfortable accepting the initial_sync without the ignore bits. I'm also open to discussing other solutions to this problem.

@nicks nicks self-requested a review April 15, 2026 15:57
@arrrnas
Copy link
Copy Markdown
Contributor Author

arrrnas commented Apr 16, 2026

hey @nicks fair enough, thinking about this, the explicit ignore might not even be needed for us, if I recall correctly tilt already takes into account .dockerignore and doesn't watch paths that are ignored by it, this makes sense since the files won't end up in the image if built locally, so no need to sync if they change, in our case the images are pre-built, but from the same dockerfile that could be used locally, so .dockerignore works the same and we still get the same result of node_modules on host machine being ignored and pods rebuilding their own node_modules when triggered by changes to packages.json, etc..

I'll update the PR to omit the ignore param.

@arrrnas
Copy link
Copy Markdown
Contributor Author

arrrnas commented Apr 16, 2026

looking at the code my claim

if I recall correctly tilt already takes into account .dockerignore and doesn't watch paths that are ignored by it,

is not correct, that applies to the normal live_update file syncing flow but doesn't apply to initial_sync, this leads to a situation where initial_sync could be able to copy files to a pod that will not get live_updated/synced later due to being ignored because of .dockerignore, which sounds pretty bad..

My proposal would be: initial_sync() takes no arguments, but respects .dockerignore the same way sync() does so that a file that will not get synced by sync() would also not get copied by initial_sync(), does that make sense?

Signed-off-by: arnas dundulis <arnas@vinted.com>
@arrrnas arrrnas force-pushed the feature/live_update_initial_sync branch from 9bdeedc to df2adb5 Compare April 16, 2026 13:15
@arrrnas
Copy link
Copy Markdown
Contributor Author

arrrnas commented Apr 16, 2026

I've modified the implementation to leverage the ignore paths that live_update already has in order to make initial_sync and sync behave the same way - files ignored by sync() will also be ignored by initial_sync()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants