Skip to content

feat: Dynamic memory snapshots #1715

Open
Pijukatel wants to merge 3 commits intomasterfrom
dynamic-snapshotter
Open

feat: Dynamic memory snapshots #1715
Pijukatel wants to merge 3 commits intomasterfrom
dynamic-snapshotter

Conversation

@Pijukatel
Copy link
Collaborator

@Pijukatel Pijukatel commented Feb 5, 2026

Description

  • Add Ratio type to represent the maximum relative available memory of the system.
  • Allow to initialize the Snapshotter.max_memory_size and MemorySnapshot.max_memory_size with either Ratio (dynamic memory) or ByteSize (fixed memory)
  • When Ratio is used, the MemorySnapshot.is_overloaded will take into account the current available memory. (Previously, it would take into account only the initial available memory.)

Top level usage in Crawlers:
Fixed memory

BasicCrawler(configuration=Configuration(memory_mbytes=1024))

Dynamic memory

BasicCrawler(configuration=Configuration(available_memory_ratio=0.5))

Issues

Testing

  • Unit test

Checklist

  • CI passed

@github-actions github-actions bot added this to the 133rd sprint - Tooling team milestone Feb 5, 2026
@github-actions github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Feb 5, 2026
@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 71.87500% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.42%. Comparing base (8c0dae6) to head (3f27a14).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
src/crawlee/_autoscaling/snapshotter.py 65.38% 9 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1715      +/-   ##
==========================================
- Coverage   92.47%   92.42%   -0.06%     
==========================================
  Files         156      156              
  Lines       10602    10621      +19     
==========================================
+ Hits         9804     9816      +12     
- Misses        798      805       +7     
Flag Coverage Δ
unit 92.42% <71.87%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Pijukatel Pijukatel marked this pull request as ready for review February 10, 2026 15:53
@Pijukatel
Copy link
Collaborator Author

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces dynamic memory snapshot support to address autoscaling limitations in environments with variable memory allocations (e.g., Kubernetes burstable QoS). It adds a Ratio type that allows the autoscaler to dynamically query available system memory rather than being locked to an initial baseline.

Changes:

  • Introduced Ratio type for representing dynamic memory as a proportion of total system memory
  • Modified Snapshotter and MemorySnapshot to accept either ByteSize (fixed) or Ratio (dynamic) for memory limits
  • Added logic to dynamically evaluate memory overload based on current available memory when using Ratio

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/crawlee/_utils/byte_size.py Adds Ratio Pydantic model with validation for memory ratios (0.0 < value ≤ 1.0)
src/crawlee/_autoscaling/snapshotter.py Updates max_memory_size parameter to accept ByteSize | Ratio and dynamically calculates memory limits when using Ratio
src/crawlee/_autoscaling/_types.py Modifies MemorySnapshot.is_overloaded to dynamically query system memory when max_memory_size is a Ratio
tests/unit/_autoscaling/test_snapshotter.py Adds comprehensive test simulating memory scale-up/scale-down scenarios with mocked memory info

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@Mantisus Mantisus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I have a few comments below

"""The maximum memory that can be used by `AutoscaledPool`.

When of type `ByteSize` then it is used as fixed memory size. When of type `Ratio` then it allows for dynamic memory
scaling based on the available system memory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the available_memory_ratio docstring in Configuration should also mention this dynamic scaling behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

class Ratio(BaseModel):
"""Represents ratio of memory."""

value: Annotated[float, Field(gt=0.0, le=1.0)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we could add gt=0.0 and le=1.0 constraints to available_memory_ratio in Configuration for consistency.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

# The snapshot overload is decided not when the snapshot was taken, but when `is_overload` property is
# accessed. This allows for dynamic memory scaling. The same memory snapshot that used to be overloaded in
# the past can become non-overloaded if the available memory was increased.
max_memory_size = ByteSize(int(get_memory_info().total_size.bytes * self.max_memory_size.value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe get_memory_info() should be wrapped in asyncio.to_thread since it uses psutil.

Also, I don't think historical snapshots should be affected by memory scaling - the overload was real at the moment the snapshot was taken. Once it's older than _SNAPSHOT_HISTORY, it won't influence the Snapshotter anyway. The 30-second inertia from old memory values seems reasonable to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. No need to be so "realtime-ish"

self._evaluate_memory_load(event_data.memory_info.current_size, event_data.memory_info.created_at)

if isinstance(self._max_memory_size, Ratio):
max_memory_size = ByteSize(int(get_memory_info().total_size.bytes * self._max_memory_size.value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could skip calling get_memory_info() when event_data.memory_info is MemoryInfo and just use memory_info.total_size.bytes instead.

Copy link
Collaborator Author

@Pijukatel Pijukatel Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Collaborator

@Mantisus Mantisus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!


max_memory_size: ByteSize
"""The maximum memory that can be used by `AutoscaledPool`."""
max_memory_size: ByteSize | Ratio
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _snapshot_memory, we now always resolve Ratio to ByteSize, so I believe there should be only ByteSize. Also, don't forget to update the docstring.

Suggested change
max_memory_size: ByteSize | Ratio
max_memory_size: ByteSize

@@ -1,14 +1,22 @@
from __future__ import annotations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the Ratio belongs here. A ratio is not a byte size - it's a proportion of memory. We should place Ration in the _autoscaling/_types.py where it is (only) used.

_BYTES_PER_TB = _BYTES_PER_KB**4


class Ratio(BaseModel):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be only a dataclass instead of a Pydantic model

Comment on lines +292 to +294
# This is just hypothetical case, that should not happen in practice.
# `LocalEvenManager` should always provide `MemoryInfo` in the event data.
# When running on Apify, `self._max_memory_size` is always `ByteSize`, not `Ratio`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please log warning then

assert prev_time <= curr_time, f'Items at indices {i - 1} and {i} are not in chronological order'


_initial_memory_info = get_memory_info()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a fixture so that this does not run at import time?

@janbuchar
Copy link
Collaborator

Once you folks are done here, we need to make sure that this is also implemented in the JS port

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snapshotter does not account for dynamic memory scaling (e.g., K8s burstable QoS)

4 participants