Skip to content

Conversation

@wooorm-arcjet
Copy link
Member

@wooorm-arcjet wooorm-arcjet commented Dec 4, 2025

related to PRD-8

@wooorm-arcjet wooorm-arcjet requested a review from a team as a code owner December 4, 2025 13:45
@trunk-io
Copy link

trunk-io bot commented Dec 4, 2025

Merging to main in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

Copy link
Contributor

@arcjet-rei arcjet-rei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! Good job turning this around quickly. I'll need to review arcjet/arcjet#6412 as well, but the other piece of this is figuring out where we want to shim the remaining work of making this available to users into the roadmap. For now, I'll come back to approve this after looking at the monorepo PR.

analyze/index.ts Outdated
): Promise<FilterResult> {
const coreImports = createCoreImports();
const analyze = await initializeWasm(coreImports);
const body = await context.getBody();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding correctly that we always read the request body if a filter rule is configured?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I looked at using the existing lookups (like ip_lookup), but those are global functions.
I do not know of an an alternative way to optionally pass things into, per call

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not feasible to statically analyze the wirefilter body at say startup and determine if the body is needed? I worry this could maybe have a perf impact for folks with existing filters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not feasible to statically analyze the wirefilter body at say startup and determine if the body is needed?

It is not efficient. Optional data is implemented as external functions that are imported. However, that doesn't allow asynchronicity so this will take extra thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest solution that comes to mind (note that I'm not advocating for it, I'm just contributing it for brainstorming purposes) would be to create a new rule type that is the same as a filter rule only it adds reading the body (e.g. FilterWithBody). It's not the most ergonomic solution from a DX perspective, because it requires people to notice that they're adding matching on a body, but if we tailored the error logs with this in mind it wouldn't be too bad.

@blaine-arcjet
Copy link
Contributor

I'm going to convert this to a draft while we consider how to resolve the concerns @qw-in and I have about the performance hit on each filter rule (even without using body)

@blaine-arcjet blaine-arcjet marked this pull request as draft December 5, 2025 19:13
@wooorm-arcjet
Copy link
Member Author

wooorm-arcjet commented Dec 8, 2025

Yes, it is possible to only read the body if needed, by:

  • a) calling into WebAssembly once when filter() is called, analyzing all expressions, this I just implemented
  • b) asking users to define this, either with localFilter() or so which is like what @arcjet-rei proposes, or some other flag, which would not be difficult to implement
  • c) for the WebAssembly to be able to callback into JavaScript to read the body if needed, which may be difficult to implement.

Although b) would theoretically be fastest (only one call into webassembly and no calls out of it), I think c) is the ideal scenario, and a) may be good enough until c) is feasible.

Real world benchmarks are difficult, because giant bodies will be slow to read and take much more time than calling into/out of WebAssembly, if for example this CVE WAF idea would be automatically applied to every Arcjet user.

If 1 out of 10 filter()s in the wild would use body, then analyzing makes sense. If 9 out of 10 filters() use body, then not analyzing and just reading body makes more sense.

Running the benchmark I commited, the results on my machine are:

bench (simple): `10000` iterations, total time: `3947.35ms`, time per iteration: `0.39ms`
bench (read the body every 10th iteration): `10000` iterations, total time: `4685.44ms`, time per iteration: `0.47ms`
bench (analyze every expression, read the body every 10th time): `10000` iterations, total time: `8884.36ms`, time per iteration: `0.89ms`
bench (always read the body): `10000` iterations, total time: `16073.67ms`, time per iteration: `1.61ms`
  • “Simple” is the baseline, the current state. No body support.
  • The second is like @arcjet-rei’s idea, b) above. The real performance depends on how slow bodies are and how often body is needed, this case is 1 in 10.
  • The third is the current code, a) above. Same real perf notes apply.
  • Last would be idea b) if every expression needs the body

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants