Skip to content

[wip] js2py to quickjs#38473

Draft
derrickaw wants to merge 3 commits into
apache:masterfrom
derrickaw:20260512_js2py2wasmer
Draft

[wip] js2py to quickjs#38473
derrickaw wants to merge 3 commits into
apache:masterfrom
derrickaw:20260512_js2py2wasmer

Conversation

@derrickaw
Copy link
Copy Markdown
Collaborator

  1. Try converting js2py to quickjs

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@derrickaw
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the js2py library with quickjs-ng for executing JavaScript UDFs in Beam YAML, addressing compatibility issues with Python 3.12. The implementation introduces a _JsWrapper class that utilizes thread-local caching for JavaScript functions and handles data transfer via JSON. Feedback includes suggestions to move the thread-local cache to the module level for better idiomaticity, removing a redundant local import of quickjs, and improving the regex used to identify JavaScript function names to support async declarations.

import itertools
import json
import re
import threading
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Define the thread-local cache at the module level to avoid redundant global lookups and non-idiomatic lazy initialization within the class method.

Suggested change
import threading
import threading
_THREAD_LOCAL_JS_CACHE = threading.local()

Comment on lines +187 to +193
global _THREAD_LOCAL_JS_CACHE
if '_THREAD_LOCAL_JS_CACHE' not in globals():
globals()['_THREAD_LOCAL_JS_CACHE'] = threading.local()

def __setstate__(self, state):
self.__dict__.update(state)
cache = globals()['_THREAD_LOCAL_JS_CACHE']
if not hasattr(cache, 'functions'):
cache.functions = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Simplify the cache access by using the module-level variable defined above. This avoids the use of globals() and the global keyword, which is cleaner and more idiomatic.

Suggested change
global _THREAD_LOCAL_JS_CACHE
if '_THREAD_LOCAL_JS_CACHE' not in globals():
globals()['_THREAD_LOCAL_JS_CACHE'] = threading.local()
def __setstate__(self, state):
self.__dict__.update(state)
cache = globals()['_THREAD_LOCAL_JS_CACHE']
if not hasattr(cache, 'functions'):
cache.functions = {}
if not hasattr(_THREAD_LOCAL_JS_CACHE, 'functions'):
_THREAD_LOCAL_JS_CACHE.functions = {}
cache = _THREAD_LOCAL_JS_CACHE


cache_key = (self.source_code, self.entrypoint_name)
if cache_key not in cache.functions:
import quickjs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The quickjs module is already imported at the top of the file (line 60). This local import is redundant.


elif callable:
js_func = _CustomJsObjectWrapper(js2py.eval_js(callable))
match = re.search(r'function\s+([a-zA-Z0-9_]+)\s*\(', callable)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current regex for finding the function name is quite restrictive. It won't match async functions or functions with different spacing. A more robust regex would improve compatibility with various JavaScript function declaration styles.

Suggested change
match = re.search(r'function\s+([a-zA-Z0-9_]+)\s*\(', callable)
match = re.search(r'(?:async\s+)?function\s+([a-zA-Z0-9_]+)', callable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant