[wip] js2py to quickjs#38473
Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request replaces the js2py library with quickjs-ng for executing JavaScript UDFs in Beam YAML, addressing compatibility issues with Python 3.12. The implementation introduces a _JsWrapper class that utilizes thread-local caching for JavaScript functions and handles data transfer via JSON. Feedback includes suggestions to move the thread-local cache to the module level for better idiomaticity, removing a redundant local import of quickjs, and improving the regex used to identify JavaScript function names to support async declarations.
| import itertools | ||
| import json | ||
| import re | ||
| import threading |
| global _THREAD_LOCAL_JS_CACHE | ||
| if '_THREAD_LOCAL_JS_CACHE' not in globals(): | ||
| globals()['_THREAD_LOCAL_JS_CACHE'] = threading.local() | ||
|
|
||
| def __setstate__(self, state): | ||
| self.__dict__.update(state) | ||
| cache = globals()['_THREAD_LOCAL_JS_CACHE'] | ||
| if not hasattr(cache, 'functions'): | ||
| cache.functions = {} |
There was a problem hiding this comment.
Simplify the cache access by using the module-level variable defined above. This avoids the use of globals() and the global keyword, which is cleaner and more idiomatic.
| global _THREAD_LOCAL_JS_CACHE | |
| if '_THREAD_LOCAL_JS_CACHE' not in globals(): | |
| globals()['_THREAD_LOCAL_JS_CACHE'] = threading.local() | |
| def __setstate__(self, state): | |
| self.__dict__.update(state) | |
| cache = globals()['_THREAD_LOCAL_JS_CACHE'] | |
| if not hasattr(cache, 'functions'): | |
| cache.functions = {} | |
| if not hasattr(_THREAD_LOCAL_JS_CACHE, 'functions'): | |
| _THREAD_LOCAL_JS_CACHE.functions = {} | |
| cache = _THREAD_LOCAL_JS_CACHE |
|
|
||
| cache_key = (self.source_code, self.entrypoint_name) | ||
| if cache_key not in cache.functions: | ||
| import quickjs |
|
|
||
| elif callable: | ||
| js_func = _CustomJsObjectWrapper(js2py.eval_js(callable)) | ||
| match = re.search(r'function\s+([a-zA-Z0-9_]+)\s*\(', callable) |
There was a problem hiding this comment.
The current regex for finding the function name is quite restrictive. It won't match async functions or functions with different spacing. A more robust regex would improve compatibility with various JavaScript function declaration styles.
| match = re.search(r'function\s+([a-zA-Z0-9_]+)\s*\(', callable) | |
| match = re.search(r'(?:async\s+)?function\s+([a-zA-Z0-9_]+)', callable) |
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.