Ts native parser#361
Draft
CaelmBleidd wants to merge 16 commits into
Draft
Conversation
Contributor
dab6123 to
8257858
Compare
…checker, minimal CLI Start of the native TypeScript frontend that replaces the ArkAnalyzer dependency: TS interfaces exactly mirroring the Kotlin EtsIR DTOs (wire format of EtsFileDto.loadFromJson), a strict-JSON serializer, an invariant checker enforcing the contracts required by Convert.kt (block ids, if-successor order, Local-only instances, reserved names), and a CLI stub emitting a valid %dflt skeleton. Covered by vitest unit tests and a Kotlin-side smoke test (EtsTsFrontendTest) that runs the CLI via node and deserializes/converts its output.
TypeConverter maps both syntactic annotations (ts.TypeNode) and inferred checker types (ts.Type) into the EtsIR type DTOs: primitives, literal types (bare JSON primitives), arrays with folded dimensions, tuples, unions/intersections, project classes/interfaces/enums as ClassType with namespace chains, enum members as EnumValueType, function types as FunctionType, generics, alias following, and UnclearReferenceType / UnknownType degradation for ambient or exotic types. Conventions checked against ArkAnalyzer ground truth. Plus an in-memory compiler-host test harness used by all upcoming lowering tests.
Real parsing pipeline: fileBuilder lowers a source file into the %dflt class (loose top-level statements -> %dflt method, top-level functions -> methods), methodBuilder tracks locals/%N temps and emits the ArkAnalyzer prologue (params, then this := ThisRef), exprLowering translates expressions with strict value discipline (immediate operands/args, Local-only call instances, NewExpr + constructor-call pairs, NewArrayExpr with indexed stores, ConditionExpr for comparisons, template-literal concat chains), and a CFG builder with symbolic labels/terminators lays the groundwork for control flow. Unsupported constructs degrade to Raw* fallbacks with diagnostics. Verified by 20 new lowering specs and a Kotlin end-to-end test converting and linearizing the produced IR.
if/else, ternary diamonds, while/do-while/for loops with proper back edges and continue-to-update semantics, for-of via the iterator protocol (Symbol.iterator/next/done/value — the ArkAnalyzer shape), for-in via Object.keys indexing, switch as a === comparison chain with fallthrough bodies, break/continue including labeled variants, unreachable-code elimination. Truthiness normalized the ArkAnalyzer way (bool != false / value != 0), negation swaps branch targets, logical &&/||/?? stay value-level binops. Kotlin end-to-end test linearizes a program exercising every construct.
ClassBuilder lowers class-like declarations with full ArkAnalyzer conventions: synthesized %instInit/%statInit initializer methods, constructors shaped as `this; %instInit(); body; return this` (synthesized when absent), parameter properties as fields + ctor assignments, static blocks folded into %statInit, modifiers bitmask, decorators, heritage clauses. Interfaces become category-2 classes with bodyless methods; enums become category-3 classes with STATIC EnumValueType fields valued in %statInit (auto-increment and string enums via checker constants); namespaces recurse into NamespaceDto with their own %dflt classes. Covered by 12 TS specs and a Kotlin end-to-end conversion test.
The DTO has no trap tables, so exception edges cannot be expressed; ArkAnalyzer simply dropped catch blocks. We instead keep handlers analyzable: a synthetic nondeterministic branch guards try vs catch, catch entry binds `e := CaughtExceptionRef`, and both paths join on the shared finally block. Import infos cover default/named(as)/namespace/ side-effect imports; export infos cover exported declarations (typed as CLASS/METHOD/TYPE/LOCAL/NAMESPACE), named re-exports with nameBeforeAs + exportFrom, star re-exports, and export default (checker-resolved).
…chaining, corpus Closures lift into %AM<n>$<method> methods on the file's %dflt class with FunctionType-typed use-site locals (ArkAnalyzer shape); object literals become %AC<n> anonymous classes (category OBJECT) with fields, literal methods and per-property stores; object/array destructuring (renames, defaults via ===undefined diamonds, nesting, holes, for-of patterns) unpacks through field/array refs; optional chaining guards accesses and calls with null-check diamonds; super()/super.m() lower as instance calls on `this`; yield lowers to YieldExpr. New corpus of realistic plain TS/JS fixtures must lower with zero invariant violations and zero raw fallbacks.
… integration EtsIrProvider (TS_FRONTEND default, ARKANALYZER legacy) selects the IR generator in LoadEtsFile.kt; both providers stay fully supported and are switchable via the ETS_IR_PROVIDER env var or an explicit parameter. The CLI gains -p/--multi directory modes (one shared ts.Program, per-file JSONs mirroring the input tree, cross-file project signatures, .ets parsed as TS). Gradle wires npm: installTsFrontend/buildTsFrontend/ testTsFrontend tasks, :jacodb-ets:test depends on the frontend build, and generateTestResources now defaults to ts-frontend (arkanalyzer via ETS_IR_PROVIDER=arkanalyzer). Static-method `this.f` now lowers to StaticFieldRef and `export * as ns` carries nameBeforeAs "*" — with these fixes the ENTIRE existing jacodb-ets test suite (EtsFileTest, EtsFromJsonTest, EnumTest, EtsImport/ExportTest) passes on the native frontend with no ArkAnalyzer installed.
ci-ets now sets up Node 20 (npm-cached), builds and unit-tests the ts-frontend before running the Gradle suite; generateTestResources and all ETS tests run on the native frontend by default, while the retained ArkAnalyzer setup feeds a new provider-parity test that keeps the legacy provider selectable and verified. ts-frontend gets a README covering the CLI, architecture, IR conventions and provider env vars; ARKANALYZER.md is marked as the legacy provider with switching instructions.
Prerequisites, quick start (single-file and project modes), Kotlin API usage examples (loadEts*AutoConvert, generateEtsIR, provider selection, frontend location knobs), complete CLI reference with exit codes and robustness guarantees, supported-features / degradation lists, IR conventions, development workflow and a troubleshooting table.
…effort The ArkAnalyzer checkout installs a floating latest @types/node whose new d.ts files (ffi.d.ts) are unparseable by ArkAnalyzer's bundled TypeScript, failing `npm run build` and killing the whole job before any Gradle test runs. Pin @types/node@18 for the legacy build, mark the step continue-on-error (the default provider is the bundled ts-frontend, so upstream breakage must not block CI), and export ARKANALYZER_DIR only after a successful build so the parity test skips cleanly instead of using a broken checkout.
The follow-up `npm install --no-save @types/node@18` re-resolved the npm tree and pruned ohos-typescript (added by ArkAnalyzer's postinstall with --no-save, so absent from package.json), failing the legacy build with TS2307 and skipping the provider-parity test. Pin @types/node with `npm pkg set` before the single `npm install` instead — verified locally to produce a working serializeArkIR.js.
…tics, this-params, nested functions Address the review comment and self-review findings: - validate.ts rejects expr-kind PtrCallExpr.ptr (Kotlin Convert casts it to EtsValue, so an expr would throw a ClassCastException); - prefix ++/-- on field/array targets returned the OLD value; now lowered as `%old := ref; %new := %old ++; ref := %new` returning %new for prefix and %old for postfix; - astUtils.buildParameters skips TS fake `this` parameters so they no longer shift ParameterRef indices or shadow the `this` local; - nested function declarations were silently dropped; they are now lifted onto the file's %dflt class under their real name, matching how call sites already resolve them; - generateTestResources fails the build on generator timeout/non-zero exit instead of silently producing partial resources.
8257858 to
cbc8e5e
Compare
Found by running the frontend on vuejs/core (465 files, ~12k methods): Kotlin's ensureOneAddress accepts only exprs/refs/values and fails with "Expected EtsValue, but got EtsRawEntity" when a raw fallback value sits in an operand position (ref-store RHS, call argument, nested operand) — 179 such placements across the vue corpus, e.g. spread elements stored straight into ArrayRefs. The only safe position for a raw value is the RHS of a Local assignment, so lowerExpr/spreadFallback now hoist every raw fallback into a `%N := <raw>` temp, and validate.ts enforces the invariant. With this fix the whole vue-core IR converts and linearizes on the JVM side with zero failures (204k stmts).
Previously finally code was emitted only on the fall-through path, so a
try body that always exits abruptly (e.g. `try { return this.fn() }
finally { ... }` in vue's ReactiveEffect.run) lost its finally entirely.
Track a stack of enclosing finally scopes and emit a copy (innermost
first) before every return/throw and before break/continue that leaves
the try; jumps staying inside the try (e.g. breaking a loop nested in
it) do not duplicate. Return values are snapshotted into a temp before
the finally runs so its mutations cannot change what is returned, and a
finally body being emitted masks its own scope so nested abrupt exits
only re-run outer finallies. Verified on vue-core: finally code is back
in the IR and the whole project still converts and linearizes on the
JVM (204k stmts, 0 failures).
…yml ref typo The raw-value whitelist in validate.ts now strips a CastExpr on the AssignStmt LHS before checking for a Local target, mirroring Kotlin Convert — `CastExpr(Local) := <raw>` is a legal Local assignment and must not be flagged. Also fix the `'ref/heads/neo'` typo (missing `s`) in the cache-read-only conditions of ci-core/ci-ets so Gradle cache writes are actually enabled on the neo branch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A new TS parser frontend instead of ArkAnalyzers' one