Skip to content

Ts native parser#361

Draft
CaelmBleidd wants to merge 16 commits into
neofrom
caelmbleidd/ts_native_parser
Draft

Ts native parser#361
CaelmBleidd wants to merge 16 commits into
neofrom
caelmbleidd/ts_native_parser

Conversation

@CaelmBleidd

Copy link
Copy Markdown
Member

A new TS parser frontend instead of ArkAnalyzers' one

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Test Results

  219 files  +1    219 suites  +1   16m 42s ⏱️ +38s
  751 tests +7    739 ✅ +7  12 💤 ±0  0 ❌ ±0 
2 021 runs  +7  1 971 ✅ +7  50 💤 ±0  0 ❌ ±0 

Results for commit 525637f. ± Comparison against base commit 15728ba.

♻️ This comment has been updated with latest results.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 47 changed files in this pull request and generated 1 comment.

Files not reviewed (1)
  • jacodb-ets/ts-frontend/package-lock.json: Generated file

Comment thread jacodb-ets/ts-frontend/src/validate.ts
@CaelmBleidd CaelmBleidd force-pushed the caelmbleidd/ts_native_parser branch from dab6123 to 8257858 Compare July 3, 2026 12:29
…checker, minimal CLI

Start of the native TypeScript frontend that replaces the ArkAnalyzer
dependency: TS interfaces exactly mirroring the Kotlin EtsIR DTOs
(wire format of EtsFileDto.loadFromJson), a strict-JSON serializer,
an invariant checker enforcing the contracts required by Convert.kt
(block ids, if-successor order, Local-only instances, reserved names),
and a CLI stub emitting a valid %dflt skeleton. Covered by vitest unit
tests and a Kotlin-side smoke test (EtsTsFrontendTest) that runs the
CLI via node and deserializes/converts its output.
TypeConverter maps both syntactic annotations (ts.TypeNode) and inferred
checker types (ts.Type) into the EtsIR type DTOs: primitives, literal
types (bare JSON primitives), arrays with folded dimensions, tuples,
unions/intersections, project classes/interfaces/enums as ClassType with
namespace chains, enum members as EnumValueType, function types as
FunctionType, generics, alias following, and UnclearReferenceType /
UnknownType degradation for ambient or exotic types. Conventions checked
against ArkAnalyzer ground truth. Plus an in-memory compiler-host test
harness used by all upcoming lowering tests.
Real parsing pipeline: fileBuilder lowers a source file into the %dflt
class (loose top-level statements -> %dflt method, top-level functions ->
methods), methodBuilder tracks locals/%N temps and emits the ArkAnalyzer
prologue (params, then this := ThisRef), exprLowering translates
expressions with strict value discipline (immediate operands/args,
Local-only call instances, NewExpr + constructor-call pairs, NewArrayExpr
with indexed stores, ConditionExpr for comparisons, template-literal
concat chains), and a CFG builder with symbolic labels/terminators lays
the groundwork for control flow. Unsupported constructs degrade to Raw*
fallbacks with diagnostics. Verified by 20 new lowering specs and a
Kotlin end-to-end test converting and linearizing the produced IR.
if/else, ternary diamonds, while/do-while/for loops with proper back
edges and continue-to-update semantics, for-of via the iterator protocol
(Symbol.iterator/next/done/value — the ArkAnalyzer shape), for-in via
Object.keys indexing, switch as a === comparison chain with fallthrough
bodies, break/continue including labeled variants, unreachable-code
elimination. Truthiness normalized the ArkAnalyzer way (bool != false /
value != 0), negation swaps branch targets, logical &&/||/?? stay
value-level binops. Kotlin end-to-end test linearizes a program
exercising every construct.
ClassBuilder lowers class-like declarations with full ArkAnalyzer
conventions: synthesized %instInit/%statInit initializer methods,
constructors shaped as `this; %instInit(); body; return this`
(synthesized when absent), parameter properties as fields + ctor
assignments, static blocks folded into %statInit, modifiers bitmask,
decorators, heritage clauses. Interfaces become category-2 classes with
bodyless methods; enums become category-3 classes with STATIC
EnumValueType fields valued in %statInit (auto-increment and string
enums via checker constants); namespaces recurse into NamespaceDto with
their own %dflt classes. Covered by 12 TS specs and a Kotlin end-to-end
conversion test.
The DTO has no trap tables, so exception edges cannot be expressed;
ArkAnalyzer simply dropped catch blocks. We instead keep handlers
analyzable: a synthetic nondeterministic branch guards try vs catch,
catch entry binds `e := CaughtExceptionRef`, and both paths join on the
shared finally block. Import infos cover default/named(as)/namespace/
side-effect imports; export infos cover exported declarations (typed as
CLASS/METHOD/TYPE/LOCAL/NAMESPACE), named re-exports with nameBeforeAs +
exportFrom, star re-exports, and export default (checker-resolved).
…chaining, corpus

Closures lift into %AM<n>$<method> methods on the file's %dflt class
with FunctionType-typed use-site locals (ArkAnalyzer shape); object
literals become %AC<n> anonymous classes (category OBJECT) with fields,
literal methods and per-property stores; object/array destructuring
(renames, defaults via ===undefined diamonds, nesting, holes, for-of
patterns) unpacks through field/array refs; optional chaining guards
accesses and calls with null-check diamonds; super()/super.m() lower as
instance calls on `this`; yield lowers to YieldExpr. New corpus of
realistic plain TS/JS fixtures must lower with zero invariant violations
and zero raw fallbacks.
… integration

EtsIrProvider (TS_FRONTEND default, ARKANALYZER legacy) selects the IR
generator in LoadEtsFile.kt; both providers stay fully supported and are
switchable via the ETS_IR_PROVIDER env var or an explicit parameter.
The CLI gains -p/--multi directory modes (one shared ts.Program,
per-file JSONs mirroring the input tree, cross-file project signatures,
.ets parsed as TS). Gradle wires npm: installTsFrontend/buildTsFrontend/
testTsFrontend tasks, :jacodb-ets:test depends on the frontend build,
and generateTestResources now defaults to ts-frontend (arkanalyzer via
ETS_IR_PROVIDER=arkanalyzer). Static-method `this.f` now lowers to
StaticFieldRef and `export * as ns` carries nameBeforeAs "*" — with
these fixes the ENTIRE existing jacodb-ets test suite (EtsFileTest,
EtsFromJsonTest, EnumTest, EtsImport/ExportTest) passes on the native
frontend with no ArkAnalyzer installed.
ci-ets now sets up Node 20 (npm-cached), builds and unit-tests the
ts-frontend before running the Gradle suite; generateTestResources and
all ETS tests run on the native frontend by default, while the retained
ArkAnalyzer setup feeds a new provider-parity test that keeps the legacy
provider selectable and verified. ts-frontend gets a README covering the
CLI, architecture, IR conventions and provider env vars; ARKANALYZER.md
is marked as the legacy provider with switching instructions.
Prerequisites, quick start (single-file and project modes), Kotlin API
usage examples (loadEts*AutoConvert, generateEtsIR, provider selection,
frontend location knobs), complete CLI reference with exit codes and
robustness guarantees, supported-features / degradation lists, IR
conventions, development workflow and a troubleshooting table.
…effort

The ArkAnalyzer checkout installs a floating latest @types/node whose new
d.ts files (ffi.d.ts) are unparseable by ArkAnalyzer's bundled TypeScript,
failing `npm run build` and killing the whole job before any Gradle test
runs. Pin @types/node@18 for the legacy build, mark the step
continue-on-error (the default provider is the bundled ts-frontend, so
upstream breakage must not block CI), and export ARKANALYZER_DIR only
after a successful build so the parity test skips cleanly instead of
using a broken checkout.
The follow-up `npm install --no-save @types/node@18` re-resolved the npm
tree and pruned ohos-typescript (added by ArkAnalyzer's postinstall with
--no-save, so absent from package.json), failing the legacy build with
TS2307 and skipping the provider-parity test. Pin @types/node with
`npm pkg set` before the single `npm install` instead — verified locally
to produce a working serializeArkIR.js.
…tics, this-params, nested functions

Address the review comment and self-review findings:
- validate.ts rejects expr-kind PtrCallExpr.ptr (Kotlin Convert casts it
  to EtsValue, so an expr would throw a ClassCastException);
- prefix ++/-- on field/array targets returned the OLD value; now
  lowered as `%old := ref; %new := %old ++; ref := %new` returning %new
  for prefix and %old for postfix;
- astUtils.buildParameters skips TS fake `this` parameters so they no
  longer shift ParameterRef indices or shadow the `this` local;
- nested function declarations were silently dropped; they are now
  lifted onto the file's %dflt class under their real name, matching how
  call sites already resolve them;
- generateTestResources fails the build on generator timeout/non-zero
  exit instead of silently producing partial resources.
@CaelmBleidd CaelmBleidd force-pushed the caelmbleidd/ts_native_parser branch from 8257858 to cbc8e5e Compare July 3, 2026 12:33
Found by running the frontend on vuejs/core (465 files, ~12k methods):
Kotlin's ensureOneAddress accepts only exprs/refs/values and fails with
"Expected EtsValue, but got EtsRawEntity" when a raw fallback value sits
in an operand position (ref-store RHS, call argument, nested operand) —
179 such placements across the vue corpus, e.g. spread elements stored
straight into ArrayRefs. The only safe position for a raw value is the
RHS of a Local assignment, so lowerExpr/spreadFallback now hoist every
raw fallback into a `%N := <raw>` temp, and validate.ts enforces the
invariant. With this fix the whole vue-core IR converts and linearizes
on the JVM side with zero failures (204k stmts).
Previously finally code was emitted only on the fall-through path, so a
try body that always exits abruptly (e.g. `try { return this.fn() }
finally { ... }` in vue's ReactiveEffect.run) lost its finally entirely.
Track a stack of enclosing finally scopes and emit a copy (innermost
first) before every return/throw and before break/continue that leaves
the try; jumps staying inside the try (e.g. breaking a loop nested in
it) do not duplicate. Return values are snapshotted into a temp before
the finally runs so its mutations cannot change what is returned, and a
finally body being emitted masks its own scope so nested abrupt exits
only re-run outer finallies. Verified on vue-core: finally code is back
in the IR and the whole project still converts and linearizes on the
JVM (204k stmts, 0 failures).

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 47 changed files in this pull request and generated 2 comments.

Files not reviewed (1)
  • jacodb-ets/ts-frontend/package-lock.json: Generated file

Comment thread jacodb-ets/ts-frontend/src/validate.ts
Comment thread .github/workflows/ci.yml Outdated
…yml ref typo

The raw-value whitelist in validate.ts now strips a CastExpr on the
AssignStmt LHS before checking for a Local target, mirroring Kotlin
Convert — `CastExpr(Local) := <raw>` is a legal Local assignment and
must not be flagged. Also fix the `'ref/heads/neo'` typo (missing `s`)
in the cache-read-only conditions of ci-core/ci-ets so Gradle cache
writes are actually enabled on the neo branch.
@CaelmBleidd CaelmBleidd requested a review from Damtev July 3, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants