Skip to content

String escaping fuzz bug with wasm-metadce #8482

@tlively

Description

@tlively

This is a fuzz bug where the fuzzer runs the CtorEval handler with test/lit/basic/name-high-bytes.wast as its initial contents.

test.wast (reduced):

(module
 (type $0 (func))
 (export "test\\c3\\a9_invoker" (func $0))
 (func $0 (type $0)
  (unreachable)
 )
)

graph.json produced by fuzz_opt.py's filter_exports (reduced):

[
  {
    "name": "outside",
    "reaches": ["export-test\\\\c3\\\\a9_invoker"],
    "root": true
  },
  {
    "name": "export-test\\\\c3\\\\a9_invoker",
    "export": "test\\\\c3\\\\a9_invoker"
  }
]

Here we have an export name containing backslashes. Note that they are escaped in the Wasm text format, so the actual unescaped byte content of the export name is test\c3\a9_invoker. But fuzz_opt.py is not unescaping the string it reads from the disassembly, and then it is JSON-encoding the escaped name, so graph.json ends up with doubly escaped backslashes.

To make matters worse, wasm-metadce is parsing the input JSON in "ASCII" mode, which does not do any unescaping either. So wasm-metadce is rooting an export named test\\\\c3\\\\a9_invoker but in fact the export's name is test\c3\a9_invoker, so the export is removed and the filtered module is empty.

This causes the fuzzer to fail when it later runs wasm-ctor-eval and passes it --kept-exports=test\c3\a9_invoker (note that it has unescaped the string for this step). This errors out because that export no longer exists.

IIUC, the proper fix would be to 1) perform Wasm text format unescaping immediately when extracting export names in get_exports, and 2) perform JSON unescaping when parsing the JSON in wasm-metadce.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions