[79a0317] | 1 | # ES Module Lexer
|
---|
| 2 |
|
---|
| 3 | [![Build Status][actions-image]][actions-url]
|
---|
| 4 |
|
---|
| 5 | A JS module syntax lexer used in [es-module-shims](https://github.com/guybedford/es-module-shims).
|
---|
| 6 |
|
---|
| 7 | Outputs the list of exports and locations of import specifiers, including dynamic import and import meta handling.
|
---|
| 8 |
|
---|
| 9 | Supports new syntax features including import attributes and source phase imports.
|
---|
| 10 |
|
---|
| 11 | A very small single JS file (4KiB gzipped) that includes inlined Web Assembly for very fast source analysis of ECMAScript module syntax only.
|
---|
| 12 |
|
---|
| 13 | For an example of the performance, Angular 1 (720KiB) is fully parsed in 5ms, in comparison to the fastest JS parser, Acorn which takes over 100ms.
|
---|
| 14 |
|
---|
| 15 | _Comprehensively handles the JS language grammar while remaining small and fast. - ~10ms per MB of JS cold and ~5ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._
|
---|
| 16 |
|
---|
| 17 | > [Built with](https://github.com/guybedford/es-module-lexer/blob/main/chompfile.toml) [Chomp](https://chompbuild.com/)
|
---|
| 18 |
|
---|
| 19 | ### Usage
|
---|
| 20 |
|
---|
| 21 | ```
|
---|
| 22 | npm install es-module-lexer
|
---|
| 23 | ```
|
---|
| 24 |
|
---|
| 25 | See [src/lexer.ts](src/lexer.ts) for the type definitions.
|
---|
| 26 |
|
---|
| 27 | For use in CommonJS:
|
---|
| 28 |
|
---|
| 29 | ```js
|
---|
| 30 | const { init, parse } = require('es-module-lexer');
|
---|
| 31 |
|
---|
| 32 | (async () => {
|
---|
| 33 | // either await init, or call parse asynchronously
|
---|
| 34 | // this is necessary for the Web Assembly boot
|
---|
| 35 | await init;
|
---|
| 36 |
|
---|
| 37 | const source = 'export var p = 5';
|
---|
| 38 | const [imports, exports] = parse(source);
|
---|
| 39 |
|
---|
| 40 | // Returns "p"
|
---|
| 41 | source.slice(exports[0].s, exports[0].e);
|
---|
| 42 | // Returns "p"
|
---|
| 43 | source.slice(exports[0].ls, exports[0].le);
|
---|
| 44 | })();
|
---|
| 45 | ```
|
---|
| 46 |
|
---|
| 47 | An ES module version is also available:
|
---|
| 48 |
|
---|
| 49 | ```js
|
---|
| 50 | import { init, parse } from 'es-module-lexer';
|
---|
| 51 |
|
---|
| 52 | (async () => {
|
---|
| 53 | await init;
|
---|
| 54 |
|
---|
| 55 | const source = `
|
---|
| 56 | import { name } from 'mod\\u1011';
|
---|
| 57 | import json from './json.json' assert { type: 'json' }
|
---|
| 58 | export var p = 5;
|
---|
| 59 | export function q () {
|
---|
| 60 |
|
---|
| 61 | };
|
---|
| 62 | export { x as 'external name' } from 'external';
|
---|
| 63 |
|
---|
| 64 | // Comments provided to demonstrate edge cases
|
---|
| 65 | import /*comment!*/ ( 'asdf', { assert: { type: 'json' }});
|
---|
| 66 | import /*comment!*/.meta.asdf;
|
---|
| 67 |
|
---|
| 68 | // Source phase imports:
|
---|
| 69 | import source mod from './mod.wasm';
|
---|
| 70 | import.source('./mod.wasm');
|
---|
| 71 | `;
|
---|
| 72 |
|
---|
| 73 | const [imports, exports] = parse(source, 'optional-sourcename');
|
---|
| 74 |
|
---|
| 75 | // Returns "modထ"
|
---|
| 76 | imports[0].n
|
---|
| 77 | // Returns "mod\u1011"
|
---|
| 78 | source.slice(imports[0].s, imports[0].e);
|
---|
| 79 | // "s" = start
|
---|
| 80 | // "e" = end
|
---|
| 81 |
|
---|
| 82 | // Returns "import { name } from 'mod'"
|
---|
| 83 | source.slice(imports[0].ss, imports[0].se);
|
---|
| 84 | // "ss" = statement start
|
---|
| 85 | // "se" = statement end
|
---|
| 86 |
|
---|
| 87 | // Returns "{ type: 'json' }"
|
---|
| 88 | source.slice(imports[1].a, imports[1].se);
|
---|
| 89 | // "a" = assert, -1 for no assertion
|
---|
| 90 |
|
---|
| 91 | // Returns "external"
|
---|
| 92 | source.slice(imports[2].s, imports[2].e);
|
---|
| 93 |
|
---|
| 94 | // Returns "p"
|
---|
| 95 | source.slice(exports[0].s, exports[0].e);
|
---|
| 96 | // Returns "p"
|
---|
| 97 | source.slice(exports[0].ls, exports[0].le);
|
---|
| 98 | // Returns "q"
|
---|
| 99 | source.slice(exports[1].s, exports[1].e);
|
---|
| 100 | // Returns "q"
|
---|
| 101 | source.slice(exports[1].ls, exports[1].le);
|
---|
| 102 | // Returns "'external name'"
|
---|
| 103 | source.slice(exports[2].s, exports[2].e);
|
---|
| 104 | // Returns -1
|
---|
| 105 | exports[2].ls;
|
---|
| 106 | // Returns -1
|
---|
| 107 | exports[2].le;
|
---|
| 108 |
|
---|
| 109 | // Import type is provided by `t` value
|
---|
| 110 | // (1 for static, 2, for dynamic)
|
---|
| 111 | // Returns true
|
---|
| 112 | imports[2].t == 2;
|
---|
| 113 |
|
---|
| 114 | // Returns "asdf" (only for string literal dynamic imports)
|
---|
| 115 | imports[2].n
|
---|
| 116 | // Returns "import /*comment!*/ ( 'asdf', { assert: { type: 'json' } })"
|
---|
| 117 | source.slice(imports[3].ss, imports[3].se);
|
---|
| 118 | // Returns "'asdf'"
|
---|
| 119 | source.slice(imports[3].s, imports[3].e);
|
---|
| 120 | // Returns "( 'asdf', { assert: { type: 'json' } })"
|
---|
| 121 | source.slice(imports[3].d, imports[3].se);
|
---|
| 122 | // Returns "{ assert: { type: 'json' } }"
|
---|
| 123 | source.slice(imports[3].a, imports[3].se - 1);
|
---|
| 124 |
|
---|
| 125 | // For non-string dynamic import expressions:
|
---|
| 126 | // - n will be undefined
|
---|
| 127 | // - a is currently -1 even if there is an assertion
|
---|
| 128 | // - e is currently the character before the closing )
|
---|
| 129 |
|
---|
| 130 | // For nested dynamic imports, the se value of the outer import is -1 as end tracking does not
|
---|
| 131 | // currently support nested dynamic immports
|
---|
| 132 |
|
---|
| 133 | // import.meta is indicated by imports[3].d === -2
|
---|
| 134 | // Returns true
|
---|
| 135 | imports[4].d === -2;
|
---|
| 136 | // Returns "import /*comment!*/.meta"
|
---|
| 137 | source.slice(imports[4].s, imports[4].e);
|
---|
| 138 | // ss and se are the same for import meta
|
---|
| 139 |
|
---|
| 140 | // Returns "'./mod.wasm'"
|
---|
| 141 | source.slice(imports[5].s, imports[5].e);
|
---|
| 142 |
|
---|
| 143 | // Import type 4 and 5 for static and dynamic source phase
|
---|
| 144 | imports[5].t === 4;
|
---|
| 145 | imports[6].t === 5;
|
---|
| 146 | })();
|
---|
| 147 | ```
|
---|
| 148 |
|
---|
| 149 | ### CSP asm.js Build
|
---|
| 150 |
|
---|
| 151 | The default version of the library uses Wasm and (safe) eval usage for performance and a minimal footprint.
|
---|
| 152 |
|
---|
| 153 | Neither of these represent security escalation possibilities since there are no execution string injection vectors, but that can still violate existing CSP policies for applications.
|
---|
| 154 |
|
---|
| 155 | For a version that works with CSP eval disabled, use the `es-module-lexer/js` build:
|
---|
| 156 |
|
---|
| 157 | ```js
|
---|
| 158 | import { parse } from 'es-module-lexer/js';
|
---|
| 159 | ```
|
---|
| 160 |
|
---|
| 161 | Instead of Web Assembly, this uses an asm.js build which is almost as fast as the Wasm version ([see benchmarks below](#benchmarks)).
|
---|
| 162 |
|
---|
| 163 | ### Escape Sequences
|
---|
| 164 |
|
---|
| 165 | To handle escape sequences in specifier strings, the `.n` field of imported specifiers will be provided where possible.
|
---|
| 166 |
|
---|
| 167 | For dynamic import expressions, this field will be empty if not a valid JS string.
|
---|
| 168 |
|
---|
| 169 | ### Facade Detection
|
---|
| 170 |
|
---|
| 171 | Facade modules that only use import / export syntax can be detected via the third return value:
|
---|
| 172 |
|
---|
| 173 | ```js
|
---|
| 174 | const [,, facade] = parse(`
|
---|
| 175 | export * from 'external';
|
---|
| 176 | import * as ns from 'external2';
|
---|
| 177 | export { a as b } from 'external3';
|
---|
| 178 | export { ns };
|
---|
| 179 | `);
|
---|
| 180 | facade === true;
|
---|
| 181 | ```
|
---|
| 182 |
|
---|
| 183 | ### ESM Detection
|
---|
| 184 |
|
---|
| 185 | Modules that uses ESM syntaxes can be detected via the fourth return value:
|
---|
| 186 |
|
---|
| 187 | ```js
|
---|
| 188 | const [,,, hasModuleSyntax] = parse(`
|
---|
| 189 | export {}
|
---|
| 190 | `);
|
---|
| 191 | hasModuleSyntax === true;
|
---|
| 192 | ```
|
---|
| 193 |
|
---|
| 194 | Dynamic imports are ignored since they can be used in Non-ESM files.
|
---|
| 195 |
|
---|
| 196 | ```js
|
---|
| 197 | const [,,, hasModuleSyntax] = parse(`
|
---|
| 198 | import('./foo.js')
|
---|
| 199 | `);
|
---|
| 200 | hasModuleSyntax === false;
|
---|
| 201 | ```
|
---|
| 202 |
|
---|
| 203 | ### Environment Support
|
---|
| 204 |
|
---|
| 205 | Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm).
|
---|
| 206 |
|
---|
| 207 | ### Grammar Support
|
---|
| 208 |
|
---|
| 209 | * Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
|
---|
| 210 | * Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
|
---|
| 211 | * Always correctly parses valid JS source, but may parse invalid JS source without errors.
|
---|
| 212 |
|
---|
| 213 | ### Limitations
|
---|
| 214 |
|
---|
| 215 | The lexing approach is designed to deal with the full language grammar including RegEx / division operator ambiguity through backtracking and paren / brace tracking.
|
---|
| 216 |
|
---|
| 217 | The only limitation to the reduced parser is that the "exports" list may not correctly gather all export identifiers in the following edge cases:
|
---|
| 218 |
|
---|
| 219 | ```js
|
---|
| 220 | // Only "a" is detected as an export, "q" isn't
|
---|
| 221 | export var a = 'asdf', q = z;
|
---|
| 222 |
|
---|
| 223 | // "b" is not detected as an export
|
---|
| 224 | export var { a: b } = asdf;
|
---|
| 225 | ```
|
---|
| 226 |
|
---|
| 227 | The above cases are handled gracefully in that the lexer will keep going fine, it will just not properly detect the export names above.
|
---|
| 228 |
|
---|
| 229 | ### Benchmarks
|
---|
| 230 |
|
---|
| 231 | Benchmarks can be run with `npm run bench`.
|
---|
| 232 |
|
---|
| 233 | Current results for a high spec machine:
|
---|
| 234 |
|
---|
| 235 | #### Wasm Build
|
---|
| 236 |
|
---|
| 237 | ```
|
---|
| 238 | Module load time
|
---|
| 239 | > 5ms
|
---|
| 240 | Cold Run, All Samples
|
---|
| 241 | test/samples/*.js (3123 KiB)
|
---|
| 242 | > 18ms
|
---|
| 243 |
|
---|
| 244 | Warm Runs (average of 25 runs)
|
---|
| 245 | test/samples/angular.js (739 KiB)
|
---|
| 246 | > 3ms
|
---|
| 247 | test/samples/angular.min.js (188 KiB)
|
---|
| 248 | > 1ms
|
---|
| 249 | test/samples/d3.js (508 KiB)
|
---|
| 250 | > 3ms
|
---|
| 251 | test/samples/d3.min.js (274 KiB)
|
---|
| 252 | > 2ms
|
---|
| 253 | test/samples/magic-string.js (35 KiB)
|
---|
| 254 | > 0ms
|
---|
| 255 | test/samples/magic-string.min.js (20 KiB)
|
---|
| 256 | > 0ms
|
---|
| 257 | test/samples/rollup.js (929 KiB)
|
---|
| 258 | > 4.32ms
|
---|
| 259 | test/samples/rollup.min.js (429 KiB)
|
---|
| 260 | > 2.16ms
|
---|
| 261 |
|
---|
| 262 | Warm Runs, All Samples (average of 25 runs)
|
---|
| 263 | test/samples/*.js (3123 KiB)
|
---|
| 264 | > 14.16ms
|
---|
| 265 | ```
|
---|
| 266 |
|
---|
| 267 | #### JS Build (asm.js)
|
---|
| 268 |
|
---|
| 269 | ```
|
---|
| 270 | Module load time
|
---|
| 271 | > 2ms
|
---|
| 272 | Cold Run, All Samples
|
---|
| 273 | test/samples/*.js (3123 KiB)
|
---|
| 274 | > 34ms
|
---|
| 275 |
|
---|
| 276 | Warm Runs (average of 25 runs)
|
---|
| 277 | test/samples/angular.js (739 KiB)
|
---|
| 278 | > 3ms
|
---|
| 279 | test/samples/angular.min.js (188 KiB)
|
---|
| 280 | > 1ms
|
---|
| 281 | test/samples/d3.js (508 KiB)
|
---|
| 282 | > 3ms
|
---|
| 283 | test/samples/d3.min.js (274 KiB)
|
---|
| 284 | > 2ms
|
---|
| 285 | test/samples/magic-string.js (35 KiB)
|
---|
| 286 | > 0ms
|
---|
| 287 | test/samples/magic-string.min.js (20 KiB)
|
---|
| 288 | > 0ms
|
---|
| 289 | test/samples/rollup.js (929 KiB)
|
---|
| 290 | > 5ms
|
---|
| 291 | test/samples/rollup.min.js (429 KiB)
|
---|
| 292 | > 3.04ms
|
---|
| 293 |
|
---|
| 294 | Warm Runs, All Samples (average of 25 runs)
|
---|
| 295 | test/samples/*.js (3123 KiB)
|
---|
| 296 | > 17.12ms
|
---|
| 297 | ```
|
---|
| 298 |
|
---|
| 299 | ### Building
|
---|
| 300 |
|
---|
| 301 | This project uses [Chomp](https://chompbuild.com) for building.
|
---|
| 302 |
|
---|
| 303 | With Chomp installed, download the WASI SDK 12.0 from https://github.com/WebAssembly/wasi-sdk/releases/tag/wasi-sdk-12.
|
---|
| 304 |
|
---|
| 305 | - [Linux](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz)
|
---|
| 306 | - [Windows (MinGW)](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-mingw.tar.gz)
|
---|
| 307 | - [macOS](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-macos.tar.gz)
|
---|
| 308 |
|
---|
| 309 | Locate the WASI-SDK as a sibling folder, or customize the path via the `WASI_PATH` environment variable.
|
---|
| 310 |
|
---|
| 311 | Emscripten emsdk is also assumed to be a sibling folder or via the `EMSDK_PATH` environment variable.
|
---|
| 312 |
|
---|
| 313 | Example setup:
|
---|
| 314 |
|
---|
| 315 | ```
|
---|
| 316 | git clone https://github.com:guybedford/es-module-lexer
|
---|
| 317 | git clone https://github.com/emscripten-core/emsdk
|
---|
| 318 | cd emsdk
|
---|
| 319 | git checkout 1.40.1-fastcomp
|
---|
| 320 | ./emsdk install 1.40.1-fastcomp
|
---|
| 321 | cd ..
|
---|
| 322 | wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz
|
---|
| 323 | gunzip wasi-sdk-12.0-linux.tar.gz
|
---|
| 324 | tar -xf wasi-sdk-12.0-linux.tar
|
---|
| 325 | mv wasi-sdk-12.0-linux.tar wasi-sdk-12.0
|
---|
| 326 | cargo install chompbuild
|
---|
| 327 | cd es-module-lexer
|
---|
| 328 | chomp test
|
---|
| 329 | ```
|
---|
| 330 |
|
---|
| 331 | For the `asm.js` build, git clone `emsdk` from is assumed to be a sibling folder as well.
|
---|
| 332 |
|
---|
| 333 | ### License
|
---|
| 334 |
|
---|
| 335 | MIT
|
---|
| 336 |
|
---|
| 337 | [actions-image]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml/badge.svg
|
---|
| 338 | [actions-url]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml
|
---|