1 | # ES Module Lexer
|
---|
2 |
|
---|
3 | [![Build Status][actions-image]][actions-url]
|
---|
4 |
|
---|
5 | A JS module syntax lexer used in [es-module-shims](https://github.com/guybedford/es-module-shims).
|
---|
6 |
|
---|
7 | Outputs the list of exports and locations of import specifiers, including dynamic import and import meta handling.
|
---|
8 |
|
---|
9 | Supports new syntax features including import attributes and source phase imports.
|
---|
10 |
|
---|
11 | A very small single JS file (4KiB gzipped) that includes inlined Web Assembly for very fast source analysis of ECMAScript module syntax only.
|
---|
12 |
|
---|
13 | For an example of the performance, Angular 1 (720KiB) is fully parsed in 5ms, in comparison to the fastest JS parser, Acorn which takes over 100ms.
|
---|
14 |
|
---|
15 | _Comprehensively handles the JS language grammar while remaining small and fast. - ~10ms per MB of JS cold and ~5ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._
|
---|
16 |
|
---|
17 | > [Built with](https://github.com/guybedford/es-module-lexer/blob/main/chompfile.toml) [Chomp](https://chompbuild.com/)
|
---|
18 |
|
---|
19 | ### Usage
|
---|
20 |
|
---|
21 | ```
|
---|
22 | npm install es-module-lexer
|
---|
23 | ```
|
---|
24 |
|
---|
25 | See [src/lexer.ts](src/lexer.ts) for the type definitions.
|
---|
26 |
|
---|
27 | For use in CommonJS:
|
---|
28 |
|
---|
29 | ```js
|
---|
30 | const { init, parse } = require('es-module-lexer');
|
---|
31 |
|
---|
32 | (async () => {
|
---|
33 | // either await init, or call parse asynchronously
|
---|
34 | // this is necessary for the Web Assembly boot
|
---|
35 | await init;
|
---|
36 |
|
---|
37 | const source = 'export var p = 5';
|
---|
38 | const [imports, exports] = parse(source);
|
---|
39 |
|
---|
40 | // Returns "p"
|
---|
41 | source.slice(exports[0].s, exports[0].e);
|
---|
42 | // Returns "p"
|
---|
43 | source.slice(exports[0].ls, exports[0].le);
|
---|
44 | })();
|
---|
45 | ```
|
---|
46 |
|
---|
47 | An ES module version is also available:
|
---|
48 |
|
---|
49 | ```js
|
---|
50 | import { init, parse } from 'es-module-lexer';
|
---|
51 |
|
---|
52 | (async () => {
|
---|
53 | await init;
|
---|
54 |
|
---|
55 | const source = `
|
---|
56 | import { name } from 'mod\\u1011';
|
---|
57 | import json from './json.json' assert { type: 'json' }
|
---|
58 | export var p = 5;
|
---|
59 | export function q () {
|
---|
60 |
|
---|
61 | };
|
---|
62 | export { x as 'external name' } from 'external';
|
---|
63 |
|
---|
64 | // Comments provided to demonstrate edge cases
|
---|
65 | import /*comment!*/ ( 'asdf', { assert: { type: 'json' }});
|
---|
66 | import /*comment!*/.meta.asdf;
|
---|
67 |
|
---|
68 | // Source phase imports:
|
---|
69 | import source mod from './mod.wasm';
|
---|
70 | import.source('./mod.wasm');
|
---|
71 | `;
|
---|
72 |
|
---|
73 | const [imports, exports] = parse(source, 'optional-sourcename');
|
---|
74 |
|
---|
75 | // Returns "modထ"
|
---|
76 | imports[0].n
|
---|
77 | // Returns "mod\u1011"
|
---|
78 | source.slice(imports[0].s, imports[0].e);
|
---|
79 | // "s" = start
|
---|
80 | // "e" = end
|
---|
81 |
|
---|
82 | // Returns "import { name } from 'mod'"
|
---|
83 | source.slice(imports[0].ss, imports[0].se);
|
---|
84 | // "ss" = statement start
|
---|
85 | // "se" = statement end
|
---|
86 |
|
---|
87 | // Returns "{ type: 'json' }"
|
---|
88 | source.slice(imports[1].a, imports[1].se);
|
---|
89 | // "a" = assert, -1 for no assertion
|
---|
90 |
|
---|
91 | // Returns "external"
|
---|
92 | source.slice(imports[2].s, imports[2].e);
|
---|
93 |
|
---|
94 | // Returns "p"
|
---|
95 | source.slice(exports[0].s, exports[0].e);
|
---|
96 | // Returns "p"
|
---|
97 | source.slice(exports[0].ls, exports[0].le);
|
---|
98 | // Returns "q"
|
---|
99 | source.slice(exports[1].s, exports[1].e);
|
---|
100 | // Returns "q"
|
---|
101 | source.slice(exports[1].ls, exports[1].le);
|
---|
102 | // Returns "'external name'"
|
---|
103 | source.slice(exports[2].s, exports[2].e);
|
---|
104 | // Returns -1
|
---|
105 | exports[2].ls;
|
---|
106 | // Returns -1
|
---|
107 | exports[2].le;
|
---|
108 |
|
---|
109 | // Import type is provided by `t` value
|
---|
110 | // (1 for static, 2, for dynamic)
|
---|
111 | // Returns true
|
---|
112 | imports[2].t == 2;
|
---|
113 |
|
---|
114 | // Returns "asdf" (only for string literal dynamic imports)
|
---|
115 | imports[2].n
|
---|
116 | // Returns "import /*comment!*/ ( 'asdf', { assert: { type: 'json' } })"
|
---|
117 | source.slice(imports[3].ss, imports[3].se);
|
---|
118 | // Returns "'asdf'"
|
---|
119 | source.slice(imports[3].s, imports[3].e);
|
---|
120 | // Returns "( 'asdf', { assert: { type: 'json' } })"
|
---|
121 | source.slice(imports[3].d, imports[3].se);
|
---|
122 | // Returns "{ assert: { type: 'json' } }"
|
---|
123 | source.slice(imports[3].a, imports[3].se - 1);
|
---|
124 |
|
---|
125 | // For non-string dynamic import expressions:
|
---|
126 | // - n will be undefined
|
---|
127 | // - a is currently -1 even if there is an assertion
|
---|
128 | // - e is currently the character before the closing )
|
---|
129 |
|
---|
130 | // For nested dynamic imports, the se value of the outer import is -1 as end tracking does not
|
---|
131 | // currently support nested dynamic immports
|
---|
132 |
|
---|
133 | // import.meta is indicated by imports[3].d === -2
|
---|
134 | // Returns true
|
---|
135 | imports[4].d === -2;
|
---|
136 | // Returns "import /*comment!*/.meta"
|
---|
137 | source.slice(imports[4].s, imports[4].e);
|
---|
138 | // ss and se are the same for import meta
|
---|
139 |
|
---|
140 | // Returns "'./mod.wasm'"
|
---|
141 | source.slice(imports[5].s, imports[5].e);
|
---|
142 |
|
---|
143 | // Import type 4 and 5 for static and dynamic source phase
|
---|
144 | imports[5].t === 4;
|
---|
145 | imports[6].t === 5;
|
---|
146 | })();
|
---|
147 | ```
|
---|
148 |
|
---|
149 | ### CSP asm.js Build
|
---|
150 |
|
---|
151 | The default version of the library uses Wasm and (safe) eval usage for performance and a minimal footprint.
|
---|
152 |
|
---|
153 | Neither of these represent security escalation possibilities since there are no execution string injection vectors, but that can still violate existing CSP policies for applications.
|
---|
154 |
|
---|
155 | For a version that works with CSP eval disabled, use the `es-module-lexer/js` build:
|
---|
156 |
|
---|
157 | ```js
|
---|
158 | import { parse } from 'es-module-lexer/js';
|
---|
159 | ```
|
---|
160 |
|
---|
161 | Instead of Web Assembly, this uses an asm.js build which is almost as fast as the Wasm version ([see benchmarks below](#benchmarks)).
|
---|
162 |
|
---|
163 | ### Escape Sequences
|
---|
164 |
|
---|
165 | To handle escape sequences in specifier strings, the `.n` field of imported specifiers will be provided where possible.
|
---|
166 |
|
---|
167 | For dynamic import expressions, this field will be empty if not a valid JS string.
|
---|
168 |
|
---|
169 | ### Facade Detection
|
---|
170 |
|
---|
171 | Facade modules that only use import / export syntax can be detected via the third return value:
|
---|
172 |
|
---|
173 | ```js
|
---|
174 | const [,, facade] = parse(`
|
---|
175 | export * from 'external';
|
---|
176 | import * as ns from 'external2';
|
---|
177 | export { a as b } from 'external3';
|
---|
178 | export { ns };
|
---|
179 | `);
|
---|
180 | facade === true;
|
---|
181 | ```
|
---|
182 |
|
---|
183 | ### ESM Detection
|
---|
184 |
|
---|
185 | Modules that uses ESM syntaxes can be detected via the fourth return value:
|
---|
186 |
|
---|
187 | ```js
|
---|
188 | const [,,, hasModuleSyntax] = parse(`
|
---|
189 | export {}
|
---|
190 | `);
|
---|
191 | hasModuleSyntax === true;
|
---|
192 | ```
|
---|
193 |
|
---|
194 | Dynamic imports are ignored since they can be used in Non-ESM files.
|
---|
195 |
|
---|
196 | ```js
|
---|
197 | const [,,, hasModuleSyntax] = parse(`
|
---|
198 | import('./foo.js')
|
---|
199 | `);
|
---|
200 | hasModuleSyntax === false;
|
---|
201 | ```
|
---|
202 |
|
---|
203 | ### Environment Support
|
---|
204 |
|
---|
205 | Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm).
|
---|
206 |
|
---|
207 | ### Grammar Support
|
---|
208 |
|
---|
209 | * Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
|
---|
210 | * Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
|
---|
211 | * Always correctly parses valid JS source, but may parse invalid JS source without errors.
|
---|
212 |
|
---|
213 | ### Limitations
|
---|
214 |
|
---|
215 | The lexing approach is designed to deal with the full language grammar including RegEx / division operator ambiguity through backtracking and paren / brace tracking.
|
---|
216 |
|
---|
217 | The only limitation to the reduced parser is that the "exports" list may not correctly gather all export identifiers in the following edge cases:
|
---|
218 |
|
---|
219 | ```js
|
---|
220 | // Only "a" is detected as an export, "q" isn't
|
---|
221 | export var a = 'asdf', q = z;
|
---|
222 |
|
---|
223 | // "b" is not detected as an export
|
---|
224 | export var { a: b } = asdf;
|
---|
225 | ```
|
---|
226 |
|
---|
227 | The above cases are handled gracefully in that the lexer will keep going fine, it will just not properly detect the export names above.
|
---|
228 |
|
---|
229 | ### Benchmarks
|
---|
230 |
|
---|
231 | Benchmarks can be run with `npm run bench`.
|
---|
232 |
|
---|
233 | Current results for a high spec machine:
|
---|
234 |
|
---|
235 | #### Wasm Build
|
---|
236 |
|
---|
237 | ```
|
---|
238 | Module load time
|
---|
239 | > 5ms
|
---|
240 | Cold Run, All Samples
|
---|
241 | test/samples/*.js (3123 KiB)
|
---|
242 | > 18ms
|
---|
243 |
|
---|
244 | Warm Runs (average of 25 runs)
|
---|
245 | test/samples/angular.js (739 KiB)
|
---|
246 | > 3ms
|
---|
247 | test/samples/angular.min.js (188 KiB)
|
---|
248 | > 1ms
|
---|
249 | test/samples/d3.js (508 KiB)
|
---|
250 | > 3ms
|
---|
251 | test/samples/d3.min.js (274 KiB)
|
---|
252 | > 2ms
|
---|
253 | test/samples/magic-string.js (35 KiB)
|
---|
254 | > 0ms
|
---|
255 | test/samples/magic-string.min.js (20 KiB)
|
---|
256 | > 0ms
|
---|
257 | test/samples/rollup.js (929 KiB)
|
---|
258 | > 4.32ms
|
---|
259 | test/samples/rollup.min.js (429 KiB)
|
---|
260 | > 2.16ms
|
---|
261 |
|
---|
262 | Warm Runs, All Samples (average of 25 runs)
|
---|
263 | test/samples/*.js (3123 KiB)
|
---|
264 | > 14.16ms
|
---|
265 | ```
|
---|
266 |
|
---|
267 | #### JS Build (asm.js)
|
---|
268 |
|
---|
269 | ```
|
---|
270 | Module load time
|
---|
271 | > 2ms
|
---|
272 | Cold Run, All Samples
|
---|
273 | test/samples/*.js (3123 KiB)
|
---|
274 | > 34ms
|
---|
275 |
|
---|
276 | Warm Runs (average of 25 runs)
|
---|
277 | test/samples/angular.js (739 KiB)
|
---|
278 | > 3ms
|
---|
279 | test/samples/angular.min.js (188 KiB)
|
---|
280 | > 1ms
|
---|
281 | test/samples/d3.js (508 KiB)
|
---|
282 | > 3ms
|
---|
283 | test/samples/d3.min.js (274 KiB)
|
---|
284 | > 2ms
|
---|
285 | test/samples/magic-string.js (35 KiB)
|
---|
286 | > 0ms
|
---|
287 | test/samples/magic-string.min.js (20 KiB)
|
---|
288 | > 0ms
|
---|
289 | test/samples/rollup.js (929 KiB)
|
---|
290 | > 5ms
|
---|
291 | test/samples/rollup.min.js (429 KiB)
|
---|
292 | > 3.04ms
|
---|
293 |
|
---|
294 | Warm Runs, All Samples (average of 25 runs)
|
---|
295 | test/samples/*.js (3123 KiB)
|
---|
296 | > 17.12ms
|
---|
297 | ```
|
---|
298 |
|
---|
299 | ### Building
|
---|
300 |
|
---|
301 | This project uses [Chomp](https://chompbuild.com) for building.
|
---|
302 |
|
---|
303 | With Chomp installed, download the WASI SDK 12.0 from https://github.com/WebAssembly/wasi-sdk/releases/tag/wasi-sdk-12.
|
---|
304 |
|
---|
305 | - [Linux](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz)
|
---|
306 | - [Windows (MinGW)](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-mingw.tar.gz)
|
---|
307 | - [macOS](https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-macos.tar.gz)
|
---|
308 |
|
---|
309 | Locate the WASI-SDK as a sibling folder, or customize the path via the `WASI_PATH` environment variable.
|
---|
310 |
|
---|
311 | Emscripten emsdk is also assumed to be a sibling folder or via the `EMSDK_PATH` environment variable.
|
---|
312 |
|
---|
313 | Example setup:
|
---|
314 |
|
---|
315 | ```
|
---|
316 | git clone https://github.com:guybedford/es-module-lexer
|
---|
317 | git clone https://github.com/emscripten-core/emsdk
|
---|
318 | cd emsdk
|
---|
319 | git checkout 1.40.1-fastcomp
|
---|
320 | ./emsdk install 1.40.1-fastcomp
|
---|
321 | cd ..
|
---|
322 | wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-12/wasi-sdk-12.0-linux.tar.gz
|
---|
323 | gunzip wasi-sdk-12.0-linux.tar.gz
|
---|
324 | tar -xf wasi-sdk-12.0-linux.tar
|
---|
325 | mv wasi-sdk-12.0-linux.tar wasi-sdk-12.0
|
---|
326 | cargo install chompbuild
|
---|
327 | cd es-module-lexer
|
---|
328 | chomp test
|
---|
329 | ```
|
---|
330 |
|
---|
331 | For the `asm.js` build, git clone `emsdk` from is assumed to be a sibling folder as well.
|
---|
332 |
|
---|
333 | ### License
|
---|
334 |
|
---|
335 | MIT
|
---|
336 |
|
---|
337 | [actions-image]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml/badge.svg
|
---|
338 | [actions-url]: https://github.com/guybedford/es-module-lexer/actions/workflows/build.yml
|
---|