stream-json is a micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can parse JSON files far exceeding available memory. Even individual data items (keys, strings, and numbers) can be streamed piece-wise. A SAX-inspired event-based API is included.
Components:
- Parser — streaming JSON parser producing a SAX-like token stream.
- Optionally packs keys, strings, and numbers (controlled separately).
- The main module creates a parser decorated with
emit().
- Filters edit a token stream:
- Streamers assemble tokens into JavaScript objects:
- StreamValues — streams successive JSON values (for JSON Streaming or after
pick()). - StreamArray — streams elements of a top-level array.
- StreamObject — streams top-level properties of an object.
- StreamValues — streams successive JSON values (for JSON Streaming or after
- Essentials:
- Assembler — reconstructs JavaScript objects from tokens (EventEmitter).
- Disassembler — converts JavaScript objects into a token stream.
- Stringer — converts a token stream back into JSON text.
- Emitter — re-emits tokens as named events.
- Utilities:
- emit() — attaches token events to any stream.
- withParser() — creates parser + component pipelines.
- Batch — groups items into arrays.
- Verifier — validates JSON text, pinpoints errors.
- FlexAssembler — Assembler with custom containers (Map, Set, etc.) at specific paths.
- Utf8Stream — sanitizes multibyte UTF-8 input.
- JSONL (JSON Lines / NDJSON):
- jsonl/Parser — parses JSONL into
{key, value}objects. Faster thanparser({jsonStreaming: true})+streamValues()when items fit in memory. - jsonl/Stringer — serializes objects to JSONL text. Faster than
disassembler()+stringer().
- jsonl/Parser — parses JSONL into
- JSONC (JSON with Comments):
- jsonc/Parser — streaming JSONC parser with comment and whitespace tokens.
- jsonc/Stringer — converts JSONC token streams back to text.
All components are building blocks for custom data processing pipelines. They can be combined with each other and with custom code via stream-chain.
Distributed under the New BSD license.
const {chain} = require('stream-chain'); const {parser} = require('stream-json'); const {pick} = require('stream-json/filters/pick.js'); const {ignore} = require('stream-json/filters/ignore.js'); const {streamValues} = require('stream-json/streamers/stream-values.js'); const fs = require('fs'); const zlib = require('zlib'); const pipeline = chain([ fs.createReadStream('sample.json.gz'), zlib.createGunzip(), parser(), pick({filter: 'data'}), ignore({filter: /\b_meta\b/i}), streamValues(), data => { const value = data.value; // keep data only for the accounting department return value && value.department === 'accounting' ? data : null; } ]); let counter = 0; pipeline.on('data', () => ++counter); pipeline.on('end', () => console.log(`The accounting department has ${counter} employees.`));See the full documentation in Wiki.
Companion projects:
- stream-csv-as-json streams huge CSV files in a format compatible with
stream-json: rows as arrays of string values. If a header row is used, it can stream rows as objects with named fields.
npm install --save stream-json # or: yarn add stream-jsonThe library is organized as small composable components based on Node.js streams and events. The source code is compact — read it to understand how things work and to build your own components.
Bug reports, simplifications, and new generic components are welcome — open a ticket or pull request.
- 2.0.0 major rewrite: functional API based on
stream-chain3.x, bundled TypeScript definitions. New: JSONC parser/stringer, FlexAssembler. See Migrating from 1.x to 2.x. - 1.9.1 fixed a race condition in the Disassembler stream implementation. Thx, Noam Okman.
- 1.9.0 fixed a slight deviation from the JSON standard. Thx Peter Burns.
- 1.8.0 added an option to indicate/ignore JSONL errors. Thx, AK.
- 1.7.5 fixed a stringer bug with ASCII control symbols. Thx, Kraicheck.
The full history is in the wiki: Release history.