Experimental JSONPath engine for querying massive streamed datasets.
The rsonpath
crate provides a JSONPath parser and a query execution engine rq
,
which utilizes SIMD instructions to provide massive throughput improvements over conventional engines.
Benchmarks of rsonpath
against a reference no-SIMD engine on the
Pison dataset. NOTE: Scale is logarithmic!
To run a JSONPath query on a file execute:
rq '$..a.b' ./file.json
If the file is omitted, the engine reads standard input. JSON can also be passed inline:
$ rq '$..a.b' --json '{"c":{"a":{"b":42}}}'
42
For details, consult rq --help
or the rsonbook.
The result of running a query is a sequence of matched values, delimited by newlines.
Alternatively, passing --result count
returns only the number of matches, which might be much faster.
For other result modes consult the --help
usage page.
See Releases for precompiled binaries for all first-class support targets.
Easiest way to install is via cargo
.
$ cargo install rsonpath
...
If maximum speed is paramount, you should install rsonpath
with native CPU instructions support.
This will result in a binary that is not portable and might work incorrectly on any other machine,
but will squeeze out every last bit of throughput.
To do this, run the following cargo install
variant:
$ RUSTFLAGS="-C target-cpu=native" cargo install rsonpath
...
Check out the relevant chapter in the rsonbook.
The project is actively developed and currently supports only a subset of the JSONPath query language. A query is a sequence of segments, each containing one or more selectors.
Segment | Syntax | Supported | Since | Tracking Issue |
---|---|---|---|---|
Child segment (single) | [<selector>] |
βοΈ | v0.1.0 | |
Child segment (multiple) | [<selector1>,...,<selectorN>] |
β | ||
Descendant segment (single) | ..[<selector>] |
βοΈ | v0.1.0 | |
Descendant segment (multiple) | ..[<selector1>,...,<selectorN>] |
β |
Selector | Syntax | Supported | Since | Tracking Issue |
---|---|---|---|---|
Root | $ |
βοΈ | v0.1.0 | |
Name | .<member> , [<member>] |
βοΈ | v0.1.0 | |
Wildcard | .* , ..* , [*] |
βοΈ | v0.4.0 | |
Index (array index) | [<index>] |
βοΈ | v0.5.0 | |
Index (array index from end) | [-<index>] |
β | ||
Array slice (forward, positive bounds) | [<start>:<end>:<step>] |
βοΈ | v0.9.0 | #152 |
Array slice (forward, arbitrary bounds) | [<start>:<end>:<step>] |
β | ||
Array slice (backward, arbitrary bounds) | [<start>:<end>:-<step>] |
β | ||
Filters β existential tests | [?<path>] |
β | #154 | |
Filters β const atom comparisons | [?<path> <binop> <atom>] |
β | #156 | |
Filters β logical expressions | && , || , ! |
β | ||
Filters β nesting | [?<expr>[?<expr>]...] |
β | ||
Filters β arbitrary comparisons | [?<path> <binop> <path>] |
β | ||
Filters β function extensions | [?func(<path>)] |
β |
The crate is continuously built for all Tier 1 Rust targets, and tests are continuously ran for targets that can be ran with GitHub action images. SIMD is supported only on x86/x86_64 platforms.
Target triple | nosimd build | SIMD support | Continuous testing | Tracking issues |
---|---|---|---|---|
aarch64-unknown-linux-gnu | βοΈ | β | βοΈ | #21, #115 |
i686-unknown-linux-gnu | βοΈ | βοΈ | βοΈ | |
x86_64-unknown-linux-gnu | βοΈ | βοΈ | βοΈ | |
x86_64-apple-darwin | βοΈ | βοΈ | βοΈ | |
i686-pc-windows-gnu | βοΈ | βοΈ | βοΈ | |
i686-pc-windows-msvc | βοΈ | βοΈ | βοΈ | |
x86_64-pc-windows-gnu | βοΈ | βοΈ | βοΈ | |
x86_64-pc-windows-msvc | βοΈ | βοΈ | βοΈ |
SIMD support is enabled on a module-by-module basis. Generally, any CPU released in the past decade supports AVX2, which enables all available optimizations.
Older CPUs with SSE2 or higher get partial support. You can check what exactly is enabled
with rq --version
β check the SIMD support
field:
$ rq --version
rq 0.9.1
Commit SHA: c024e1bab89610455537b77aed249d2a05a81ed6
Features: default,simd
Opt level: 3
Target triple: x86_64-unknown-linux-gnu
Codegen flags: link-arg=-fuse-ld=lld
SIMD support: avx2;fast_quotes;fast_popcnt
The fast_quotes
capability depends on the pclmulqdq
instruction,
and fast_popcnt
on the popcnt
instruction.
Not all selectors are supported, see the support table above.
The engine assumes that every object in the input JSON has no duplicate keys. Behavior on duplicate keys is not guaranteed to be stable, but currently the engine will simply match the first such key.
$ rq '$.key' --json '{"key":"value","key":"other value"}'
"value"
The engine does not parse unicode escape sequences in member names.
This means that a key "a"
is different from a key "\u0041"
, even though semantically they represent the same string.
This is actually as-designed with respect to the current JSONPath spec.
Parsing unicode sequences is costly, so the support for this was postponed
in favour of high performance. This is tracked as #117.
The gist is: fork, implement, make a PR back here. More details are in the CONTRIBUTING doc.
The dev workflow utilizes just
.
Use the included Justfile
. It will automatically install Rust for you using the rustup
tool if it detects there is no Cargo in your environment.
$ just build
...
$ just test
...
Benchmarks for rsonpath
are located in a separate repository,
included as a git submodule in this main repository.
Easiest way to run all the benchmarks is just bench
. For details, look at the README in the submodule.
We have a paper on rsonpath
to be published at ASPLOS '24! You can read it
here.
This project was conceived as my thesis. You can read it for details on the theoretical background on the engine and details of its implementation.
Showing direct dependencies, for full graph see below.
cargo tree --package rsonpath --edges normal --depth 1
rsonpath v0.9.3 (/home/mat/src/rsonpath/crates/rsonpath)
βββ clap v4.5.23
βββ color-eyre v0.6.3
βββ eyre v0.6.12
βββ log v0.4.22
βββ rsonpath-lib v0.9.3 (/home/mat/src/rsonpath/crates/rsonpath-lib)
βββ rsonpath-syntax v0.3.2 (/home/mat/src/rsonpath/crates/rsonpath-syntax)
βββ simple_logger v5.0.0
[build-dependencies]
βββ rustflags v0.1.6
βββ vergen v9.0.2
β [build-dependencies]
βββ vergen-git2 v1.0.2
β [build-dependencies]
βββ vergen-gitcl v1.0.2
[build-dependencies]
cargo tree --package rsonpath-lib --edges normal --depth 1
rsonpath-lib v0.9.3 (/home/mat/src/rsonpath/crates/rsonpath-lib)
βββ arbitrary v1.4.1
βββ cfg-if v1.0.0
βββ log v0.4.22
βββ memmap2 v0.9.5
βββ rsonpath-syntax v0.3.2 (/home/mat/src/rsonpath/crates/rsonpath-syntax)
βββ smallvec v1.13.2
βββ static_assertions v1.1.0
βββ thiserror v2.0.9
βββ vector-map v1.0.1
clap
β standard crate to provide the CLI.color-eyre
,eyre
β more accessible error messages for the parser.log
,simple-logger
β diagnostic logs during compilation and execution.cfg-if
β used to support SIMD and no-SIMD versions.memmap2
β for fast reading of source files via a memory map instead of buffered copies.nom
β for parser implementation.smallvec
β crucial for small-stack performance.static_assertions
β additional reliability by some constant assumptions validated at compile time.thiserror
β idiomaticError
implementations.vector_map
β used in the query compiler for measurably better performance.
cargo tree --package rsonpath --edges normal
rsonpath v0.9.3 (/home/mat/src/rsonpath/crates/rsonpath)
βββ clap v4.5.23
β βββ clap_builder v4.5.23
β β βββ anstream v0.6.18
β β β βββ anstyle v1.0.10
β β β βββ anstyle-parse v0.2.6
β β β β βββ utf8parse v0.2.2
β β β βββ anstyle-query v1.1.2
β β β β βββ windows-sys v0.59.0
β β β β βββ windows-targets v0.52.6
β β β β βββ windows_aarch64_gnullvm v0.52.6
β β β β βββ windows_aarch64_msvc v0.52.6
β β β β βββ windows_i686_gnu v0.52.6
β β β β βββ windows_i686_gnullvm v0.52.6
β β β β βββ windows_i686_msvc v0.52.6
β β β β βββ windows_x86_64_gnu v0.52.6
β β β β βββ windows_x86_64_gnullvm v0.52.6
β β β β βββ windows_x86_64_msvc v0.52.6
β β β βββ anstyle-wincon v3.0.6
β β β β βββ anstyle v1.0.10
β β β β βββ windows-sys v0.59.0 (*)
β β β βββ colorchoice v1.0.3
β β β βββ is_terminal_polyfill v1.70.1
β β β βββ utf8parse v0.2.2
β β βββ anstyle v1.0.10
β β βββ clap_lex v0.7.4
β β βββ strsim v0.11.1
β β βββ terminal_size v0.4.1
β β βββ rustix v0.38.42
β β β βββ bitflags v2.6.0
β β β βββ errno v0.3.10
β β β β βββ libc v0.2.169
β β β β βββ windows-sys v0.59.0 (*)
β β β βββ libc v0.2.169
β β β βββ linux-raw-sys v0.4.14
β β β βββ windows-sys v0.59.0 (*)
β β βββ windows-sys v0.59.0 (*)
β βββ clap_derive v4.5.18 (proc-macro)
β βββ heck v0.5.0
β βββ proc-macro2 v1.0.92
β β βββ unicode-ident v1.0.14
β βββ quote v1.0.37
β β βββ proc-macro2 v1.0.92 (*)
β βββ syn v2.0.91
β βββ proc-macro2 v1.0.92 (*)
β βββ quote v1.0.37 (*)
β βββ unicode-ident v1.0.14
βββ color-eyre v0.6.3
β βββ backtrace v0.3.71
β β βββ addr2line v0.21.0
β β β βββ gimli v0.28.1
β β βββ cfg-if v1.0.0
β β βββ libc v0.2.169
β β βββ miniz_oxide v0.7.4
β β β βββ adler v1.0.2
β β βββ object v0.32.2
β β β βββ memchr v2.7.4
β β βββ rustc-demangle v0.1.24
β β [build-dependencies]
β β βββ cc v1.2.5
β β βββ jobserver v0.1.32
β β β βββ libc v0.2.169
β β βββ libc v0.2.169
β β βββ shlex v1.3.0
β βββ eyre v0.6.12
β β βββ indenter v0.3.3
β β βββ once_cell v1.20.2
β βββ indenter v0.3.3
β βββ once_cell v1.20.2
β βββ owo-colors v3.5.0
βββ eyre v0.6.12 (*)
βββ log v0.4.22
βββ rsonpath-lib v0.9.3 (/home/mat/src/rsonpath/crates/rsonpath-lib)
β βββ cfg-if v1.0.0
β βββ log v0.4.22
β βββ memmap2 v0.9.5
β β βββ libc v0.2.169
β βββ rsonpath-syntax v0.3.2 (/home/mat/src/rsonpath/crates/rsonpath-syntax)
β β βββ nom v7.1.3
β β β βββ memchr v2.7.4
β β β βββ minimal-lexical v0.2.1
β β βββ owo-colors v4.1.0
β β βββ thiserror v2.0.9
β β β βββ thiserror-impl v2.0.9 (proc-macro)
β β β βββ proc-macro2 v1.0.92 (*)
β β β βββ quote v1.0.37 (*)
β β β βββ syn v2.0.91 (*)
β β βββ unicode-width v0.2.0
β βββ smallvec v1.13.2
β βββ static_assertions v1.1.0
β βββ thiserror v2.0.9 (*)
β βββ vector-map v1.0.1
β βββ contracts v0.4.0 (proc-macro)
β β βββ proc-macro2 v1.0.92 (*)
β β βββ quote v1.0.37 (*)
β β βββ syn v1.0.109
β β βββ proc-macro2 v1.0.92 (*)
β β βββ quote v1.0.37 (*)
β β βββ unicode-ident v1.0.14
β βββ rand v0.7.3
β βββ getrandom v0.1.16
β β βββ cfg-if v1.0.0
β β βββ libc v0.2.169
β β βββ wasi v0.9.0+wasi-snapshot-preview1
β βββ libc v0.2.169
β βββ rand_chacha v0.2.2
β β βββ ppv-lite86 v0.2.20
β β β βββ zerocopy v0.7.35
β β β βββ byteorder v1.5.0
β β β βββ zerocopy-derive v0.7.35 (proc-macro)
β β β βββ proc-macro2 v1.0.92 (*)
β β β βββ quote v1.0.37 (*)
β β β βββ syn v2.0.91 (*)
β β βββ rand_core v0.5.1
β β βββ getrandom v0.1.16 (*)
β βββ rand_core v0.5.1 (*)
β βββ rand_hc v0.2.0
β βββ rand_core v0.5.1 (*)
βββ rsonpath-syntax v0.3.2 (/home/mat/src/rsonpath/crates/rsonpath-syntax) (*)
βββ simple_logger v5.0.0
βββ colored v2.2.0
β βββ lazy_static v1.5.0
β βββ windows-sys v0.59.0 (*)
βββ log v0.4.22
βββ time v0.3.37
β βββ deranged v0.3.11
β β βββ powerfmt v0.2.0
β βββ itoa v1.0.14
β βββ libc v0.2.169
β βββ num-conv v0.1.0
β βββ num_threads v0.1.7
β β βββ libc v0.2.169
β βββ powerfmt v0.2.0
β βββ time-core v0.1.2
β βββ time-macros v0.2.19 (proc-macro)
β βββ num-conv v0.1.0
β βββ time-core v0.1.2
βββ windows-sys v0.48.0
βββ windows-targets v0.48.5
βββ windows_aarch64_gnullvm v0.48.5
βββ windows_aarch64_msvc v0.48.5
βββ windows_i686_gnu v0.48.5
βββ windows_i686_msvc v0.48.5
βββ windows_x86_64_gnu v0.48.5
βββ windows_x86_64_gnullvm v0.48.5
βββ windows_x86_64_msvc v0.48.5
[build-dependencies]
βββ rustflags v0.1.6
βββ vergen v9.0.2
β βββ anyhow v1.0.95
β βββ cargo_metadata v0.19.1
β β βββ camino v1.1.9
β β β βββ serde v1.0.216
β β β βββ serde_derive v1.0.216 (proc-macro)
β β β βββ proc-macro2 v1.0.92 (*)
β β β βββ quote v1.0.37 (*)
β β β βββ syn v2.0.91 (*)
β β βββ cargo-platform v0.1.9
β β β βββ serde v1.0.216 (*)
β β βββ semver v1.0.24
β β β βββ serde v1.0.216 (*)
β β βββ serde v1.0.216 (*)
β β βββ serde_json v1.0.134
β β β βββ itoa v1.0.14
β β β βββ memchr v2.7.4
β β β βββ ryu v1.0.18
β β β βββ serde v1.0.216 (*)
β β βββ thiserror v2.0.9 (*)
β βββ derive_builder v0.20.2
β β βββ derive_builder_macro v0.20.2 (proc-macro)
β β βββ derive_builder_core v0.20.2
β β β βββ darling v0.20.10
β β β β βββ darling_core v0.20.10
β β β β β βββ fnv v1.0.7
β β β β β βββ ident_case v1.0.1
β β β β β βββ proc-macro2 v1.0.92 (*)
β β β β β βββ quote v1.0.37 (*)
β β β β β βββ strsim v0.11.1
β β β β β βββ syn v2.0.91 (*)
β β β β βββ darling_macro v0.20.10 (proc-macro)
β β β β βββ darling_core v0.20.10 (*)
β β β β βββ quote v1.0.37 (*)
β β β β βββ syn v2.0.91 (*)
β β β βββ proc-macro2 v1.0.92 (*)
β β β βββ quote v1.0.37 (*)
β β β βββ syn v2.0.91 (*)
β β βββ syn v2.0.91 (*)
β βββ regex v1.11.1
β β βββ aho-corasick v1.1.3
β β β βββ memchr v2.7.4
β β βββ memchr v2.7.4
β β βββ regex-automata v0.4.9
β β β βββ aho-corasick v1.1.3 (*)
β β β βββ memchr v2.7.4
β β β βββ regex-syntax v0.8.5
β β βββ regex-syntax v0.8.5
β βββ rustc_version v0.4.1
β β βββ semver v1.0.24 (*)
β βββ vergen-lib v0.1.5
β βββ anyhow v1.0.95
β βββ derive_builder v0.20.2 (*)
β [build-dependencies]
β βββ rustversion v1.0.18 (proc-macro)
β [build-dependencies]
β βββ rustversion v1.0.18 (proc-macro)
βββ vergen-git2 v1.0.2
β βββ anyhow v1.0.95
β βββ derive_builder v0.20.2 (*)
β βββ git2 v0.19.0
β β βββ bitflags v2.6.0
β β βββ libc v0.2.169
β β βββ libgit2-sys v0.17.0+1.8.1
β β β βββ libc v0.2.169
β β β βββ libz-sys v1.1.20
β β β βββ libc v0.2.169
β β β [build-dependencies]
β β β βββ cc v1.2.5 (*)
β β β βββ pkg-config v0.3.31
β β β βββ vcpkg v0.2.15
β β β [build-dependencies]
β β β βββ cc v1.2.5 (*)
β β β βββ pkg-config v0.3.31
β β βββ log v0.4.22
β β βββ url v2.5.4
β β βββ form_urlencoded v1.2.1
β β β βββ percent-encoding v2.3.1
β β βββ idna v1.0.3
β β β βββ idna_adapter v1.2.0
β β β β βββ icu_normalizer v1.5.0
β β β β β βββ displaydoc v0.2.5 (proc-macro)
β β β β β β βββ proc-macro2 v1.0.92 (*)
β β β β β β βββ quote v1.0.37 (*)
β β β β β β βββ syn v2.0.91 (*)
β β β β β βββ icu_collections v1.5.0
β β β β β β βββ displaydoc v0.2.5 (proc-macro) (*)
β β β β β β βββ yoke v0.7.5
β β β β β β β βββ stable_deref_trait v1.2.0
β β β β β β β βββ yoke-derive v0.7.5 (proc-macro)
β β β β β β β β βββ proc-macro2 v1.0.92 (*)
β β β β β β β β βββ quote v1.0.37 (*)
β β β β β β β β βββ syn v2.0.91 (*)
β β β β β β β β βββ synstructure v0.13.1
β β β β β β β β βββ proc-macro2 v1.0.92 (*)
β β β β β β β β βββ quote v1.0.37 (*)
β β β β β β β β βββ syn v2.0.91 (*)
β β β β β β β βββ zerofrom v0.1.5
β β β β β β β βββ zerofrom-derive v0.1.5 (proc-macro)
β β β β β β β βββ proc-macro2 v1.0.92 (*)
β β β β β β β βββ quote v1.0.37 (*)
β β β β β β β βββ syn v2.0.91 (*)
β β β β β β β βββ synstructure v0.13.1 (*)
β β β β β β βββ zerofrom v0.1.5 (*)
β β β β β β βββ zerovec v0.10.4
β β β β β β βββ yoke v0.7.5 (*)
β β β β β β βββ zerofrom v0.1.5 (*)
β β β β β β βββ zerovec-derive v0.10.3 (proc-macro)
β β β β β β βββ proc-macro2 v1.0.92 (*)
β β β β β β βββ quote v1.0.37 (*)
β β β β β β βββ syn v2.0.91 (*)
β β β β β βββ icu_normalizer_data v1.5.0
β β β β β βββ icu_properties v1.5.1
β β β β β β βββ displaydoc v0.2.5 (proc-macro) (*)
β β β β β β βββ icu_collections v1.5.0 (*)
β β β β β β βββ icu_locid_transform v1.5.0
β β β β β β β βββ displaydoc v0.2.5 (proc-macro) (*)
β β β β β β β βββ icu_locid v1.5.0
β β β β β β β β βββ displaydoc v0.2.5 (proc-macro) (*)
β β β β β β β β βββ litemap v0.7.4
β β β β β β β β βββ tinystr v0.7.6
β β β β β β β β β βββ displaydoc v0.2.5 (proc-macro) (*)
β β β β β β β β β βββ zerovec v0.10.4 (*)
β β β β β β β β βββ writeable v0.5.5
β β β β β β β β βββ zerovec v0.10.4 (*)
β β β β β β β βββ icu_locid_transform_data v1.5.0
β β β β β β β βββ icu_provider v1.5.0
β β β β β β β β βββ displaydoc v0.2.5 (proc-macro) (*)
β β β β β β β β βββ icu_locid v1.5.0 (*)
β β β β β β β β βββ icu_provider_macros v1.5.0 (proc-macro)
β β β β β β β β β βββ proc-macro2 v1.0.92 (*)
β β β β β β β β β βββ quote v1.0.37 (*)
β β β β β β β β β βββ syn v2.0.91 (*)
β β β β β β β β βββ stable_deref_trait v1.2.0
β β β β β β β β βββ tinystr v0.7.6 (*)
β β β β β β β β βββ writeable v0.5.5
β β β β β β β β βββ yoke v0.7.5 (*)
β β β β β β β β βββ zerofrom v0.1.5 (*)
β β β β β β β β βββ zerovec v0.10.4 (*)
β β β β β β β βββ tinystr v0.7.6 (*)
β β β β β β β βββ zerovec v0.10.4 (*)
β β β β β β βββ icu_properties_data v1.5.0
β β β β β β βββ icu_provider v1.5.0 (*)
β β β β β β βββ tinystr v0.7.6 (*)
β β β β β β βββ zerovec v0.10.4 (*)
β β β β β βββ icu_provider v1.5.0 (*)
β β β β β βββ smallvec v1.13.2
β β β β β βββ utf16_iter v1.0.5
β β β β β βββ utf8_iter v1.0.4
β β β β β βββ write16 v1.0.0
β β β β β βββ zerovec v0.10.4 (*)
β β β β βββ icu_properties v1.5.1 (*)
β β β βββ smallvec v1.13.2
β β β βββ utf8_iter v1.0.4
β β βββ percent-encoding v2.3.1
β βββ time v0.3.37
β β βββ deranged v0.3.11 (*)
β β βββ itoa v1.0.14
β β βββ libc v0.2.169
β β βββ num-conv v0.1.0
β β βββ num_threads v0.1.7 (*)
β β βββ powerfmt v0.2.0
β β βββ time-core v0.1.2
β βββ vergen v9.0.2 (*)
β βββ vergen-lib v0.1.5 (*)
β [build-dependencies]
β βββ rustversion v1.0.18 (proc-macro)
βββ vergen-gitcl v1.0.2
βββ anyhow v1.0.95
βββ derive_builder v0.20.2 (*)
βββ time v0.3.37 (*)
βββ vergen v9.0.2 (*)
βββ vergen-lib v0.1.5 (*)
[build-dependencies]
βββ rustversion v1.0.18 (proc-macro)