You may know the jq JSON command line processor. You may also know the ammonite REPL. Now imagine that you combine these two awesome tools into single one and you have the sjq.
sjq is a small tool written in Scala that compiles your input JSON into Scala case classes, so you can use Scala expressions and collections API to transform your data, no need to remember any special syntax! And thanks to the embedded ammonite REPL, you can use all that goodies such as syntax highlighting, auto completion on TAB and much more! Manipulating JSON data has never been easier :-)
🚧 Work in Progress 🚧 - This software is under active development and hasn't even reached its initial public release. Please bear in mind that documentation is likely missing and API and/or command line interface can change unexpectedly. At least for now, below is small demo how you can use sjq right now.
- 1. Example of Use
- 2. Table of Contents
- 3. Installation
- 4. Usage
- 5. How it Works?
- 6. Known Issues / Future Work
Pre-built JAR files and binary distribution coming soon!
To build sjq from source code, you need to install sbt first.
$ git clone https://github.com/vaclavsvejcar/sjq.git
$ cd sjq/
$ sbt assembly
Then grab the built JAR file from ./target/scala-2.13/sjq-assembly-<VERSION>.jar
and you're ready to go!
sjq offers two different modes, the Interactive (REPL) Mode which uses the ammonite REPL for interactive work, and Non-interactive (CLI) Mode which is useful when you need to call sjq from another shell script, etc.
Probably the biggest advantage over similar tools is the interactive mode, powered by the awesome ammonite REPL. In this mode, sjq takes your input JSON, compiles it into Scala code and then exposes them through the ammonite REPL. Then you can work with your data as with regular Scala code, with all the goodies such as syntax highlighting and auto completion on TAB:
$ java -jar sjq.jar repl -f path/to/file.json
--- Welcome to sjq REPL mode ---
Compiling type definitions from input JSON (this may take a while)...
[i] Variable 'root' holds Scala representation of parsed JSON
[i] Variable 'json' holds parsed JSON
[i] Variable 'ast' holds internal AST representation of data (for debugging purposes)
[i] Variable 'defs' holds generated Scala definitions (for debugging purposes)
[i] To serialize data back to JSON use '.asJson.spaces2'
Welcome to the Ammonite Repl 2.1.1 (Scala 2.13.2 Java 1.8.0_222)
@ root.users
res0: Seq[root0.users] = List(users("John Smith", 42.0), users("Peter Taylor", 67.0), users("Lucy Snow", 21.0))
@ root.users.sortBy(_.age)
res1: Seq[root0.users] = List(users("Lucy Snow", 21.0), users("John Smith", 42.0), users("Peter Taylor", 67.0))
Interactive mode is executed using the java -jar sjq.jar repl
and you need to specify the source of input JSON either as local file (-f|--file=PATH
) or as inline value (-j|--json=JSON
).
Following variables are exposed to the ammonite REPL, so you can access them as needed:
root
- Scala representation of JSON data, this is probably what you'll use the mostjson
- Circe's representation of parsed JSON dataast
- sjq's internal representation of parsed JSON datadefs
- Scala definitions and types generated from the JSON data
If you don't want to use the interactive mode, then this mode is here for you. It might be useful for example for some shell script, when it can perform JSON transformations as part of some larger task.
$ cat /tmp/example.json | java -jar sjq.jar cli -a "root.users.sortBy(_.age).asJson.noSpaces"
"[{\"name\":\"Lucy Snow\",\"age\":21.0},{\"name\":\"John Smith\",\"age\":42.0},{\"name\":\"Peter Taylor\",\"age\":67.0}]"
Non-interactive mode is executed using the java -jar sjq.jar cli
. You need to specify the transformation code as -a|--access=CODE
argument and input JSON as either STDIN or as -j|--json=JSON
argument.
For the curious ones, here's how sjq works under the hood:
- First step is to parse internal AST representation from the input JSON (see dev.svejcar.sjq.core.Parser).
- Next step is to emit valid Scala code (case classes and objects) matching the input JSON (see dev.svejcar.sjq.core.Emitter). This code is the compiled in runtime (this is the part that may take long time).
- Last step is to read the input JSON data into generated Scala representation. This is done using the Circe's automatic derivation mechanism.
sjq is under heavy development and things might not be perfect yet. Here is the list of known issues and/or limitations that should be targeted in future releases:
- Performance: compile more complex JSON can be very slow, as it needs to do both compilation in runtime of generated Scala code and generic derivation for Circe's decoders/encoders. Performance improvements should be one of main targets in future releases
- Occasional crashes: more complex JSON structures might cause runtime crashes, mostly due to incorrectly generated Scala code and/or not matching derived JSON encoders/decoders. If that happens for you, please report that as new issue.