Skip to content

Commit

Permalink
Internal change
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 459800928
  • Loading branch information
achoum authored and copybara-github committed Jul 8, 2022
1 parent 06a1fff commit 449a15f
Show file tree
Hide file tree
Showing 10 changed files with 986 additions and 2 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
**Yggdrasil Decision Forests** (**YDF**) is a collection of state-of-the-art
algorithms for the training, serving and interpretation of **Decision Forest**
models. The library is developed in C++ and available in C++, CLI
(command-line-interface, i.e. shell commands) and in TensorFlow under the name
(command-line-interface, i.e. shell commands), in TensorFlow under the name
[TensorFlow Decision Forests](https://github.com/tensorflow/decision-forests)
(TF-DF).
(TF-DF), and in Javascript (inference only).

Developing models in TF-DF and productionizing them (possibly including
re-training) in C++ with YDF allows both for a flexible and fast development and
Expand Down Expand Up @@ -88,6 +88,7 @@ The following resources are available:
- [Known issues](documentation/known_issues.md)
- [Changelog](CHANGELOG.md)
- [TensorFlow Decision Forest](https://github.com/tensorflow/decision-forests)
- [Javascript port](port/javascript)

## Installation from pre-compiled binaries

Expand Down
105 changes: 105 additions & 0 deletions yggdrasil_decision_forests/port/javascript/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
load("@emsdk//emscripten_toolchain:wasm_rules.bzl", "wasm_cc_binary")

package(
default_visibility = ["//yggdrasil_decision_forests/port/javascript:users"],
licenses = ["notice"],
)

package_group(
name = "users",
packages = [
"//learning/lib/ami/simple_ml/ports/javascript/...",
"//yggdrasil_decision_forests/port/javascript/...",
],
)

exports_files(["wrapper.js"])

# Change the extension of the wrapper js file. This is necessary for Emscripten.
genrule(
name = "wrapper",
srcs = ["wrapper.js"],
outs = ["wrapper.lds"],
cmd = "cat $(SRCS) > $@",
visibility = ["//visibility:private"],
)

# Web assembly logic (part 1).
#
# See https://github.com/emscripten-core/emscripten/blob/main/src/settings.js for the description
# of the linkops.
cc_binary(
name = "inference",
srcs = ["inference.cc"],
defines = [],
linkopts = [
"--bind",
"-s EXPORTED_RUNTIME_METHODS=FS", # To access YDF output file from JS.
"-s ALLOW_MEMORY_GROWTH=1",
"-s EXIT_RUNTIME=0",
"-s MALLOC=emmalloc",
"-s MODULARIZE=1",
"-s DYNAMIC_EXECUTION=0",
"-s EXPORT_NAME=YggdrasilDecisionForests",
"-s FILESYSTEM=1", # Link filesystem (should be automatic in some cases).
# "-s -g", # Function names in stack trace.
# "-s ASSERTIONS=2", # Runtime checks for common memory allocation errors.
"-s DEMANGLE_SUPPORT=1", # Better function name in stack stace.
# fetchSettings is included to bypass CORS issues during development
"-s INCOMING_MODULE_JS_API=onRuntimeInitialized,fetchSettings,print,printErr",
"--post-js yggdrasil_decision_forests/port/javascript/wrapper.js",
],
tags = [
"manual",
"nobuilder",
"notap",
],
visibility = ["//visibility:private"],
deps = [
"//yggdrasil_decision_forests/learner:learner_library",
"//yggdrasil_decision_forests/learner/cart",
"//yggdrasil_decision_forests/learner/gradient_boosted_trees",
"//yggdrasil_decision_forests/learner/random_forest",
"//yggdrasil_decision_forests/model:model_library",
"//yggdrasil_decision_forests/utils:logging",
],
)

# Web assembly logic (part 2).
wasm_cc_binary(
name = "inference_wasm",
cc_target = ":inference",
tags = [
"manual",
"notap",
],
)

# Extract the emscriptten wasm file.
genrule(
name = "extract_wasm_file",
srcs = [":inference_wasm"],
outs = ["inference.wasm"],
cmd = "cp $(BINDIR)/yggdrasil_decision_forests/port/javascript/inference_wasm/inference.wasm $(@D)/",
)

# Extract the merged emscriptten js + wrapper file.
genrule(
name = "extract_js_file",
srcs = [":inference_wasm"],
outs = ["inference.js"],
cmd = "cp $(BINDIR)/yggdrasil_decision_forests/port/javascript/inference_wasm/inference.js $(@D)/",
)

# Zip the library.
genrule(
name = "create_release",
srcs = [
":extract_wasm_file",
":extract_js_file",
],
outs = ["ydf.zip"],
cmd = "zip -j $@ $(locations :extract_wasm_file) $(locations :extract_js_file) && " +
"echo Zipfile information: && zipinfo $@ && " +
"echo Zipfile ls: && ls -lh $@",
)
177 changes: 177 additions & 0 deletions yggdrasil_decision_forests/port/javascript/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Yggdrasil / TensorFlow Decision Forests in Javascript

The Yggdrasil Decision Forests Javascript port is a Javascript + WebAssembly
library to run (i.e. generate the predictions) of Yggdrasil Decision Forests
models on the web.

This library is compatible with TensorFlow Decision Forests models.

## Usage example

The [example/](example) directory (i.e.
`yggdrasil_decision_forests/port/javascript/example/example.html`) contains a
running example of YDF model inference in a webpage. The following section
details how this example work.

**Note:** The shell commands below should be run from the Yggdrasil root
directory i.e. the directory containing `CHANGELOG.md`.

**Step 1**

First, train and save to disk a Yggdrasil Decision Forests model using one of
the available APIs. See the
[Yggdrasil TensorFlow Decision Forests page](../../README.md) for more details.

- C++:
[user manual](https://github.com/google/yggdrasil-decision-forests/blob/main/documentation/user_manual.md),
[example](https://source.corp.google.com/piper///depot/google3/third_party/yggdrasil_decision_forests/examples/beginner.cc).
- CLI:
[user manual](https://github.com/google/yggdrasil-decision-forests/blob/main/documentation/user_manual.md),
[example](https://source.corp.google.com/piper///depot/google3/third_party/yggdrasil_decision_forests/examples/beginner.sh).
- Python (with TensorFlow Decision Forests):
[website](https://www.tensorflow.org/decision_forests),
[examples](https://www.tensorflow.org/decision_forests/tutorials). The
Yggdrasil Decision Forests model is located in the `assets` subdirectory of
the TensorFlow model.

Note: Yggdrasil Decision Forests model is a directory containing a
`data_spec.pb` file.
[Here](https://github.com/google/yggdrasil-decision-forests/tree/main/yggdrasil_decision_forests/test_data/model/adult_binary_class_gbdt)
is an example of model.

If the size of the model or its inference speed is important to you, the
following suggestions can help optimizing it:

1. Gradient Boosted Trees models are both smaller and faster than Random Forest
models. If both have the same quality, prefer a Gradient Boosted Trees
model.

2. The number of trees (controlled with the `num_trees`) parameter impacts the
size of Random Forest models.

3. If you don't expect to interpret the model, use the
`keep_non_leaf_label_distribution=False` hyperparameter.

4. Always, if you don't expect to interpret the model, use the
`keep_non_leaf_label_distribution=False` advanced hyperparameter.

**Step 2**

Archive the model in a zip file.

```shell
zip -jr model.zip /path/to/my/model
```

For this example, we can use one of the unit test models:

```shell
zip -jr model.zip yggdrasil_decision_forests/test_data/model/adult_binary_class_gbdt
```

**Step 3**

Download the Yggdrasil Decision Forest Javascript port library using one of the
options:

- Download a [pre-compiled binaries]() (Not yet available).
- Compile the library from source by running:

```shell
# Compile Yggdrasil Decision Forest Webassembly inference
# The result is available at "bazel-bin/yggdrasil_decision_forests/port/javascript/ydf.zip"
bazel build -c opt --config=wasm //yggdrasil_decision_forests/port/javascript:create_release
```

**Step 5**

Decompress the library.

```
unzip bazel-bin/yggdrasil_decision_forests/port/javascript/ydf.zip \
-d yggdrasil_decision_forests/port/javascript/example/ydf
```

The library is composed of two files:

```
ydf/inference.js
ydf/inference.wasm
```

**Step 4**

Add the library to the HTML header of your webpage. Also add
[JSZip](https://stuk.github.io/jszip/).

```
<!-- Yggdrasil Decision Forests -->
<script src="ydf/inference.js"></script>
<!-- JSZip -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jszip/3.10.0/jszip.min.js"></script>
```

See `yggdrasil_decision_forests/port/javascript/example/example.html` for an
example.

**Step 5**

In Javascript, load the library :

```js
let ydf = null;
YggdrasilDecisionForests().then(function (m) {
ydf = m;
console.log("The library is loaded");
});
```

Then, load the model from an url:

```js
let model = null;
ydf.loadModelFromUrl("https://path/to/my/model.zip").then((loadedModel) => {
model = loadedModel;

console.log("The model is loaded");
console.log("The input features of the model are:", model.getInputFeatures());
});
```

Compute predictions with the model:

```js
let examples = {
feature_1: [1, null, 3], // "null" represents a missing value.
feature_2: ["cat", "dog", "tiger"],
};
let predictions = model.predict(examples);
```

Finally, unload the model:

```js
model.unload();
model = null;
```

**Step 6**

Start a http(s) server:

```shell
# Start a http server with python.
(cd yggdrasil_decision_forests/port/javascript/example && python3 -m http.server)
```

Open the webpage `http://localhost:8000/example.html`.

**Step 7**

In this example, you can see three buttons:

- **Load model:** Downloads and load the model in memory.
- **Apply model:** Apply the model on the toy examples specified in the `Input
examples` text area.
- **Unload model:** Unload the model from memory.
5 changes: 5 additions & 0 deletions yggdrasil_decision_forests/port/javascript/example/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@

package(
default_visibility = ["//yggdrasil_decision_forests/port/javascript:users"],
licenses = ["notice"],
)
36 changes: 36 additions & 0 deletions yggdrasil_decision_forests/port/javascript/example/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Usage example for Yggdrasil Decision Forests in JavaScript

See "../README.md" for details.

## Data

The `model.zip` was created as:

```shell
zip -r -j third_party/yggdrasil_decision_forests/ports/javascript/model.zip \
third_party/yggdrasil_decision_forests/test_data/model/adult_binary_class_gbdt
```

The 4 input examples are the first 4 examples in
`third_party/yggdrasil_decision_forests/test_data/dataset/adult_test.csv`.

The predictions on those examples are expected to be:

Those predictions can also be generated using the CLI interface:

```shell
bazel build -c opt //third_party/yggdrasil_decision_forests/cli:predict

./bazel-bin/third_party/yggdrasil_decision_forests/cli/predict \
--alsologtostderr \
--model=third_party/yggdrasil_decision_forests/test_data/model/adult_binary_class_gbdt \
--dataset=csv:third_party/yggdrasil_decision_forests/test_data/dataset/adult_test.csv \
--output=csv:/tmp/predictions.csv

head /tmp/predictions.csv
# <=50K,>50K
# 0.987869,0.0121307
# 0.668998,0.331002
# 0.219888,0.780112
# 0.88848,0.11152
```
Loading

0 comments on commit 449a15f

Please sign in to comment.