Broaden MLJ `target_scitype` only when using `TemplateExpression` #392

MilesCranmer · 2024-12-21T03:22:24Z

Fixes #390

@ablaom how is this?

github-actions · 2024-12-21T03:44:18Z

Benchmark Results

	master	`c8a43be`...	master/c8a43bea0b32cb...
search/multithreading	16.9 ± 0.93 s	16.5 ± 1.3 s	1.03
search/serial	28.6 ± 0.21 s	28.7 ± 0.47 s	0.996
utils/best_of_sample	1.62 ± 0.4 μs	1.6 ± 1.5 μs	1.01
utils/check_constraints_x10	11.1 ± 2.8 μs	11.2 ± 2.9 μs	0.985
utils/compute_complexity_x10/Float64	2.06 ± 0.12 μs	2.09 ± 0.11 μs	0.986
utils/compute_complexity_x10/Int64	2.04 ± 0.11 μs	2.07 ± 0.12 μs	0.986
utils/compute_complexity_x10/nothing	1.47 ± 0.14 μs	1.41 ± 0.099 μs	1.04
utils/insert_random_op_x10	5.77 ± 1.9 μs	5.8 ± 1.8 μs	0.995
utils/next_generation_x100	0.354 ± 0.094 ms	0.431 ± 0.068 ms	0.822
utils/optimize_constants_x10	0.0335 ± 0.0083 s	0.0346 ± 0.0082 s	0.967
utils/randomly_rotate_tree_x10	5.28 ± 0.59 μs	5.26 ± 0.6 μs	1
time_to_load	1.85 ± 0.018 s	1.85 ± 0.0095 s	1

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

coveralls · 2024-12-21T04:05:48Z

Pull Request Test Coverage Report for Build 12509324821

Details

20 of 22 (90.91%) changed or added relevant lines in 3 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.01%) to 94.776%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/MLJInterface.jl	16	18	88.89%

Totals
Change from base Build 12509301759:	0.01%
Covered Lines:	3157
Relevant Lines:	3331

💛 - Coveralls

ablaom · 2024-12-22T23:24:45Z

Thanks for this. Unfortunately, I'm getting a new problem. It looks like certain kinds of table are not supported. And training is extraordinarily slow.

using SymbolicRegression
using MLJBase
using Tables

M = SRRegressor

X = (x1 = rand(5), x2 = rand(5))
y = rand(5)

mach = machine(M(), X, y)
@time fit!(mach, verbosity=0)  # second attempt
# 6.551545 seconds (209.99 M allocations: 10.681 GiB, 39.67% gc time, 0.10% compilation time: 100% of which was recompilation)

X = Tables.rowtable(X)
machine(M(), X, y) |> fit!
# julia> machine(M(), X, y) |> fit!
# [ Info: Training machine(SRRegressor(defaults = nothing, …), …).
# ┌ Error: Problem fitting the machine machine(SRRegressor(defaults = nothing, …), …).
# └ @ MLJBase ~/.julia/packages/MLJBase/7nGJF/src/machines.jl:694
# [ Info: Running type checks...
# [ Info: Type checks okay.
# ERROR: MethodError: no method matching haskey(::Vector{@NamedTuple{x1::Float64, x2::Float64}}, ::Symbol)
# The function `haskey` exists, but no method is defined for this combination of argument types.

  [a7f614a8] MLJBase v1.7.0
  [697918b4] MLJTestIntegration v0.5.2
  [8254be44] SymbolicRegression v1.5.0 `https://github.com/MilesCranmer/SymbolicRegression.jl.git#narrow-mlj-scitype`
  [bd369af6] Tables v1.12.0

julia> versioninfo()
Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (x86_64-apple-darwin22.4.0)
  CPU: 12 × Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 12 default, 0 interactive, 6 GC (on 12 virtual cores)
Environment:
  JULIA_LTS_PATH = /Applications/Julia-1.10.app/Contents/Resources/julia/bin/julia
  JULIA_PATH = /Applications/Julia-1.11.app/Contents/Resources/julia/bin/julia
  JULIA_EGLOT_PATH = /Applications/Julia-1.7.app/Contents/Resources/julia/bin/julia
  JULIA_NUM_THREADS = 12
  JULIA_NIGHTLY_PATH = /Applications/Julia-1.11.app/Contents/Resources/julia/bin/julia

MilesCranmer · 2024-12-22T23:45:39Z

Is this related to this PR or is it a different issue? Is this PR itself ok to merge though? I can work on that issue afterwards

MilesCranmer · 2024-12-22T23:49:47Z

If I run your example I get

julia> @time fit!(mach, verbosity=0)
  3.370594 seconds (202.80 M allocations: 10.300 GiB, 22.58% gc time, 1.83% compilation time)

Doesn't seem too bad? SRRegressor doesn't change the default parameters if the dataset is small.

MilesCranmer · 2024-12-22T23:55:32Z

Regarding Tables.rowtable(X), is there a way I can test that and other types with MLJTestInterface? Currently the MLJTestInterface is passing so I guess there is a blindspot for types it should be compatible with.

ablaom · 2024-12-26T19:23:14Z

Regarding Tables.rowtable(X), is there a way I can test that and other types with MLJTestInterface? Currently the MLJTestInterface is passing so I guess there is a blindspot for types it should be compatible with.

MLJTestIntegration.jl provides more comprehensive testing, and a second signature for test that autogenerates some datasets - both column tables and row tables. It also has higher dependencies. However, as these datasets may not be sufficient for all models, we currently suggest packages create their own "comprehensive" collection of datasets and test these using MLJTestInterface.jl.test, which should suffice to test just the "interface".

However, to improve coverage, you could add MLJTestIntegration tests following the following example, to test with the auto-generated datasets

using MLJTestInterface
using SymbolicRegression
using MLJTestIntegration

MLJTestIntegration.test(
    M = SRRegressor;
    verbosity=0, # bump to debug
    throw=true,
    level=2,
)

Use level > 2 for more than just interface tests - up to level=4. Full docstring is below. I'm running this code with level = 4 in the MLJ integration tests for all models, and this is how I am finding your bugs.

(Some of the advanced tests require that models be truly reproducible with default hyperparameters. However, as I found this is not always the case, we check this manually before proceeding; for slow models, this can add a bit of test time.)

test(models, data...; mod=Main, level=2, throw=false, verbosity=1)
test(model; mod=Main, level=2, throw=false, verbosity=1)

The first signature above applies a battery of MLJ integration tests to a collection of
models, using data for training. Here mod should be the module from which test is
called (generally, mod=@__MODULE__ will work). Here models is either:

A collection of model types implementing the MLJ model interface.
A collection of named tuples, where each tuple includes :name and :package_name as keys, and whose corresponding values point to a model type appearing in the MLJ Model Registry. MLJ.models(...) always returns such a collection.

The second signature applies the same tests to a single model (of either type above) but
applies them using all MLJTestIntegration.jl datasets that might conceivable apply. If
none appear to apply, an exception is thrown.

Ordinarily, code defining the model types to be tested must already be
loaded into the module mod. An exception is described under "Testing
with automatic code loading" below.

The extent of testing is controlled by level:

`level`	description	tests (full list below)
1	test code loading	`:model_type`
2 (default)	basic test of model interface	first four tests
3	comprehensive CPU1()	all non-accelerated tests
4	comprehensive	all tests

By default, exceptions caught in tests are not thrown. If
throw=true, testing will terminate at the first execption
encountered, after throwing that exception (useful to obtain stack
traces).

Return value

The first signature returns (failures, summary) where:

failures: table of exceptions thrown
summary: table summarizing the outcomes of each test, where
outcomes are indicated as below:

entry	interpretation
✓	test succesful
×	test unsuccessful
n/a	skipped because not applicable
-	test skipped for some other reason

In the special case of operations, an empty entry, "", indicates that there don't
appear to be any operations implemented.

The second signature returns a vector of exceptions encountered.

Testing with automatic code loading

World Age issues pose challenges for testing Julia code if some code
is to be loaded "on demand". Nevertheless, in case 2 mentioned above,
model types to be tested need not be loaded, provided testing is
carried out in two stages, as shown in the second example below. In
this case, however, the necessary model interface packages need
to be listed in the current Julia environment, and the test calls
must appear in global scope.

Examples

Testing models in a new MLJ model interface implementation

The following tests the model interface implemented by some model type
MyClassifier, as might appear in tests for a package providing that
type:

import MLJTestIntegration
using Test
X, y = MLJTestIntegration.MLJ.make_blobs()
failures, summary = MLJTestIntegration.test([MyClassifier, ], X, y, verbosity=1, mod=@__MODULE__)
@test isempty(failures)

Testing models after filtering models in the registry

The following applies comprehensive integration tests to all
regressors provided by the package GLM.jl appearing in the MLJ Model
Registry. Since GLM.jl models are provided through the interface
package MLJGLMInterface, this must be in the current environment:

Pkg.add("MLJGLMInterface")
import MLJBase, MLJTestIntegration
using DataFrames # to view summary
X, y = MLJTestIntegration.MLJ.make_regression();
regressors = MLJTestIntegration.MLJ.models(matching(X, y)) do m
    m.package_name == "GLM"
end

# to test code loading *and* load code:
MLJTestIntegration.test(regressors, X, y, verbosity=1, mod=@__MODULE__, level=1)

# comprehensive tests:
failures, summary =
    MLJTestIntegration.test(regressors, X, y, verbosity=1, mod=@__MODULE__, level=3)

summary |> DataFrame

List of tests

Tests are applied in sequence. When a test fails, subsequent tests for
that model are skipped. The following are applied to all models:

:model_type: Load model type using registry (if named tuples are
provided) or using load_path(model_type) (if types are provided, to
check load_path trait is correctly overloaded).
:model_instance: Create a default instance.
:fitted_machine: Bind instance to data in a machine and fit!. Call report and
fitted_params on the machine.
:operations: Call implemented operations, such as predict and transform

These additional tests are applied to Supervised models:

:threshold_prediction: If the model is Probablisitic and
scitype(data[2]) <: Finite{2} (binary classification) then wrap
model using BinaryThresholdPredictor and fit!.
:evaluation: Assuming MLJ is able to infer a suitable measure
(metric), evaluate the performance of the model using evaluate!
and and cross-validation.
:accelerated_evaluation: Assuming the model appears to make
repeatable predictions on retraining, repeat the :evaluation test
using CPUThreads() acceleration and check agreement with CPU1() case.
:tuned_pipe_evaluation: Repeat the :evauation test but first
insert model in a pipeline with a trivial pre-processing step
(applies the identity transformation) and wrap in TunedModel (only
the default instance is actually evaluated).
:ensemble_prediction: Wrap the mode as EnsembleModel, train, and
attempt a predict call.
:iteration_prediction: If the model is iterable, repeat the
:evaluation test but first wrap as an IteratedModel.
:stack_evaluation: test a Stack within a Stack, with the model
being tested appearing at two levels, and evaluate the
Stack. (Other base models and adjudicators in the double stack are
instances of KNNClassifier or KNNRegressor.)
This test is only applied to single target supervised models that
are probabilistic classifiers or deterministic regressors.
:accelerated_stack_evaluation: If the model appears to make
repeatable predictions on retraining, check consistency of
evaluations for Stack(acceleration=CPU1(), ...) and
Stack(acceleration=CPUThreads(), ...) (in the double stack above).

ablaom · 2024-12-26T19:44:19Z

I should add that, in our experience, it is rare to find models that pass the interface tests but fail the integrations tests.

MilesCranmer · 2024-12-26T22:29:27Z

Got it, thanks. I will likely add those integration tests in the future

(Some of the advanced tests require that models be truly reproducible with default hyperparameters. However, as I found this is not always the case, we check this manually before proceeding; for slow models, this can add a bit of test time.)

Just for reference, the SRRegressor can be made reproducible with the following setup:

model = SRRegressor(
    parallelism=:serial,
    deterministic=true,
    seed=0  # or whatever other seed
)

It's a bit slower but it does guarantee reproducibility. The default parallelism=:multithreading prevents determinism simply due to the non-deterministic nature of CPU thread scheduling.

Even things like SIMD operations mess with determinism because of things like

julia> (0.1 + 0.2) - 0.3
5.551115123125783e-17

julia> 0.1 + (0.2 - 0.3)
2.7755575615628914e-17

ablaom · 2024-12-26T22:56:20Z

julia> (0.1 + 0.2) - 0.3
5.551115123125783e-17

The test for reproducabiliity is not that fussy, it uses isapprox. But yeah, your default model still fails the repeatability test. I think this just rules out some multithreading options in things like cross-validation, but I forget the details for now.

MilesCranmer · 2024-12-26T23:03:28Z

Oh what I meant by (0.1 + 0.2) - 0.3 != 0.1 + (0.2 - 0.3) was just explaining internal reasons for why SymbolicRegression.jl isn't reproducible in multithreaded mode. But there's a lot of different reasons, in addition to summation order depending on the CPU job scheduler, for why this is true.

fix: make `:class` col more generic to `X` type

MilesCranmer · 2024-12-26T23:09:56Z

Ok I merged #393 here

MilesCranmer force-pushed the narrow-mlj-scitype branch 3 times, most recently from 37e7021 to a394da2 Compare December 21, 2024 03:29

MilesCranmer mentioned this pull request Dec 21, 2024

[BUG]: [MLJ Interface] SRRegressor likely has too broad a target_scitype #390

Open

MilesCranmer force-pushed the narrow-mlj-scitype branch from 643f610 to 0a916a2 Compare December 21, 2024 21:33

MilesCranmer added 3 commits December 22, 2024 19:44

fix: make :class col more generic to X type

1bf13ef

feat: introduce new trait for special class column

9ea8c55

style: clean up indentation

1840f53

ablaom mentioned this pull request Dec 26, 2024

fix: make :class col more generic to X type #393

Merged

MilesCranmer enabled auto-merge December 26, 2024 23:04

MilesCranmer force-pushed the narrow-mlj-scitype branch from 0a916a2 to 1be49f4 Compare December 26, 2024 23:04

MilesCranmer added 3 commits December 26, 2024 18:06

feat: conditionally widen MLJ scitype

3e3a2d1

fix: ambiguity in target scitype

098c601

fix: switch to Unknown rather than Any

29ddf75

MilesCranmer force-pushed the narrow-mlj-scitype branch from 1be49f4 to 29ddf75 Compare December 26, 2024 23:06

MilesCranmer disabled auto-merge December 26, 2024 23:09

Merge pull request #393 from MilesCranmer/handle-rowtable

c8a43be

fix: make `:class` col more generic to `X` type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broaden MLJ `target_scitype` only when using `TemplateExpression` #392

Broaden MLJ `target_scitype` only when using `TemplateExpression` #392

MilesCranmer commented Dec 21, 2024

github-actions bot commented Dec 21, 2024 •

edited

Loading

coveralls commented Dec 21, 2024 •

edited

Loading

ablaom commented Dec 22, 2024 •

edited

Loading

MilesCranmer commented Dec 22, 2024

MilesCranmer commented Dec 22, 2024 •

edited

Loading

MilesCranmer commented Dec 22, 2024

ablaom commented Dec 26, 2024

Return value

Testing with automatic code loading

Examples

Testing models in a new MLJ model interface implementation

Testing models after filtering models in the registry

List of tests

ablaom commented Dec 26, 2024

MilesCranmer commented Dec 26, 2024

ablaom commented Dec 26, 2024

MilesCranmer commented Dec 26, 2024 •

edited

Loading

MilesCranmer commented Dec 26, 2024

Broaden MLJ target_scitype only when using TemplateExpression #392

Are you sure you want to change the base?

Broaden MLJ target_scitype only when using TemplateExpression #392

Conversation

MilesCranmer commented Dec 21, 2024

github-actions bot commented Dec 21, 2024 • edited Loading

Benchmark Results

Benchmark Plots

coveralls commented Dec 21, 2024 • edited Loading

Pull Request Test Coverage Report for Build 12509324821

Details

💛 - Coveralls

ablaom commented Dec 22, 2024 • edited Loading

MilesCranmer commented Dec 22, 2024

MilesCranmer commented Dec 22, 2024 • edited Loading

MilesCranmer commented Dec 22, 2024

ablaom commented Dec 26, 2024

Return value

Testing with automatic code loading

Examples

Testing models in a new MLJ model interface implementation

Testing models after filtering models in the registry

List of tests

ablaom commented Dec 26, 2024

MilesCranmer commented Dec 26, 2024

ablaom commented Dec 26, 2024

MilesCranmer commented Dec 26, 2024 • edited Loading

MilesCranmer commented Dec 26, 2024

Broaden MLJ `target_scitype` only when using `TemplateExpression` #392

Broaden MLJ `target_scitype` only when using `TemplateExpression` #392

github-actions bot commented Dec 21, 2024 •

edited

Loading

coveralls commented Dec 21, 2024 •

edited

Loading

ablaom commented Dec 22, 2024 •

edited

Loading

MilesCranmer commented Dec 22, 2024 •

edited

Loading

MilesCranmer commented Dec 26, 2024 •

edited

Loading