You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 21, 2022. It is now read-only.
I've been thinking through the design of the meta/sub learners, and I think it will be pretty reasonable to get Optim functionality in this framework. Here's a rough concept of what I think that looks like:
# keep vector of: ‖θ_true - θ‖
tracer =Tracer(Float64, (model,i) ->norm(θ -params(model)))
# build the MetaLearner
learner =make_learner(
GradientLearner(...),
TimeLimit(60), # stop after 60 secondsMaxIter(1000), # stop after 1000 iterationsShowStatus(100), # show a status update before iterating and every 100 iterationsConverged(params, tol=1e-6), # similar to x_converged for the function-caseConverged(output, tol=1e-6), # similar to f_converged for the function-caseConverged(grad, tol=1e-6, every=10), # similar to g_converged for the function-case# note: we can also only check every ith iteration
tracer
)
# learn is like optimize.learn!(model, learner)
# note: for the function minimization case, it "iterates" over obs == nothing, since the# `x` in f(x) is treated as learnable parameters in Transformations.Differentiable, NOT input
Something like this would replace most of Optim.optimize, and individual algorithms would be implemented by creating sub-learners to do whatever is "special" as compared to common algos, reusing other components as necessary. For example, you may replace the SearchDirection within the GradientLearner with something specialized for that method.
Line search and similar could be made generic as implementations of LearningRate, which is a sub-learner that knows how to calculate the learning rate for a step.
Common components could be added through a high-level api, similar to make_learner.
In the long term, beyond simple tracing, early stopping, and convergence checks, I plan on incorporating real time visualizations, animations, and more. So I think there's a benefit to get Optim functionality into this framework and pool resources.
How does this compare with the changes already happening with Optim.jl? I'm not as up-to-date: so much to follow! But yes, I think this design makes a lot of sense.
@pkofod can hopefully answer better than me, but from what I've seen in the Optim codebase, there's implementation leakage and coupling of objects, tracing, temporary storage, etc. There are massive "state" objects that hold a range of diverse things and get passed all around, and I think adding new approaches/methods will sometimes require changes that touch a lot of files/code. I'm working toward alleviating a lot of these issues.
The "update refactor" probably decoupled pieces and made it more modular, but I'm not the one to comment there.
One thing that is not obvious... it's hard to make your own sub-components and inject them into the optimize loop, unless you can possibly wrap them successfully in an iteration callback. Frequently you need the callbacks from different parts of the iteration depending on the specific task, and that's tough right now.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I've been thinking through the design of the meta/sub learners, and I think it will be pretty reasonable to get Optim functionality in this framework. Here's a rough concept of what I think that looks like:
Something like this would replace most of
Optim.optimize
, and individual algorithms would be implemented by creating sub-learners to do whatever is "special" as compared to common algos, reusing other components as necessary. For example, you may replace theSearchDirection
within theGradientLearner
with something specialized for that method.Line search and similar could be made generic as implementations of
LearningRate
, which is a sub-learner that knows how to calculate the learning rate for a step.Common components could be added through a high-level api, similar to
make_learner
.For a working example using SGD, see: https://github.com/JuliaML/StochasticOptimization.jl/blob/master/test/runtests.jl#L145
In the long term, beyond simple tracing, early stopping, and convergence checks, I plan on incorporating real time visualizations, animations, and more. So I think there's a benefit to get Optim functionality into this framework and pool resources.
cc: @pkofod @ChrisRackauckas @oxinabox
The text was updated successfully, but these errors were encountered: