Releases: microsoft/FLAML
v0.8.2
What's Changed
- include default value in rf search space by @sonichi in #317
- adding TODOs for NLP module, so students can implement other tasks easier by @liususan091219 in #321
- pred_time_limit clarification and logging by @sonichi in #319
- bug fix in confg2params by @sonichi in #323
Full Changelog: v0.8.1...v0.8.2
v0.8.1
What's Changed
- Update test_regression.py by @fengsxy in #306
- Add conda forge minimal test by @MichalChromcak in #309
- fixing config2params for transformersestimator by @liususan091219 in #316
- Code quality improvement based on #275 by @abnsy and @sonichi in #313
- skip cv preparation if eval_method is holdout by @sonichi in #314
New Contributors
Full Changelog: v0.8.0...v0.8.1
v0.8.0
In this release, we add two nlp tasks: sequence classification and sequence regression to flaml.AutoML
, using transformer-based neural networks. Previously the nlp module was detached from flaml.AutoML
with a separate API. We redesigned the API such that the nlp tasks can be accessed from the same API as other tasks, and adding more nlp tasks in future would be easy. Thanks for the hard work @liususan091219 !
We've also continued to make more performance & feature improvements. Examples:
- We added a variation of XGBoost search space which uses limited
max_depth
. It includes the default configuration from XGBoost library. The new search space leads to significantly better performance for some regression datasets. - We allow arguments for
flaml.AutoML
to be passed to the constructor. This enables multioutput regression by combining sklearn's MultioutputRegressor and flaml's AutoML. - We made more memory optimization, while allowing users to keep the best model per estimator in memory through the "model_history" option.
What's Changed
- Unify regression and classification for XGBoost by @sonichi in #276
- when max_iter=1, skip search only if retrain_final by @sonichi in #280
- example update by @sonichi in #281
- Merge exp into flaml by @liususan091219 in #210
- add best_loss_per_estimator by @qingyun-wu in #286
- model_history -> save_best_model_per_estimator by @sonichi in #283
- datetime feature engineering by @sonichi in #285
- add warmstart test by @qingyun-wu in #298
- empty search space by @sonichi in #295
- multioutput regression by @sonichi in #292
- add max_depth to xgboost search space by @sonichi in #282
- custom metric function clarification by @sonichi in #300
- checkpoint naming in nonray mode, fix ray mode, delete checkpoints in nonray mode by @liususan091219 in #293
Full Changelog: v0.7.1...v0.8.0
v0.7.1
v0.7.0
New feature: multivariate time series forecasting.
What's Changed
- Fix exception in CFO's
_create_condition
if all candidate start points didn't return yet by @Yard1 in #263 - Integrate multivariate time series forecasting by @int-chaos in #254
- Update Dockerfile by @wuchihsu in #269
- limit time and memory consumption by @sonichi in #264
New Contributors
Full Changelog: v0.6.9...v0.7.0
v0.6.9
v0.6.8
What's Changed
- fix the bug in hierarchical search space (#248); make dependency on lgbm and xgboost optional (#252) by @sonichi in #250
- Add conda forge badge by @MichalChromcak in #251
New Contributors
- @MichalChromcak made their first contribution in #251
Full Changelog: v0.6.7...v0.6.8
v0.6.7
What's Changed
- remove big objects after fit by @sonichi in #176
- remove catboost training dir by @sonichi in #178
- Forecast v2 by @int-chaos in #182
- Fix decide_split_type bug. by @gianpdomiziani in #184
- Cleanml by @qingyun-wu in #185
- warmstart blendsearch by @sonichi in #186
- variable name by @sonichi in #187
- notebook example by @sonichi in #189
- make flaml work without catboost by @sonichi in #197
- package name in setup by @sonichi in #198
- clean up forecast notebook by @sonichi in #202
- consider num_samples in bs thread priority by @sonichi in #207
- accommodate nni usage pattern by @sonichi in #209
- random search by @sonichi in #213
- add consistency test by @qingyun-wu in #216
- set converge flag when no trial can be sampled by @sonichi in #217
- seed for hpo method by @sonichi in #224
- update config if n_estimators is modified by @sonichi in #225
- warning -> info for low cost partial config by @sonichi in #231
- Consistent California by @cdeil in #245
- Package by @sonichi in #244
New Contributors
Full Changelog: v0.6.0...v0.6.7
v0.6.6
What's Changed
- remove big objects after fit by @sonichi in #176
- remove catboost training dir by @sonichi in #178
- Forecast v2 by @int-chaos in #182
- Fix decide_split_type bug. by @gianpdomiziani in #184
- Cleanml by @qingyun-wu in #185
- warmstart blendsearch by @sonichi in #186
- variable name by @sonichi in #187
- notebook example by @sonichi in #189
- make flaml work without catboost by @sonichi in #197
- package name in setup by @sonichi in #198
- clean up forecast notebook by @sonichi in #202
- consider num_samples in bs thread priority by @sonichi in #207
- accommodate nni usage pattern by @sonichi in #209
- random search by @sonichi in #213
- add consistency test by @qingyun-wu in #216
- set converge flag when no trial can be sampled by @sonichi in #217
- seed for hpo method by @sonichi in #224
- update config if n_estimators is modified by @sonichi in #225
- warning -> info for low cost partial config by @sonichi in #231
Full Changelog: v0.6.0...v0.6.6
v0.6.0
In this release, we added support for time series forecasting task and NLP model fine tuning. Also, we have made a large number of feature & performance improvements.
- data split by 'time' for time-ordered data, and by 'group' for grouped data.
- support parallel trials and random search in
AutoML.fit()
API. - support warm-start in
AutoML.fit()
by using previously found start points. - support constraints on training/prediction time per model.
- new optimization metric: ROC_AUC for multi-class classification, MAPE for time series forecasting.
- utility functions for getting normalized confusion matrices and multi-class ROC or precision-recall curves.
- automatically retrain models after search by default; options to disable retraining or enforce time limit.
- CFO supports hierarchical search space and uses points_to_evaluate more effectively.
- variation of CFO optimized for unordered categorical hps.
- BlendSearch improved for better performance in parallel setting.
- memory overhead optimization.
- search space improvements for random forest and lightgbm.
- make stacking ensemble work for categorical features.
- python 3.9 support.
- experimental support for automated fine-tuning of transformer models from huggingface.
- experimental support for time series forecasting.
- warnings to suggest increasing time budget, and warning to inform users there is no performance improvement for a long time.
Minor updates
- make log file name optional.
- notebook for time series forecasting.
- notebook for using AutoML in sklearn pipeline.
- bug fix when training_function returns a value.
- support fixed random seeds to improve reproducibility.
- code coverage improvement.
- exclusive upper bounds for hyperparameter type randint and lograndint.
- experimental features in BlendSearch.
- documentation improvement.
- bug fixes for multiple logged metrics in cv.
- adjust epsilon when time per trial is very fast.
Contributors
- @sonichi
- @qingyun-wu
- @int-chaos
- @liususan091219
- @Yard1
- @bnriiitb
- @su2umaru
- @eduardobull
- @sek788432
- @ekzhu
- @anshumandutt
- @yue-msr
- @sadtaf
- @fzanartu
- @dsbyprateekg
- @hanhanwu
- @PardeepRassani
- @gianpdomiziani
- @stepthom
- @anhnht3
- @zzheng93
- @flippercy
- @luizhemelo
- @nabalamu
- @lostmygithubaccount
- @suryajayaraman