-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntax for feature engineering #314
Comments
There's also this: https://github.com/joshday/Telperion.jl |
Continuing the discussion started by @indymnv at JuliaAI/MLJ.jl#970: Existing MLJ transformers are documented here with the exception of julia> using MLJModels
julia> models() do m
m.package_name == "MLJModels" &&
!m.is_supervised
end
11-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :human_name, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :reporting_operations, :reports_feature_importances, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype)}}:
(name = ContinuousEncoder, package_name = MLJModels, ... )
(name = FeatureSelector, package_name = MLJModels, ... )
(name = FillImputer, package_name = MLJModels, ... )
(name = InteractionTransformer, package_name = MLJModels, ... )
(name = OneHotEncoder, package_name = MLJModels, ... )
(name = Standardizer, package_name = MLJModels, ... )
(name = UnivariateBoxCoxTransformer, package_name = MLJModels, ... )
(name = UnivariateDiscretizer, package_name = MLJModels, ... )
(name = UnivariateFillImputer, package_name = MLJModels, ... )
(name = UnivariateStandardizer, package_name = MLJModels, ... )
(name = UnivariateTimeTypeToContinuous, package_name = MLJModels, ... ) A "fancier" version of There is a project in progress to roll out a TableTransforms.jl referenced by @juliohm is very active but not yet integrated with MLJ, although we are working towards doing so in the future (at least several months off). I think that is good place to contribute generic table transformers, such as encoders. Some feature engineering tools, such as RFE, will probably not make sense there, as they require supervised learners, for example. @indymnv It would be helpful if you can identify specific encoders or other tools you use frequently that are missing from MLJ (or TableTransforms.jl) so they can be prioritised. |
@ablaom Thanks for all the information, in general in my work with ML I use the following encoders a lot.
For now, in Julia I have only used One-Hot-encoder, I have not checked the transformations. [Edit]: As a context, I frequently work with linear/logistic regression models, Decision-Tree, Random Forest and GBM. |
Thanks @indymnv . That's most helpful. PR's for missing items welcome 😉 |
I stumbled upon https://github.com/matthieugomez/PairsMacros.jl today and it seems to be close to what we discussed with @vollmersj with respect to defining new columns with a formula-like syntax.
@matthieugomez sorry to ping you here but would you be interested in something like PairsMacros for general-purpose feature engineering to work with MLJ?
The text was updated successfully, but these errors were encountered: