-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow users to specify float types from OneHotEncoder and ContinuousEncoder #565
Comments
This sounds very reasonable. For the ContinuousEncoder, perhaps we can just call Would it be enough to have a single option @EssamWisam Perhaps we can keep such an option in mind for the encoders you are working on. We should have a consistent name for the hyper-parameter. If it's a flag, how about @tiemvanderdeure What do you think? |
I think it's common to train deep learning models at 16-bit precision and sometimes even lower. Thus, I think one way to approach this may be to add a transformer (e.g., Alternatively, I think it would be good for Flux models to do this conversion themselves (e.g., while batching). That would be easier because one can then try out their Flux model at different precisions for the parameters without also manuaully changing the precision of the data. |
My initial idea was to be a bit more flexible by adding a Would a I think doing the conversion in MLJFlux is also reasonable - I can't see any reason why you wouldn't want the input data to be converted to whatever datatype the neural net has. I don't know if there are any other MLJ models that have specific requirements to input data? I can't think of any on the spot, but I only know a fraction of all the models. |
When I proposed having a separate transformer, the reason was it would not make sense to have the hyperparameter take only specific values for the type (e.g., only Float64 or Float32) as the model may also be Float16 or it may use an embedding layer requiring the input to be ordinal encoded integers and because a separate transformer can also be used other applications (e.g., casting specific columns of a dataset to Float16 while casting others to Int16 to save memory before feeding the data into some model). If I understand correctly now, you meant that |
Just made a quick PR to show what I had in mind - it might still be better to handle this in (MLJ)Flux instead, then we don't merge the PR. |
After thinking about this more, my vote would be to add this to MLJFlux, as this is, for now, the only package fussy about float types. In the rare we want float32 in some other case, one can always insert an ordinary function into the pipeline. For example, if you have a float64 table, you could insert float32ify(X) = Float32.(Tables.matrix(X)) which returns a In MLJFlux, I wouldn't implement the conversion by adding a layer to the Flux model. Rather do it in the data pre-processing step, so that it only needs to happen once. |
Admittedly, this is a bit of a pain because the |
In the current state, Flux will behave as reported above (as far as I believe) if input is FloatX and model is FloatY where X≠Y. The most frequent case for this will have X=64 and Y=32. If we make a reformat method to handle just this case, then we don't perfectly solve this problem for all setups but at least do so for the vastly most frequent setup (i.e., still an improvement over the status quo). |
It makes good sense to me to just add this to MLJFlux. I'm leaving on holiday today, but I can write a PR in august. |
MLJFlux now automatically converts inputs to float32. |
I am using a
LinearPipeline
consisting of aOneHotEncoder
in combination with aNeuralNetworkClassifier
, which I think is a basic yet really neat use case for linear pipelines.However,
Flux
gives this warning:I know MLJ tries to abstract away from data types to use scitypes instead, but in this case it would be great to be able to specify the data type a
OneHotEncoder
gives as output.For
OneHotEncoder
this happens to be really easy to do (it callsfloat
), for aContinuousEncoder
it would be a bit trickier (it callscoerce
).I also don't know if it conflicts too much with the scitype philosophy or if there's an easier way around this, so I thought I'd ask here before making a PR (which I would gladly do).
The text was updated successfully, but these errors were encountered: