Skip to content

wjlgatech/fastai_multimodal

Repository files navigation

fastai-multimodal

Purpose:

To create end-to-end multimodal classifers based on Fastai-tabular, Fastai-text and Fastai-vision.

Specifically, I will construct 3 types of multimodal model:

  • early concat: concatinate cnt, cat, txt, img after data loading and data preprocessing, followed by a learner of choice (e.g. fastai tabular, TabNet, Deep-RF, GSN-VSN).
  • middle concat: concatinate the embeddings from each of the trained tab (cnt+cat), txt, img models, followed by a learner of choice.
  • late concat: concatinate the probability predictions from each of the trained tab(cnt+cat), txt, img models, followed by a learner of choice.

Using a few benchmark datasets, I will compare the 3 types of multimodal models on their

  • computation efficiency
  • ML performance
  • interpretability

Every iteration, I am aiming to make this package 5% better, w.r.t.

  • easy to use
  • efficent
  • stable and ready for production

Wanna contribute?

Check out these notebooks here and here. Any advices and comments are welcomed. Please shot me an email here.

Credits & References:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages