Welcome to the repository containing code to fine-tune a Large Language Model (LLM) for the task of answering multiple-choice-questions, specifically using the LLM Science Exam dataset from a competition. The primary objective of this project is to finetune a language model and adapt them to accurately predict the correct answers to a set of questions.
This has three Jupyter Notebooks focussed for three parts of the project:
-
Crafting a Delectable Mix of Data and Prompts!: Setting up an ideal learning environment for Large Language Models (LLMs) is akin to creating a recipe for success. Imagine LLMs as eager learners, ready to absorb a mix of various information. It's not just about throwing words together randomly; we carefully design prompts, like clear instructions, to help LLMs tackle different language tasks – from simple summaries to more intricate challenges. After curating this diverse mix of data, we split it into training and validation sets. The training set acts as the main course, allowing LLMs to grasp language patterns, while the validation set ensures they genuinely comprehend the information. This notebook contains the code for data processing.
-
Training Large Language Models Made as Delightful as Cooking a Gourmet Meal!: Have you ever thought about the detailed steps involved in training a big language model? It's a bit like getting ready to cook a fancy meal! Each part needs careful consideration and accurate actions, starting with collecting the best ingredients and becoming a pro at cooking methods. So, just as a chef creates a masterpiece by paying attention to every detail, crafting a well-trained language model involves precision and dedication at every stage! This notebook contains the code for training the LLM. I am adding this flowchart for better understanding of the training process.
The fine-tuned model has been uploaded to 🤗 hub. Click 🤗 to land on the model space. You can also see board. Also, I have made the board public.