CH05- got weird result from train_model_simple #442
-
Hi
Which is kind of different from that printed in the book. Here is my code
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 10 replies
-
Hi there, thanks for sharing, and hm, that's weird. The code looks correct but yeah, based on the high training loss, something is not right and the model is not learning. Could be that there's an issue with the dataset, or maybe the model configuration itself. Maybe the architecture is not set up correctly. For the remaining code where, can you use the functions from here: https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/01_main-chapter-code/previous_chapters.py It's just so that we can narrow down where the typo occurs. |
Beta Was this translation helpful? Give feedback.
Hi there,
thanks for sharing, and hm, that's weird. The code looks correct but yeah, based on the high training loss, something is not right and the model is not learning. Could be that there's an issue with the dataset, or maybe the model configuration itself. Maybe the architecture is not set up correctly.
For the remaining code where, can you use the functions from here:
https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/01_main-chapter-code/previous_chapters.py
It's just so that we can narrow down where the typo occurs.