Skip to content

CH05- got weird result from train_model_simple #442

Answered by rasbt
Nevermetyou65 asked this question in Q&A
Discussion options

You must be logged in to vote

Hi there,

thanks for sharing, and hm, that's weird. The code looks correct but yeah, based on the high training loss, something is not right and the model is not learning. Could be that there's an issue with the dataset, or maybe the model configuration itself. Maybe the architecture is not set up correctly.

For the remaining code where, can you use the functions from here:

https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/01_main-chapter-code/previous_chapters.py

It's just so that we can narrow down where the typo occurs.

Replies: 1 comment 10 replies

Comment options

You must be logged in to vote
10 replies
@Nevermetyou65
Comment options

@rasbt
Comment options

@Nevermetyou65
Comment options

@rasbt
Comment options

@Nevermetyou65
Comment options

Answer selected by Nevermetyou65
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants