We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello team,
When I do:
from transformers import AutoTokenizer pretrained_model = "EleutherAI/pythia-160m" tokenizer = AutoTokenizer.from_pretrained( pretrained_model, padding_side="left", cache_dir=pretrained_model+'_tokenizer', ) print(tokenizer.pad_token)
It seems like the pad_token is empty (None is printed).
pad_token
None
tokenizer.pad_token = tokenizer.eos_token seems fixing the issue. Is this the same way to apply padding token as in the training process?
tokenizer.pad_token = tokenizer.eos_token
Thank you!
The text was updated successfully, but these errors were encountered:
Yes I think so. You can just set the pad_token = eos_token during training.
pad_token = eos_token
Sorry, something went wrong.
No branches or pull requests
Hello team,
When I do:
It seems like the
pad_token
is empty (None
is printed).tokenizer.pad_token = tokenizer.eos_token
seems fixing the issue. Is this the same way to apply padding token as in the training process?Thank you!
The text was updated successfully, but these errors were encountered: