-
-
Notifications
You must be signed in to change notification settings - Fork 675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow users to generate texts longer than 1024 tokens #2
Comments
Why half of the previous text and not all? |
Would this be something you'd be willing to accept a PR on? I'd be willing to give it a go next week. |
Is there some work on this? I would like to have this feature implemented |
I guess that the length is computed on the generated text, including the prefix. If you feed the whole previous text as a prefix, then you would not be able to generate anything more if the length of the input is already at the max. By feeding half of the previous text, you are guaranteed to have space(*) left for the rest of the generation process. This allows to circumvent the length constraint by iterating. (*) at least half the length |
I'm currently not working on this; if there's a PR, I'll merge it. The hard part is that the 1024 limit is done at the tensor level; not sure what's necessary to shift it to handle it efficiently. (especially in the batch case) |
Yeah, the batch case is where I fail to get it working, as I already am using a code with it working for only one batch, but can't figure it out how to properly and efficiently make it work for multiple batches, so I can't create a PR right now. Yet, I'd be glad if someone would create a PR that solves this problem |
|
It likely isn't possible to do it at the generation level (like other frameworks), but we can hack it by:
prefix
.The text was updated successfully, but these errors were encountered: