Is there a limit to context (prefix) length? #82

r3ndd · 2019-07-05T06:06:48Z

I understand there is a 1024 token output limit on each request, but are there any constraints on how long the context (prefix) can be?

Additionally, and this is directly related to a context limit, what is the best way to implement a system like a chatbot where the bot is continually switching between generating text (sending messages) and taking in new context (receiving messages)? The conversation history could grow indefinitely, but would the history need to be fed into the model every time the bot wants to send a new message? If so, are there speed/resource constraints to that naive approach which would warrant the removal of some older messages?

woctezuma · 2019-07-05T06:22:22Z

are there any constraints on how long the context (prefix) can be?

Based on #2, I think the output limit enforces a prefix limit, because the prefix is the start of the output.

r3ndd · 2019-07-05T07:19:38Z

@woctezuma thanks for the info. If I understand the model correctly, in that example where you talked about feeding in half of the previous output as the next input, the only "memory" it has is the most recent context provided, right?

woctezuma · 2019-07-05T07:22:19Z

That is my understanding.

r3ndd · 2019-07-05T21:42:45Z

@woctezuma thanks for the help. Any idea how this input size limit affects training? For instance, I have bunch of documents with a specific format that I want to train the model on. At the top of each document is some meta data that hopefully the model will learn to use as context for the rest of the document. However, will the model still have the top of the document in context during training when it reaches the bottom of a document much larger than 1024 tokens?

woctezuma · 2019-07-05T21:55:42Z

That is a good question. After looking at the code, I believe the limit, called window size, affects training. I guess it is set here in the original GPT-2 repository.

minimaxir · 2019-07-05T23:37:15Z

Yes, the metadata + text would have to be < 1024 tokens in order for it to be incorporated into the training.

r3ndd · 2019-07-06T03:00:35Z

Hmm, thanks. So let's say I have all of my documents of size <= 1024 tokens and delimited by <|document|></|document|> tags. Is there a simple way I can ensure that the model is always trained with an entire document at once and not two halves of separate documents?

woctezuma · 2019-07-06T06:09:25Z

not two halves of separate documents

I imagine that the window is sliding, so, even if the document is too big, it would not be just split in two. For instance, if the document is of length 1030 tokens, I expect it to be used as 7 lists of length 1024 tokens.

r3ndd · 2019-07-06T07:19:54Z

Good to know, thanks. By the way, is there a simple way to approximate token length so I can determine how many tokens a document is? When evaluating some of the outputs of the model for the Shakespeare examples I found the average token length to be about 3 characters, assuming all of the sample outputs were 1024 tokens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a limit to context (prefix) length? #82

Is there a limit to context (prefix) length? #82

r3ndd commented Jul 5, 2019

woctezuma commented Jul 5, 2019 •

edited

Loading

r3ndd commented Jul 5, 2019

woctezuma commented Jul 5, 2019

r3ndd commented Jul 5, 2019

woctezuma commented Jul 5, 2019

minimaxir commented Jul 5, 2019

r3ndd commented Jul 6, 2019

woctezuma commented Jul 6, 2019 •

edited

Loading

r3ndd commented Jul 6, 2019

Is there a limit to context (prefix) length? #82

Is there a limit to context (prefix) length? #82

Comments

r3ndd commented Jul 5, 2019

woctezuma commented Jul 5, 2019 • edited Loading

r3ndd commented Jul 5, 2019

woctezuma commented Jul 5, 2019

r3ndd commented Jul 5, 2019

woctezuma commented Jul 5, 2019

minimaxir commented Jul 5, 2019

r3ndd commented Jul 6, 2019

woctezuma commented Jul 6, 2019 • edited Loading

r3ndd commented Jul 6, 2019

woctezuma commented Jul 5, 2019 •

edited

Loading

woctezuma commented Jul 6, 2019 •

edited

Loading