-
-
Notifications
You must be signed in to change notification settings - Fork 22
How to track session-related state beyond cookies? #141
Comments
Gonna add TLSv1.3 / QUIC session resumption tickets to the list of items that can be stored. |
TLS session resumption doesn't work for Python on TLS 1.3 so we'd only get to use session resumption for TLS 1.2 (Still good!)
aioquic also has session tickets in this file, look for
Will need to figure out the parallels between the two to see if they can still be stored in the same group. For now though aioquic is a ways off, I'll focus on only the TLS session tickets. |
I realized I was really unclear on when it was safe to re-use session tickets, e.g. if TLS settings changed, so I asked around and got a pretty good answer:
So it sounds like session tickets probably should get handled at the same level as the connection pooling – which I guess will be a lower level "HTTP transport" object, not a higher-level "session with cookies and stuff" object? I guess optimally, HSTS and Alt-Svc should be handled at this level as well. It doesn't make sense to stop enforcing HSTS just because you switched to a different set of cookies! And it makes sense architecturally too, because the "HTTP transport" is responsible for TLS and protocol negotiation, and that's what tickets/HSTS/Alt-Svc are all about. (Versus cookies, caches, redirects, which are all defined at the level of HTTP's abstract semantics, and don't care about transport details.) (Or is it better to treat HSTS as an automatic redirect? I.e., if you request |
I guess something else to say explicitly here is: re-using a connection is also an example of state that gets shared between otherwise unrelated requests! Assuming we do go with a design that has a low-level "transport" that handles connection-level stuff + higher-level "sessions" on top, then we'll need to document that for users who are serious about privacy/anonymization and want to make sure servers can't link together different requests, then they need to use Tor + use different "transport" objects for each pseudonymous identity. Simply switching "session" objects won't be enough to stop an attacker from correlating requests. |
There's also this RFC we have to deal with when it comes to TLS session resumption: https://tools.ietf.org/html/rfc8470 Doesn't seem too bad, except we need to signal to the lifecycle somehow that a session is resumed and add the |
[This started as a comment on https://github.com//pull/159 but then I realized that it was really a general discussion of session state rather than anything much to do with that design sketch in particular, so I decided to post it here instead. But for context, that PR has a sketch of a session store with methods like I guess the biggest architectural question here is whether we want a single uber-session-store that holds all this data, or separate objects to hold different kinds of state. Above, I suggested that we might want to handle Alt-Svc/HSTS/session-tickets at the connection pool level, rather than at the session level, so that would argue for splitting them off from cookies/redirects/caching. Also, I think permanent redirects are a special case of a cached response? Based on this SO thread, it sounds like that's how browsers treat them. In particular, if a 301 has explicit cache control headers, then those are respected and can override the normal "permanent" caching. So maybe we just need a response caching API, and can drop the redirect caching part. Cookies are also somewhat special: all the other bits of state we're talking about here are fundamentally optimizations – they can make things faster or more secure, but if you lose the state then it doesn't really matter; HTTP's semantics are unaffected. But for cookies, this isn't true; they have a massive effect on HTTP semantics. I'm not sure what the consequences of this are. Okay, let's turn it around and think top-down for a bit. I'm tentatively imagining an architecture like this:
Does that all make sense? I feel like it's one of those things that's conceptually pretty simple but trying to explain it makes it seem super complicated. So let's consider a few different user scenarios:
Hmm. I feel like I understand the problem better, but not like I'm necessarily any closer to answering the original question about what the API should look like :-). |
100% agree that AltSvc, TLSTickets, and HSTS can be handled at the connection manager level. We do sacrifice a bit on keeping HSTS only in memory but not much, and websites that really care about their HSTS will end up on the preload list via I'm not entirely convinced about having def request(method, url, <request options>, <session_options>):
session = Session(<session options>)
return session.request(method, url, <request options>) Maybe we do want to drive home the point that The session object seems like a good place to hold onto the A thought on in-memory cache for the "User doesn't want disk but does want caching" use-case: Because it's so easy to configure in-memory (maybe via |
@sethmlarson points out that there are a bunch of bits of state that HTTP clients might want to track across otherwise-independent requests:
Traditionally http client libs handle cookies, but mostly drop the other stuff on the floor. Maybe we should have a more comprehensive strategy. I think he's thought about this some, so I'll let him fill it more details :-)
The text was updated successfully, but these errors were encountered: