-
-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idea: TreeVar (context variables that follow stack discipline) #1523
Comments
First reactions: Is this a problem? Answer: Maybe? I see your point about trio-asyncio. But in your example, I'm not sure it would be so bad if async with trio.open_nursery() as nursery:
async with trio.open_tcp_stream(...) as stream:
nursery.start_soon(some_stream_task, stream) Seems like a pretty similar case, it does confuse new users sometimes, but... there's not really anything to do about it except teach folks what things mean. I think the other way around might be potentially more compelling? If I do: global aio_supporting_nursery
async with trio_asyncio.open_loop() as loop:
async with trio.open_nursery() as aio_supporting_nursery:
await sleep_forever()
# Elsewhere:
aio_supporting_nursery.start_soon(aio_task) Then currently that won't work, but maybe it would be convenient if it did? I'm not sure whether this example is realistic or contrived :-) Another option I'll throw out there: trio-asyncio could do some hacky stuff like, find the loop by walking up the task tree and checking for Should all contextvars work this way? It's a bit arbitrary whether new tasks inherit their context from the task that called I think it actually is. Here's a fairly realistic example: consider a web app, that needs to spawn a background task in response to a request. (And maybe there will be other requests later to check on the status of the background job, etc.) Generally this will require a global nursery to handle the background tasks, and a request handler that spawns a task into this nursery, and then reports back that the task has been started. Now, let's say you're using a logging library that assigns each request a unique id, and then attaches that to all log messages generated by that request. You probably want that request-id to propagate into the background task, so any logs it generates can be traced back to the originating request. (Or at the least, that's more useful than every background task losing the logging context entirely!) That's what Trio gives you now, with context propagating along If we do it, how should we implement it? Instead of tuples like |
More potential use cases to think about:
Of course both of these cases involve deep involvement with trio's internals, so it would be easy to handle them as special cases, by putting some dedicated attributes on |
I don't know if I'd say that a lot of libraries would need it, but I think it's a feature that will be useful to libraries in at least some cases, and I don't think it costs much to support it in Trio core. This also fits with our principle of "as much of Trio as possible should be implemented in ways that don't require magic access to internals". |
Another use: Trio wants to clean up async generators in one way (#265); asyncio uses a different approach. trio-asyncio should use the right set of async generator semantics for the flavor of code that first iterates the async generator. A scopevar is the natural way to implement this. I think this pushes me over the edge into "yes, useful". |
Isn't that just a matter of being able to sniff out which mode we're running in at any given moment? Trio-asyncio already has ways to do that, doesn't it? |
I was imagining that the global asyncgen hooks (presumably installed by Trio) would check the scopevar for an optional override of "what hooks to use in this context". That way Trio doesn't have to be directly aware of trio-asyncio. If the hooks were specific to trio-asyncio, then I agree that they could just use the sniffio state, but I think having trio-asyncio reliably install its own global hooks might be tricky in practice. |
trio-asyncio just uses a contextvar to store the loop. That works for applications because you typically have one loop at top level and there is no "outside" to call into trio-asyncio from. It does not work quite as well for a Trio library which wants to use some asyncio code. I'd recommend to simply add a Untested code:
|
The problem is that there isn't currently any "nursery's context". There is the nursery's parent task's context, but that might reflect changes that were made after opening the nursery and would otherwise only ever be visible to the body of the nursery's
I agree that we can't treat all contextvars in this scoped fashion -- I think @njsmith did a good analysis of that above. But I think having one context for traditional contextvars and another for scopevars will work out fine in practice. Copying contexts is really cheap -- that's almost their defining feature, relative to something more mundane like a dictionary. This is important in asyncio where every single callback registration does a context copy. I think Trio can get away with doing two context copies rather than one on each new task creation. |
If we provide a similar API for frameworks and libraries like:
then pretty much an effects system is implemented, relating to conventional algebraic effects construct, we use a contextvar plus duck typed function arguments in place of an Algebra Data Type which specifies the call convention in a well typed way. data SomEffect = SomeEffect {
arg1 :: Int
, arg2 :: String
}; While functional language implementers tend to use continuation passing to implement effect calls, so even exception handling could be implemented by a library as an effect, we don't have to go that way, with a procedural imperative language where exception handling is already established. And effect calling don't have to always be about function calls, resolving a value in context is perfect valid use case IMO, and such a pair of APIs ( |
An unresolved question: what should happen around |
Sorry I'm not familiar with Trio yet. But I'm surprised to hear a task can move between scopes, isn't the idea of structural concurrency to describe tasks with lexical scoping, how can they move ? Or the task @oremanj refers to is actually an asyncio task? Does Trio use asyncio loops/tasks under the hood just as resources ? |
That's certainly a thing that you can do. But it's not clear to me what the goal is. What problem are you trying to solve?
We don't support moving tasks between scopes in general, but for Trio is completely separate from asyncio; we don't use any asyncio code.
My first intuition for This would be pretty straightforward to handle as a one-off in
I was thinking that when trio-asyncio would have hooks like: if in_asyncio_mode:
...
else:
call_original_hook(...) And every time it starts a loop, it would check if it has its own hooks installed, and if not then install them. You can't use trio-asyncio unless you're already using trio, and trio will install its hooks unconditionally at the beginning of Overall, I'm just not quite comfortable that we understand the situation well enough yet to commit to adding a whole new variable mechanism... I want to think harder about what exactly trio-asyncio needs, since that seems like the on that's potentially really compelling. I'm also pondering: the critical thing about the One case is logging libraries. I suspect that these prefer the "causal" context propagation along Another case is with decimal.localcontext() as ctx:
ctx.prec = 42
nursery.start_soon(...) Would users expect that high-precision setting to be inherited by the child task, or not? |
Also, I'm not sure "scope" is the best name here... a lot of things are scopes :-). Maybe |
A simplified motivating example for my case, but maybe less relevant to Trio so far: class CurrSymbol:
@static
def curr():
...
class Price:
def __init__(self, field='close'):
self.field=field
def tensor():
return cxx.Price(CurrSymbol.curr(), self.field)
class RSI:
def __init__(self, n, upstream=None):
self.n=n
self.upstream=upstream
def tensor():
return cxx.RSI(self.n, (self.upstream or Price()).tensor())
class SMA:
def __init__(self, n, upstream=None):
self.n=n
self.upstream=upstream
def tensor():
return cxx.SMA(self.n, (self.upstream or Price()).tensor())
def trade(ind1, ind2):
tensor1 = ind1.tensor()
...
def uiTradeBtn():
sym = uiSymDropdown.sel() # AAPL
ctor1 = eval(uiIndCode.text()) # SMA(10, RSI(5))
with CurrSymbol(sym):
trade( ctor1, ... ) This is like something in my current system with synchronous Python, assembling computation graph from tensors written in C++, In my next generation system, with an effect system and dedicated syntax in Edh, I'll write it like: class Price {
method __init__(field as this.field='close') pass
method tensor() {
hs.Price(perform CurrSymbol, field)
}
}
class RSI {
method __init__(
n as this.n,
upstream as this.upstream=None,
) pass
method tensor() {
hs.RSI(n, case this.upstream |> Price() of
{ upstream } -> upstream.tensor() )
}
}
class SMA {
method __init__(
n as this.n,
upstream as this.upstream=None,
) pass
method tensor() {
hs.SMA(n, case this.upstream |> Price() of
{ upstream } -> upstream.tensor() )
}
}
method trade(ind1, ind2) {
tensor1 = ind1.tensor()
...
}
method uiTradeBtn() {
sym = uiSymDropdown.sel() # AAPL
effect CurrSymbol = sym
ctor1 = eval(uiIndCode.text()) # SMA(10, RSI(5))
trade( ctor1, ... )
} to assembly tensors written in Haskell. I feel similar needs could exist for contextual resources in web request handling, but don't have a concrete example. |
As a user I would either set up that context inside the task (if it's task-specific anyway), or before creating the nursery (if it is not).
On the other hand, if workers are started without queuing the job, as in
they won't inherit that context. This may be kindof surprising. The problem is of course that things like decimal's context are contextvars just like
or maybe
|
Agreed that that's a much better name!
That's a good point, and KI is edge-triggered so we can't fall back on the "just don't move it and it'll get cancelled soon enough" logic we currently use to avoid an analogous problem with cancellations. So we need some special logic for KeyboardInterrupt... but I think a piece of task-local state that propagates along parent/child task relationships is still going to be a useful input to that logic!
Based on your comments in #265, I think we will have at least two libraries that want to extend asyncgen hooks behavior (trio-asyncio and tricycle), but not in the same parts of the task tree. That doesn't necessarily mean Trio needs to support task-local asyncgen hooks out of the box, because it's possible for each library to independently do something that composes well:
But that sort of falls apart if you can't determine "are we in a part of the task tree that wants this library's behavior?". trio-asyncio has an existing mechanism for this, because it can control the environment in which every individual step of an asyncio task runs. But that's not an option for everyone.
Yeah, I'm sufficiently convinced that for normal contextvars the current behavior is fine. For anything that doesn't inherently "expire" at the end of some scope, the causal propagation is strictly more flexible in common cases (where tasks are spawned into a nursery from within that nursery), and if you really need to pick up the nursery's ambient context when spawning from outside of the nursery, you can do
I'm sympathetic to this caution; I just don't see any good way to experiment with this without support from the Trio core. The best compromise I've been able to come up with, in terms of a minimal-size change to Trio itself, is to specify that nurseries capture their enclosing context as But I think the argument for TreeVar in Trio is pretty strong:
|
I implemented a
Now, these problems don't seem insurmountable. I could either build my own In short, +1 to merge #1543 ASAP. |
TreeVar is implemented in the latest release of I would be happy to revive #1543 if there's support for it being merged, but I haven't seen any further input on that PR or this issue since I left my last comment. |
FWIW, I'd like to have them.
Did you miss @richardsheridan's +1? |
A context variable is a convenient way to track something about the scope in which a task executes. Sometimes that "something" effectively constitutes a promise to child tasks that some resource remains usable. For example,
trio_asyncio.open_loop()
provides the asyncio loop via a contextvar, but it's not much use if theasync with
block has exited already.Unfortunately, contextvars are inherited from the environment of the task spawner, not the environment of the nursery in which the task runs. That means they can't reliably provide resources to a task, because the resources might be released before the task finishes. Imagine:
It would be nice to have a kind of contextvar that is inherited by child tasks based on the value it had when their nursery was opened, rather than the value it had when the child task was spawned. (In theory, cancellation status could use this if it existed. In practice, we probably want to continue to special-case cancellation status because it's so fundamental.)
Implementation idea: each scoped contextvar uses an underlying regular contextvar whose value is a tuple
(scoped contextvar's value, task that set it)
. Also, capture the contextvars context when creating a nursery (this is cheap, since contexts are meant to be shallow-copied all the time). When accessing the value of a scoped contextvar, check whether the underlying regular contextvar was set in the task we're currently in. If so, return the value from the underlying regular contextvar in the current context; otherwise, return its value in our parent nursery's captured context.The text was updated successfully, but these errors were encountered: