-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise shared memory multithreading proposal #4095
Conversation
@@ -1195,7 +855,7 @@ to control threads. | |||
```dart | |||
// dart:concurrent | |||
|
|||
abstract class Thread implements Shareable { | |||
abstract class Thread { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The below:
/// ... which uses the
/// spawned thread as an executor for all callbacks: this
/// means the thread will remain alive as long as there is
/// a callback referencing it.
I'm not sure what that means. Are the callbacks Zone
-functions, like scheduleMicrotask
, or are they the callback arguments to those zone functions?
Are executors part of zones, or separator?
(If zones are shareable, and you can share the value with a function run with a different Executor.schedule
, then it seems the executor is not bound by the zone or vice-versa.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what this comment is trying to say is that if you have some sort of branching asynchronous execution (e.g. you do future.then(cb)
) then cb
will run on the spawned thread.
Maybe this is just bad API: if this behavior is desired users should use executors explicitly.
- introduces the concept a concept of [_shared isolate_](#shared-isolates), | ||
an isolate which only has access to a state shared between all isolates of the | ||
group. This concept allows to bridge interoperability gap with native code. | ||
_Shared isolate_ becomes an answer to the previously unresolved question |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original answer to this question is that this arbitrary thread should "enter isolate" if it wants to run dart code on this thread.
I think "shared isolate" is an answer to the question of "what if native code wants to run dart code without access to any isolate state, with access to only isolate group state".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original answer to this question is that this arbitrary thread should "enter isolate" if it wants to run dart code on this thread.
In the current model Dart code can only run in some isolate, so it is given that native must enter some isolate to be able to execute Dart - that's not the question. The question is really which isolate. There are different answers to this question we have explored over the years:
- One could answer same isolate that created callback. But this answer is unsatisfactory because one can't simply enter an isolate if it is busy: you either need to use asynchronous communication or introduces blocking-and-waiting for the isolate to become free.
- One could answer fresh temporary isolate created for the purpose. This answer is also unsatisfactory because there is no uniform answer for the story of static state (and event loop) for such temporary isolate.
Shared isolate gives an answer which is simple to explain and has a very clear semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"introduces the concept a concept of" -> "introduces the concept of"
One thing that seems a little unsettling is that shared isolate is not actually an isolate. The dart code that runs on them really runs in the context of containing isolate group, so for example, proposed name for new native callback constructor is NativeCallable.isolateGroupShared
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shared isolate is an isolate. It's just an isolate that runs code slightly different from the code of the other isolates in the group (in particular, every global variable throws on read or write instead of what the source code says). That also makes it stateless so it can be shared by multiple executors, without anybody being able to see it.
So, as I understand it, the isolate provides the global state and the executor provides that stack state and likely an event loop, which is the dynamic, temporary, execution state of running code. All running code has both. Today they are the same thing.
No non-shared isolate has more than one executor at the same time (today, unless we get real concurrency).
A ReceivePort
belongs to both an isolate and the executor that created it. Anything that can schedule or create events belong to an executor, and cannot be shared between different executors.
I think we'll want executor-local "global" state, not just isolate-local state and isolate-group state (shared globals). For example, Zone._current
should probably be executor-local because it reflects position in the current stack, which code is currently executing. And Executor.current
should work in all isolates, even shared isolates, so it's "not state"? Or it's executor-local state, initialized to an object representing the current executor.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shared isolate gives an answer which is simple to explain and has a very clear semantics.
Maaaaybe.
The semantics are very simple. Any code that tries to read or write a global or static non-const
variable will throw instead.
Being simple doesn't necessarily make it clear or easy to reason about, because it's impossible to see on a function whether it, or anything it calls recursively, may access a global variable.
Something like print
, Iterable.toString
, BigInt(1)
or everything in dart:async
does access global state today. ByteData
fails if it needs Endian.host
. You can't call an async
function, it throws ... at some point. Maybe immediately, maybe at the first await
, depending on when it tries to create the Future
it returns, which isn't specified today because creating a Future
object is not expected to be able to fail.
In almost all of these examples, the global state can either be shared (isolate-group-wide, has the same value in every isolate, and doesn't contain anything non-shareable) or executor local (caches a value that is used in a single threaded way, but it's OK that it's uninitialized in each new executor).
I'm still worried that it will be a breaking change to add an access to a global variable to any existing code.
That's not great. I'd go as far as saying that we should not have any non-shared non-executor-local global variables at all. Choose one. If you choose nothing, it's executor local by default. Which just means that every normal global/static variable is executor-local, not isolate local, and you can make them isolate-group-shared, and isolates no longer carry state. At which point an executor is an isolate, and we've come full circle.
So maybe we don't need an executor. We just need to use the isolates we have today, and the "shared isolate" is just a fresh isolate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being simple doesn't necessarily make it clear or easy to reason about, because it's impossible to see on a function whether it, or anything it calls recursively, may access a global variable.
That is fair. I honestly don't know what is the good solution here: it seems costly to try lifting this segregation into the type-system so that you could statically know at the call site whether or not the invocation is valid in the context of shared isolate.
Something like
Iterable.toString
,BigInt(1)
or everything indart:async
does access global state today.ByteData
fails if it needsEndian.host
. You can't call anasync
function, it throws ... at some point. Maybe immediately, maybe at the firstawait
, depending on when it tries to create theFuture
it returns, which isn't specified today because creating aFuture
object is not expected to be able to fail.
As I see it we will over time fix core libraries to allow most of these things to work in the shared isolate.
We just need to use the isolates we have today, and the "shared isolate" is just a fresh isolate.
Fresh isolates don't fit here because you don't want them to have state (and event loop). When you invoke Dart code on some thread that is a transient action, thread might never call Dart code again. When is this isolate spawned and when does it get destroyed? What happens if multiple callbacks are called concurrently? What happens if they are called sequentially but on the same thread? There are so many different ways to answer these questions when you allow isolates to have isolated state - which this state comes need for proper lifecycle and I really want to avoid that.
That being said there is nothing preventing us from allowing developer to explicitly make specific choices here, e.g. imagine we have something like this:
class Isolate {
// Runs the given function in the given isolate.
//
// This method blocks and waits for isolate to become available.
Isolated<T> runSync<T>(T Function() body);
}
// This represents a value produced in the isolate, if it is
// accessed from another isolate it is copied if it is not
// trivially shareable.
class Isolated<T> {
T get value;
}
Then from the shared isolate you can always execute code inside some other isolate which you decided to create and run for these specific purposes. You can also use other means to switch to another isolate: e.g. you can post a task to another isolate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this is exactly what I've been advocating for (mainly the removal of the type system changes, i.e. the Shared
interface).
Amongst my reasons behind supporting the removal of Shared
(a type system level concept) are:
- It would take long time to migrate the ecosystem packages (possibly many years)
- Many use cases would actually have ownership of an object graph and want to transfer that object graph ownership to another isolate in O(1). In order to send the object graph one would need to mark it's classes all as
Sharable
=> A class beingSharable
does not imply it's thread safe to use on concurrent threads
=> If we make type system changes then at least one would expect to get concurrency safety from it, but aSharable
will not give any guarantees at all - A simple use case like having a helper thread to decode json (that sends decoded json back to main thread) wouldn't even be solved by this because json decoder returns mutable
Map
s/List
s that don't implementSharable
=> I'd want any shared memory multithreading to support decoding json off-thread
I like the proposal and I think it is really really needed, but my biggest frustration with Isolates is that it is incredibly low level and so hard to use that no one uses it (it is incredibly complex to cancel, for example. When do you want to cancel? Always. What if someone closes the screen? Or tap the screen again?). Maybe it would be a different proposal, but at this point all I wish is that Isolates were so easy to use anyone would use (just like async await), like they do with go, kotlin coroutines or swift now. I think this proposal makes sense, but somehow I still hoped for more :( |
Yes, I think being able to trigger and cancel asynchronous/concurrent threads of execution & cancel those is orthogonal to sharing memory among those concurrent threads of execution.
Why do you say "so easy to use ... just like async await"? Generally one cannot cancel an async operation triggered via async/await. Our
In Go: It doesn't seem there's a generic cancel mechanism, instead you pass the go routine a channel and the go routine itself has to react to messages on that channel. One could use the same in Dart: Make an isolate listen on a port and if someone sends a message on that port, the isolate shuts itself down - arguably it's not so easy to use in Dart, but one could make a helper package that provides an abstraction for this. In Kotlin: Coroutines can be cancelled. It's implemented by throwing catchable exceptions in specific places. It's cooperative - so one doesn't have a guarantee it will actually be cancelled / cleaned up. Right now in Dart we have an Maybe this is more of a ask to have proper coroutines in Dart. @bernaferrari Please consider filing new issues for feature requests. |
FWIW the cancellation is actually mentioned in the proposal, in the section covering structured concurrency. I think we have a pretty reasonable design for cancellable futures, but nobody seems to have time to actually implement it. |
This revises shared memory multithreading proposal based on discussions we had internally and current implementation direction.
The biggest change is complete removal of
Shareable
type marker - instead we accept that most likely destination for us would be shared anything multithreading.Another change is that proposal now describes in more details the intermediate shared native multithreading: which does not introduce any fundamentally new capabilities (only native memory can truly be shared), but at the same time addresses a lot of ergonomics issues around interoperability and concurrency model mismatch between Dart and native.
fyi @a-siva @johnmccutchan