Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[move-vm] Relax script function rules #175

Merged
merged 1 commit into from
Apr 4, 2022
Merged

Conversation

tnowacki
Copy link
Contributor

@tnowacki tnowacki commented Apr 1, 2022

Apologies for the large PR, but per directory, the changes are fairly isolated and all related to the extensions to the session API. I'd recommend starting with the vm runtime, then the bytecode verifier, then the file format, then the transactional test runner, then the tests. This is the sort of logical flow that should help the changes make sense.

  • Scripts and public(script) functions can now take arguments of any type and return any values
  • Checks are preserved for legacy file format versions
  • Checks must be invoked by the adapter

Change log:

VM/Language
- Removed execute_script_function entry point
- Removed execute_function_for_effects
+ Added return values to execute_function and execute script
  * lets you implement execute_function_for_effects
- Removed most checks from script_signature verification
  * logic preserved for old files
  * NOTE visibility checks was not preserved and must be done in the adapter,
    since execute_script function was removed
  * This allows structs, references, and arbitrary signers in entry point functions
  * This allows return values for public(script) functions
    * added necessary checks for these changes
+ Added helpers in script_signature verification for adapter checks
+ Bumped VERSION_MAX

Testing
+ Added printing of return values and mutable inputs

Follow ups:
Return values for public(script) (and really anything from execute_function) are now supported.
The same support should be added to normal scripts

Motivation

  • We want to move the verification of entry point functions to the adapter layer

Test Plan

  • Extended transactional test runner
  • Added many new tests

@wrwg
Copy link
Contributor

wrwg commented Apr 1, 2022

This is great! I don't have the cycles to review from where I'm right now, but I'm wondering whether a followup step should be to just deprecate the public(script) fun stuff from the language, and also the old style script modules.

If Move is going to deployed in different ways (Aptos, Sui, Async Move, EVM Move, Starcoin, TBD Solana Move, ...) it appears misleading and counterproductive to pretend there would be a 'standard' transaction model. Rather different deployment environments have different notions of transactions with different context restrictions.

Instead of encoding the model of transactions into the language, we could use attributes to flag the transaction model. For example, in EVM Move, we use #[callable] fun, in Async Move we use #[message] fun, and so on. There is already a (bloody primitive but working) attribute derivation and context checking model in language/move-compiler/src/attr-derivation which is best established so far for Async Move.

@sblackshear @davidiw @jolestar

@tnowacki
Copy link
Contributor Author

tnowacki commented Apr 1, 2022

It is an interesting idea that I hadn't considered (just dropping public(script)).

But because there are no annotations in the CompiledModule, I think the big issue though would be that it then makes a programmers life a bit confusing. It would be unclear at the bytecode level for what functions were intended to be invoked and what functions can actually be invoked by the adapter.
It also makes tooling a bit more difficult, since when you see a compiled module, it isn't entirely clear what functions need a ABI/transaction generation (though this is adapter specific)

Writing this all out I might tweak the session API a bit to do some level of visibility checks...
I agree that we might do away with the whole thing at some point! But it is probably too big of a change for this PR? maybe?

@sblackshear
Copy link
Contributor

sblackshear commented Apr 2, 2022

I think it is indeed valuable to maintain a bytecode-level visibility modifier that represents an entrypoint. Without this, there is no way for a programmer to convey that a function should not be an entrypoint. We actually have this problem Sui right now. We have some entrypoint validity checks that the adapter enforces, but we only want a subset of the functions that pass those checks to be callable as entrypoints. We need this enforcement at the bytecode level because occasionally, an attacker calling an "accidental entrypoint" as an entrypoint would violate key invariants. public(script) (or an equivalent, but better-named bytecode visibility modifier) is the natural mechanism for solving this problem.

@jolestar
Copy link
Contributor

jolestar commented Apr 2, 2022

Giving the adapter more freedom to customize its entrypoints is good, but one of my concerns is that the VM adapter can only check the entry function at runtime.

The developer writes and deploys the modules and then finds out that the function is not defined correctly when he calls the entry function, which affects the development experience.

So is there a way to allow custom entry functions and at the same time check them in the compiler time?

Maybe the #[callable] fun annotation way can achieve this goal?

For example:

#[script]
public fun main(sender: singer,){
}

The function signature requirements for #[script] and public(script) are the same. Now we can use #[script] to replace public(script), and every adapter can define its annotation.

But maybe we need to keep the annotation with the bytecodes for tools like generating ABI, as @tnowacki says.

@sblackshear
Copy link
Contributor

sblackshear commented Apr 2, 2022

Giving the adapter more freedom to customize its entrypoints is good, but one of my concerns is that the VM adapter can only check the entry function at runtime

I think adapters will want/need custom bytecode verifier passes for this (e.g., Sui works this way). At the source level, we should make it easier to write compiler extensions or linters that enforce the same properties as the verifier pass. Neither of these requires annotations from the programmer--both the verifier and compiler can work directly off of the visibility modifier.

Note that the verifier pass eliminates the need to enforce entrypoint signature rules at runtime altogether--the verifier checks the rules once at publish time, and the runtime can then assume they hold for all entrypoints.

gas_status: &mut GasStatus,
) -> ExecutionResult
where
V: Borrow<[u8]>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done to prevent copies in performance-sensitive adapters (see #136 for more details). We should be careful to preserve that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, didn't see this wasn't done on execute function and other spots (its all over in the runtime)
Will fix

@@ -24,56 +21,52 @@ pub struct Session<'r, 'l, S> {
pub(crate) native_extensions: NativeContextExtensions,
}

/// Result of executing a function in the VM
pub enum ExecutionResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep this type + an API that returns it around. Existing adapters are relying on it.

In addition, I think it's easier/better for an adapter to work with this structured type than it is to ask each adapter to figure out how to extract this other info (which almost every adapter will need) from other VM API's.

Totally fine with also having a more low-level API that returns SerializedReturnValues of course.

Copy link
Contributor Author

@tnowacki tnowacki Apr 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just execute_function (now execute_entry_point function) mixed with finish. I don't see much of a point of keeping it around. That is, you get the same results from composing those two (except for the gas calculation, but that is pretty easy to do on the callee side, and a bit awkward to do internally IMO since the session doesn't own the gas)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make finish() return this type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that will have any benefit. It won't have the return values for the function or the gas used; those aren't members of the session.

And it is already fairly descriptive on return type, so I don't think a struct wrapper buys much readability just for finish.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would look at the most recent fixup for a good example of the before/after with this change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, point taken. I guess I feel that writing an adapter was/is hard and this type made it a bit easier, so I'm a bit sad to see it go away. But making life easy for adapter-writers is a problem for later, no reason to block this improvement on it...

@wrwg
Copy link
Contributor

wrwg commented Apr 2, 2022

Neither of these requires annotations from the programmer--both the verifier and compiler can work directly off of the visibility modifier.

I agree with you that for most customized adapters, technically, this can be done off the existing language visibility modifiers (public(script)). But from a language design perspective, I find it rather problematic that the same syntax in Move would mean different static and dynamic semantics. I think it would be desirable to require attributes for transaction functions to indicate what flavor of Move one is actually targeting,

I also think the current nomenclature of "script" functions is very confusing. Personally, I never understood why we called this "scripts". I've a very different understanding of this notion than how it is used in Move. I would prefer to drop this notion, and mark transaction entry points exclusively via attributes.

@sblackshear
Copy link
Contributor

sblackshear commented Apr 3, 2022

I am fully onboard with changing or deprecating the script name, which no longer makes sense. public(entry) is the alternative I like best, but not religious here.

I would prefer to drop this notion, and mark transaction entry points exclusively via attributes.

As mentioned above, this is not sufficient for giving the programmer the ability to express that a function is not an entrypoint, and to carry that enforcement through to the bytecode level. That power is important.

@tnowacki
Copy link
Contributor Author

tnowacki commented Apr 4, 2022

I am fully onboard with changing or deprecating the script name, which no longer makes sense. public(entry) is the alternative I like best, but not religious here.

I have some longer thoughts on this whole space. Which I will writeup for further discussion elsewhere.


But backing up a bit, this PR should be fairly inline with the previously discussed changes to how these functions work. So it would be super helpful to get a review on this shortly.

Given the heavy session/runtime changes, @vgao1996's would particularly helpful :)

@sblackshear
Copy link
Contributor

But backing up a bit, this PR should be fairly inline with the previously discussed changes to how these functions work. So it would be super helpful to get a review on this shortly.

I have taken a fairly careful look and am happy to approve modulo concerns about keeping the existing API with efficient serialization and a more structured return type are addressed. This will free up Victor to focus on reviewing #160, which I have less context on.

@tnowacki tnowacki force-pushed the adapter branch 3 times, most recently from fb9c32c to feca91d Compare April 4, 2022 20:51
sblackshear
sblackshear previously approved these changes Apr 4, 2022
@@ -24,56 +21,52 @@ pub struct Session<'r, 'l, S> {
pub(crate) native_extensions: NativeContextExtensions,
}

/// Result of executing a function in the VM
pub enum ExecutionResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, point taken. I guess I feel that writing an adapter was/is hard and this type made it a bit easier, so I'm a bit sad to see it go away. But making life easy for adapter-writers is a problem for later, no reason to block this improvement on it...

@tnowacki
Copy link
Contributor Author

tnowacki commented Apr 4, 2022

Ok, point taken. I guess I feel that writing an adapter was/is hard and this type made it a bit easier, so I'm a bit sad to see it go away. But making life easy for adapter-writers is a problem for later, no reason to block this improvement on it...

We could do some fancy things here, but not sure how much "easier" it would make things. I can do some experimentation after this lands

@tnowacki
Copy link
Contributor Author

tnowacki commented Apr 4, 2022

/land

- Scripts and public(script) functions can now take arguments of any type and return any values
- Checks are preserved for legacy file format versions
- Checks must be invoked by the adapter

Closes: diem#175
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants