lang: Add QuickCheck<T> that checks owner (and other invariants) quickly, avoiding complete deserialization and consuming memory #2670

snawaz · 2023-10-14T16:52:42Z

for best experience, review commit by commit.

Problem

If a program requires few accounts to be used in CPI calls (or any scenario that uses .to_account_info() only on the accounts), then the developers have 3 choices:

choice1: Account<'info, Mint>,

 ///CHECK: REQUIRED
choice2: UncheckedAccount<'info>,

 ///CHECK: REQUIRED
choice3: AccountInfo<'info>,

choice1 is safe but it's overkill if program never needs to the Mint object itself. It could be slow, as it completely deserializes the account. The biggest problem is, it consumes stack (or heap) memory unnecessarily!
choice2 and choice3 are fast (no deserialization at all) but unsafe — and requires CHECK: doc comment as well.

All choices are suboptimal — choice1, despite being safe, is the worst, as it's memory hungry!

Solution

QuickCheck<T> solves this satisfactorily (at least to me and we're using this solution in production).

safe_and_fast_both: Account<'info, QuickCheck<Mint>>,

It's both safe and fast, as it checks the owner of the account (and if there is any discriminator, that will be automatically checked by the Anchor framework already).
It does not need CHECK: doc comment either.

More

In some cases, we do not have any#[account] types (such as Mint type above), instead we have AccountInfo only but we still want to (rather must) check owner of such account_infos. To support this, this PR also implements trait Owner for the program type that enables us to write this:

#[program]
mod candy_machine { //  generates CandyMachine type to represent thos program
    // .. 
}

// instead of
// some_account: AccountInfo<'info>,
// now we can write this: 
some_account: Account<'info, QuickCheck<CandyMachine>>,

In this code, CandyMachine is not account though — it's a program type. But we can use it to ensure that some_account is owned by CandyMachine.

Tests

I've added tests to cover both scenarios discussed above.

Docs

If this PR looks good to reviewers (and eventually acceptable), I'll also add the docs. So please let know.

…full deserialization

vercel · 2023-10-14T16:52:45Z

@snawaz is attempting to deploy a commit to the coral-xyz Team on Vercel.

A member of the Team first needs to authorize it.

snawaz · 2023-10-14T16:56:48Z

This PR solves the problem in better way than a similar PR (#1475 created a long back) tried to solve (which does not compile, as noted by @armaniferrante ).

acheroncrypto

Thanks for the comprehensive explanation.

safe_and_fast_both: Account<'info, QuickCheck<Mint>>,

How does this compare to the existing owner constraint?

#[account(owner = Mint::owner())]
my_account: UncheckedAccount<'info>,

Aursen · 2023-10-14T22:09:07Z

Why do you not use a new account type for such cases? It would improve the readability of the code. Is this intentional?

snawaz · 2023-10-14T23:24:06Z

@acheroncrypto

How does this compare to the existing owner constraint?
#[account(owner = Mint::owner())]
my_account: UncheckedAccount<'info>,

QuickCheck<T> has 4 (small) advantages:

You do not need to write CHECK: doc comment. With UncheckedAccount<>, you still need to write CHECK: doc comment. The above code wont compile!
From readability point of view, despite putting CHECK: and the constraints as you used, it's often unclear as to what else needs to be checked and what else is missed, as you pointed out here. With QuickCheck<>, we can be assured that except the deserialization, all checks that apply to Account<'info, Mint> also apply to Account<'info, QuickCheck<Mint>>, e.g do we need to check discriminator or not? The framework can automate this, without needing the users to think about the checks again and again.
The mere presence of UncheckedAccount in the derive(accounts) structs feels uncomfortable and unsafe (which is precisely why CHECK doc comments are enforced strictly).
- analogy: It feels like unsafe { .. } block, you get red alerts immediately, whereas QuickCheck<> feels like normal code without unsafe { .. } blocks. All members of derive(accounts) look similar and similarly safe/checked!
UncheckAccount is less readable as we do not see the type (i.e Mint) it is meant to represent. I understand that the type is being used in the constraints just above it, but you need to kinda scan other parts of the code to see it. Also, if someone writes the constraint as #[account(owner = spl_token::ID)] which is the same check, but now the type Mint has disappeared from the scene and even scanning other parts wont help.

In terms of speed and quickness, it has zero advantage.

snawaz · 2023-10-14T23:34:29Z

@Aursen

Why do you not use a new account type for such cases? It would improve the readability of the code. Is this intentional?

How would you write an equivalent of Account<'info, QuickCheck<Mint>> using new account type? (Mint is anchor_spl::token::Mint).

How about Account<'info, QuickCheck<TokenAccount>> ?

If I use new account (wrapper?) type approach more than a couple of times, I'd quickly start feeling the verbosity without much readability improvement. QuickCheck is already readable, as you can see the real account type, meant to be used if you needed full deserialized account.

Aursen · 2023-10-15T07:24:34Z

@Aursen

Why do you not use a new account type for such cases? It would improve the readability of the code. Is this intentional?

How would you write an equivalent of Account<'info, QuickCheck<Mint>> using new account type? (Mint is anchor_spl::token::Mint).

How about Account<'info, QuickCheck<TokenAccount>> ?

If I use new account (wrapper?) type approach more than a couple of times, I'd quickly start feeling the verbosity without much readability improvement. QuickCheck is already readable, as you can see the real account type, meant to be used if you needed full deserialized account.

Maybe my question is not very clear. I just mean this syntax:

CpiAccount<'info, Mint>

Against

Account<'info, QuickCheck<Mint>>

snawaz · 2023-10-15T08:12:33Z

Maybe my question is not very clear. I just mean this syntax:
CpiAccount<'info, Mint>

ah, I see. I thought about this solution as well. I've created another PR which implements Immutable<> account type (because the current Account<> type does not enforce compile-time immutability) and my experience is that adding a new account type requires more coding and more syncing (with other account types). I personally do not see that as problem because the syncing issue can be solved by extracting out the common code in a macro or so and use that.

However, @acheroncrypto probably thinks otherwise (as per his feedbacks on the other PR). Let me know what your current thoughts are. I'd happily work on that, if that is more desirable! (and hoping then my Immutable<> PR is approved as well).

We could try implementing a third-solution:

#[account(quick_check)]
safe_and_fast: Account<'info, Mint>,

But my guess is that all these alternative solutions require more code and more changes to the existing codebase.

acheroncrypto · 2023-10-18T14:20:05Z

@acheroncrypto
How does this compare to the existing owner constraint?
#[account(owner = Mint::owner())]
my_account: UncheckedAccount<'info>,
QuickCheck<T> has 4 (small) advantages:

You do not need to write CHECK: doc comment. With UncheckedAccount<>, you still need to write CHECK: doc comment. The above code wont compile!

CHECK: comment is just a best practice recommendation by the framework - it can be skipped easily from either command arguments or workspace level setting in Anchor.toml if you don't want to document why you're not checking the account type.

From readability point of view, despite putting CHECK: and the constraints as you used, it's often unclear as to what else needs to be checked and what else is missed, as you pointed out here. With QuickCheck<>, we can be assured that except the deserialization, all checks that apply to Account<'info, Mint> also apply to Account<'info, QuickCheck<Mint>>, e.g do we need to check discriminator or not? The framework can automate this, without needing the users to think about the checks again and again.

Overall fair point.

The mere presence of UncheckedAccount in the derive(accounts) structs feels uncomfortable and unsafe (which is precisely why CHECK doc comments are enforced strictly).

analogy: It feels like unsafe { .. } block, you get red alerts immediately, whereas QuickCheck<> feels like normal code without unsafe { .. } blocks. All members of derive(accounts) look similar and similarly safe/checked!

Well, that's the intent because it's unsafe! While it may be true that using Account<'info, QuickCheck<Mint>> "look similar and similarly safe/checked", looks can be deceiving. The type is not validated, so it's still considered unsafe.

To illustrate this, let's take a test case you've added:

anchor/lang/src/quick_check.rs

Lines 102 to 108 in 894f6bd

    
           fn test_quick_check_good_case() { 
        
               let mut good_mint_data = ([0; 1024], 10959, Mint::owner()); 
        
               let good_mint_info = new_account_info(&mut good_mint_data); 
        
               let good_mint = Account::<QuickCheck<Mint>>::try_from(&good_mint_info); 
        
               assert!(good_mint.is_ok()); 
        
               assert_eq!(*good_mint.unwrap(), deserialized_value()); 
        
           }

-        let mut good_mint_data = ([0; 1024], 10959, Mint::owner());
+        struct WrongAccount;
+        impl anchor_lang::Owner for WrongAccount {
+            fn owner() -> Pubkey {
+                spl_token::ID
+            }
+        }
+
+        let mut good_mint_data = ([0; 1024], 10959, WrongAccount::owner());

Make these changes and the test would still pass even though we are using WrongAccount.

UncheckAccount is less readable as we do not see the type (i.e Mint) it is meant to represent.
Similar to the last point, Account<'info, QuickCheck<Mint>> looks like it represent Mint but one could supply a TokenAccount and it would still work - so what's the type here? This is saying it's type Mint but not actually checking if it's type Mint. I'd say it's very hard to tell what Account<'info, QuickCheck<Mint>> does meanwhile #[account(owner = spl_token::ID)] is very explicit in what it does. I think we have quite different definitions for what's readable, for me

#[account(owner = spl_token::ID)]
my_account: UncheckedAccount<'info>,

is much more readable and understandable compared to

my_account: Account<'info, QuickCheck<Mint>>,

I understand that the type is being used in the constraints just above it, but you need to kinda scan other parts of the code to see it.

Given the idiomatic way of doing account validation in Anchor is with #[account] macro, I'd argue looking at inside the #[account] macro is not "scanning other parts of the code" because that's the default location one is supposed to look to see the account constraints.

Also, if someone writes the constraint as #[account(owner = spl_token::ID)] which is the same check, but now the type Mint has disappeared from the scene and even scanning other parts wont help.

The type Mint in Account<'info, QuickCheck<Mint> was never there in the first place as it's not being checked.

At the end of the day, this all comes down to personal preference. Anchor, being an opinionated framework, has idiomatic way of doing things and allowing multiple ways of doing the same thing usually results in people getting confused rather than the intended effect of flexibility.

snawaz · 2023-10-18T16:58:24Z

Well, that's the intent because it's unsafe! While it may be true that using Account<'info, QuickCheck<Mint>> "look similar and similarly safe/checked", looks can be deceiving. The type is not validated, so it's still considered unsafe.

To illustrate this, let's take a test case you've added:

anchor/lang/src/quick_check.rs

Lines 102 to 108 in 894f6bd

fn test_quick_check_good_case() {

let mut good_mint_data = ([0; 1024], 10959, Mint::owner());

let good_mint_info = new_account_info(&mut good_mint_data);

let good_mint = Account::<QuickCheck<Mint>>::try_from(&good_mint_info);

assert!(good_mint.is_ok());

assert_eq!(*good_mint.unwrap(), deserialized_value());

}
-        let mut good_mint_data = ([0; 1024], 10959, Mint::owner());
+        struct WrongAccount;
+        impl anchor_lang::Owner for WrongAccount {
+            fn owner() -> Pubkey {
+                spl_token::ID
+            }
+        }
+
+        let mut good_mint_data = ([0; 1024], 10959, WrongAccount::owner());
Make these changes and the test would still pass even though we are using WrongAccount.

Yes, the test would pass because QuickCheck<> does not check the discriminator currently — I've only implemented the basic idea, not everything. As I said earlier, Account<'info, QuickCheck<Mint>> must check EVERYTHING that Account<'info, Mint> does, except the former does not deserialize the type itself.

Now if you think deserialization ensures some safety, then that's not true at all. Deserialization merely interprets buffer as the given type and for most types you'll get some value (even if it does not make sense to you) without any failure — it might fail at later stage though when you try to use the deserialized type but that's a different topic altogether.

The following test defines 4 types, one of which is Correct type, the rest are Wrong types. If you replace these types with the generated code (by the macro), and then comment the discriminator checks for all the types, the test would pass (this gist shows the complete test)!

    #[account]
    #[derive(InitSpace, Debug)]
    pub struct Correct {
        pub simple: u32,
        #[max_len(4)]
        pub complicated: Vec<u32>,
    }

    #[account]
    #[derive(InitSpace, Debug)]
    pub struct Wrong1 {
        pub same_size_array: [u32; Correct::INIT_SPACE],
    }

    #[account]
    #[derive(InitSpace, Debug)]
    pub struct Wrong2 {
        pub smaller_size_array: [u32; Correct::INIT_SPACE / 2],
    }

    #[account]
    #[derive(InitSpace, Debug)]
    pub struct Wrong3 {
        pub signer_bump: u8,
        pub usdc_balance: u64,
        pub deposit_count: u32,
        pub withdraw_count: u32,
    }

    #[test]
    fn test_deserialization_does_not_ensure_safety() {
        let mut data = ([0; 1024], 10959, crate::ID);
        let info = new_account_info(&mut data);

        let correct = Account::<Correct>::try_from(&info);
        assert!(correct.is_ok(), "correct: {:?}", correct);

        let wrong1 = Account::<Wrong1>::try_from(&info);
        assert!(wrong1.is_ok(), "wrong1: {:?}", wrong1);

        let wrong2 = Account::<Wrong2>::try_from(&info);
        assert!(wrong2.is_ok(), "wrong2: {:?}", wrong2);

        let wrong3 = Account::<Wrong3>::try_from(&info);
        assert!(wrong3.is_ok(), "wrong3: {:?}", wrong3);
    }

You can think of this another way: if the deserialization to the type would have provided any safety-guarantee, then Anchor would not have the notion of "discriminator". It's the discriminator which ensures safety!

Also, impl anchor_lang::Owner for WrongAccount would be generated by Anchor, so we wont have wrong owner as your example shows.

snawaz · 2023-10-18T17:09:25Z

Given the idiomatic way of doing account validation in Anchor is with #[account] macro, I'd argue looking at inside the #[account] macro is not "scanning other parts of the code" because that's the default location one is supposed to look to see the account constraints.

Yes, I thought of using #[account(validation..)] approach, as I mentioned the following in one of my comment earlier:

#[account(quick_check)]
safe_and_fast: Account<'info, Mint>,

But later I realized, it wont work, because all of the members of Account<'info, Mint> still needs to be initialized properly but one member account: T cannot be initialized without deserialization which we want to avoid. If we change the definition of Account, and keep the member account behind a pointer-like thing (Rc or Option), then maybe that'd work. But I don't (currently) think that's a better solution.

Also, if someone writes the constraint as #[account(owner = spl_token::ID)] which is the same check, but now the type Mint has disappeared from the scene and even scanning other parts wont help.

The type Mint in Account<'info, QuickCheck<Mint> was never there in the first place as it's not being checked.

In the previous comment, I explained the type Mint will be checked using its discriminator. Deserialization is not necessary — neither does it ensure any safety!

acheroncrypto · 2023-10-18T23:50:00Z

Also, if someone writes the constraint as #[account(owner = spl_token::ID)] which is the same check, but now the type Mint has disappeared from the scene and even scanning other parts wont help.

The type Mint in Account<'info, QuickCheck<Mint> was never there in the first place as it's not being checked.

In the previous comment, I explained the type Mint will be checked using its discriminator. Deserialization is not necessary — neither does it ensure any safety!

Is the Mint discriminator you've been talking about in the room with us right now?

snawaz · 2023-10-19T02:28:53Z

Is the Mint discriminator you've been talking about in the room with us right now?

I took it as example. If you're referring to the real anchor_spl::Mint, then it's not with us, in which case Account<'info, Mint> is not any more safer than Account<'info, QuickCheck<Mint>> either, as neither will check for its discriminator!

Like I said earlier, Account<'info, QuickCheck<Mint>> must check EVERYTHING that Account<'info, Mint> does.

Does Account<'info, Mint> check for its discriminator? No. It does not have any discriminator to begin with. That means, the data which gets deserialized into Mint will also get deserialized into WrongAccount as well, which you defined earlier.

struct WrongAccount; 
// some impl deserialization
impl anchor_lang::Owner for WrongAccount {
      fn owner() -> Pubkey {
          spl_token::ID
      }
 }

acheroncrypto · 2023-10-19T15:26:42Z

Is the Mint discriminator you've been talking about in the room with us right now?

I took it as example. If you're referring to the real anchor_spl::Mint, then it's not with us, in which case Account<'info, Mint> is not any more safer than Account<'info, QuickCheck<Mint>> either, as neither will check for its discriminator!

This is incorrect — not checking the non-existent discriminator does not make them equally safe or unsafe. Account<'info, Mint> will fail to deserialize if the account is not a Mint account unlike Account<'info, QuickCheck<Mint>> which will NOT fail if the account is not a Mint account.

The following will ONLY accept Mint account owned by the spl-token program(safe):

#[derive(Accounts)]
pub struct MintAccountIx<'info> {
    pub mint: Account<'info, Mint>,
}

in contrast, the following(with QuickCheck) will accept ANY account owned by the spl-token program(unsafe):

#[derive(Accounts)]
pub struct QuickCheckMintAccountIx<'info> {
    pub quick_check_mint: Account<'info, QuickCheck<Mint>>,
}

Not having a discriminator does not make an account unsafe as Native programs have their own way of making sure the accounts are uniquely deserialized. We also do not blindly interpret the account data for the wrapper accounts like you might assume. As shown by the example above, inputting a token account to the MintAccountIx will fail with a deserialization error(as it should) but QuickCheckMintAccountIx will NOT fail. Therefore, your implementation of QuickCheck is unsafe.

Which part of being able to input a different account type than the one specified in the program being unsafe is unclear to you?

snawaz · 2023-10-19T16:28:30Z

This is incorrect — not checking the non-existent discriminator does not make them equally safe or unsafe. Account<'info, Mint> will fail to deserialize if the account is not a Mint account unlike Account<'info, QuickCheck<Mint>> which will NOT fail if the account is not a Mint account.

You missed the following two points that I mentioned about the verification:

Account<'info, QuickCheck<Mint>> must check EVERYTHING that Account<'info, Mint> does.
- The entire deserialization is not necessary. Only few checks need to be done. Also, the Mint object does not need to be stored (and occupy memory) if you think deserialization is the only way to verify it.
The current implementation of QuickCheck does not check everything yet. I have said it before that once we agree with the idea, we can implement the rest of the things.

Therefore, your implementation of QuickCheck is unsafe.

Yes, because it's not complete.

Here are my thoughts: `trait AccountVerification`

Currently the deserialization does two things:

Verify the correctness of account.
- Anchor verifies accounts using the discriminator approach.
- Native programs verify using specific bits/bytes and the length of the data. The rest of the buffers do not contribute to any verification and thus can be safely ignored!
- User's programs could verify their selected accounts in their own way, like zero-copy accounts or accounts which do not have discriminator.
Interpret the buffer into the given type. This part does not contribute to any verification (by definition).

If we separate the first step into its own trait, say trait AccountVerification, that can be used by QuickCheck to verify the accounts without full-deserialization.

In some corner cases where looking for specific bytes that contribute to verification is hard, OR there are too many such bytes, OR the full-deserialization can be quick (because the account is tiny) then deserialization can be done, to verify it and discard the deserialized object immediately to free memory, as it's not necessary to keep it around.

That means, QuickCheck can do full-deserialization (if we do not agree on AccountVerification) and discard the deserialized object immediately (if successful), that'd still be a huge improvement as it'd save lots of heap/stack memory! But I'm looking for a better and faster solution.

All I'm looking for a way to avoid consuming stack/heap memory unnecessarily, as I mentioned in my first post it's the biggest problem that can be solved using such approach.

snawaz · 2023-10-19T17:14:57Z

@acheroncrypto and @Aursen

I changed the implementation and now it uses full-deserialization to verify the accounts, just like Account<'info, Mint>. However, unlike Account<'info, Mint>, it's still memory efficient as it discards the deserialized object immediately, to save memory! The name QuickCheck<Mint>, however, is not good now. Maybe Check<Mint> would be good.

An alternative solution could to have Account<>-like type but that does not store the object.

struct CheckedAccount<'info, T> {
     info: AccountInfo<'info>,
    _marker: PhantomData<T>,
}

It's much more work though.

…memory

…accounts

snawaz · 2023-10-23T14:12:40Z

I wonder why an unrelated code is failing due to clippy. Looks like something has changed on the ci, maybe due to the upstream dependencies?

Add QuickCheck<T> that only checks owner (and discriminator) without …

2717cf9

…full deserialization

snawaz force-pushed the quick-check branch from c43336c to b0edac9 Compare October 14, 2023 17:07

snawaz added 2 commits October 14, 2023 22:47

Add tests for QuickCheck

d7a9e6c

Impl Owner for #program so that users could use it with QuickCheck

894f6bd

snawaz force-pushed the quick-check branch from 9d58d9d to 894f6bd Compare October 14, 2023 17:18

acheroncrypto added the lang label Oct 14, 2023

acheroncrypto reviewed Oct 14, 2023

View reviewed changes

snawaz added 2 commits October 20, 2023 00:50

Use full-deserialization and discard the deserialized object to save …

c7d53f0

…memory

Remove program_type test as we cannot used it anymore with zero-copy …

679537d

…accounts

snawaz force-pushed the quick-check branch from fe90bff to 679537d Compare October 19, 2023 19:20

Make clippy happy => -D clippy::unwrap-or-default

2a54334

snawaz force-pushed the quick-check branch from b74767b to 2a54334 Compare October 23, 2023 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lang: Add QuickCheck<T> that checks owner (and other invariants) quickly, avoiding complete deserialization and consuming memory #2670

lang: Add QuickCheck<T> that checks owner (and other invariants) quickly, avoiding complete deserialization and consuming memory #2670

snawaz commented Oct 14, 2023 •

edited

Loading

vercel bot commented Oct 14, 2023

snawaz commented Oct 14, 2023

acheroncrypto left a comment

Aursen commented Oct 14, 2023

snawaz commented Oct 14, 2023 •

edited

Loading

snawaz commented Oct 14, 2023 •

edited

Loading

Aursen commented Oct 15, 2023

snawaz commented Oct 15, 2023 •

edited

Loading

acheroncrypto commented Oct 18, 2023

snawaz commented Oct 18, 2023

snawaz commented Oct 18, 2023 •

edited

Loading

acheroncrypto commented Oct 18, 2023

snawaz commented Oct 19, 2023 •

edited

Loading

acheroncrypto commented Oct 19, 2023

snawaz commented Oct 19, 2023 •

edited

Loading

snawaz commented Oct 19, 2023 •

edited

Loading

snawaz commented Oct 23, 2023

lang: Add QuickCheck<T> that checks owner (and other invariants) quickly, avoiding complete deserialization and consuming memory #2670

Are you sure you want to change the base?

lang: Add QuickCheck<T> that checks owner (and other invariants) quickly, avoiding complete deserialization and consuming memory #2670

Conversation

snawaz commented Oct 14, 2023 • edited Loading

Problem

Solution

More

Tests

Docs

vercel bot commented Oct 14, 2023

snawaz commented Oct 14, 2023

acheroncrypto left a comment

Choose a reason for hiding this comment

Aursen commented Oct 14, 2023

snawaz commented Oct 14, 2023 • edited Loading

snawaz commented Oct 14, 2023 • edited Loading

Aursen commented Oct 15, 2023

snawaz commented Oct 15, 2023 • edited Loading

acheroncrypto commented Oct 18, 2023

snawaz commented Oct 18, 2023

snawaz commented Oct 18, 2023 • edited Loading

acheroncrypto commented Oct 18, 2023

snawaz commented Oct 19, 2023 • edited Loading

acheroncrypto commented Oct 19, 2023

snawaz commented Oct 19, 2023 • edited Loading

Here are my thoughts: trait AccountVerification

snawaz commented Oct 19, 2023 • edited Loading

snawaz commented Oct 23, 2023

snawaz commented Oct 14, 2023 •

edited

Loading

snawaz commented Oct 14, 2023 •

edited

Loading

snawaz commented Oct 14, 2023 •

edited

Loading

snawaz commented Oct 15, 2023 •

edited

Loading

snawaz commented Oct 18, 2023 •

edited

Loading

snawaz commented Oct 19, 2023 •

edited

Loading

snawaz commented Oct 19, 2023 •

edited

Loading

Here are my thoughts: `trait AccountVerification`

snawaz commented Oct 19, 2023 •

edited

Loading