Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store and storecontext: shareable when frozen #807

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

HoneyryderChuck
Copy link
Contributor

version of #803 scoped to store and store context classes.

It adds frozen checks in all methods which perform state changes on internal state, which includes X509_STORE_CTX and X509_STORE state.

This makes them useless standalone, due to how OpenSSL:::X509::Store#verify is implemented, where when it's called, the context verification call info gets passed into store ivars. I believe this could be reimplemented in a less intrusive way, where this info could be part of the return data of the method. This would be a breaking change though.

This change still has value considering the main goal of making DEFAULT_CERT_STORE shareable due to its usage in OpenSSL::SSL::SSLContext, and how the internal X509_STORE is passed internally to SSL_CTX via SSL_CTX_set_cert_store, so the ruby object doesn't get exposed.

@rhenium
Copy link
Member

rhenium commented Oct 29, 2024

An X509_STORE holds references to X509{,_CRL} that are passed to X509_STORE_add_{cert,crl}(). They are a reference counted object, so we need something to prevent this:

cert = OpenSSL::X509::Certificate.new
store = OpenSSL::X509::Store.new
store.add_cert(cert) # increments the reference counter on X509 wrapped by `cert`, using X509_up_ref()
store.freeze

cert.serial = 1234 # indirectly modifies `store`

I'm not sure what is the best solution for this.

  1. #add_cert and #add_crl freeze the passed object. This seems unusual.
  2. These methods make a deep copy of the passed object. This is an unnecessary overhead for programs that don't need a shareable Store.
  3. Keep track of all Certificate and CRL passed to the methods in order to ensure Store will only be shareable when all of them are also frozen. This is also an additional overhead.

Regarding OpenSSL::X509::StoreContext / X509_STORE_CTX, those instantiated in ossl_verify_cb_call() (i.e., in verify_callbacks) are not possible to be made shareable safely because ruby/openssl doesn't own the object.

I wonder if we can keep it unshareable, as I can't think of any use case of a frozen OpenSSL::X509::StoreContext.

We should still add these rb_check_frozen() calls to methods that modify the receiver, though.

@HoneyryderChuck
Copy link
Contributor Author

IMO the least intrusive and less overhead way of doing so is what you propose in 3. The store could then keep added certificates in an internal array, and a call to .freeze would also freeze the associated certificates. There's some overhead for the array, but the elements are references. The object could also receive a convenient .certificates attr reader (not sure how useful in real life that would be).

Bear in mind that the "happy path" this PR is targeting is freeze-and-share the default cert store, which uses set_default_paths, so the above would not interfere not cause overhead with it. This is also the most common usage of SSL Contexts, so the overhead wouldn't leak too much (this is my assumption).

For store context, I agree with you, the yielded ones don't need to be shared, as they're only used within a block, and there's very little chance a ractor is initiated there to use it. I think that reasoning matches my patch, right?

@rhenium
Copy link
Member

rhenium commented Oct 31, 2024

a call to .freeze would also freeze the associated certificates

FYI: a #freeze method is not guaranteed to be called when an object is frozen. For example, rb_obj_freeze() in a C extension or Object.instance_method(:freeze).bind(store).call will bypass it. A proper way would be to update the mark function to recursively mark referenced objects. That way, a T_DATA object will be shareable when all referenced objects are also shareable. (If I understand it correctly.)

After rethinking the approach 3, it wouldn't work because someone can add the same certificate repeatedly. We don't want the internal array to grow indefinitely in that case. X509_STORE_add_cert() doesn't return a useful value to see if a certificate is added or not, unfortunately.

100.times { store.add_cert(OpenSSL::X509::Certificate.new(File.binread(path))) }

While looking at the man pages, I stumbled upon a function called X509_STORE_get1_objects(). It seems we can use it, but it will have a performance impact. It returns all certificates stored in the X509_STORE including those loaded by #add_file or lazily by #add_path/#set_default_paths. I'm not sure if it would be acceptable.

For store context, I agree with you, the yielded ones don't need to be shared, as they're only used within a block, and there's very little chance a ractor is initiated there to use it. I think that reasoning matches my patch, right?

It is unsafe to be shared even if we have frozen checks. While it's technically possible to introduce a mutex to make it safe, I don't think it's worth it. My suggestion is to drop RUBY_TYPED_FROZEN_SHAREABLE from StoreContext.

@HoneyryderChuck
Copy link
Contributor Author

HoneyryderChuck commented Oct 31, 2024

A proper way would be to update the mark function to recursively mark referenced objects.

I'm a bit out of my ruby internals depth here. Not sure how to access referenced objects. Do you have an example I can learn from?

After rethinking the approach 3, it wouldn't work because someone can add the same certificate repeatedly.

Wouldn't a simple "return if present in array" suffice? Or is it supposed to allow multiple calls to X509_STORE_add_cert with the same cert? I believe it'd be the less performance impactful change.

Added a POC in the last commit.

My suggestion is to drop RUBY_TYPED_FROZEN_SHAREABLE from StoreContext.

Indeed, I forgot I enabled it. reverted.

@rhenium
Copy link
Member

rhenium commented Nov 6, 2024

Or is it supposed to allow multiple calls to X509_STORE_add_cert with the same cert?

Yes. Duplicates are checked by the contents and the details aren't documented. Even if we can replicate the logic and maintain a list ourselves, it seems inefficient and fragile against future changes in OpenSSL.

@rhenium
Copy link
Member

rhenium commented Nov 6, 2024

A proper way would be to update the mark function to recursively mark referenced objects.

I'm a bit out of my ruby internals depth here. Not sure how to access referenced objects. Do you have an example I can learn from?

If I understand correctly, Ractor.make_shareable will traverse the referenced objects and freeze each object as necessary. For T_DATA, it will check objects stored in instance variables and ones declared in the dmark function.

If we have a complete list of dependencies, we can just store it in an instance variable like 80ee037 did and that's all. However, it doesn't seem possible.

I don't have a good example to show, but we'll need to extend this current dmark function to call rb_gc_mark() on each relevant certificate and CRL.

static void
ossl_x509store_mark(void *ptr)
{
X509_STORE *store = ptr;
// Note: this reference is stored as @verify_callback so we don't need to mark it.
// However we do need to ensure GC compaction won't move it, hence why
// we call rb_gc_mark here.
rb_gc_mark((VALUE)X509_STORE_get_ex_data(store, store_ex_verify_cb_idx));
}

@rhenium
Copy link
Member

rhenium commented Nov 6, 2024

I was thinking of doing something like this with X509_STORE_get1_objects(), but it turned out that X509_CRL doesn't have an ex_data to store app-specific data. X509 does. I'm out of ideas...

in x509_cert.c, when initializing an object
  X509_get_ex_data(x509, ossl_x509_ex_obj_idx, (void *)self);
  // Another question is what to do if `x509` already has a linked Ruby object
  // Is that possible in the first place, or not?

in the dfree function
  X509_set_ex_data(x509, ossl_x509_ex_obj_idx, NULL);

in Init_x509cert()
  ossl_x509_ex_obj_idx = X509_get_ex_new_index(0, (void *)"ossl_x509_ex_obj_idx", 0, 0, 0);

diff --git a/ext/openssl/ossl_x509store.c b/ext/openssl/ossl_x509store.c
index 670519febc4a..43a51ad4f114 100644
--- a/ext/openssl/ossl_x509store.c
+++ b/ext/openssl/ossl_x509store.c
@@ -120,6 +132,32 @@ ossl_x509store_mark(void *ptr)
     // However we do need to ensure GC compaction won't move it, hence why
     // we call rb_gc_mark here.
     rb_gc_mark((VALUE)X509_STORE_get_ex_data(store, store_ex_verify_cb_idx));
+
+    STACK_OF(X509_OBJECT) *objects = X509_STORE_get1_objects(store);
+    for (int i = 0; i < sk_X509_OBJECT_num(objects); i++) {
+        X509_OBJECT *xobj = sk_X509_OBJECT_value(objects, i);
+        switch (X509_OBJECT_get_type(xobj)) {
+          case X509_LU_X509: {
+              X509 *x509 = X509_OBJECT_get0_X509(xobj);
+              VALUE obj = (VALUE)X509_get_ex_data(x509, ossl_x509_ex_obj_idx);
+              if (obj != 0)
+                  rb_gc_mark(obj);
+              break;
+          }
+          case X509_LU_CRL: {
+              X509_CRL *crl = X509_OBJECT_get0_X509_CRL(xobj);
+              // Wait... X509_CRL doesn't even have an ex_data. Any way around?
+              VALUE obj = (VALUE)X509_CRL_get_ex_data(crl, ossl_x509crl_ex_obj_idx);
+              if (obj != 0)
+                  rb_gc_mark(obj);
+              break;
+          }
+          default:
+            // TODO: future OpenSSL versions may add more types
+            break;
+        }
+    }
+    sk_X509_OBJECT_pop_free(objects, X509_OBJECT_free);
 }
 
 static void

@HoneyryderChuck HoneyryderChuck force-pushed the issue-521-1 branch 2 times, most recently from e4b656e to 440618c Compare November 7, 2024 12:58
@HoneyryderChuck
Copy link
Contributor Author

If we have a complete list of dependencies, we can just store it in an instance variable like 80ee037 did and that's all. However, it doesn't seem possible.

Why not? I've just extended the commit to include CRLs and the .freeze implementation which ensures references are also frozen. I'd expect this to work, but perhaps you know edge cases where it doesn't?

I was thinking of doing something like this with X509_STORE_get1_objects(), but it turned out that X509_CRL doesn't have an ex_data to store app-specific data. X509 does. I'm out of ideas...

I see what you mean now, that'd be awesome if it worked.

FWIW I was trying to avoid explicitly using Ractor.make_shareable, and instead ensure that frozen static variables are shareable. For no particular reason, other than my impression that the "shareable" concept was a "temporary workaround" for the lack of widespread guarantees in the ecosystem that frozen objects are "deep frozen", and it'd be better to achieve the latter unless there'd be something strong against it. Is that assessment wrong? Also, I believe that a call to .freeze would not run the mark callback, so my approach would not be compatible with that patch, and I'd have to resort to using .make_shareable.

While my suggested approach imposes some overhead, I'd say it's not much (if it works, of course). FWIW we're talking about two extra objects (array), and some operational overhead on calling add_cert and add_crl (include? + << on each call), but I'm expecting that, in most cases where X509 stores are used, they're initialized once and used for verification multiple times, so that overhead is somehow amortized. And for the (mostly) happy path of using the default cert store, that overhead is even more minimal.

@rhenium
Copy link
Member

rhenium commented Nov 8, 2024

If we have a complete list of dependencies, we can just store it in an instance variable like 80ee037 did and that's all. However, it doesn't seem possible.

Why not? I've just extended the commit to include CRLs and the .freeze implementation which ensures references are also frozen. I'd expect this to work, but perhaps you know edge cases where it doesn't?

The issue is with keeping the list of references up to date.

As I mentioned in an earlier comment, it's not documented how X509_STORE_add_{cert,crl}() exactly behaves when it's passed a certificate/CRL that is already in the X509_STORE. The man page doesn't specify what it considers to be the same. It seems to me that they are trying to keep it an implementation detail that users should not be depending on.

I was thinking of doing something like this with X509_STORE_get1_objects(), but it turned out that X509_CRL doesn't have an ex_data to store app-specific data. X509 does. I'm out of ideas...

I see what you mean now, that'd be awesome if it worked.

FWIW I was trying to avoid explicitly using Ractor.make_shareable, and instead ensure that frozen static variables are shareable. For no particular reason, other than my impression that the "shareable" concept was a "temporary workaround" for the lack of widespread guarantees in the ecosystem that frozen objects are "deep frozen", and it'd be better to achieve the latter unless there'd be something strong against it. Is that assessment wrong? Also, I believe that a call to .freeze would not run the mark callback, so my approach would not be compatible with that patch, and I'd have to resort to using .make_shareable.

Isn't a #freeze method generally expected to do a shallow freeze, like how Array#freeze works?

AFAIK, before the FL_SHAREABLE flag is finally set to an object, the dependencies will be checked in the same way, regardless of whether the user proactively froze the object or is being frozen by Ractor.make_shareable.

@HoneyryderChuck
Copy link
Contributor Author

HoneyryderChuck commented Nov 8, 2024

As I mentioned in an earlier comment, it's not documented how X509_STORE_add_{cert,crl}() exactly behaves when it's passed a certificate/CRL that is already in the X509_STORE.

ok, I see what you mean. I made a small tweak to accomodate that requirement: always call X509_STORE_add_*, and add it to the corresponding ivar when not present already. It'd mean that the ivar overhead would be solely for bookkeeping the reference for the eventual freeze/share event, unless there's any value in exposing these as attributes (none that I can think of right now). The same pros/cons from above would apply.

This overhead would probably be mitigated if your workaround using X509_STORE_get1_objects would work, but as you already pointed out, there's no ex_data in the X509_CRL related struct. I guess this could be proposed upstream, but it'd probably take months (if accepted) to have it available. Do you see this as a reasonable path forward? I guess the proposed workaround would be valid only for older versions of openssl.

Isn't a #freeze method generally expected to do a shallow freeze, like how Array#freeze works?

I think that's something that only the ruby core team can answer, but the way I perceived it, operations like dup, clone or freeze on core collections have been historically shallow, and there was pushback (for some reason, perhaps just backwards compatibility(?), and despite some requests) for implementing "deep" versions of them. There has been no mandate on freezing having to be shallow or not outside of core structures though, and libraries which require "deep freezing" for some reason resort to override "#freeze" in a similar way to what you see in the proposal. I'd say the openssl gem could do the same here, within the realm of possibility.

Moreover, the main goal of the MR is to make openssl usable across ractors; it'll never move away from "experimental" if at least the whole of ruby's standard lib isn't usable outside of the main ractor. And there's certainly more than one way to achieve it. My preferred approach would be to have less ractor-specific conditional code to accommodate that, which is why I'm avoiding making Ractor.make_shareable(DEFAULT_CERT_STORE) if defined?(Ractor) work, and would instead prefer making DEFAULT_CERT_STORE.freeze work in that manner. But I admit that's a personal preference, and I'm certainly open to alternative ones.

@rhenium
Copy link
Member

rhenium commented Nov 11, 2024

As I mentioned in an earlier comment, it's not documented how X509_STORE_add_{cert,crl}() exactly behaves when it's passed a certificate/CRL that is already in the X509_STORE.

ok, I see what you mean. I made a small tweak to accomodate that requirement: always call X509_STORE_add_*, and add it to the corresponding ivar when not present already. It'd mean that the ivar overhead would be solely for bookkeeping the reference for the eventual freeze/share event, unless there's any value in exposing these as attributes (none that I can think of right now). The same pros/cons from above would apply.

It makes no difference if you check before or after the X509_STORE_add_{cert,crl}() call because it will succeed whether the certificate/CRL is already in it or not.

Actually, this is a new behavior that was changed in OpenSSL 1.1.0i (a patch release!). The old behavior was undocumented and the change was therefore allowed even without a changelog entry: openssl/openssl@f96d3c1. This is a good example why we shouldn't depend on OpenSSL's undocumented behaviors.

There is no guarantee that X509_STORE_add_{cert,crl}()'s internal de-duplication logic won't change from the current implementation.

This overhead would probably be mitigated if your workaround using X509_STORE_get1_objects would work, but as you already pointed out, there's no ex_data in the X509_CRL related struct. I guess this could be proposed upstream, but it'd probably take months (if accepted) to have it available. Do you see this as a reasonable path forward? I guess the proposed workaround would be valid only for older versions of openssl.

Note that the approach I suggested with X509_STORE_get1_objects() is at best an ugly workaround that introduces malloc inside the mark function. It doesn't even work at the current state, let's leave it behind.

Honestly, I don't know what we can do.

Would it be acceptable that OpenSSL::X509::Store is shareable only when it's frozen and all #add_{cert,crl} calls passed an already-shareable certificates or CRLs? This is the easiest way while at least making the DEFAULT_CERT_STORE shareable.

Isn't a #freeze method generally expected to do a shallow freeze, like how Array#freeze works?

I think that's something that only the ruby core team can answer, but the way I perceived it, operations like dup, clone or freeze on core collections have been historically shallow, and there was pushback (for some reason, perhaps just backwards compatibility(?), and despite some requests) for implementing "deep" versions of them. There has been no mandate on freezing having to be shallow or not outside of core structures though, and libraries which require "deep freezing" for some reason resort to override "#freeze" in a similar way to what you see in the proposal. I'd say the openssl gem could do the same here, within the realm of possibility.

I looked at these issues and while I didn't really get the reason for rejection, changing #freeze's behavior wasn't discussed.

A #freeze method automatically freezing user-supplied objects would be surprising to me.

Moreover, the main goal of the MR is to make openssl usable across ractors; it'll never move away from "experimental" if at least the whole of ruby's standard lib isn't usable outside of the main ractor. And there's certainly more than one way to achieve it. My preferred approach would be to have less ractor-specific conditional code to accommodate that, which is why I'm avoiding making Ractor.make_shareable(DEFAULT_CERT_STORE) if defined?(Ractor) work, and would instead prefer making DEFAULT_CERT_STORE.freeze work in that manner. But that's a personal preference, and if you disagree with the approach, I'm open to an alternative one.

FWIW, DEFAULT_CERT_STORE in lib/openssl/ssl.rb doesn't call #add_cert nor #add_crl, so it will not need a deep freeze to be shareable.

@HoneyryderChuck
Copy link
Contributor Author

Would it be acceptable that OpenSSL::X509::Store is shareable only when it's frozen and all #add_{cert,crl} calls passed an already-shareable certificates or CRLs?

Isn't that what the last iteration of the last commit does? 🤔 As in, in the case of certs, unless frozen, it always calls X509_STORE_add_cert and adds it to the @certificates ivar list, which is only used to propagate frozen status for previously certs and crls? Or perhaps it's the last part you're against, and freezing a store with non-frozen certificates/crls should instead fail? Or perhaps you mean "when a store is frozen, only add frozen certs/crls", and if so, what to do when a store added unfrozen certs before freezing?

FWIW, DEFAULT_CERT_STORE in lib/openssl/ssl.rb doesn't call #add_cert nor #add_crl, so it will not need a deep freeze to be shareable.

Indeed, that's less important for that use-case. but one has to deal with class-level expectations nontheless.

@rhenium
Copy link
Member

rhenium commented Nov 13, 2024

This is the problematic part:

    if(!RTEST(rb_funcall(certificates, rb_intern("include?"), 1, arg)))
        rb_ary_push(certificates, arg);

This makes the assumption that the certificate arg is added to X509_STORE if and only if certificate.none? { X509_cmp(_1, arg) == 0 }. OpenSSL doesn't guarantee it. And if certificates and X509_STORE's internal list are no longer in sync, that means we have either thread safety issue or memory leak.

Or perhaps it's the last part you're against, and freezing a store with non-frozen certificates/crls should instead fail?

I wanted to suggest a safe option where freezing a store with an unshareable certificates/CRL should not make it shareable.

There should be a smarter way, but one way I can think of is to inject a forever unshareable object into it when we can't confidently say the Store will be safely shareable.

This is completely untested:

--- a/ext/openssl/ossl_x509store.c
+++ b/ext/openssl/ossl_x509store.c
@@ -429,6 +429,22 @@ ossl_x509store_set_default_paths(VALUE self)
     return Qnil;
 }
 
+static void
+ossl_make_ractor_unshareable(VALUE obj)
+{
+    static const rb_data_type_t ossl_ractor_unshareable_type = {
+        "OpenSSL/RACTOR_UNSHAREABLE_DUMMY_OBJECT",
+        {
+            0,
+        },
+        0, 0, RUBY_TYPED_FREE_IMMEDIATELY | RUBY_TYPED_WB_PROTECTED,
+    };
+
+    if (!RTEST(rb_attr_get(obj, rb_intern("ractor_unshareable_poison"))))
+        rb_ivar_set(obj, rb_intern("ractor_unshareable_poison"),
+                    TypedData_Wrap_Struct(0, &ossl_ractor_unshareable_type, NULL));
+}
+
 /*
  * call-seq:
  *   store.add_cert(cert) -> self
@@ -448,6 +464,9 @@ ossl_x509store_add_cert(VALUE self, VALUE arg)
     if (X509_STORE_add_cert(store, cert) != 1)
         ossl_raise(eX509StoreError, "X509_STORE_add_cert");
 
+    if (!rb_ractor_shareable_p(arg))
+        ossl_make_ractor_unshareable(self);
+
     return self;
 }
 

@HoneyryderChuck
Copy link
Contributor Author

HoneyryderChuck commented Nov 13, 2024

This is the problematic part:

Ahhh, I finally see what you mean: include? depends of the implementation of == in the certificate class, which relies on the undefined/undocumented behaviour. I agree, it's a bit fragile to depend on this method.

I wanted to suggest a safe option where freezing a store with an unshareable certificates/CRL should not make it shareable.

If I understood the suggestion there, you'd set a hidden ivar with an unshareable object which would then make it impossible to share further.

I could also think of this alternative: why don't we just unset the shareable when frozen flag when adding cert/crl? Smth like:

# on add cert to unfrozen store
RTYPEDDATA(self)-> flags ^= RUBY_TYPED_FROZEN_SHAREABLE;

Not sure if there's performance impact here. It's also dependent on the internal shape of the rbdata struct. Will it fail under compaction? Let me know what you think.

@rhenium
Copy link
Member

rhenium commented Nov 13, 2024

I could also think of this alternative: why don't we just unset the shareable when frozen flag when adding cert/crl? Smth like:

# on add cert to unfrozen store
RTYPEDDATA(self)-> flags ^= RUBY_TYPED_FROZEN_SHAREABLE;

RUBY_TYPED_* is a flag on the rb_data_type_t constant. There doesn't seem to be a per-object flag.

Not sure if there's performance impact here. It's also dependent on the internal shape of the rbdata struct. Will it fail under compaction? Let me know what you think.

I think the cost for setting an instance variable would be negligible. Instances variables should be handled during compaction automatically.

But as a hack, Ractor.make_shareable(store) would not raise a pretty error message.

I didn't actually test if this works. In any case, we must take care of Certificate, CRL, and PKey (which is a dependency of the two classes) before we can make Store shareable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants