Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add redis backed cache as alternative for infinispan #3534

Closed

Conversation

StFS
Copy link
Contributor

@StFS StFS commented Aug 24, 2023

This PR is very much WIP still.

So far I have refactored the client-device-connection-infinispan module into 3 cache modules (common, infinispan and redis) and a client-device-connection module.

The idea is that the client-device-connection can be backed by different types of caches, one being the Infinispan data grid that has been used in Hono so far but another possibly being Redis.

Currently I've only refactored the module structure and copied the existing code to new places. I've added an extremely crude proof of concept implementation for a Redis cache and hard coded that to be the type of cache that is created (it works, but as I said, it's not configurable in any way and the implementation is absolute garbage).

… modules (common, infinispan and redis) and a client-device-connection module
@StFS
Copy link
Contributor Author

StFS commented Aug 24, 2023

Addresses #3532

Copy link
Contributor

@sophokles73 sophokles73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On first glance, I believe that it would be sufficient to introduce modules client-device-connection and client-device-connection-redis with client-device-connection containing the interfaces and generic classes.

Any particular reason why you want to introduce the cache-* modules?

@StFS
Copy link
Contributor Author

StFS commented Aug 24, 2023

Any particular reason why you want to introduce the cache-* modules?

Mainly just because I saw the Cache interface and thought it looked so nice and clean that it would be better to abstract that out. The client-device-connection is higher level logic that can be kept separate and agnostic to what caching backend it uses. Any other such higher level logic that needs a "caching backend" could then use the same abstraction since it's quite simple (basically just get/set/ping).

I don't really have a concrete example of some other higher level logic that might need a caching backend in Hono currently. Possibly keeping track of running adapter instances for cleanup purposes since that's something we've been running into issues with recently (as reported by my colleague).

Essentially the ides is to minimize the "domain" that you want to create multiple implementations of. It seems to me that if I have a client-device-connection-infinispan and client-device-connection-redis and then the common client-device-connection module, all that would be left in the first two would be the different implementations of the backing cache... and then it wouldn't really make sense to tie the name to the client-device-connection logic.

Am I making any sense?

@sophokles73
Copy link
Contributor

Am I making any sense?

In general, you do :-) And I agree that the Cache interface looks quite generic. However, we actually devised it specifically for this purpose and while it might be usable for other use cases as well, it never was meant to be.

The other place(s) where we are using caches include the protocol adapters where we try to keep information from the device registry local by putting them to a Caffeine cache. However, I do not really see how the Cache interface could be put to good use there because the context is completely different. Another (potential) use case would be in the CoAP protocol adapter where we could implement a DTLS session cache based on Redis (or Infinispan). However, in that case I would rather try to contribute such a component to the upstream project (Californium) instead of keeping the code in Hono. So, again, the Cache interface of Hono would probably not come into play.

All in all, I guess we should simply start by putting the common classes into client-device-connection instead of introducing three additional modules for the cache classes. WDYT?

@StFS
Copy link
Contributor Author

StFS commented Aug 25, 2023

Allright, sounds good to me. I'll change the layout and go from there.

@StFS
Copy link
Contributor Author

StFS commented Sep 2, 2023

Okay, so I've modified the way I split up the modules as per your suggestions @sophokles73 .

I've also changed the redis client library to vertx-redis-client and made a first stab at using that.

So my first two questions are: 1) does the module structure now look more like what you had imagined and 2) from the looks of it, would you recommend that I stick to using the vertx-redis-client library?

There are some issues that I'm not sure about how to solve with using that client library, it assumes that keys and values are String so it doesn't quite fit the generic nature of the Cache<K,V> interface without some elaborate object "SerDe" mechanism where arbitrary class types would be serialized/deserialized to/from String. Currently I'm just bypassing the generics in Cache and casting everything to String which feels icky. Maybe there is a better way of doing this without going to a full blown SerDe implementation, I haven't really given it too much thought and doing the casting to String for now is fine.

I noticed that there are some Quarkus client libraries being used in Hono and there does exist a Quarkus Redis client (see also here). I haven't looked much into it but it does seem to have some generics support. So I'm wondering whether I should look further into that or do you think I should stick with the vertx-redis-client one?

I've also noticed some interesting things regarding the semantics of the Cache. I haven't done a full deep dive yet so what follows should be taken with a bit of salt but...

  1. The remove method takes a key and a value. I'm not sure what the purpose of value is when calling remove? Should the key only be removed if the supplied value matches what is stored? Or should value just be ignored here?

  2. The getAll method is called with multiple keys but one (or possibly more, I've only noticed one in the test runs) of the keys that are being requested does not seem to exist (or at least I see that the value for it is null and that applies both in my RedisCache implementation but also in the EmbeddedCache and HotrodCache implementations). It looks like when this happens in the Hotrod/Embedded caches, the null value is simply ignored and that key/value pair is not included in the results that are returned. Am I okay to continue with that assumption?

Other than that I'm making some progress. There are values being written and read to/from Redis but running the integration tests still fails on some of the tests. I haven't investigated that much so I'm not gonna ask for help on that just yet. But I did want to check in to see if you could possibly give me your thoughts about the above notes.

Copy link
Contributor

@sophokles73 sophokles73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FMPOV the project layout looks good.

client-device-connection/pom.xml Outdated Show resolved Hide resolved
client-device-connection-infinispan/pom.xml Outdated Show resolved Hide resolved
client-device-connection-redis/pom.xml Outdated Show resolved Hide resolved
client-device-connection-redis/pom.xml Outdated Show resolved Hide resolved
client-device-connection-redis/pom.xml Outdated Show resolved Hide resolved
@sophokles73
Copy link
Contributor

I noticed that there are some Quarkus client libraries being used in Hono and there does exist a Quarkus Redis client (see also here). I haven't looked much into it but it does seem to have some generics support. So I'm wondering whether I should look further into that or do you think I should stick with the vertx-redis-client one?

IMHO you should try to use the Quarkus redis-client extension instead. Even if you then want to use the vertx-redis-client API for interacting with Redis, the extension will make sure that everything fits into the Quarkus build, in particular when it comes to the native images.

@sophokles73
Copy link
Contributor

sophokles73 commented Sep 4, 2023

The remove method takes a key and a value. I'm not sure what the purpose of value is when calling remove? Should the key only be removed if the supplied value matches what is stored? Or should value just be ignored here?

No, you got the purpose correctly. The key should only be removed if the current value matches the given value. In our case it is actually more like: Only remove the key if its value is still the (old) value which I believe it has.
This is to make sure that we do not accidentally remove a key that has been changed by another pod and thus contains more recent data. So, in fact, what we want to do is: remove the key, if there is no more recent value available. I am saying this because it is not so much about the value itself but whether it is still up to date and could therefore be implemented using other kind of metadata in the cache as well ...

@calohmn I hope I got this right ...

@sophokles73
Copy link
Contributor

The getAll method is called with multiple keys but one (or possibly more, I've only noticed one in the test runs) of the keys that are being requested does not seem to exist (or at least I see that the value for it is null and that applies both in my RedisCache implementation but also in the EmbeddedCache and HotrodCache implementations). It looks like when this happens in the Hotrod/Embedded caches, the null value is simply ignored and that key/value pair is not included in the results that are returned. Am I okay to continue with that assumption?

Yes, IMHO that is the right thing to do. @calohmn WDYT?

@calohmn
Copy link
Contributor

calohmn commented Sep 10, 2023

The remove method takes a key and a value. [...]

No, you got the purpose correctly. The key should only be removed if the current value matches the given value.

Yes, the cache is used to keep a mapping of a device ID (for which a command subscription exists) to the ID of the protocol adapter instance that the device is connected to (and that shall receive commands for that device then).
When a device disconnect is detected, a cache entry removal is triggered. But in the meantime, the device could have connected to (and subscribed to commands at) another adapter instance, having already done a cache value update. So, the cache removal mustn't remove the wrong/newer mapping value.

The getAll method is called with multiple keys [...] It looks like when this happens in the Hotrod/Embedded caches, the null value is simply ignored and that key/value pair is not included in the results that are returned. [..]

That is interesting actually. Our implementation invokes AsyncCache.getAllAsync() here. There is no documentation for that - just a hint that it probably is meant to behave like AdvancedCache.getAll() for which this behaviour is documented:

If an entry cannot be loaded for a given key, the returned Map will contain null for value of the key.

So, the getAllAsync() implementation, which omits such entries with a null value, seems inconsistent here.

If it's not causing difficulties in the implementation, I would probably go with the documented AdvancedCache.getAll approach - letting all requested keys also be in the resulting Map - possibly with null values.

The places where we currently invoke Cache.getAll() could be adapted to skip entries with null values.

EDIT: But the approach of omitting null values would also be fine, I think, if that's how we document our getAll method behaviour. Then we also wouldn't need to change the existing Infinispan-related code.

Copy link
Contributor

@sophokles73 sophokles73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken a look based on my very limited knowledge of Redis. My gut feeling is that the vert.x client based variant is easier to read (for me) as we do not use Mutiny anywhere in the code.
On the other hand, it seems like the Quarkus project puts more focus on the Mutiny based data source approach. I cannot tell which one will work better in reality. One thing to try out in the vert.x client based impl would be to get a RedisAPI instance injected by Quarkus so we do not need to manually create and maintain the connection...

@sophokles73
Copy link
Contributor

I would suggest to rebase your branch on HEAD instead of merging master into your branch ...

@StFS
Copy link
Contributor Author

StFS commented Nov 28, 2023

I would suggest to rebase your branch on HEAD instead of merging master into your branch ...

Okay, I must admit that I'm not used to using rebase.

Are you suggesting that I do that now to clean up the history here or do you mean that I should try doing that from now on.

If I do a git rebase master now on my branch, will that actually have any effect? Won't my merge commit still be there now that I have already done them? Or am I (probably) misunderstanding how that would work?

@StFS
Copy link
Contributor Author

StFS commented Nov 28, 2023

I can squash the commits of the feature branch and force push it to get rid of the merge commits... but that would of course also squash the commits that I've made but personally I'm fine with that as most of these commits are experiments and playing around. I think that once I get to a point where I've properly decided on how to approach this I may create a new branch / PR and just close this one to clean things up.

<artifactId>hono-client-common</artifactId>
</dependency>

<dependency> <!-- TODO: changed from vertx-web because there only seems to be dependency on -core. Maybe there are runtime dependencies though??? -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sophokles73 do you know? Is it safe to change this or do you think there might be some runtime dependencies that we need from -web?

@sophokles73
Copy link
Contributor

I think that once I get to a point where I've properly decided on how to approach this I may create a new branch / PR and just close this one to clean things up.

That sounds like a great idea 👍

StFS added 3 commits December 7, 2023 10:44
… the quarkus-redis-client extension and not manually managing connections. Minor cleanup based on PR comments. Experiment with configuration.
@StFS
Copy link
Contributor Author

StFS commented Mar 2, 2024

Finally taking a look again.

I've changed to using the Vert.X RedisAPI instance that the quarkus-redis-client extension provides (code has been checked in but is still very incomplete). Integration tests are running and passing now.

However, there is one consideration that I'd like to get input on.

The thing is that the Quarkus Redis Client extension receives its configuration via properties rooted in quarkus.redis.*. There doesn't seem to be any way for me to "map" this over to some other config root such as hono.commandRouter.cache.redis.*... or at least I haven't found a way to do this yet. Do you know if this is doable and if it's not, that is if the config has to be located under quarkus.redis.*, would it be a deal breaker for using the quarkus-redis-client extension?

StFS added 8 commits March 2, 2024 02:38
…edis config values to hono.commandRouter.cache.redis.
…edis config values to hono.commandRouter.cache.redis.
… copying the behavior of it into Hono. Integration Tests are now working for both embedded and redis cache configurations. Still some work to be done to enable metrics and health checks.
@StFS
Copy link
Contributor Author

StFS commented Mar 25, 2024

Closing this PR until it's ready for a proper review.

@StFS StFS closed this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants