How to scale Vuu server-side to handle very large tables (10m to 10bn) - Discussion #821

chrisjstevo · 2023-07-17T11:24:53Z

chrisjstevo
Jul 17, 2023
Maintainer

The value proposition of Vuu is that:

It allows a user to get data & behaviour out to a user with minimal coding
It provides a consistent data api to the client which allows viewporting and streaming of the data
It allows joining across multiple sources of data (in join tables)
It allows "modules" of functionality to be reused with 1 line of code
It allows mutable data to be stored and referenced on the server side, joining editable data (say inputting basket order quantities) with volatile ticking data (like market data)

One of the restrictions with Vuu at the moment is handling very large data sets (10m to 10bn). The restriction around this is that Vuu currently uses a table model that caches all data in memory. This is performant for up to 1m rows but becomes less performant as we go above that.

One proposal for resolving this would be to break Vuu into its internal interfaces (such as DataTable, Viewport, ViewportKeys, Filter, Sort etc... and then allow the Vuu team or another people to provide implementations for those interfaces in their favorite technology.

Then we could allow users to use these features in a declarative manner:

//more Vuu config
    VuuThreadingOptions()
      .withViewPortThreads(4)
      .withTreeThreads(4)
  ).withModule(SimulationModule())
    .withModule(MetricsModule())
    .withModule(PermissionModule())
    .withFeature(HyperSqlFeature()) //something like this....

And they could declare table types in the manner:

        HyperSqlTableDef( //custom table type
          name = "instruments",
          keyField = "ric",
          columns = Columns.fromNames("ric".string(), "description".string(), "bbg".string(), "isin".string(),
                    "currency".string(), "exchange".string(), "lotSize".int()),
          VisualLinks(),
          joinFields = "ric"
        ),
       //unclear what we specify as provide and RPC etc...
        )

What would be the benefit of this approach?

This way we can choose the table storage and query mechanism based on the size and latency requirements.

What are the implications of this approach?

The first implication for this approach is that we'd have to refactor the code base into abstract features and feature implementations. Most of the features are wrapped in interfaces so this shouldn't be a huge ask.

A more substantial impact to the code would be on components such as the join manager. The join managers purpose is to manage across data tables the relationships between keys (for example orders 1,2 and 3 all have the RIC = VOD.L and use that as a foreign key to prices.)

In this new design the join table would have to allow the loading of foreign key relations from an underlying data store. This would add some complexity.

What are the proposed next steps?

The proposed next step would be to abstract the interface definitions for the objects (or features) out, and then provide the default implementation as the default provided impl.

The second step would be to provide a simple second implementation of a "table" feature using another technology to back it. An example of this could be a Hyper SQL DB or an Apache Ignite data cache (TBC).

Discuss...

chrisjstevo · 2023-08-14T13:06:24Z

chrisjstevo
Aug 14, 2023
Maintainer Author

in order to begin this work I will create a few new structures within the Vuu source code.

I will create a org.finos.vuu.feature package within the existing vuu/vuu module. In this I will house all the potential features that we allow to be extended
I will create a vuu/features top level module. Within this top level module we'll create submodules for each feature
I will create an example feature using hyperSQL as the backing for a Vuu table with row count of ~ 5bn

so 1) will contain org.finos.vuu.feature.table

and I will have a new module in 2) called table-hypersql under /vuu/features.

0 replies

chrisjstevo · 2023-11-15T09:05:32Z

chrisjstevo
Nov 15, 2023
Maintainer Author

Closed as superseded by #971

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to scale Vuu server-side to handle very large tables (10m to 10bn) - Discussion #821

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

How to scale Vuu server-side to handle very large tables (10m to 10bn) - Discussion #821

chrisjstevo Jul 17, 2023 Maintainer

What would be the benefit of this approach?

What are the implications of this approach?

What are the proposed next steps?

Replies: 2 comments

chrisjstevo Aug 14, 2023 Maintainer Author

chrisjstevo Nov 15, 2023 Maintainer Author

chrisjstevo
Jul 17, 2023
Maintainer

chrisjstevo
Aug 14, 2023
Maintainer Author

chrisjstevo
Nov 15, 2023
Maintainer Author