Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding Prebid Cache for module data #3512

Open
bretg opened this issue Feb 16, 2024 · 2 comments
Open

Expanding Prebid Cache for module data #3512

bretg opened this issue Feb 16, 2024 · 2 comments

Comments

@bretg
Copy link
Contributor

bretg commented Feb 16, 2024

As we've been talking with vendors who may be interested in building a Prebid Server module, the issue of local storage is consistent. For their browser- or SDK-based functionality, there's most often an ability to keep some local state. If there's no local state in Prebid Server, request latency and network costs are higher for everyone.

The first vendor that built a Prebid Server module requires a local Redis cluster with special stored procedure logic. PBS host companies are not going to be able to support an ecosystem of vendors that require disparate storage mechanisms.

So we've discussed in committee that expanding Prebid Cache to be able to operate as a local storage would give PBS host companies some options. At a high level, the idea is that each host company + vendor combination can choose how to handle the state needs for each module:

  1. Configure the module to just phone home on each request. (higher latency and network costs)
  2. Configure the module to point to the host company's Prebid Cache server with a simplified Key-Value interface. (potential to affect existing PBC functionality)
  3. If absolutely necessary, a vendor may require a special storage requirements. (This may limit the adoption of their module.)

Prebid Cache

The current usage patterns of Prebid Cache are:

  • supports several NoSQL backends (Aerospike, Cassandra, Memcache, Redis)
  • stores VAST and bids for a short period (5-15 minutes in general)
  • problems with storing or retrieving VAST/bids affects publisher monetization

In general, Prebid Cache itself is a lightly loaded tier of servers and for most host companies these servers can probably handle more traffic. However, the backend storage is often more difficult to manage operationally: sizing, failovers, fragmentation, monitoring... bit of a pain.

There are several enhancements needed for PBC to support the new usage pattern:

  1. Support a 'text' type in addition to XML and JSON.
  2. Support accepting the key if the request is from it's local PBS rather than always generating it. There's concern about allowing arbitrary keys from anywhere on the internet.
  3. Accept an "application" parameter that defines which backend configuration to utilize.
  4. Allow the host company to raise the max ttlseconds per application.

So the proposal here is to give host companies 3 options:

  1. Just use their current PBC and NoSQL backend and mix ads and module data
  2. Spin up a new PBC+backend specifically to store module data.
  3. Build a new feature in PBC that enables host companies to use the same tier of PBC servers but split the backend so modules use different storage.

First, a reminder of the current PBC POST interface:

{
  "puts": [
    {
      "type": "xml",
      "key": "ArbitraryKeyValue1",
      "ttlseconds": 60,
      "value": "<tag>Your XML content goes here.</tag>"
    },
    {
      "type": "json",
      "key": "ArbitraryKeyValue2"
      "ttlseconds": 300,
      "value": [1, true, "JSON value of any type can go here."]
    }
  ]
}

PBC/S Enhancements

Storage Service

Set of functions that modules can invoke to store/retrieve their data.

This service should be designed to someday accommodate an LRU cache mechanism, but we don't need to implement the LRU initially.

Text storage type

Modules could supply JSON if they want, but for usages that aren't JSON, they should be able to supply text data, e.g. base64 encoded.

{
  "puts": [
    {
      "type": "text",
      "key": "ArbitraryKeyValue1",
      "ttlseconds": 604800,
      "value": "dGV4dCBkYXRhIGhlcmU="
    }]
}

Accept the key from PBS

Modules will need to be able to supply a key for their data. e.g. "mymodule-sharedid-1234567890". Writing to an existing key should overwrite the existing value.

We can't allowing arbitrary key overwrite from anywhere on the internet because that could allow some hacking opportunities.

It's up for discussion how PBC should confirm that the request is allowed to set a key and overwrite. A couple of options:

a) IP address range configuration. PBC could recognize that any IP address of a certain pattern (e.g. 100.100.*) is allowed to write
b) public/private key pair

Application parameter

To support the ability for host companies to bifurcate data storage, the proposal is to allow PBC to configure different "applications". The default application is the "adstorage" that exists today. Any number of new applications can be added that can define different backend NoSQL, different max TTL, etc.

{
  "puts": [
    {
      "type": "text",
      "key": "ArbitraryKeyValue1",
      "application": "moduledata",
      "ttlseconds": 604800,
      "value": "<tag>Your XML content goes here.</tag>"
    }]
}
@bretg
Copy link
Contributor Author

bretg commented May 8, 2024

@bsardo and @muuki88 to review.

@bretg bretg moved this from Needs Requirements to Ready for Dev in Prebid Server Prioritization Jun 5, 2024
@bretg
Copy link
Contributor Author

bretg commented Jul 12, 2024

Done with PBS-Java 3.5, though there were some changes from the proposed spec here. A new endpoint was created rather than trying to expand the existing /cache endpoint. Details forthcoming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready for Dev
Development

No branches or pull requests

2 participants