Skip to content
Stephen Pascoe edited this page Apr 9, 2014 · 3 revisions
Wiki Reorganisation
This page has been classified for reorganisation. It has been given the category MOVE.
The content of this page will be revised and moved to one or more other pages in the new wiki structure.

ESGF Replica Support

The ESGF P2P system fully support the notion of _ replicas _ - identical copies of the same entity published at separate locations. To be precise, the first copy of a Dataset or File to be published is called the master , while all sub-sequent copies published somewhere else are called replicas .

Thredds Catalogs

In a Thredds catalog, replicas are identified by the presence of two special properties that are attached to the top-level Dataset:

  • : is the fully qualified hostname where the master catalog is found

  • : is the fully qualified hostname where the replica catalog is found

The dataset and file IDs are identical in the master and replica catalog. If a catalog does not posses the above properties, it is considered to be a master catalog.

Solr Index

When Thredds catalogs are ingested into the Solr index, the properties above are used to flag the records as replicas or masters, if found. Specifically, for each record the following fields are stored in the Solr index:

  • master record:
* _ id _ : exactly as found in the Thredds catalog (from the _ dataset_id _ or _ file_id _ field) 

* _ replica _ : set to _ false _

* _ master_id _ : set equal to the record id 
  • replica record:
* _ id _ : built as the composition of the master id and the replica hostname, separated by a colon: _ master_id:replica_node _

* _ replica _ : set to _ true _

* _ master_id _ : set equal to the master record id 

Note that all records in the Solr index have universally unique ids - in other words, the Solr _ id _ of a master and replica records are different (but their _ master_id _ are the same).

Clone this wiki locally