Skip to content

Commit

Permalink
fn prune accepts a range: CheckpointMapping, and callers can choose t…
Browse files Browse the repository at this point in the history
…o use the checkpoint, tx, or epoch_interval, all of which have an exclusive hi

address first set of comments. trait fn prune still accepts from and to checkpoints, up to handler prune to decide what it needs. doc comments for PrunableRange regarding what it expects on instantiation

extend PrunableRange::get_range to return an error if from_cp is not < to_cp. Output warning if it tries to fetch from cp_mapping and can't. and remove warning in framework/lib.rs around cp_mapping

split out models from handlers

simplify error story

expose intervals as functions instead of in a struct

anyhow error when from_cp not < to_cp
  • Loading branch information
wlmyng committed Dec 26, 2024
1 parent bed5641 commit 87504e7
Show file tree
Hide file tree
Showing 5 changed files with 100 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,14 @@

use std::sync::Arc;

use crate::models::cp_sequence_numbers::StoredCpSequenceNumbers;
use crate::pipeline::{concurrent::Handler, Processor};
use crate::schema::cp_sequence_numbers;
use anyhow::Result;
use diesel::prelude::*;
use diesel_async::RunQueryDsl;
use sui_field_count::FieldCount;
use sui_pg_db::{self as db};
use sui_types::full_checkpoint_content::CheckpointData;

#[derive(Insertable, Selectable, Queryable, Debug, Clone, FieldCount)]
#[diesel(table_name = cp_sequence_numbers)]
pub struct StoredCpSequenceNumbers {
pub cp_sequence_number: i64,
pub tx_lo: i64,
pub epoch: i64,
}

pub struct CpSequenceNumbers;

impl Processor for CpSequenceNumbers {
Expand Down
1 change: 1 addition & 0 deletions crates/sui-indexer-alt-framework/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ use watermarks::CommitterWatermark;
pub mod handlers;
pub mod ingestion;
pub(crate) mod metrics;
pub mod models;
pub mod pipeline;
pub(crate) mod schema;
pub mod task;
Expand Down
88 changes: 88 additions & 0 deletions crates/sui-indexer-alt-framework/src/models/cp_sequence_numbers.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
// Copyright (c) Mysten Labs, Inc.
// SPDX-License-Identifier: Apache-2.0

use crate::schema::cp_sequence_numbers;
use anyhow::{bail, Result};
use diesel::prelude::*;
use diesel_async::RunQueryDsl;
use std::ops::Range;
use sui_field_count::FieldCount;
use sui_pg_db::Connection;

#[derive(Insertable, Selectable, Queryable, Debug, Clone, FieldCount)]
#[diesel(table_name = cp_sequence_numbers)]
pub struct StoredCpSequenceNumbers {
pub cp_sequence_number: i64,
pub tx_lo: i64,
pub epoch: i64,
}

/// Inclusive start and exclusive end range of prunable txs.
pub async fn tx_interval(conn: &mut Connection<'_>, cps: Range<u64>) -> Result<Range<u64>> {
let result = get_range(conn, cps).await?;

Ok(Range {
start: result.0.tx_lo as u64,
end: result.1.tx_lo as u64,
})
}

/// Inclusive start and exclusive end range of epochs.
///
/// The two values in the tuple represent which epoch the `from` and `to` checkpoints come from,
/// respectively.
pub async fn epoch_interval(conn: &mut Connection<'_>, cps: Range<u64>) -> Result<Range<u64>> {
let result = get_range(conn, cps).await?;

Ok(Range {
start: result.0.epoch as u64,
end: result.1.epoch as u64,
})
}

/// Gets the tx and epoch mappings for the given checkpoint range.
///
/// The values are expected to exist since the cp_mapping table must have enough information to
/// encompass the retention of other tables.
pub(crate) async fn get_range(
conn: &mut Connection<'_>,
cps: Range<u64>,
) -> Result<(StoredCpSequenceNumbers, StoredCpSequenceNumbers)> {
let Range {
start: from_cp,
end: to_cp,
} = cps;

if from_cp >= to_cp {
bail!(format!(
"Invalid checkpoint range: `from` {from_cp} must be less than `to` {to_cp}"
));
}

let results = cp_sequence_numbers::table
.select(StoredCpSequenceNumbers::as_select())
.filter(cp_sequence_numbers::cp_sequence_number.eq_any([from_cp as i64, to_cp as i64]))
.order(cp_sequence_numbers::cp_sequence_number.asc())
.load::<StoredCpSequenceNumbers>(conn)
.await
.map_err(anyhow::Error::from)?;

let Some(from) = results
.iter()
.find(|cp| cp.cp_sequence_number == from_cp as i64)
else {
bail!(format!(
"No checkpoint mapping found for checkpoint {from_cp}"
));
};
let Some(to) = results
.iter()
.find(|cp| cp.cp_sequence_number == to_cp as i64)
else {
bail!(format!(
"No checkpoint mapping found for checkpoint {to_cp}"
));
};

Ok((from.clone(), to.clone()))
}
4 changes: 4 additions & 0 deletions crates/sui-indexer-alt-framework/src/models/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// Copyright (c) Mysten Labs, Inc.
// SPDX-License-Identifier: Apache-2.0

pub mod cp_sequence_numbers;
6 changes: 6 additions & 0 deletions crates/sui-indexer-alt/src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,9 @@ The required flags are --remote-store-url (or --local-ingestion-path) and the --
```
cargo run --bin sui-indexer-alt -- --database-url {url} indexer --remote-store-url https://checkpoints.mainnet.sui.io --skip-watermark --first-checkpoint 68918060 --last-checkpoint 68919060 --config indexer_alt_config.toml
```

## Pruning
To enable pruning, the `cp_sequence_numbers` pipeline must be enabled. Otherwise, even if pruning logic is
configured for a table, the pruner task itself will skip if it cannot find a mapping for the
checkpoint pruning watermark. Only one committer needs to update this table - it is not necessary
for every indexer instance to have this pipeline enabled.

0 comments on commit 87504e7

Please sign in to comment.