Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stats): Backend charts & Migration 6: Transactions page #1158

Merged
merged 17 commits into from
Dec 25, 2024

Conversation

bragov4ik
Copy link
Contributor

@bragov4ik bragov4ik commented Dec 19, 2024

I suggest to review the PR by commits, as I've organized them this time properly, so that changes are well-separated.

Changes:

  • migration of counters in /txns page (+ same behaviour as in blockscout)
  • new caching functionality based on update context
  • (technical) remove marked db connection as it won't be used

Chart sources:

Summary by CodeRabbit

  • New Features

    • Introduced new metrics for monitoring blockchain activity, including new_txns_24h, pending_txns, txns_fee_24h, and average_txn_fee_24h.
    • Added new scheduling groups for transaction monitoring: pending_txns_group and txns_stats_24h_group.
    • Enhanced transaction statistics with new modules for AverageTxnFee24h, NewTxns24h, and TxnsFee24h.
  • Bug Fixes

    • Streamlined database connection handling across various components to improve performance and readability.
  • Documentation

    • Updated comments and documentation for clarity regarding the new functionalities and metrics.
  • Chores

    • Removed deprecated MarkedDbConnection references and related utility functions throughout the codebase.

Copy link

coderabbitai bot commented Dec 19, 2024

Walkthrough

This pull request introduces significant enhancements to the Blockscout stats service, focusing on expanding transaction-related metrics and improving the overall data collection and caching infrastructure. The changes span multiple components of the stats service, including configuration files, database interaction methods, and chart generation logic.

Key modifications include:

  • Adding new transaction-related counters: pending transactions, 24-hour transaction statistics (new transactions, transaction fees, average transaction fees)
  • Updating dependency management, specifically for blockscout-service-launcher
  • Refactoring database connection handling by removing MarkedDbConnection and utilizing Arc<DatabaseConnection>
  • Introducing a new caching mechanism with UpdateCache to improve performance and reduce redundant database queries
  • Expanding update groups and scheduling for new metrics
  • Implementing new chart sources and query methods for transaction-related statistics

The changes are comprehensive and touch multiple layers of the stats service, from configuration and dependency management to data retrieval and processing.

Possibly related issues

  • stats: main page statistics migration #1148 (stats: main page statistics migration)
    • This PR directly addresses some aspects of the issue by:
      • Adding new transaction-related charts
      • Improving chart update mechanisms
      • Enhancing data retrieval and caching strategies

Possibly related PRs

Poem

🐰 Hop, hop, through data's domain,
Where metrics dance and numbers reign,
New counters bloom like spring's delight,
Transactions sparkle, metrics bright!
A rabbit's code, both swift and keen,
Makes stats more clear than e'er they've been! 🚀


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (28)
stats/stats/src/charts/counters/average_txn_fee_24h.rs (3)

22-22: Review data type for ETHER constant.
Although storing 10^18 in a 64-bit integer is valid, EVM-based fee calculations can be sensitive to precision. If fees exceed certain thresholds or if future expansions require more than 64-bit ranges, you may need a bigger integer or a decimal type for safer arithmetic.


77-82: Ensure large sums won't overflow.
'fee_sum' is stored in an Option, but if transactions are extremely large in quantity or value, floating-point precision might become a concern. Consider using a decimal type for more exact financial calculations.


129-166: Expand test coverage for boundary scenarios.
The test cases appear to focus on typical ranges and a zero-fee scenario. Adding tests for extremely high values, negative or zero gas usage, or partial data in the database will further strengthen the reliability of the transactions fee calculations.

stats/stats/src/charts/counters/pending_txns.rs (2)

45-64: Consider making the 30-minute offset configurable.
Hardcoding “30 minutes” might be acceptable in many scenarios, but a configuration-driven approach could offer greater flexibility.


90-104: Test coverage improvement.
The test is ignored and requires a live database. To accelerate CI feedback, you could add a smaller unit test that mocks or stubs the database, verifying the logic without needing a live environment.

stats/stats/src/data_source/kinds/remote_db/query/one.rs (3)

50-50: Handle “no data” scenario more explicitly.
Using “query returned nothing” might be valid, but returning an explicit zero or empty result could be less confusing, depending on downstream logic. Consider clarifying or differentiating real error conditions from “no query results.”

Do you want a follow-up PR to refine this behavior?


58-60: Doc comment clarity.
These doc comments explain that the result is cached for the same query. Highlight the caching key more explicitly (e.g. statement text) to avoid confusion about how collisions might occur if different statements share the same string.


65-104: Potential concurrency & eviction considerations.
If your application handles many unique queries, the in-memory cache could grow unbounded. Consider adding an eviction or capacity limit, or storing partial query digests if the statements can be large.

stats/stats/src/data_source/types.rs (2)

139-144: Check for new cache variants.
The enum covers String, Option, and TxnsStatsValue. If more data types arise in the future, consider a more generic approach—like storing JSON or a dynamic type—to avoid frequent expansions of CacheValue.


205-238: Evaluate a capacity or TTL for the cache.
As the code stands, the cache never evicts values. In a long-running system, memory usage could steadily grow. A time-based or size-based eviction policy could be beneficial.

stats/stats/src/range.rs (1)

195-197: Clarify sub-millisecond range accuracy limitation.
Thanks for the cautionary note. Add a snippet in the doc example that illustrates potential off-by-one or rounding scenarios. This can reduce confusion about exclusive/inclusive boundary transformations.

stats/stats/src/charts/counters/mod.rs (1)

2-2: LGTM! Consider grouping transaction-related modules.

The new transaction-related modules and exports align well with the PR's objective of migrating the transactions page functionality. The naming and visibility modifiers are appropriate.

Consider grouping all transaction-related modules into a submodule (e.g., txns/) to improve code organization as the number of transaction-related counters grows. This would make the module hierarchy more maintainable:

counters/
  txns/
    average_txn_fee_24h.rs
    new_txns_24h.rs
    pending_txns.rs
    txns_fee_24h.rs
    mod.rs  # Re-exports all transaction-related types
  mod.rs    # Re-exports from txns/mod.rs

Also applies to: 6-7, 18-18, 25-26, 30-31, 42-42

stats/stats/src/data_source/kinds/data_manipulation/map/unwrap_or.rs (1)

7-7: Add documentation for UnwrapOrFunction struct

Consider adding documentation explaining the purpose and usage of this struct, including examples.

+/// A function that unwraps optional values in TimespanValue, providing a default value when None.
+/// 
+/// # Type Parameters
+/// * `DefaultValue` - A type implementing Get trait that provides the default value
pub struct UnwrapOrFunction<DefaultValue: Get>(PhantomData<DefaultValue>);
stats/stats/src/charts/counters/total_contracts.rs (1)

Line range hint 71-75: Consider adding mock-based tests

The current test is ignored due to database dependency. Consider adding mock-based tests to improve test coverage without requiring a live database.

Example approach:

#[cfg(test)]
mod tests {
    use super::*;
    use mockall::predicate::*;
    use mockall::mock;

    mock! {
        DatabaseConnection {
            async fn count(&self, query: Select<addresses::Entity>) -> Result<u64, DbErr>;
        }
    }

    #[tokio::test]
    async fn test_query_data() {
        let mut mock_db = MockDatabaseConnection::new();
        mock_db.expect_count()
            .return_const(Ok(23u64));
        
        let context = UpdateContext {
            blockscout: &mock_db,
            time: Utc::now(),
            // ... other fields
        };

        let result = TotalContractsQueryBehaviour::query_data(
            &context,
            UniversalRange::default()
        ).await;
        
        assert!(result.is_ok());
        assert_eq!(result.unwrap().value, "23");
    }
}
stats/stats/src/charts/counters/new_txns_24h.rs (1)

51-84: Consider adding more test coverage

While the current tests cover basic scenarios and edge cases, consider adding tests for:

  • Error handling scenarios
  • Boundary conditions for the 24h window
stats/stats-server/tests/it/counters.rs (2)

Line range hint 33-34: Consider increasing the wait time for server initialization

The current 5-second sleep might be insufficient if the server needs to calculate many values, especially with the new counters added.

-    tokio::time::sleep(std::time::Duration::from_secs(5)).await;
+    // Increased wait time to ensure all counters are calculated
+    tokio::time::sleep(std::time::Duration::from_secs(10)).await;

Line range hint 35-41: Consider adding specific value assertions

The test only verifies non-empty strings for description and title. Consider adding assertions for specific expected values or value ranges.

stats/stats/src/charts/counters/txns_fee_24h.rs (1)

63-88: Consider adding test case descriptions

While the test cases cover different scenarios, adding comments describing what each test verifies would improve maintainability.

 #[tokio::test]
 #[ignore = "needs database to run"]
+// Test case 1: Verifies handling of small transaction fees
 async fn update_txns_fee_24h_1() {
     simple_test_counter::<TxnsFee24h>("update_txns_fee_24h_1", "0.000023592592569", None).await;
 }
stats/stats/src/charts/query_dispatch.rs (1)

38-38: Important lifetime annotation improvement

The change from &UpdateContext<'a> to &'a UpdateContext<'a> strengthens the lifetime guarantees by ensuring that the UpdateContext reference lives for the entire duration of the async operation. This prevents potential dangling references in async contexts.

Consider documenting these lifetime requirements in the trait documentation to help future maintainers understand the importance of these constraints.

Also applies to: 46-46, 154-154

stats/stats/src/update_groups.rs (1)

145-147: LGTM: New transaction statistics group

The new TxnsStats24hGroup logically groups related 24-hour transaction metrics (AverageTxnFee24h, NewTxns24h, TxnsFee24h) together for coordinated updates.

Consider adding a comment explaining the update frequency requirements or dependencies between these metrics, if any exist.

stats/stats/src/data_source/kinds/data_manipulation/filter_deducible.rs (1)

Line range hint 4-164: Consider adding edge case tests

While the current tests cover the basic functionality, consider adding tests for edge cases such as:

  • Empty input vectors
  • Single point vectors
  • Vectors with all zero values for FillZero policy
  • Vectors with all same values for FillPrevious policy
stats/stats/src/data_source/kinds/local_db/parameters/query.rs (1)

Line range hint 78-89: Consider enhancing error message with more context

The error message could be more helpful by including additional context such as the date being queried.

-            "no data for counter '{}' was found",
-            C::name()
+            "no data for counter '{}' was found for date {}",
+            C::name(), cx.time.date_naive()
stats/stats/src/utils.rs (1)

Line range hint 1-6: Consider updating module documentation

The module documentation could be enhanced to better describe the current utilities after the removal of database connection related code.

-//! Common utilities used across statistics
+//! Common utilities for SQL query generation and date/time handling used across statistics
stats/stats/src/charts/counters/total_blocks.rs (1)

79-90: Consider improving the estimation accuracy.

The current implementation:

  1. Uses a fixed 0.9 multiplier for estimation
  2. Doesn't provide documentation explaining the rationale for this multiplier
  3. Silently defaults to 0 when estimation fails

Consider:

  1. Documenting the reasoning behind the 0.9 multiplier
  2. Implementing a more dynamic estimation strategy
  3. Adding error context when estimation fails
stats/stats-server/src/read_service.rs (1)

16-16: LGTM! Clean transition to Arc.

The changes consistently implement the transition from MarkedDbConnection to Arc<DatabaseConnection>, maintaining thread-safety and improving the architecture. The error handling remains robust, and the service's functionality is preserved.

Consider grouping related imports together:

-use sea_orm::{DatabaseConnection, DbErr};
 use stats::{
     data_source::{types::BlockscoutMigrations, UpdateContext, UpdateParameters},
     query_dispatch::{CounterHandle, LineHandle, QuerySerializedDyn},
     range::UniversalRange,
     types::Timespan,
-    utils::day_start,
     ChartError, RequestedPointsLimit, ResolutionKind,
+    utils::day_start,
 };
+use sea_orm::{DatabaseConnection, DbErr};

Also applies to: 22-22, 30-31, 38-39, 107-107

stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (1)

205-213: Consider extracting common test setup code

There's significant duplication in the test setup code across multiple test functions. Consider extracting the common initialization into a helper function.

+async fn setup_test_context(time: &str) -> (DatabaseConnection, UpdateContext<'_>) {
+    let empty_db = sea_orm::Database::connect("sqlite::memory:").await.unwrap();
+    let context = UpdateContext::from_params_now_or_override(UpdateParameters {
+        db: &empty_db,
+        blockscout: &empty_db,
+        blockscout_applied_migrations: BlockscoutMigrations::latest(),
+        update_time_override: Some(dt(time).and_utc()),
+        force_full: false,
+    });
+    (empty_db, context)
+}

This would allow simplifying the test setup:

-let empty_db = sea_orm::Database::connect("sqlite::memory:").await.unwrap();
-let context = UpdateContext::from_params_now_or_override(UpdateParameters {
-    db: &empty_db,
-    blockscout: &empty_db,
-    blockscout_applied_migrations: BlockscoutMigrations::latest(),
-    update_time_override: Some(dt("2023-03-30T09:00:00").and_utc()),
-    force_full: false,
-});
+let (empty_db, context) = setup_test_context("2023-03-30T09:00:00").await;

Also applies to: 253-261, 303-311, 349-357

stats/stats/src/data_source/kinds/local_db/mod.rs (2)

Line range hint 145-188: Consider adding retry mechanism for database operations.

The database operations could benefit from a retry mechanism for transient failures, especially for the metadata updates which are critical for chart synchronization.

Consider implementing a retry mechanism:

+async fn with_retry<F, T, E>(f: F, retries: u32, delay: Duration) -> Result<T, E>
+where
+    F: Fn() -> Pin<Box<dyn Future<Output = Result<T, E>> + Send>> + Send + Sync,
+    T: Send,
+    E: std::error::Error + Send,
+{
+    let mut attempts = 0;
+    loop {
+        match f().await {
+            Ok(result) => return Ok(result),
+            Err(e) if attempts < retries => {
+                attempts += 1;
+                tokio::time::sleep(delay).await;
+                continue;
+            }
+            Err(e) => return Err(e),
+        }
+    }
+}

438-441: Consider parameterizing test data setup.

The test initialization could be more flexible by parameterizing the mock data setup.

Consider extracting test parameters:

+struct TestParams {
+    name: &'static str,
+    current_time: DateTime<Utc>,
+    mock_data: Vec<MockData>,
+}
+
+async fn setup_test(params: TestParams) -> (DatabaseConnection, DatabaseConnection) {
+    let (db, blockscout) = init_db_all(params.name).await;
+    fill_mock_blockscout_data(&blockscout, params.current_time.date_naive()).await;
+    (db, blockscout)
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57c1884 and 087bfd7.

⛔ Files ignored due to path filters (1)
  • stats/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (50)
  • stats/Cargo.toml (1 hunks)
  • stats/config/charts.json (1 hunks)
  • stats/config/layout.json (1 hunks)
  • stats/config/update_groups.json (2 hunks)
  • stats/stats-server/src/read_service.rs (2 hunks)
  • stats/stats-server/src/runtime_setup.rs (2 hunks)
  • stats/stats-server/src/server.rs (4 hunks)
  • stats/stats-server/src/update_service.rs (3 hunks)
  • stats/stats-server/tests/it/counters.rs (1 hunks)
  • stats/stats/Cargo.toml (1 hunks)
  • stats/stats/src/charts/counters/average_block_time.rs (2 hunks)
  • stats/stats/src/charts/counters/average_txn_fee_24h.rs (1 hunks)
  • stats/stats/src/charts/counters/mod.rs (3 hunks)
  • stats/stats/src/charts/counters/new_txns_24h.rs (1 hunks)
  • stats/stats/src/charts/counters/pending_txns.rs (1 hunks)
  • stats/stats/src/charts/counters/total_addresses.rs (2 hunks)
  • stats/stats/src/charts/counters/total_blocks.rs (8 hunks)
  • stats/stats/src/charts/counters/total_contracts.rs (2 hunks)
  • stats/stats/src/charts/counters/total_txns.rs (3 hunks)
  • stats/stats/src/charts/counters/txns_fee_24h.rs (1 hunks)
  • stats/stats/src/charts/counters/yesterday_txns.rs (2 hunks)
  • stats/stats/src/charts/db_interaction/read.rs (2 hunks)
  • stats/stats/src/charts/lines/native_coin_holders_growth.rs (2 hunks)
  • stats/stats/src/charts/lines/new_accounts.rs (1 hunks)
  • stats/stats/src/charts/lines/new_block_rewards.rs (2 hunks)
  • stats/stats/src/charts/lines/new_blocks.rs (8 hunks)
  • stats/stats/src/charts/lines/new_txns_window.rs (2 hunks)
  • stats/stats/src/charts/query_dispatch.rs (2 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/filter_deducible.rs (2 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/map/mod.rs (1 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/map/unwrap_or.rs (1 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (5 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/resolutions/last_value.rs (2 hunks)
  • stats/stats/src/data_source/kinds/data_manipulation/resolutions/sum.rs (2 hunks)
  • stats/stats/src/data_source/kinds/local_db/mod.rs (7 hunks)
  • stats/stats/src/data_source/kinds/local_db/parameters/query.rs (8 hunks)
  • stats/stats/src/data_source/kinds/local_db/parameters/update/batching/mod.rs (3 hunks)
  • stats/stats/src/data_source/kinds/local_db/parameters/update/point.rs (1 hunks)
  • stats/stats/src/data_source/kinds/remote_db/mod.rs (2 hunks)
  • stats/stats/src/data_source/kinds/remote_db/query/all.rs (1 hunks)
  • stats/stats/src/data_source/kinds/remote_db/query/each.rs (1 hunks)
  • stats/stats/src/data_source/kinds/remote_db/query/mod.rs (1 hunks)
  • stats/stats/src/data_source/kinds/remote_db/query/one.rs (2 hunks)
  • stats/stats/src/data_source/tests.rs (3 hunks)
  • stats/stats/src/data_source/types.rs (5 hunks)
  • stats/stats/src/range.rs (2 hunks)
  • stats/stats/src/tests/init_db.rs (0 hunks)
  • stats/stats/src/tests/simple_test.rs (8 hunks)
  • stats/stats/src/update_groups.rs (2 hunks)
  • stats/stats/src/utils.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • stats/stats/src/tests/init_db.rs
✅ Files skipped from review due to trivial changes (1)
  • stats/Cargo.toml
🧰 Additional context used
📓 Learnings (1)
stats/stats/src/data_source/kinds/local_db/parameters/update/batching/mod.rs (1)
Learnt from: bragov4ik
PR: blockscout/blockscout-rs#1147
File: stats/stats/src/data_source/kinds/local_db/parameters/update/batching/parameters/mod.rs:76-98
Timestamp: 2024-12-11T15:25:33.248Z
Learning: In the `ClearAllAndPassStep::batch_update_values_step_with` function in `stats/src/data_source/kinds/local_db/parameters/update/batching/parameters/mod.rs`, it's acceptable to perform the clear and insert operations separately without wrapping them in a transaction, as inconsistency is acceptable in this context, and the insert step is very unlikely to fail unless the service is down.
🔇 Additional comments (57)
stats/stats/src/charts/counters/average_txn_fee_24h.rs (2)

26-74: Validate potential negative or edge-case fees.
While the expression for fee calculation appears correct for standard EVM transactions (using gas price, base fee, etc.), consider adding explicit safeguards or tests to handle potential negative or nonsensical computed fees (e.g., if maxFeePerGas < baseFeePerGas).

Would you like me to generate a script to check for usage or references of this calculation in test or production code to confirm it never calculates negative fee values?


86-99: Good extraction logic for average fees.
The ExtractAverage struct succinctly transforms the fee stats into just the average fee. This is a clean implementation that isolates logic neatly.

stats/stats/src/data_source/kinds/remote_db/mod.rs (2)

5-8: Documentation improvements look good!

The updated documentation better reflects the component's nature and provides clearer guidance about its behavior and limitations.


37-38: Verify cache implementation details

The addition of PullOne24hCached introduces caching functionality. To ensure robust implementation:

  1. Verify cache invalidation strategy
  2. Consider documenting cache behavior in module docs

Let's examine the cache implementation:

✅ Verification successful

Let me search for the implementation of PullOne24hCached to understand the caching behavior.


Let me check the implementation details in the identified files.


Based on the implementation details found, I can now provide a final response:

Cache implementation is properly handled with automatic invalidation

The PullOne24hCached implementation has a robust caching strategy:

  1. Cache key is based on the query statement, which includes the 24-hour time range
  2. Cache automatically invalidates as the time window slides (new queries generate different cache keys)
  3. Cache is scoped to the update context, preventing stale data between updates
  4. Implementation includes proper error handling and fallback mechanisms

The code is well-tested and the caching behavior is documented in the test cases.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for PullOne24hCached implementation and usage
ast-grep --pattern 'struct PullOne24hCached {
  $$$
}'

# Look for cache-related tests
rg -l "PullOne24hCached.*test" 

# Check for potential cache invalidation logic
rg "invalidate|clear.*cache" 

Length of output: 71787


Script:

#!/bin/bash
# Search for PullOne24hCached implementation
ast-grep --pattern 'struct PullOne24hCached {
  $$$
}'

# Search for PullOne24hCached usage and tests
rg -l "PullOne24hCached"

# Look for cache invalidation logic
rg -A 5 "PullOne24hCached.*impl|PullOne24hCached.*fn"

Length of output: 357


Script:

#!/bin/bash
# Check implementation in the identified files
cat stats/stats/src/data_source/kinds/remote_db/query/one.rs

# Look for cache-related code in the average txn fee implementation
cat stats/stats/src/charts/counters/average_txn_fee_24h.rs

# Check the module definition
cat stats/stats/src/data_source/kinds/remote_db/query/mod.rs

Length of output: 11366

stats/stats/src/charts/counters/pending_txns.rs (1)

45-64: Validate non-existent row scenario.
The code returns an internal error if no row is found (“query returned nothing”). Verify whether returning an explicit zero might be more meaningful (since it could indicate zero pending transactions), rather than treating it as an application error in normal circumstances.

stats/stats/src/data_source/types.rs (1)

1-2: Overall concurrency approach is solid.
The use of an Arc<Mutex<HashMap<...>>> is straightforward and ensures thread safety. Be mindful of potential contention in highly concurrent environments.

stats/stats/src/data_source/kinds/remote_db/query/mod.rs (1)

7-7: LGTM! Verify usage of the new cached query type.

The new PullOne24hCached export aligns with the PR's caching functionality objective and follows existing naming conventions.

✅ Verification successful

The new PullOne24hCached type is properly integrated and used in the codebase

The verification shows that PullOne24hCached is:

  • Correctly implemented in query/one.rs
  • Properly re-exported through module hierarchies
  • Actively used in charts/counters/average_txn_fee_24h.rs for transaction fee statistics with appropriate type composition
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the usage of PullOne24hCached across the codebase
rg "PullOne24hCached" --type rust

Length of output: 1051

stats/stats/src/data_source/kinds/local_db/parameters/update/point.rs (1)

31-31: LGTM! Consider updating related documentation.

The simplified database connection handling aligns with the PR's objective of removing marked connections. The change reduces code complexity while maintaining functionality.

✅ Verification successful

LGTM! No documentation updates needed.

The change to remove .connection.as_ref() is safe and no documentation updates are required. The search results show:

  • No remaining instances of connection.as_ref() in the Rust codebase
  • No specific database connection documentation in the stats module that would need updating
  • The only database connection reference in documentation is a general note in the root README.md about sea-orm crates, which is unaffected by this change
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there's any documentation that needs updating
rg -A 5 "connection\.as_ref\(\)" --type rust
rg -l "MarkedDbConnection" --type-add docs:*.{md,txt,rst} --type docs

Length of output: 236


Script:

#!/bin/bash
# Let's check for any documentation files in the repository first
fd -e md -e txt -e rst

# Then search for any mentions of database connections or related terms in documentation
rg -i "database connection|db connection" --type md

# Check for any remaining references to connection.as_ref() in the codebase
rg "connection\.as_ref\(\)" --type rust

# Look for any documentation in the specific stats module
fd -e md -e txt -e rst . stats/

Length of output: 2558

stats/stats/Cargo.toml (3)

37-38: LGTM: Workspace-aligned dependency declarations

The transition to workspace-based dependency declarations for blockscout-service-launcher and wiremock improves dependency management consistency.


44-46: LGTM: Enhanced testing capabilities

The addition of the "mock" feature to sea-orm enhances the testing capabilities, which is particularly valuable for the stats backend changes.


53-54: LGTM: Consistent dev dependencies

The dev-dependencies are properly aligned with the workspace configuration.

stats/stats/src/data_source/kinds/data_manipulation/map/unwrap_or.rs (2)

9-28: LGTM: Well-implemented MapFunction traits

The implementations for both vector and single TimespanValue cases are type-safe and handle the unwrapping logic correctly. Good use of generics and trait bounds ensures type safety at compile time.

Also applies to: 30-46


48-51: LGTM: Clear type alias documentation

The documentation for UnwrapOr type alias is clear and includes a helpful reference to the gettable_const utility.

stats/stats/src/data_source/kinds/data_manipulation/map/mod.rs (1)

19-19: LGTM: Clean module organization

The addition of the unwrap_or module and its export is well-organized and follows Rust module conventions.

Also applies to: 24-24

stats/stats/src/charts/counters/total_contracts.rs (1)

32-32: LGTM: Database connection handling update

The simplified database connection handling aligns with the PR objectives and broader codebase changes.

stats/config/layout.json (1)

17-21: Verify completeness of transaction metrics

The new transaction-related counters align with the PR objective of migrating the transactions page. Let's verify that all required metrics from the original Blockscout implementation are covered.

✅ Verification successful

All required transaction metrics are covered

The new transaction-related counters being added (new_txns_24h, pending_txns, txns_fee_24h, average_txn_fee_24h) align with the existing metrics in charts.json and provide complete coverage of transaction statistics:

  • new_txns_24h tracks new transactions in last 24h
  • pending_txns shows pending transaction count
  • txns_fee_24h measures total transaction fees in 24h
  • average_txn_fee_24h shows average fee per transaction in 24h

These complement existing metrics like completed transactions, transaction success rate, and cumulative growth statistics.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if all transaction metrics from original Blockscout are covered

# Search for transaction-related metrics in the original Blockscout implementation
rg -i "transaction|txn" --type json

# Compare with new metrics being added
echo "New metrics being added:"
echo "- new_txns_24h"
echo "- pending_txns"
echo "- txns_fee_24h"
echo "- average_txn_fee_24h"

Length of output: 121569


Script:

#!/bin/bash
# Let's check the charts.json file to verify all transaction metrics
cat stats/config/charts.json | grep -A 5 '"title".*[Tt]ransaction'

Length of output: 2766

stats/stats/src/data_source/kinds/remote_db/query/all.rs (2)

63-63: LGTM: Database connection handling update

The simplified database connection handling is consistent with other changes in the codebase.

✅ Verification successful

Let me verify the database connection handling changes more thoroughly by checking for any mixed usage patterns.


Let me try one final verification with a different approach to check the database query patterns:


Based on the comprehensive verification results, I can now provide the final response:

Database connection handling changes are consistent

The simplified database connection handling (from cx.blockscout.connection.as_ref() to cx.blockscout) is correctly implemented across the codebase. All database operations consistently use the new pattern, with no remnants of the old connection handling style found. The changes maintain backward compatibility as they only simplify the connection access without modifying the underlying query behavior.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistent database connection handling across codebase

# Check for any remaining old-style connection references
rg "connection\.as_ref\(\)"

# Check for consistent usage of new style
rg "\.all\(.*blockscout\)"

Length of output: 628


Script:

#!/bin/bash
# Search for any database connection patterns in the codebase
rg -A 2 "blockscout\." stats/stats/src/

# Check for any other connection handling patterns
ast-grep --pattern 'connection.as_ref()'

# Look for any direct database operations
ast-grep --pattern '.execute($$$)'

Length of output: 121


Script:

#!/bin/bash
# Search for database query patterns in the stats directory
rg -B 2 -A 2 "\.select|\.execute|\.query" stats/stats/src/

# Check all files in the query directory for connection patterns
fd . stats/stats/src/data_source/kinds/remote_db/query/ -t f -x cat {}

Length of output: 22222


Line range hint 62-67: Verify backward compatibility of query behavior

Since this is a generic implementation used by multiple queries, ensure that the connection handling change maintains backward compatibility with existing query implementations.

✅ Verification successful

The previous search didn't yield results, so let's broaden our search to understand the query patterns and their implementations in the codebase. This will help us verify if the connection handling change maintains backward compatibility.


This will help us understand the query implementations and verify if the connection handling change maintains backward compatibility.


Based on the comprehensive search results, I can now generate the final response:

The connection handling change maintains backward compatibility

The change in the code is safe and maintains backward compatibility because:

  1. The PullAllWithAndSort implementation is consistently used across multiple chart types (new blocks, transactions, gas prices, etc.) with the same pattern of:

    • Fetching data using find_by_statement
    • Sorting by timespan
    • Maintaining the same output type Vec<TimespanValue<Resolution, Value>>
  2. The sorting operation (sort_unstable_by) is a post-processing step that happens after the data is retrieved, ensuring consistent ordering regardless of how the data is fetched.

  3. The change only affects the internal data retrieval mechanism while preserving the same interface and data structure that all implementations depend on.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all implementations using PullAllWithAndSort

# Search for PullAllWithAndSort usage
ast-grep --pattern 'struct $name<$_, $_, $_, PullAllWithAndSort<$_>>'

Length of output: 71


Script:

#!/bin/bash
# Let's search for any usage of TimespanValue and related query patterns

# Search for TimespanValue usage
rg "TimespanValue" -A 5

# Search for any struct implementations related to query handling
ast-grep --pattern 'struct $name {
  $$$
}'

# Look for files that might contain query implementations
fd "query" --type f

Length of output: 96539


Script:

#!/bin/bash
# Let's search for the query implementations and their usage patterns

# Search for find_by_statement usage to understand query patterns
rg "find_by_statement" -A 3

# Look for implementations of RemoteQueryBehaviour
ast-grep --pattern 'impl $_ RemoteQueryBehaviour for $_'

# Search for any usage of PullAllWithAndSort
rg "PullAllWithAndSort" -A 3

Length of output: 40453

stats/config/update_groups.json (1)

6-6: LGTM! Verify scheduling frequency requirements

The new schedule entries are well-formatted and follow the existing pattern:

  • pending_txns_group: Updates every 5 minutes
  • txns_stats_24h_group: Updates hourly

Please verify if these update frequencies align with your requirements:

  • Is 5-minute interval sufficient for pending transactions monitoring?
  • Is hourly update adequate for 24h transaction statistics?

Also applies to: 41-42

stats/stats/src/charts/counters/new_txns_24h.rs (2)

14-27: LGTM! Clean and focused implementation of ExtractCount

The transformation logic is straightforward and maintains the timespan integrity.


32-47: LGTM! Well-defined chart properties

Good choices for:

  • Counter chart type for transaction counts
  • FillPrevious policy to maintain continuous data
stats/stats-server/tests/it/counters.rs (1)

62-65: LGTM! New counters properly integrated

The test correctly validates the presence and order of the new counters.

stats/stats/src/charts/counters/txns_fee_24h.rs (2)

16-29: LGTM! Clean and focused implementation of fee sum extraction

The ExtractSum implementation correctly transforms the transaction stats into fee sum values while preserving the timespan information.


33-50: LGTM! Appropriate chart properties for transaction fee tracking

The counter type with fill-previous policy is well-suited for continuous fee tracking, ensuring no gaps in the data visualization.

stats/stats/src/charts/counters/yesterday_txns.rs (1)

Line range hint 41-45: LGTM! Simplified database connection handling

The change to use direct database connection improves code clarity while maintaining proper error handling and null value management.

stats/stats/src/charts/counters/total_addresses.rs (1)

66-66: LGTM! Consistent database connection handling

The changes align with the codebase-wide transition to simplified database connection handling while maintaining robust error handling and type safety.

Also applies to: 70-74

stats/stats/src/charts/counters/total_txns.rs (2)

33-33: LGTM: Database access simplification

The simplified database access pattern improves code readability while maintaining the same functionality.


82-90: LGTM: Robust error handling in estimation logic

The estimation logic includes proper error handling with:

  • Appropriate error mapping using ChartError::BlockscoutDB
  • Safe type conversion with fallback using unwrap_or(0)
  • Null handling with unwrap_or(0)
stats/stats/src/data_source/kinds/data_manipulation/resolutions/last_value.rs (1)

114-120: LGTM: Improved test setup with explicit parameters

The test setup is improved with:

  • Clear parameter initialization using UpdateParameters
  • Explicit update time override for deterministic testing
  • Simplified database connection setup
stats/stats-server/src/update_service.rs (1)

12-13: LGTM: Improved database connection handling

The transition to Arc<DatabaseConnection> simplifies the connection management while maintaining thread safety.

stats/stats/src/charts/lines/new_block_rewards.rs (1)

103-103: LGTM: Database connection handling simplification

The changes correctly reflect the removal of MarkedDbConnection in favor of direct DatabaseConnection usage, which simplifies the codebase. The test setup maintains its functionality while being more straightforward.

Also applies to: 140-146

stats/stats/src/data_source/kinds/data_manipulation/resolutions/sum.rs (1)

132-140: LGTM: Test setup improvements

The changes improve the test setup by:

  1. Using in-memory SQLite for testing, which is more efficient
  2. Properly structuring the UpdateContext initialization with all required parameters
stats/stats-server/src/server.rs (2)

83-83: LGTM: Database connection handling simplified

The removal of MarkedDbConnection wrapper and direct usage of Arc<Database> simplifies the connection handling while maintaining thread safety through Arc.

Also applies to: 93-93


105-105: Verify database connection usage in create_charts_with_mutexes

The method now accepts a reference to Arc<Database>. Let's verify all callers have been updated accordingly.

stats/stats/src/update_groups.rs (1)

23-23: LGTM: Added PendingTxns counter

The addition of PendingTxns to singleton_groups is appropriate as it's a standalone counter that doesn't require multiple resolutions.

stats/stats/src/charts/lines/new_txns_window.rs (2)

64-64: LGTM: Simplified database access

The change to use cx.blockscout directly aligns with the broader simplification of database connection handling.


135-136: LGTM: Updated test parameters

Test parameters correctly updated to use the new database connection handling approach.

stats/stats/src/data_source/kinds/data_manipulation/filter_deducible.rs (1)

158-164: LGTM: Test setup properly configured with UpdateParameters

The test setup has been correctly updated to use the new database connection pattern with proper initialization of UpdateParameters.

stats/stats/src/data_source/kinds/local_db/parameters/query.rs (1)

Line range hint 51-57: LGTM: Simplified database connection access

The database connection access has been properly simplified to use cx.db directly while maintaining all necessary query parameters.

stats/stats/src/utils.rs (1)

5-6: LGTM: Clean removal of unused imports

The removal of unused imports keeps the code clean and maintains good hygiene.

stats/stats/src/data_source/kinds/remote_db/query/each.rs (1)

66-66: LGTM! Database connection handling simplified.

The change simplifies the database connection access by directly using cx.blockscout instead of cx.blockscout.connection.as_ref(), making the code more concise while maintaining the same functionality.

stats/stats/src/charts/counters/total_blocks.rs (2)

42-42: LGTM! Consistent database connection handling.

The change aligns with the codebase-wide simplification of database connection access.


108-108: LGTM! Test refactoring improves consistency.

The changes to test functions properly reflect the new database connection handling approach, maintaining test coverage while improving code consistency.

Also applies to: 123-123, 127-127, 137-137, 141-141, 160-160, 164-164, 168-168, 187-187, 191-191, 201-201, 205-205

stats/stats/src/charts/lines/new_accounts.rs (1)

113-113: LGTM! Consistent database connection handling.

The change aligns with the codebase-wide simplification of database connection access, maintaining consistency across the project.

stats/stats/src/charts/counters/average_block_time.rs (1)

58-58: LGTM! Good architectural improvement.

The simplification of database connection handling by removing the MarkedDbConnection wrapper in favor of Arc<DatabaseConnection> is a positive change that:

  • Reduces code complexity
  • Leverages Rust's built-in thread-safe reference counting
  • Improves maintainability

Also applies to: 200-201

stats/config/charts.json (1)

29-38: LGTM! Consistent use of native coin symbol.

The transaction fee counters consistently use {{native_coin_symbol}} in both descriptions and units, which is good for maintainability and localization.

stats/stats/src/data_source/tests.rs (1)

37-37: LGTM: Database connection handling simplified

The changes streamline the database connection handling by removing the marked connection wrapper, making the code cleaner while maintaining the same functionality.

Also applies to: 273-273, 276-276

stats/stats/src/tests/simple_test.rs (1)

15-15: LGTM: Test utilities updated for simplified database handling

The changes consistently update test utilities to use direct database connections instead of marked connections, improving code clarity without affecting test coverage or functionality.

Also applies to: 83-84, 118-119, 133-134

stats/stats/src/data_source/kinds/data_manipulation/resolutions/average.rs (1)

140-146: LGTM: Test setup modernized with UpdateParameters

The changes consistently update the test setup to use UpdateParameters, improving clarity and maintainability.

stats/stats/src/charts/lines/native_coin_holders_growth.rs (1)

121-125: LGTM: Proper error handling for database operations.

The code correctly maps database errors to the appropriate error type and maintains a clean error chain.

stats/stats/src/charts/lines/new_blocks.rs (1)

169-175: LGTM: Proper test setup with explicit parameters.

The update context parameters are well-structured and explicit, making the test setup clear and maintainable.

stats/stats/src/data_source/kinds/local_db/mod.rs (1)

379-382: LGTM: Proper error mapping in test utilities.

The error handling in test utilities correctly maps database errors to chart errors.

stats/stats-server/src/runtime_setup.rs (1)

Line range hint 260-298: LGTM! New update groups added correctly.

The new transaction-related groups (PendingTxnsGroup and TxnsStats24hGroup) have been properly integrated into the all_update_groups() method, maintaining a logical grouping with other transaction-related groups.

stats/stats/src/data_source/kinds/local_db/parameters/update/batching/mod.rs (2)

72-74: LGTM! Database connection handling simplified.

The change to pass cx.blockscout directly to get_min_date_blockscout simplifies the code while maintaining the same functionality.


Line range hint 182-189: LGTM! Consistent database connection handling.

The change to pass cx.db directly in batch_update_values_step maintains consistency with other database connection handling changes.

stats/stats/src/charts/db_interaction/read.rs (2)

831-832: LGTM! Test setup simplified.

The test setup has been simplified by directly using the database connection from init_db.


Line range hint 660-664: LGTM! Consistent database connection handling.

The change to pass cx.blockscout directly to get_min_date_blockscout maintains consistency with other similar changes across the codebase.

stats/stats/src/data_source/types.rs Show resolved Hide resolved
stats/stats-server/src/update_service.rs Show resolved Hide resolved
stats/config/charts.json Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
stats/stats/src/data_source/types.rs (3)

139-144: Add documentation for public enum

Consider adding documentation for the CacheValue enum to explain its purpose and when each variant should be used.

+/// Represents different types of values that can be stored in the cache.
+/// - ValueString: String values
+/// - ValueOptionF64: Optional floating point numbers
+/// - ValueTxnsStats: Transaction statistics
 #[derive(Clone, Debug)]
 pub enum CacheValue {
     ValueString(String),
     ValueOptionF64(Option<f64>),
     ValueTxnsStats(TxnsStatsValue),
 }

205-237: Consider adding cache capacity limits

The current implementation of UpdateCache could potentially grow unbounded. Consider:

  1. Adding a maximum capacity
  2. Implementing an eviction strategy (e.g., LRU)
  3. Adding metrics for cache size monitoring

This would help prevent potential memory issues in production.


268-302: Enhance test coverage

The current test coverage is good but could be improved by adding:

  1. Concurrent access tests to verify thread safety
  2. Memory pressure tests with large datasets
  3. Edge cases for cache eviction (once implemented)

Example test case for concurrent access:

#[tokio::test]
async fn cache_concurrent_access() {
    let cache = Arc::new(UpdateCache::new());
    let mut handles = vec![];
    
    for i in 0..10 {
        let cache_clone = cache.clone();
        let handle = tokio::spawn(async move {
            let stmt = Statement::from_string(
                DbBackend::Sqlite, 
                format!("query_{}", i)
            );
            cache_clone.insert::<Option<f64>>(&stmt, Some(i as f64)).await;
        });
        handles.push(handle);
    }
    
    futures::future::join_all(handles).await;
    
    // Verify all values were stored correctly
    for i in 0..10 {
        let stmt = Statement::from_string(
            DbBackend::Sqlite, 
            format!("query_{}", i)
        );
        assert_eq!(
            cache.get::<Option<f64>>(&stmt).await, 
            Some(Some(i as f64))
        );
    }
}
stats/stats/src/tests/simple_test.rs (1)

1-1: Consider adding module-level documentation

While the function-level documentation is good in some places, consider adding:

  1. Module-level documentation explaining the purpose of these test utilities
  2. Consistent documentation across all test helper functions

Add this at the start of the file:

+//! Test utilities for chart and counter tests.
+//!
+//! This module provides a comprehensive set of test helpers for:
+//! - Simple chart and counter tests
+//! - Migration variant tests
+//! - Ranged chart tests
+//! - Fallback scenario tests
+
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 087bfd7 and 77266de.

📒 Files selected for processing (5)
  • stats/stats/src/charts/counters/total_contracts.rs (1 hunks)
  • stats/stats/src/charts/counters/txns_fee_24h.rs (1 hunks)
  • stats/stats/src/charts/counters/yesterday_txns.rs (1 hunks)
  • stats/stats/src/data_source/types.rs (5 hunks)
  • stats/stats/src/tests/simple_test.rs (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • stats/stats/src/charts/counters/total_contracts.rs
  • stats/stats/src/charts/counters/yesterday_txns.rs
  • stats/stats/src/charts/counters/txns_fee_24h.rs
🔇 Additional comments (5)
stats/stats/src/data_source/types.rs (1)

15-16: LGTM: Clean database connection refactor with cache integration

The changes to use DatabaseConnection directly and the integration of the new cache system are well-implemented. The code maintains good Rust practices and the changes align with the PR objectives.

Also applies to: 27-28, 30-30, 42-42

stats/stats/src/tests/simple_test.rs (4)

Line range hint 1-15: LGTM: Clean import changes

The removal of MarkedDbConnection and addition of DatabaseConnection from sea_orm aligns with the PR's objective of simplifying database connection handling.


223-223: LGTM: Clean database initialization changes

The database initialization sequence has been properly updated to use the new connection type while maintaining the correct initialization order:

  1. Database initialization
  2. Time setup
  3. Recursive initialization
  4. Mock data population

Also applies to: 336-342


Line range hint 1-385: LGTM: Test coverage remains comprehensive

The test utilities have been successfully adapted to use DatabaseConnection while maintaining comprehensive test coverage across:

  • Simple chart and counter tests
  • Migration variant tests
  • Ranged chart tests
  • Fallback scenarios

The changes are type-related and don't affect the test behavior or coverage.


83-84: Verify database connection type changes across the codebase

The transition from MarkedDbConnection to DatabaseConnection is consistent in these test utilities. Let's verify this pattern is followed throughout the codebase.

Also applies to: 118-119, 313-314

✅ Verification successful

Based on the search results, I can now generate the final response since I have enough information to verify the database connection type changes:

Database connection type changes are consistent across the codebase

The search results confirm that MarkedDbConnection has been completely removed from the codebase, and DatabaseConnection is consistently used as the database connection type throughout. No instances of the old MarkedDbConnection type remain.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that MarkedDbConnection has been completely removed and replaced with DatabaseConnection

# Check for any remaining instances of MarkedDbConnection
echo "Checking for any remaining MarkedDbConnection usage..."
rg "MarkedDbConnection" 

# Check for consistent DatabaseConnection usage pattern
echo "Verifying DatabaseConnection usage pattern..."
rg "DatabaseConnection" -A 2 

Length of output: 79144

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
stats/stats/src/charts/counters/txns_stats_24h/average_txn_fee_24h.rs (1)

20-28: Validate the Error Handling Logic

The function always returns an Ok result even though it’s defined to return a Result type. If no errors are expected, consider removing the Result usage in favor of directly returning the output, unless compliance with the broader trait architecture requires it.

stats/stats/src/charts/counters/txns_stats_24h/mod.rs (1)

56-65: Indexing Consideration for Range Filters

The query filters by block timestamp and transaction timestamp based on the provided range. Consider ensuring proper indexes exist on these columns in the related database table(s) to maintain performance at scale.

stats/stats/src/charts/counters/txns_stats_24h/txns_fee_24h.rs (1)

40-49: Confirm the Missing Date Policy

"FillPrevious" is used instead of "FillZero" for missing dates. Confirm that this is the desired behavior, as it may skew data if a large gap in timestamps occurs.

stats/stats/src/data_source/types.rs (2)

205-243: Consider adding cache metrics for monitoring

The cache implementation is solid and thread-safe. However, for production monitoring, it might be helpful to track cache statistics.

Consider adding:

 pub struct UpdateCache {
     inner: Arc<Mutex<HashMap<String, CacheValue>>>,
+    hits: Arc<AtomicUsize>,
+    misses: Arc<AtomicUsize>,
 }

273-307: Consider adding edge case tests

The current test coverage is good but could be enhanced with additional scenarios:

  • Concurrent access patterns
  • Large cache values
  • Cache behavior under memory pressure
stats/stats/src/charts/counters/txns_stats_24h/new_txns_24h.rs (3)

14-27: Add documentation for ExtractCount implementation

Consider adding documentation comments to explain:

  • The purpose of this transformation
  • Why the count is converted to a string
  • Any assumptions or constraints

Example documentation:

+/// Extracts the transaction count from TxnsStatsValue and converts it to a string representation.
+/// This transformation is used for the new transactions counter in the last 24 hours.
 pub struct ExtractCount;

+/// Implements the transformation from TxnsStatsValue to a string count.
 impl MapFunction<TimespanValue<NaiveDate, TxnsStatsValue>> for ExtractCount {

51-84: Enhance test documentation

While the tests cover various scenarios, they would benefit from more detailed documentation:

  • Explain what each test case verifies
  • Document the expected outcomes
  • Explain the significance of the chosen test dates

Example enhancement:

 #[tokio::test]
 #[ignore = "needs database to run"]
+/// Tests basic transaction counting with a single transaction
 async fn update_new_txns_24h_1() {
     simple_test_counter::<NewTxns24h>("update_new_txns_24h_1", "1", None).await;
 }

1-84: Consider caching strategy for performance

Given that this is part of a 24-hour statistics implementation:

  1. Consider adding a caching layer to prevent frequent recalculations
  2. Document the update frequency and data freshness requirements
  3. Consider adding metrics for monitoring query performance

This aligns with the PR objective of introducing new caching functionality based on the update context.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 77266de and 82ca829.

📒 Files selected for processing (6)
  • stats/stats/src/charts/counters/mod.rs (4 hunks)
  • stats/stats/src/charts/counters/txns_stats_24h/average_txn_fee_24h.rs (1 hunks)
  • stats/stats/src/charts/counters/txns_stats_24h/mod.rs (1 hunks)
  • stats/stats/src/charts/counters/txns_stats_24h/new_txns_24h.rs (1 hunks)
  • stats/stats/src/charts/counters/txns_stats_24h/txns_fee_24h.rs (1 hunks)
  • stats/stats/src/data_source/types.rs (5 hunks)
🔇 Additional comments (9)
stats/stats/src/charts/counters/txns_stats_24h/average_txn_fee_24h.rs (1)

58-94: Tests Rely on Real Database

All tests are annotated with "needs database to run." Ensure these tests run in an integration environment that provides a database, or switch to mocking if you need a faster feedback loop.

stats/stats/src/charts/counters/txns_stats_24h/mod.rs (1)

27-39: Ensure EIP-1559 Fee Calculation Logic Is Correct

The COALESCE expression for calculating fees with base and priority fees aligns with EIP-1559 logic. However, double-check that LEAST operations and COALESCE default are correct for edge cases, such as extremely high or zero base fees or priority fees.

stats/stats/src/charts/counters/mod.rs (1)

5-5: Re-exports and Namespacing Look Good

The added modules and public re-exports for transaction statistics—particularly the 24h modules—are well organized. No issues detected.

Also applies to: 16-16, 26-26, 37-40

stats/stats/src/charts/counters/txns_stats_24h/txns_fee_24h.rs (1)

20-28: Consistent Handling of fee_sum

The logic to unwrap the optional fee_sum is consistent with how average fees were handled. Make sure there are no scenarios that accidentally produce negative or extremely large fees requiring special handling.

stats/stats/src/data_source/types.rs (3)

139-204: Well-designed cache infrastructure with type safety

The cache infrastructure is well-implemented with:

  • Type-safe conversions through the Cacheable trait
  • Clean macro implementations to reduce boilerplate
  • Proper error handling for type mismatches

30-30: Verify cache integration with migration system

The addition of cache to UpdateContext is part of a larger migration effort. Let's verify the integration.

Also applies to: 42-42

✅ Verification successful

Based on the search results, I can now generate the final response:

Cache integration is properly implemented within the migration system

The cache implementation is correctly integrated as a temporary storage mechanism within update groups:

  • Cache is initialized for each update group via UpdateContext and is scoped to individual group updates
  • Documentation explicitly states "cache is constructed from scratch on each group update and dropped after the update"
  • All update groups created via construct_update_group! macro share consistent cache handling through UpdateContext
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for migration-related code that might interact with the cache
rg "construct_update_group!" -A 5 -B 5

# Look for cache usage in migration-related code
rg "UpdateCache" -A 5 -B 5

Length of output: 17484


15-16: Verify database connection changes across the codebase

The change from MarkedDbConnection to DatabaseConnection is a significant architectural modification. While the change itself looks correct, we should ensure all database operations continue to work as expected.

Also applies to: 27-28

✅ Verification successful

Based on the search results, I can see that DatabaseConnection is widely used across the codebase as the standard database connection type from the sea-orm ORM. The change from MarkedDbConnection to DatabaseConnection appears to be a simplification by removing a wrapper type, and there are no remaining usages of MarkedDbConnection in the codebase.

Let me generate the final response:

Database connection changes are safe to proceed

The migration from MarkedDbConnection to DatabaseConnection is a clean architectural simplification that removes an unnecessary wrapper type. The search confirms:

  • No remaining usages of MarkedDbConnection exist in the codebase
  • DatabaseConnection is consistently used across all database interactions
  • The change aligns with the standard sea-orm ORM patterns
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining MarkedDbConnection usages that might need updating
rg "MarkedDbConnection" -l

# Search for DatabaseConnection usage patterns to verify consistency
rg "DatabaseConnection" -A 2 -B 2

Length of output: 75064

stats/stats/src/charts/counters/txns_stats_24h/new_txns_24h.rs (2)

1-12: LGTM! Well-organized imports

The imports are logically grouped and all are utilized in the implementation.


38-47: Verify the missing date policy choice

The current implementation uses FillPrevious for missing dates, which might give misleading impressions about transaction activity during periods of missing data. Consider if FillZero would be more appropriate for transaction counts.

Let's check how missing dates are handled in other transaction-related charts:

✅ Verification successful

Let me get more specific information about the actual implementations of missing date policies in transaction-related charts:


Using FillPrevious is consistent with other transaction counters

The FillPrevious policy is consistently used across most transaction-related counters including total_txns, completed_txns, and yesterday_txns. This appears to be a deliberate design choice for maintaining continuity in transaction metrics. The only exception is average_txn_fee_24h which uses FillZero, but this makes sense as it's an average rather than a count.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for other transaction chart implementations and their missing date policies
rg -A 10 "impl ChartProperties" | grep -B 5 "missing_date_policy"

Length of output: 14584


Script:

#!/bin/bash
# Get the actual missing date policy implementations for transaction-related charts
rg -A 1 "missing_date_policy\(\)" "stats/stats/src/charts/counters/txns_stats_24h/" "stats/stats/src/charts/counters/total_txns.rs" "stats/stats/src/charts/counters/yesterday_txns.rs" "stats/stats/src/charts/counters/completed_txns.rs"

Length of output: 1476

@bragov4ik bragov4ik requested a review from sevenzing December 20, 2024 13:57
@bragov4ik bragov4ik merged commit 7544481 into main Dec 25, 2024
6 checks passed
@bragov4ik bragov4ik deleted the bragov4ik/stats-migrate-backend-charts-6 branch December 25, 2024 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants