Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IGNITE-23601 Create broadcast partitioned compute method #4871

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

valepakh
Copy link
Contributor

https://issues.apache.org/jira/browse/IGNITE-23601

This adds new compute API methods for executing jobs based on the table partitioning. These methods take a table name and submit jobs on nodes when the primary replica for the specified table is located, passing the list of partition indices on this node via the JobExecutionContext.

@valepakh valepakh changed the title IGNITE-23601 Create brodcast partioned compute method IGNITE-23601 Create broadcast partitioned compute method Dec 11, 2024
*
* @return list of partitions numbers associated with this job.
*/
@Nullable List<Integer> partitions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Nullable List<Integer> partitions();
@Nullable List<org.apache.ignite.table.partition.Partition> partitions();

Should be compatible with PartitionManager API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the type to Partition

) {
Objects.requireNonNull(tableName);
Objects.requireNonNull(descriptor);
return failedFuture(new UnsupportedOperationException("Not implemented"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a ticket and add TODO for client impl. Ideally also tickets for other clients (.NET, C++).

@@ -244,6 +244,98 @@ default <T, R> Map<ClusterNode, R> executeBroadcast(
return map;
}

/**
* Submits a {@link ComputeJob} of the given class for an execution on nodes where the primary replicas of a partitions of the specified
* table are located. The partition indices are passed to the job in the {@link JobExecutionContext#partitions()}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any "partition pinning"? What if the partition is moved to another node while the job is executing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use "best-effort" approach. If partition is moved, then it will not be local, that's it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case? What is the difference from a regular broadcast call, apart from access to the partition list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically it's a shorthand version of the broadcast with manually getting the table, getting the partition list and filtering it to get the partitions on a local node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, in my opinion, we don't need those new broadcast methods. Instead let's simplify "get local partitions for a table" scenario by extending PartitionManager API.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants