Presto Queue Length Based Routing Manager #81

ssanthanam185 · 2019-12-10T06:52:56Z

Introduced Listeners for the Active Cluster Monitor , The Health Checker & RoutingTable are listners to the stats collected frm the Active Cluster Monitor.
The PrestoQueueLengthRoutingTable is an extention of the HARoutingManager that gets list of all active backends along with queue depth. It assigns a weight to each cluster relative to the most queued up cluster. The most queued up cluster's weight is adjusted such that it handles the following cases
a. There is a huge difference between the least queued up & most queued up cluster, in which case the most queued Up cluster is given a negligible wt such that it almost get no new requests.
b. there diff is not huge between least & most queued up in which case most queued Up cluster is given a small-ish wt such that it get sonly a few requests in favor of the other healthier clusters.
c. All clusters have equal load OR 0 load , at which point its a even distribution.

With this new refactor, when users want to Use the ActiveClusterMonitor managed App, make sure to also add the module ClusterStateListenerModule to gateway-ha-config.yml

ssanthanam185 · 2019-12-10T07:07:31Z

gateway-ha/src/main/java/com/lyft/data/gateway/ha/clustermonitor/ActiveClusterMonitor.java

-  @Inject private Notifier emailNotifier;
-  @Inject private GatewayBackendManager gatewayBackendManager;
+  @Inject
+  private List<PrestoClusterStatsObserver> clusterStatsObservers;


Introducing listeners , which can reactive to the stats collected from the Active Cluster monitor

bearcage

This looks good to me!

I have a few nits, and a couple questions about how generating the sampling distribution works, but nothing that should block you from proceeding when you're ready.

bearcage · 2019-12-10T15:33:30Z

gateway-ha/src/main/java/com/lyft/data/gateway/ha/clustermonitor/HealthChecker.java

+        }
+        if (clusterStats.getNumWorkerNodes() < 1) {
+          notifyForNoWorkers(clusterStats);
+        }


I know it's a lot of boilerplate, but do you think we should split up the healthchecks to different observer classes?

We could def do that . We also want to handle different routingGroup health checks differently . Ill add that as a follow up PR

bearcage · 2019-12-10T15:39:07Z

gateway-ha/src/main/java/com/lyft/data/gateway/ha/clustermonitor/PrestoQueueLengthChecker.java

+
+    for (ClusterStats stat : stats) {
+      if (!clusterQueueMap.containsKey(stat.getRoutingGroup())) {
+        clusterQueueMap.put(stat.getRoutingGroup(), new HashMap<String, Integer>() {


TIL you can do initializers like this in Java!

It might be slightly simpler to use Map.of if we're on java9+.

bearcage · 2019-12-10T15:39:52Z

gateway-ha/src/main/java/com/lyft/data/gateway/ha/module/HaGatewayProviderModule.java

@@ -89,4 +89,4 @@ public RoutingManager getRoutingManager() {
  public JdbcConnectionManager getConnectionManager() {
    return this.connectionManager;
  }
-}
+}


nit: Missing newline at EOF

bearcage · 2019-12-10T15:47:34Z

gateway-ha/src/main/java/com/lyft/data/gateway/ha/router/PrestoQueueLengthRoutingTable.java

+
+
+  /**
+   * Performs routing to an adhoc backend based compute weights base don cluster queue depth.


typo: base don -> based on

bearcage · 2019-12-10T15:47:53Z

gateway-ha/src/main/java/com/lyft/data/gateway/ha/router/PrestoQueueLengthRoutingTable.java

+  /**
+   * Performs routing to an adhoc backend based compute weights base don cluster queue depth.
+   *
+   * <p>d.


bearcage · 2019-12-10T16:18:33Z

gateway-ha/src/main/java/com/lyft/data/gateway/ha/router/PrestoQueueLengthRoutingTable.java

+         *  provisioned or not.
+         */
+
+        if (maxQueueLn == 0) {


I'm not sure I understand what each branch of this conditional is doing — it reads to me like this is trying to set a max weight such that all of the weights will sum to a particular value, and establish a discontinuous jump in the weight distribution at the last-but-one-th element?

I think it might be simpler to reduce the weighting to a linear interpolation — pseudocode, but:

given queueSum. for each cluster fractionOfQueueSum = cluster.weight() / queueSum alpha = 1 - fractionOfQueueSum weight = alpha * MAX_WT bucket_count = floor(NUM_BUCKETS * weight) create_n_sampling_buckets(cluster.name, bucket_count)

If you're concerned about the bucket count not exactly matching up, we can either make the routing function do its modular math in terms of weightedDistributionRouting.size(), or pad the map with an extra bucket pointing to the least-queued cluster, slightly skewing the distribution.

I believe this is basically what's already happening down on L156-157, but I think perhaps we don't need to do any of this conditional special-casing for small routing groups. If we assign each cluster a simple numeric weight (inversely proportional to its queue depth as a fraction of the whole) it should accomplish the same thing, right?

Alex, My initial approach was as you mentioned. Wt would be Inverse of the queueDepth. As I ran my tests I realized that the distribution of queries ( especially when the query volume is low) didnt really work well by excluding the most queue up cluster even when the difference between the least queued Up & most queued Up was significant.

Just to clarify the the computed weights potentially change every min ( as often as the Active Cluster Monitor) and since the gateway is HA with a fleet of proxy hosts, each proxy host would get a small number of queries.

The way this is designed is to have each of the proxy host to be stateless and agnostic of how the query distribution is happening at a global level.

To account for the above I wanted to tweak the weight given to the most queued up cluster. Idea is to set a very negligible wt to the most queued up cluster when there are other lesser util clusters avail. Almost removing it from the mix but not really.

@bearcage Experimented with suggestion to use a sqr of the inverse (computed wt) ,but that does not work well for cases with some what even distribution. Ive tried to simplify the math to a large extent, to make the code more readable. I however still need to specially handle some edge cases.

bkyryliuk · 2022-06-03T02:04:59Z

@ssanthanam185 could you share a sample configuration for using Queue Length Based Routing Manager ?

Chaho12 · 2023-07-28T08:28:54Z

@puneetjaiswal would be great if there was a sample guide for this feature!

okayhooni · 2023-08-12T10:01:26Z

@Chaho12

Hello, I wrote a pull request to make this router function available, after understanding structure of the codebase.

Activate queue based router as default & Consider backend with no worker as unhealthy #208

(I enjoyed your DEVIEW presentation about Trino, Thank you..!)

ssanthanam185 added 4 commits December 9, 2019 22:59

First Draft : Weighted Dstribution Routing

223dc8c

Test cases & checkstyle

66b3a2e

Undo provider module/'

0cae4b1

rebasing changes from master

681db56

ssanthanam185 force-pushed the MakeRoutingManagerAModule branch from 145542e to 681db56 Compare December 10, 2019 07:06

ssanthanam185 commented Dec 10, 2019

View reviewed changes

ssanthanam185 requested a review from bearcage December 10, 2019 07:14

ssanthanam185 changed the title ~~Make routing manager a module~~ Presto Queue Length Based Routing Manager Dec 10, 2019

bearcage previously approved these changes Dec 10, 2019

View reviewed changes

Simplifying routing Math, Bumping up Artifact version

6b6d6eb

ssanthanam185 dismissed bearcage’s stale review via 6b6d6eb December 11, 2019 20:34

ssanthanam185 added 3 commits December 11, 2019 12:53

Missed Bumping up version in modules

1efaa51

Not Handling MAth ceil double correctly

f8d804f

Eeks, fixing checkstyle

7c7b099

nishantrayan approved these changes Dec 11, 2019

View reviewed changes

ssanthanam185 merged commit f558835 into master Dec 11, 2019

bkyryliuk mentioned this pull request Jun 6, 2022

Example on how to use queue size based load balancer #171

Open

okayhooni mentioned this pull request Aug 12, 2023

Activate queue based router as default & Consider backend with no worker as unhealthy #208

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Presto Queue Length Based Routing Manager #81

Presto Queue Length Based Routing Manager #81

ssanthanam185 commented Dec 10, 2019 •

edited

Loading

ssanthanam185 Dec 10, 2019

bearcage left a comment

bearcage Dec 10, 2019

ssanthanam185 Dec 10, 2019 •

edited

Loading

bearcage Dec 10, 2019

bearcage Dec 10, 2019

bearcage Dec 10, 2019

bearcage Dec 10, 2019

bearcage Dec 10, 2019

ssanthanam185 Dec 10, 2019

ssanthanam185 Dec 11, 2019

bkyryliuk commented Jun 3, 2022

Chaho12 commented Jul 28, 2023

okayhooni commented Aug 12, 2023



		/**
		* Performs routing to an adhoc backend based compute weights base don cluster queue depth.

Presto Queue Length Based Routing Manager #81

Presto Queue Length Based Routing Manager #81

Conversation

ssanthanam185 commented Dec 10, 2019 • edited Loading

Choose a reason for hiding this comment

bearcage left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssanthanam185 Dec 10, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bkyryliuk commented Jun 3, 2022

Chaho12 commented Jul 28, 2023

okayhooni commented Aug 12, 2023

ssanthanam185 commented Dec 10, 2019 •

edited

Loading

ssanthanam185 Dec 10, 2019 •

edited

Loading