Adds more resiliency to retrying read operations during failures. #685

jonhyman · 2015-09-01T18:25:56Z

Adds some failover support code from Moped which handles reads during failovers better.

We've been doing failover testing with the new driver (add8a77). While doing a kill -9 of the mongod primary, we saw this error:

D, [2015-09-01T17:58:43.370391 #2194] DEBUG -- : MONGODB | REDACTED | staging.find | STARTED | {"find"=>"my_collection", "filter"=>{}}
D, [2015-09-01T17:58:43.374908 #2194] DEBUG -- : MONGODB | REDACTED | staging.find | FAILED | can't connect to new replica set master [REDACTED], err: couldn't connect to server REDACTED (REDACTED), connection attempt failed (13639) | 0.004322347000000001s
Mongo::Error::OperationFailure: can't connect to new replica set master [REDACTED], err: couldn't connect to server REDACTED (REDACTED), connection attempt failed (13639)
  from /var/www/platform/shared/dashboard/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-add8a77a48aa/lib/mongo/operation/result.rb:226:in `validate!'
  from /var/www/platform/shared/dashboard/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-add8a77a48aa/lib/mongo/operation/executable.rb:36:in `block in execute'
  from /var/www/platform/shared/dashboard/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-add8a77a48aa/lib/mongo/server/connection_pool.rb:111:in `with_connection'

While doing a stepdown of a primary, we got the same stack trace but the error was

D, [2015-09-01T18:05:43.974336 #2194] DEBUG -- : MONGODB | REDACTED | staging.find | FAILED | error querying server (15988) | 0.0012782479999999998s
Mongo::Error::OperationFailure: error querying server (15988)
from /var/www/platform/shared/dashboard/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-add8a77a48aa/lib/mongo/operation/result.rb:226:in `validate!'

This pull should fix these issues.

jonhyman · 2015-09-01T18:46:01Z

Actually, hold off on merging. This code retries only once. We just tested it by kill -9ing a mongod and got this:

D, [2015-09-01T18:40:34.779203 #2090] DEBUG -- : MONGODB | REDACTED | staging.find | FAILED | error querying server (15988) | 0.006565967999999999s
D, [2015-09-01T18:40:34.781297 #2090] DEBUG -- : MONGODB | REDACTED | staging.find | STARTED | {"find"=>"my_collection", "filter"=>{}}
D, [2015-09-01T18:40:34.785768 #2090] DEBUG -- : MONGODB | REDACTED | staging.find | FAILED | ReplicaSetMonitor no master found for set: f35f71b69154572b602a36ded3d5e04b (10009) | 0.004394989s

so it retried once, but then failed. It needs to retry more than once, as an election can take a few seconds.

jonhyman · 2015-09-03T19:03:57Z

@durran / @estolfo:

We ran a variety of failover tests earlier this week against HEAD. It seems that this driver is very resilient to errors at the mongoS layer, but abysmal at elections and problems in the mongoD layer. Scenarios we tested:

Graceful step down: Fails, the app exceptions out. Stack trace in body of pull.
kill -9 a mongod primary for a "dirty election": Fails, the app exceptions out. Stack trace in body of pull.
kill -9 a mongos process. Success, reconnects and keeps going
Rolling restart of mongos processes. Success, reconnects and keeps going
Simulate a host completely hanging. This has happened to us on bad hardware. The way to simulate is to, while performing reads, enable 2 iptables firewall rules that silently drop all incoming/outgoing packets, so 0 bytes will be sent on the wire (the driver will be hung on the blocking socket read, see my comments at Hangs forever in refresh after 5 minutes when 2ary unreachable mongoid/moped#331). This simulates a really bad network/host failure, as heartbeats and connections from the rest of the replset and mongoS will hang until they reach a timeout. This causes the election to take much longer. We have patched the driver in a similar fashion as I describe in that Moped issue to set a long-running driver timeout via read_nonblock and IO.select. In this failure scenario, our patch works great in Moped 2. But in this scenario on this new driver, this is the stack trace we get:

from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/socket.rb:123:in `block in read'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/socket.rb:191:in `handle_errors'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/socket.rb:122:in `read'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/protocol/serializers.rb:62:in `deserialize'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/protocol/message.rb:193:in `deserialize_header'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/protocol/message.rb:83:in `deserialize'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connectable.rb:106:in `block in read'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connectable.rb:91:in `ensure_connected'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connectable.rb:106:in `read'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connection.rb:180:in `deliver'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connection.rb:125:in `block in dispatch'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/monitoring/publishable.rb:47:in `publish_command'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connection.rb:124:in `dispatch'
  from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/operation/executable.rb:36:in `block in execute'

I believe this pull will fix that as well via commit 86053b8.

However, this driver still very badly needs additional retry logic in order to handle even a graceful step down. The read_with_retry code only retries one time, but it takes a few seconds for the election to happen. I can copy similar code from Moped that has something like cluster.max_retries and cluster.retry_interval and retry more than once. But do you have plans to add this yourself?

durran · 2015-09-04T08:23:55Z

lib/mongo/server/connectable.rb

@@ -103,7 +107,9 @@ def ensure_same_process!
      end

      def read
-        ensure_connected{ |socket| Protocol::Reply.deserialize(socket) }
+        read_with_retry do


This will not work here as the connection does not know anything about the cluster and the retry code calls cluster.scan!.

Hm. Thoughts on how to handle a retry if the read that ultimately gets called from here gets a socket error, then?

Stack trace again:

from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/socket.rb:123:in `block in read' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/socket.rb:191:in `handle_errors' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/socket.rb:122:in `read' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/protocol/serializers.rb:62:in `deserialize' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/protocol/message.rb:193:in `deserialize_header' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/protocol/message.rb:83:in `deserialize' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connectable.rb:106:in `block in read' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connectable.rb:91:in `ensure_connected' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connectable.rb:106:in `read' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connection.rb:180:in `deliver' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connection.rb:125:in `block in dispatch' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/monitoring/publishable.rb:47:in `publish_command' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/server/connection.rb:124:in `dispatch' from /var/www/bundle/ruby/2.2.0/bundler/gems/mongo-ruby-driver-a2a7e0f949a3/lib/mongo/operation/executable.rb:36:in `block in execute'

durran · 2015-09-04T08:30:35Z

What is your server_selection_timeout option set at? This defaults at 30 seconds and should give a 30 second window to retry here. Once the cluster is scanned on the retry and the operation executed again, it goes back through the server selection code which should be waiting up to 30 seconds to issue the next query again. Unless the ping command is actually returning us that it's ready when it's not... I'll double check this as well now.

jonhyman · 2015-09-04T12:41:15Z

We don't alter the server_selection_timeout at all. During the stepdown and kill -9 of the mongod process, the code exceptioned out instantly so it is likely the ping command returns ready when it is not.

jonhyman · 2015-09-14T16:54:32Z

@durran I saw that you all released 2.1.0 last week. Did you test this and fix the problem? IMO if the official driver can't handle a graceful step-down, I don't think it's ready for prime-time.

durran · 2015-09-14T19:16:09Z

This is not yet resolved... We did tests with planned elections and everything worked as expected for the basic scenarios, which is better than the 1.x series already had... We're still looking at better retry around what you've proposed here, and will probably go into the 2.1.1 release. We had other dependencies that were pressuring the 2.1.0 ga that did not need more robust retry around it... But we're still looking at it.

jonhyman · 2015-09-14T19:57:06Z

Do you have any idea when you are aiming for that release? That's the last thing we are bottlenecked on in order to upgrade to the official driver / Mongoid 5, which then lets us immediately upgrade our staging environment to MongoDB 3.0.x and start testing that. Many features in our upcoming roadmap are dependent on 3.0.x and we've been waiting a very long time to try it out.

durran · 2015-09-15T07:42:51Z

I'm looking into this today and then will put into master for testing... If that looks good then we'll figure out the release of 2.1.1 in our weekly meeting tonight.

durran · 2015-09-15T07:54:05Z

Just to confirm, you're testing on a sharded cluster with multiple mongos that's behind a single load balancer correct? This would be why the general ping strategy is not enough for your case - the ping is going to the mongoses which are always running and ready throughout the reconfiguration.

jonhyman · 2015-09-15T12:55:32Z

Yes, that's correct. We host with ObjectRocket, which uses one connection string to the load balancer. We once tried to move away from this scenario but ran into a lot of issues with connection limits since we're still on Mongo 2.4 and the connection handling isn't nearly as good as it is in >= 2.6.

jonhyman · 2015-09-21T03:35:59Z

@durran if you can get this onto master this week I'll redo the failover testing and report back how it worked.

durran · 2015-09-21T10:50:03Z

@jonhyman Yeah I'll have something in this week.

jonhyman · 2015-09-25T00:44:22Z

@durran friendly reminder. I have scheduled time with our DBAs to do failover testing on Tuesday night, so please try to have something before then. Thanks.

durran · 2015-09-25T15:23:14Z

Sounds great - will have it in before Monday.

durran · 2015-09-27T12:03:28Z

See: #692

You can test against the branch retry-mongos.

Adds more resiliency to retrying read operations during failures.

a2a7e0f

Retry reads when deserializing a reply.

86053b8

durran reviewed Sep 4, 2015
View reviewed changes

durran closed this Sep 27, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds more resiliency to retrying read operations during failures. #685

Adds more resiliency to retrying read operations during failures. #685

jonhyman commented Sep 1, 2015

jonhyman commented Sep 1, 2015

jonhyman commented Sep 3, 2015

durran Sep 4, 2015

jonhyman Sep 4, 2015

durran commented Sep 4, 2015

jonhyman commented Sep 4, 2015

jonhyman commented Sep 14, 2015

durran commented Sep 14, 2015

jonhyman commented Sep 14, 2015

durran commented Sep 15, 2015

durran commented Sep 15, 2015

jonhyman commented Sep 15, 2015

jonhyman commented Sep 21, 2015

durran commented Sep 21, 2015

jonhyman commented Sep 25, 2015

durran commented Sep 25, 2015

durran commented Sep 27, 2015

Adds more resiliency to retrying read operations during failures. #685

Adds more resiliency to retrying read operations during failures. #685

Conversation

jonhyman commented Sep 1, 2015

jonhyman commented Sep 1, 2015

jonhyman commented Sep 3, 2015

durran Sep 4, 2015

Choose a reason for hiding this comment

jonhyman Sep 4, 2015

Choose a reason for hiding this comment

durran commented Sep 4, 2015

jonhyman commented Sep 4, 2015

jonhyman commented Sep 14, 2015

durran commented Sep 14, 2015

jonhyman commented Sep 14, 2015

durran commented Sep 15, 2015

durran commented Sep 15, 2015

jonhyman commented Sep 15, 2015

jonhyman commented Sep 21, 2015

durran commented Sep 21, 2015

jonhyman commented Sep 25, 2015

durran commented Sep 25, 2015

durran commented Sep 27, 2015