Skip to content
This repository has been archived by the owner on Dec 7, 2023. It is now read-only.

Refactor misleading database errors #417

Open
wants to merge 2 commits into
base: staging
Choose a base branch
from

Conversation

tanyongzhi
Copy link
Contributor

@tanyongzhi tanyongzhi commented Oct 26, 2021

This PR fixes misleading database errors. In particular, this PR, when applied, will:

- Fix mongo database initialization hanging during local development if local mongo instance is not started

  • In development, the timeout for database connection is 0 - corresponding to a blocking call to DialWithInfo(). This effectively leads to the service being unresponsive, as the services will forever be waiting on the connection to the database during initialization without logging anything meaningful. As a result, all subsequent requests to the service will simply be timed out and return 500 errors.
  • This is fixed by setting the timeout for database connections to 60 both in production and in development. Correspondingly, DialWithInfo() will return an error, which will be logged. In the future, we can consider exiting the service upon initialization failure.

- Change not found errors to return 404 instead of 500

  • Previously, many GET requests that retrieve data from mongo will return a 500 error if the data is not present in the database. This is misleading, and 500 errors should be reserved for internal connection errors with the database backend. This is fixed by changing the database error to a 404 status code if the database returns a not found error.

@tanyongzhi tanyongzhi marked this pull request as ready for review October 26, 2021 05:17
@@ -44,13 +44,13 @@ func (db *MongoDatabase) Connect(host string) error {
return ErrConnection
}

dial_info.Timeout = 60 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally and it didn't seem to work. If the API can't make a connection to mongodb within 60 seconds, it should crash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you running all services with make run? If so, can you check which services are still running after the mongo timeout happens? The gateway and the stats service should still be running, as these services do not access mongo and would not be terminated on connection failure. All other services should be terminated.

While on this topic, perhaps we should change run.sh to terminate all services if anyone exits with an error...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no, I tested it with make setup which (ideally) should crash immediately once it detects mongod is not running. I didn't realize this fix was for make run.

It still looks like those services aren't dying after missing the mongod connection:

matas      35633   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service user
matas      35634   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service registration
matas      35635   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service decision
matas      35636   35628  3 16:25 pts/1    00:00:08 /home/matas/Desktop/api/bin/hackillinois-api --service rsvp
matas      35637   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service checkin
matas      35638   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service upload
matas      35639   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service mail
matas      35640   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service event
matas      35642   35628  0 16:25 pts/1    00:00:00 /home/matas/Desktop/api/bin/hackillinois-api --service stat
matas      35647   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service notifications
matas      35651   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service project
matas      35656   35628  3 16:25 pts/1    00:00:09 /home/matas/Desktop/api/bin/hackillinois-api --service profile
matas      35661   35628  3 16:25 pts/1    00:00:08 /home/matas/Desktop/api/bin/hackillinois-api --service room
matas      35666   35628  0 16:25 pts/1    00:00:00 /home/matas/Desktop/api/bin/hackillinois-api --service gateway

(ignore the missing auth service, I killed it manually)

Let's go over this in tomorrow's meeting, maybe I am missing something simple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion! But yes, let's go over this tomorrow.

@TakshPSingh
Copy link
Contributor

TakshPSingh commented Nov 3, 2021

@tanyongzhi do you think it might be a better idea to move the 404 error part into a separate commit so that we can merge it now before the rest of the commit is approved?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants