You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here is an example error log indicating a timeout is happened during a query.
2024-04-25T05:45:55.408746Z ERROR sql{protocol="http" request_type="sql"}: client::region: Failed to do Flight get, addr: greptimedb-datanode-0.greptimedb-datanode.greptimedb:4001, code: The operation was cancelled err=0: Timeout expired
2024-04-25T05:45:55.409104Z ERROR sql{protocol="http" request_type="sql"}: servers::http::error_result: Failed to handle HTTP request err=0: , at greptimedb/src/common/recordbatch/src/adapter.rs:254:55
1: External(0: External error, at greptimedb/src/query/src/dist_plan/merge_scan.rs:207:22
1: Region query error, at greptimedb/src/frontend/src/instance/region_query.rs:53:14
2: Failed to query, at greptimedb/src/frontend/src/instance/region_query.rs:74:14
3: External error, at greptimedb/src/client/src/region.rs:72:14
4: Failed to do Flight get, code: The operation was cancelled
5: Timeout expired)
2024-04-25T05:45:55.409893Z ERROR tower_http::trace::on_failure: response failed classification=Status code: 500 Internal Server Error latency=10008 ms
We have two major issue here
It somehow returns 1003 as error code, which is not accurate and confusing. Perhaps the first External hides the actual reason. We want to deliver the message clearly to end user if it's a timeout error.
the timeout threshold(which is 10 seconds) is not configurable yet. It's inappropriate since some large queries can easily take more than a minute.
Implementation challenges
We want to
fix the case where the actual timeout reason is hidden. It might relate to our error passing mechanism, please refer to this blog(previewing) for some insights.
add a configuration for timeout. Better off, a separate configuration for each protocol like HTTP, MySQL, PG and gRPC. This might be harder than it seems to be, for a timeout can occur during the gRPC call from frontend to datanode.
The text was updated successfully, but these errors were encountered:
What type of enhancement is this?
User experience
What does the enhancement do?
Here is an example error log indicating a timeout is happened during a query.
We have two major issue here
1003
as error code, which is not accurate and confusing. Perhaps the firstExternal
hides the actual reason. We want to deliver the message clearly to end user if it's a timeout error.Implementation challenges
We want to
The text was updated successfully, but these errors were encountered: