-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FOGL-6891: Handle storage service restart #893
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: Amandeep Singh Arora <[email protected]>
Signed-off-by: Amandeep Singh Arora <[email protected]>
Signed-off-by: Amandeep Singh Arora <[email protected]>
Signed-off-by: Amandeep Singh Arora <[email protected]>
Signed-off-by: Amandeep Singh Arora <[email protected]>
…start Signed-off-by: Amandeep Singh Arora <[email protected]>
Signed-off-by: Amandeep Singh Arora <[email protected]>
Signed-off-by: Amandeep Singh Arora <[email protected]>
Signed-off-by: Amandeep Singh Arora <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests are failing.
Please remove code that is not really needed.
There is lots of log output at the info level that looks at best to be debug and much that looks like it is not really needed. Merging this will cause a huge increase in logging output at info level that is meant to be a level to give the user more information, not for the code developer.
The changes are impacting the default log levels of various areas of the code and these should be reverted.
The code is littered with PRINT_FUNC, this looks like your debugging and should be removed.
The use of ps in the storage client limits the deployment options to having all services in the same machine or container, this should be avoided.
@@ -19,7 +19,7 @@ | |||
using namespace std; | |||
|
|||
// uncomment line below to get uSec level timestamps | |||
// #define ADD_USEC_TS | |||
#define ADD_USEC_TS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best not to checkin with this uncommented
m_urlbase << hostname << ":" << port; | ||
m_logger->info("StorageClient c'tor 2: this=%p, m_urlbase=%s", this, m_urlbase.str().c_str()); | ||
|
||
m_logger->info("StorageClient c'tor 2: current thread ID=%u, m=%p", std::this_thread::get_id(), m); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is debug rather than info information
storage_service_urlbase.str(""); | ||
storage_service_host = ""; | ||
|
||
/* else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not needed please remove
@@ -180,6 +258,8 @@ bool StorageClient::readingAppend(const vector<Reading *>& readings) | |||
ss << m_pid << "#" << thread_id << "_" << m_seqnum_map[thread_id].load(); | |||
sto_mtx_client_map.unlock(); | |||
|
|||
m_logger->info("ss=%s", ss.str().c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really does not look like useful info for the user, should be debug at best but probably should be removed.
do | ||
{ | ||
PRINT_FUNC; | ||
system("ps -ef | grep fledge.services.storage | grep -v grep | wc -l > /tmp/storage_service_instance_count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite ugly, not very portable and makes the assumption the storage service is on the same machine as the client that is using it. We have been very careful not to add that restriction so that we may distribute services across machines or containers at some point.
I suspect you can find out what you need to by just using calls to the API and looking at error codes. A TCP connection to the API port can tell you if it is running or not.
@@ -135,8 +155,11 @@ def __init__(self, core_management_host=None, core_management_port=None, is_safe | |||
|
|||
# Initialize class attributes | |||
if not cls._logger: | |||
cls._logger = logger.setup(__name__, level=logging.INFO) | |||
# cls._logger = logger.setup(__name__, level=logging.DEBUG) | |||
# cls._logger = logger.setup(__name__, level=logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reinstate original before merging
@@ -439,6 +479,9 @@ async def _check_schedules(self): | |||
if schedule.exclusive and schedule_execution.task_processes: | |||
continue | |||
|
|||
# if schedule.type == Schedule.Type.STARTUP and self._scheduler_restart: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove commented out code if it is not required
@@ -56,7 +57,7 @@ | |||
__license__ = "Apache 2.0" | |||
__version__ = "${VERSION}" | |||
|
|||
_logger = logger.setup(__name__, level=20) | |||
_logger = logger.setup(__name__, level=logging.DEBUG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please reinstate before merging
|
||
''' | ||
def start_storage(): | ||
with open("/tmp/monitor.log", "w+") as file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this file used for? It looks like it is for your debugging purposes and should not be part of the final merge.
if not self._acl_handler: | ||
self._acl_handler = ACLManager(connect.get_storage_async()) | ||
await self._acl_handler.\ | ||
resolve_pending_notification_for_acl_change(service_record._name) | ||
|
||
check_count[service_record._id] = 1 | ||
|
||
# self._logger.debug("step D") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this if it is really not required
This is really POC code only. |
Storage service restart is handled in south service, north service & notification service. But this hasn't been tested a lot. Also lots of debug logs are still present in this current code.