This is a collection of related ideas to improve the management of Nova Services: i) Add a text field to capture the reason a service is disabled. Disabling a specific services is normally a manual operation, carried out for some specific operational reason, such as an issue on a specific server, or to reserve some servers as part of capacity management. We’ve found that adding a reason field which can be set when the service is disabled and displayed as part of the service status makes it much easier to manage this aspect of the system. The messages are normally short and refer to, for example, issues captured in JIRA. ii) In addition to updating the services table with a timestamp and report count, services should report the number of threads currently running in their greenpool – as this provides an indication of how busy the service is. This information could be used, for example, by schedulers. iii) Extend the services table to include a status field that is set by the service to “starting/started/stopping/stopped” at the appropriate points in its lifecycle. This helps distinguish between services that have stopped as part of a software update from those that have failed for some reason. iv) Service shutdown currently kills any threads which are in progress. Service shutdown should try to stop the service in a clean state by stopping reading new messages from the queue and waiting for the running thread count to reduce to 0. If the threads don’t stop within a configurable timeout window, then the service should force a stop anyway. (Session lead is Phil Day)
Tuesday April 17, 2012 12:00pm - 12:25pm PDT
Seacliff AB