We have added service-lifecycle to a number of internal libraries and we use it in applications that are running in production. Generally we really like the new APIs and we want to +1 them!
What we did before
Some of our components already worked on a pattern of having a single run function which stops when it receives the task cancellation signal. We had further helper code for listening to unix termination signals and cancelling the Task when certain signals were received. We were able to replace this with graceful shutdown listeners. We implement the Service protocol for all the components that we offer and expect our adopters to use ServiceLifecycle now.
Previously, we found ourselves often writing helper functions in the pattern of withHTTPClient { httpClient in which would handle the lifecycle for us. The with function would start the service, then call the closure, then stop the service. It would ensure the service is always shutdown, even if the closure throws an error. However, this resulted in heavily nested code when calling many with functions. It is also easy for users of our libraries to forget to use these helper functions, and instead directly instantiate services. In that scenario, they may forget to shut it down correctly.
Why service-lifecycle is better
For libraries, adopting service-lifecycle has made our code much simpler to write and to understand, as service-lifecycle is able to do a lot of heavy lifting for us. We were able to remove our own unix-signals catching, that our adopters used, since this func is now provided by ServiceLifecycle.
For applications, developers no longer need to ensure they shut down every service cleanly in every scenario (e.g. error cases). Using the with helper functions makes this easier, but results in heavily nested and hard-to-read code. With service-lifecycle, it becomes very clear what is running and what order the dependencies are in
Our experience
Conforming libraries to service-lifecycle
In the case of components which already have a single run function, adopting service-lifecycle is trivial and has made our code much simpler to write and to understand. Furthermore, we now have a concept of graceful shutdown, which allows us to implement more complex shutdown functions. For example, a HTTP server can wait for requests to finish when asked to shutdown gracefully, but forcefully stop them when the Task is cancelled.
Adapting legacy libraries
For components which are not based on the pattern of a single run function, it is not too difficult to adapt. We found there are 2 common patterns.
The first is components which have a start function and a stop function, and some way to wait for the shutdown to happen
They can implement a run function as follows
public func run() async throws {
await cancelOnGracefulShutdown {
try await withTaskCancellationHandler {
try await self.start()
try await shutdownFuture.get()
} onCancel: {
self.shutdown()
}
}
}
The second type is components which have no way of waiting until shutdown. They need some way to pause until shutdown is requested.
For example, AsyncHTTPClient does not have any equivalent to the run function but instead requires shutting down once it is no longer needed. We want to use AHC in projects which otherwise use service-lifecycle. We achieved this by implementing the conformance ourselves with 2 steps
- Do nothing until shutdown signal is received
- Shutdown
This requires blocking execution until the shutdown signal is received. We implemented an actor to keep track of this state, as follows:
actor CancellationWaiter {
private var taskContinuation: CheckedContinuation<Void, Never>?
init() {}
func wait() async {
await withTaskCancellationHandler {
await withGracefulShutdownHandler {
await withCheckedContinuation { continuation in
self.taskContinuation = continuation
}
} onGracefulShutdown: {
Task {
await self.stop()
}
}
} onCancel: {
Task {
await self.stop()
}
}
}
private func stop() {
self.taskContinuation?.resume()
self.taskContinuation = nil
}
}
Then the run function can be implemented as
extension HTTPClient: Service {
public func run() async throws {
await CancellationWaiter().wait()
try await self.shutdown()
}
}
This is quite verbose, but is a small price to pay. In exchange, our adopters can specify their dependencies and have the lifecycle managed for them.
In any case, care is needed to ensure that the run function does not terminate prematurely, as this is considered an error by service-lifecycle, and would trigger a full shutdown of all services.
Adopting service-lifecycle in applications
Our applications which use service-lifecycle typically need to instantiate the services one-by-one, often in a particular order if one service needs a reference to another
These services then need to be added to the ServiceGroup and run. The api is generally simple and easy to use.
This makes it very easy to handle the shutdown, as it is done for us and done in order. However, if using legacy libraries, these either need to be adapted using the pattern above, or managed separately outside of the ServiceGroup.
It is easy to forget to add a service to the service group, this results in it not running which can cause the application to not work correctly.
This is a big improvement over the patterns we were previously using. Adopters don’t need to worry about ensuring every service is shutdown correctly and about handling signals.
Testing a library with service-lifecycle
The run function of a Service is a normal Swift function and so can be tested trivially. For checking shutdown is handled correctly, ServiceLifecycleTestKit provides a try await testGracefulShutdown { shutdownManager in helper function. We simply need to start our service, then call shutdownManager.triggerGracefulShutdown() and then assert that the service has shut down. Depending on the service, we might want to do further assertions that things were shutdown cleanly/correctly
This is really clean and simple!
Nesting services
For advanced use-cases, we found the API to be flexible enough to allow us to nest services. I.e., we could have our run function implemented as running a TaskGroup, which then runs multiple services underneath. This allows us to expose a group of Services as a single Service to our adopters.
Shortcomings
One potential shortcoming of the current API is there is no way to wait for a service to finish starting up. Services are started up one after the other. However, it is possible to work around this. For example, we can wrap service B in a LazyService (an implementation of Service which waits for a continuation from service A before running underlying service B). This way, we ensure B is not started before A has reached a certain point in its startup.
Service-lifecycle should be able to add helper functions for these use cases in future if needed, without breaking API.