Generic Connection Pool

tanner0101 · August 6, 2020, 11:24pm

Connection pooling is a critical component for many server-side applications. I would attempt to summarize connection pooling here, but honestly Wikipedia's page does a better job:

In software engineering, a connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database.

There are several Swift connection pool implementations currently floating around:

EventLoop{Group}ConnectionPool from Vapor's AsyncKit (contributed by myself, @MrLotU, @graskind, and @calebkleveter).

Vapor currently uses these pools for PostgreSQL, MySQL, SQLite, and APNS

ConnectionPool from @Mordil's RediStack (contributed by @lukasa).
ConnectionPool from SSWG's AsyncHTTPClient (contributed by @adtrevor, @weissi, @artemredkin, @adam-fowler, and @dimitribouniol)

Given the variety of implementations currently available, it seems like a good time to compare and contrast approaches and see whether generic connection pool could help us combine efforts.

It's important to note that such a generic connection pool probably can't be a silver bullet. It's likely that AsyncHTTPClient for example will need to continue to ship its own implementation due to the complex requirements of the HTTP spec. However, as @lukasa said at the last SSWG meeting, we should be able to make progress on "homogenous connection" pooling. In other words, pools where all the connections can be treated equally.

Since I'm most familiar with Vapor's connection pools, I'm going to share an explanation of how they work, and what I believe are the pros / cons of the methods we chose. I invite anyone else who has experience with connection pooling to share their thoughts here. If I left something out from the list above, please let me know and I'll add it!

Vapor's AsyncKit package defines two connection pools:

EventLoopConnectionPool: A pool of connections tied to a single EventLoop.
EventLoopGroupConnectionPool: A collection of EventLoopConnectionPools tied to an EventLoopGroup with one pool per EventLoop.

These pools rely on two protocols:

ConnectionPoolItem: What the connection pools hold.
ConnectionPoolSource: Responsible for creating new connection pool items.

The ConnectionPoolItem protocol is very simple, but enforces a core assumption of Vapor's connection pools: Each connection belongs to an EventLoop.

/// Item managed by a connection pool.
public protocol ConnectionPoolItem: class {
    /// EventLoop this connection belongs to.
    var eventLoop: EventLoop { get }
    
    /// If `true`, this connection has closed.
    var isClosed: Bool { get }
    
    /// Closes this connection.
    func close() -> EventLoopFuture<Void>
}

The isClosed property is used by the connection pool to prune closed connections. The close() method is used when the pool is shutting down.

The ConnectionPoolSource is also quite simple and again assumes connections are created for a single event loop.

/// Source of new connections for `ConnectionPool`.
public protocol ConnectionPoolSource {
    /// Associated `ConnectionPoolItem` that will be returned by `makeConnection()`.
    associatedtype Connection: ConnectionPoolItem
    
    /// Creates a new connection.
    func makeConnection(logger: Logger, on eventLoop: EventLoop) -> EventLoopFuture<Connection>
}

Note that the Logger the connection should use is passed here. Going forward, this method may need to accept additional context like BaggageContext for tracing. This should be considered in advance to preempt protocol breakage.

A simplified version of EventLoopConnectionPool's interface is supplied below. Some important things to note:

If a connection is not available when requested, a new connection will be created until maxConnections is reached.
Closed connections are pruned, but the pool never closes connections itself unless shutting down.
There is no "min connections" requirement.
The requestTimeout prevents deadlocking if more connections are required at once than the pool can possibly yield. In other words, if you are waiting on three connections from the pool at once, and it can hold at most two, you would be waiting forever without this timeout.
withConnection simply calls requestConnection and releaseConnection around the supplied closure. There is no special logic here.
You can call this pool from any thread. It will return to the associated EventLoop before doing work.
This pool never locks and is meant to be used on its associated EventLoop.
Additional context passing besides just Logger should be considered.

final class EventLoopConnectionPool<Source> 
    where Source: ConnectionPoolSource 
{
    let source: Source
    let maxConnections: Int
    let requestTimeout: TimeAmount
    let logger: Logger
    let eventLoop: EventLoop

    init(
        source: Source,
        maxConnections: Int,
        requestTimeout: TimeAmount = .seconds(10),
        logger: Logger = .init(label: "codes.vapor.pool"),
        on eventLoop: EventLoop
    )
    
    func withConnection<Result>(
        logger: Logger? = nil,
        _ closure: @escaping (Source.Connection) -> EventLoopFuture<Result>
    ) -> EventLoopFuture<Result>
    
    func requestConnection(
        logger: Logger? = nil
    ) -> EventLoopFuture<Source.Connection>
    
    func releaseConnection(
        _ connection: Source.Connection, 
        logger: Logger? = nil
    )
    
    func close() -> EventLoopFuture<Void>
}

A simplified version of EventLoopGroupConnectionPool is supplied below. Some important things to note:

This pool is just a collection of EventLoopConnectionPool and does not implement any actual pooling logic.
maxConnectionsPerEventLoop sets maxConnections on each EventLoopConnectionPool. This means that you must have at least one connection event loop.
pool(for:) is the most important method this pool offers. The {with,request,release}Connection methods all call into this method.
This pool locks during shutdown and to check that it is not already shutdown when other methods are called.
This pool is meant to be used on the main thread. If you are on an EventLoop, you should ask for your respective EventLoopConnectionPool.
Additional context passing besides just Logger should be considered.

final class EventLoopGroupConnectionPool<Source> 
    where Source: ConnectionPoolSource 
{
    let source: Source
    let maxConnectionsPerEventLoop: Int
    let eventLoopGroup: EventLoopGroup
    let logger: Logger
    
    public init(
        source: Source,
        maxConnectionsPerEventLoop: Int = 1,
        requestTimeout: TimeAmount = .seconds(10),
        logger: Logger = .init(label: "codes.vapor.pool"),
        on eventLoopGroup: EventLoopGroup
    )
    
    public func withConnection<Result>(
        logger: Logger? = nil,
        on eventLoop: EventLoop? = nil,
        _ closure: @escaping (Source.Connection) -> EventLoopFuture<Result>
    ) -> EventLoopFuture<Result>
    
    public func requestConnection(
        logger: Logger? = nil,
        on eventLoop: EventLoop? = nil
    ) -> EventLoopFuture<Source.Connection>
    
    public func releaseConnection(
        _ connection: Source.Connection,
        logger: Logger? = nil
    )

    func pool(for eventLoop: EventLoop) -> EventLoopConnectionPool<Source>
    
    func syncShutdownGracefully() throws

    func shutdownGracefully(_ callback: @escaping (Error?) -> Void)

}

Vapor integrates these pools into the framework and you normally don't interact with them directly. An example of using these pools without Vapor can be seen in PostgresKit's README.

import PostgresKit

let eventLoopGroup: EventLoopGroup = ...
defer { try! eventLoopGroup.syncShutdown() }

let configuration = PostgresConfiguration(
    hostname: "localhost",
    username: "vapor_username",
    password: "vapor_password",
    database: "vapor_database"
)
let pools = EventLoopGroupConnectionPool(
    source: PostgresConnectionSource(configuration: configuration), 
    on: eventLoopGroup
)
defer { pools.shutdown() }

pools.withConnection { conn 
    print(conn) // PostgresConnection on randomly chosen event loop
}

let eventLoop: EventLoop = ...
let pool = pools.pool(for: eventLoop)

pool.withConnection { conn
    print(conn) // PostgresConnection on eventLoop
}

Pros:

Very concise implementation (~600 loc with lots of comments).
Connections must be used on the same EventLoop meaning there is no hopping.
CircularBuffer is used to achieve O(1) request and release.
LIFO ordering is used to help reduce connection timeouts.
Lots of trace and debug logging to help diagnose issues.
deinit assertions to help quickly track down pools not being closed properly

Cons:

You cannot share connections between EventLoops and you must have at least one connection per loop. There is no way to set an application-wide max connection count. This has been a fairly consistent point of confusion for Vapor developers.
Very limited configurability, i.e., no min connection count option, "leaky connection" option, etc.
No guarantees that connections are not being used after returned to the pool.

Helge_Hess1 · August 7, 2020, 1:31pm

I'd love to have a general purpose NIO connection pool, waiting for this for a long time.

I was always imagining that the pool itself would just be a specialised NIO Bootstrap (maybe taking another bootstrap) and an integral part of NIO itself. I.e. there wouldn't be a need for the protocol, because NIO essentially has that already.

Does this make any sense?

lukasa · August 7, 2020, 2:06pm

The Vapor pool is broadly very similar to the RediStack one. The RediStack pool has a few differences, most of which are motivated to its desire to solve a minimal amount of the problem set. By that what I mostly mean is we should not assume the things that RediStack is missing are things we shouldn't build.

No equivalent of EventLoopGroupConnectionPool. The RediStack pool assumes all connections will be on a single event loop as well, but doesn't bother with a cross-pool abstraction.
No use of protocols: we store concrete types. There's no good reason to do this, we probably should use an abstraction.
More complex functionality around maintaining connections. The RediStack pool allows you to ask the pool to try to maintain a certain minimum connection count to reduce the odds that idle machines will have to wait for connections to be established (reducing the risk of higher tail latency). It also allows a "leaky" model in which, if the pool has already leased all its connections, more connections will be created to serve waiters.
Deadlines instead of timeouts, just for composition purposes really.

I think we would do well to consider "merging" the two approaches. Given that both are about equally small if there are no more compelling choices we could start with Vapor's and tweak it a bit to add in anything missing from RediStack that we think we need.

An important feature set we discussed in the SSWG meeting was the question of customisability. It is common to do this with strategies: essentially pluggable bits of code that the pool invokes whenever it needs to do anything. Places we might want strategies:

Connection creation. This is already present in both implementations, we should formalise how that looks.
How to select a connection to be leased from the pool.
What to do with a connection when it is returned to the pool.
A strategy for what to do if an idle connection dies.

At a higher level (the EventLoopGroupPool in Vapor, what I've often seen called a PoolManager) may also want strategies around selecting which pool to use, as well as the ability to share strategies between the sub-pools.

Another topic raised by @weissi is that we really want to arrange that users don't have to deal with event loop hopping if they can possibly avoid it. This probably doesn't get done at the lowest-level pool abstraction, because it's explicitly an event-loop-aware construct, but the PoolManager level almost certainly does need a solution for how to deal with that. A Channel wrapper type is probably the way to go here.

Broadly, yes. The thing I think is a bit tricky with this idea is the question of channel initialisers: what happens to the handlers in the pipeline when you give your connection back?

Helge_Hess1 · August 7, 2020, 2:15pm

They stay up, if one would want to make it really smart, there could be pooling/parking events (not sure it is necessary).
What do you currently do, just pool the Channel itself?

I suppose the top-level "client" handler would need to be pushed/popped as the stack enters/leaves the pool? So the pool bootstrap would have the top-level handler which is attached to the "network stack" maintained by the inner connection setup bootstrap?

There is also the question what happens if there is IO while a connection is in the pool. Could be a keep alive (say WebSocket ping/pongs) or unexpected data delivery (close connection?)

P.S.: I didn't really think this through, it just feels natural to have it there instead of adding another layer.

Helge_Hess1 · August 7, 2020, 2:17pm

My SwiftNIO IRC and Redis has something like that (not sure it is actually implemented :-), I think I just took what some Node lib suggested): https://github.com/NozeIO/swift-nio-irc-client/blob/develop/Sources/IRC/IRCRetryStrategyCB.swift#L19

lukasa · August 7, 2020, 2:37pm

Pool the Channel directly, which persists its state as-is. It's incumbent on the user of the pool to clean up after themselves if they don't want that state stored.

This is also up to the user to handle: they are responsible for ensuring the pipeline is in the state they want when it's in the pool.

tanner0101 · August 7, 2020, 2:45pm

+1 to this being handled by the generic connection pool. This was something I had to implement myself when adopting RediStack's pools here: Remove RedisKit dep, use RedisStack pooling directly by tanner0101 · Pull Request #166 · vapor/redis · GitHub. I think most high level frameworks and people rolling their own framework (especially using the new service lifecycle package) will want something like this.

+1.

+1 to both min connections and the "leaky" model. This is something I've been meaning to implement in Vapor's pools already.

If it's easier to produce deadline value from a desired timeout period than vice versa, then +1.

+1

The relevant code in EventLoopConnectionPool is here: https://github.com/vapor/async-kit/blob/master/Sources/AsyncKit/ConnectionPool/EventLoopConnectionPool.swift#L187. It currently pops first from the available CircularBuffer until it finds an open connection. This could instead use a enum-like strategy type that offers some good defaults (like find first open) or lets you find one from the buffer yourself.

The relevant code in EventLoopConnectionPool is here: https://github.com/vapor/async-kit/blob/master/Sources/AsyncKit/ConnectionPool/EventLoopConnectionPool.swift#L263. It currently appends the released connection to the CircularBuffer. This is what creates the LIFO ordering. This could also be managed by an enum-like strategy type.

Relevant code here: https://github.com/vapor/async-kit/blob/master/Sources/AsyncKit/ConnectionPool/EventLoopConnectionPool.swift#L194-L196. Since looking for the connection pops it from the CircularBuffer, finding a closed once simply decrements the available connection count. This approach is lazy in that we don't prune until we need to and connections are not replaced until the pool becomes exhausted again. IIRC RediStack monitors a connection's closeFuture. We might want to adopt that here.

This is the part I'm least sure about. As in, no idea how it will work but I think it's important we figure it out. For two reasons:

The way Vapor's maxConnectionsPerEventLoop configuration works is constantly confusing people. Could there be an (additional?) way to configure the pool manager with a cross-EventLoop "max connections" count and then use something like AHC's EventLoopPreference when requesting connections? If it finds one on the event loop you want, great. If not, it pulls one from another event loop and does hopping. This approach is obviously less performant but could be valuable to people with a limited connection budget on their DB.
Wrapping the channel type in some way would also let us enforce that once a connection is returned to the pool, you can no longer use it. I'm not sure how much of a problem this really is (I haven't heard any complaints about this with Vapor's connection pools yet) but if it's something we could do relatively easily I'm definitely +1 for safety.

Helge_Hess1 · August 7, 2020, 11:17pm

What does Netty do here, do they have something for this?

lukasa · August 10, 2020, 6:47am

Netty has ChannelPool. This is broadly the same idea as what is being proposed here, albeit with a much more Java spelling.

Joannis_Orlandos · August 11, 2020, 8:01am

I like this idea, but want to make sure this is defined properly. In MySQL/PostgreSQL you're supposed to have a connection form to prevent being limited by the lack of multiplexing. Whereas in MongoDB and I suppose a tonne of other drivers, you do have multiplexing.

In MongoKitten, the connection pool is used to maintain fallback connections when a database is hosted as a cluster. When MongoKitten stops being able to maintain a connection to one server, it hops to the next looking for the new master. Meanwhile it tries to reconnect to the dropped server.

I suppose the wide set of connection pool use cases would make it relevant to define the usecase explicitly upfront.

kmahar · September 16, 2020, 5:20pm

Chiming in a bit late here

I'm definitely in support of such a project and we would love to be able to depend on this in the MongoDB driver as we tackle rewriting internals in Swift, rather than starting off completely from scratch.

As we've discussed, there are quite a number of requirements around how MongoDB drivers pool connections, and the ability to supply custom strategies for various behavior as you describe will be key for us.

The sort of typical MongoDB driver architecture (at least for the in-house drivers, MongoKitten and other community drivers may be a bit different) is to have a shared, thread-safe MongoClient , which users typically only need one of for their entire application. The MongoClient is backed by one connection pool per host it is connected to.
For each read or write performed via a client’s API, we use an algorithm called server selection, which takes into account the information we know about each host (tracked via a separate monitoring connection to each host, maintained outside of the pool) to decide which host to route the command to.

Given the conversation here about a PoolManager that maintains a pool per event loop to prevent having to hop event loops, and that a MongoClient is meant to be used across event loops, it seems like maybe we would end up with each MongoClient being backed by one PoolManager per host?

Alternatively maybe we could also handle the server selection algorithm via a PoolManager strategy as well, so there would be a 1:1 MongoClient-PoolManager relationship.

So one thing I've been thinking about is that per the specification we must allow our users to specify a maxPoolSize when creating a MongoClient, which is intended to cap the # of concurrent connections the driver will have open to a particular host. I think a setting like this would help us meet that requirement while still providing some optimization around avoiding hops where possible.

Joannis_Orlandos · October 4, 2020, 12:11pm

MongoDB is indeed quite a different beast from other databases here, when it comes to requirements. I'm not sure it's reasonable to make a Connection Pool as generic as would be needed, however it'd be great if a good design could be found for these kinds of use cases without sacrificing other use cases.