[Discussion] NIORedis: NIO-based Redis Driver

Mordil · April 1, 2019, 4:41pm

NIO-based Redis Driver

Proposal: SSWG-NNNN
Authors: Nathan Harris
Sponsors: Vapor
Review Manager: TBD
Status: Implemented
Pitch: Server/Pitches/SwiftNIO Redis Client
Implementation: mordil/nio-redis

Package Description

Non-blocking Swift driver for Redis built on SwiftNIO.


Package name	`nio-redis`
Proposed Maturity Level	Incubating
License	Apache 2
Dependencies	SwiftNIO 2.x, Server Logging API 0.1.x

Introduction

NIORedis is a module providing general implementations for connecting to Redis and executing
commands against an instance using its proprietary Redis Seralization Protocol (RESP).

These types are designed to work in a request / response loop, representing individual connections to Redis.

The goal of this library is to provide a semi-low level API to work with Redis, while still feeling like a normal Swift package that can be used as-is.

Motivation

Implementations of Swift Redis clients have been around for as long as Swift has, but most have been abandoned, rely on Objective-C runtimes, or use C libraries.

All of the currently maintained libraries either have framework specific dependencies, are not built with NIO, or do not provide enough extensibility while providing "out of the box" capabilities.

Existing Solutions

Proposed Solution

NIORedis provides the essential types for interacting with RESP and building NIO Channel pipelines for communicating with Redis, with default implementations designed to cover most use cases.

RESPValue

RESPValue represents the different types outlined in Redis' protocol as an enum.

public enum RESPValue {
    case null
    case simpleString(String)
    case bulkString([UInt8])
    case error(RedisError)
    case integer(Int)
    case array([RESPValue])
}

RESPValueConvertible

RESPValue needs to be translatable between Swift values and RESP more easily and generically - which is provided by the RESPValueConvertible protocol.

public protocol RESPValueConvertible {
    init?(_ value: RESPValue)

    func convertedToRESPValue() -> RESPValue
}

Default conformance is provided for:

Optional where Wrapped: RESPValueConvertible
Array where Element: RESPValueConvertible
RedisError
RESPValue
String
FixedWidthInteger (Int, Int8, Int16, ...)
Double
Float

RedisClient

As a protocol, RedisClient just defines a type that is capable of sending commands and receiving responses.

public protocol RedisClient {
    var eventLoop: EventLoop { get }

    func send(command: String, with arguments: [RESPValueConvertible]) -> EventLoopFuture<RESPValue>
}

In general, this is the type to rely on for executing commands - all convenience command extensions are applied to this type so that conformers may gain the benefits of implementations for.

An example of this is RedisPipeline, which conforms to RedisClient in order to implement it's internal pipelining logic.

RedisConnection

As the primary connection type (and RedisClient implementation), RedisConnection works with NIO's ClientBootstrap to build a pipeline for executing commands in a request / response cycles.

It is designed to be long-lived; being re-used between commands as needed.

public final class RedisConnection: RedisClient {
    public var eventLoop: EventLoop { get }
    public var isConnected: Bool { get }

    public init(channel: Channel, logger: Logger = default)

    @discardableResult
    public func close() -> EventLoopFuture<Void>

    public func makePipeline() -> RedisPipeline

    public func send(command: String, with arguments: [RESPValueConvertible]) -> EventLoopFuture<RESPValue>
}

While RedisConnection can be created on the fly with just a Channel instance, it's more likely you won't have one and can instantiate a connection with the following static method:

extension RedisConnection {
    public static func connect(
        to socket: SocketAddress,
        with password: String? = nil,
        on eventLoopGroup: EventLoopGroup,
        logger: Logger = default) -> EventLoopFuture<RedisConnection>
}

Example usage of RedisConnection:

import NIORedis

// create a new event loop group
let elg = MultiThreadedEventLoopGroup(numberOfThreads: 1)
defer { try! elg.syncShutdownGracefully() }

// create a new connection
let address = try SocketAddress(ipAddress: "127.0.0.1", port: 6379)
let connection = RedisConnection.connect(to: address, with: "password", on: elg.next()).wait()
defer { try! connection.close().wait() }

let value = connection.get("my_key")

RedisPipeline

Redis provides a decent overview of the benefits of providing pipelining, and NIORedis does this by
providing the RedisPipeline object, created from RedisConnection.makePipeline().

Roughly stated - a regular command through RedisConnection will be a "write and flush" operation, while RedisPipeline will write all of the commands, but not flush until execute() is called.

public final class RedisPipeline {
    /// Number of queued commands
    public var count: Int

    public init(channel: channel, Logger: Logger = default)

    /// Returns `self` to allow chaining
    @discardableResult
    public func enqueue<T>(operation: (RedisClient) -> EventLoopFuture<T>) -> RedisPipeline

    /// Drains the queue
    public func execute() -> EventLoopFuture<[RESPValue]>
}

An example of usage:

let connection = RedisConnection.connect(...)
let results = connection.makePipeline()
    .enqueue { $0.set("my_key", to: 1) }
    .enqueue { $0.get("my_key") }
    .enqueue { $0.increment("my_key") }
    .enqueue { $0.increment("my_key", by: 30) }
    .execute()
    .wait()
// results = [RESPValue]
// results[0].string == "OK"
// results[1].int == 1
// results[2].int == 2
// results[3].int == 32

RESPEncoder

As part of the pipeline, RESPValue needs to be translated to RESP byte streams, which is handled by the publicly available RESPEncoder.

This class conforms to NIO's MessageToByteEncoder protocol.

public final class RESPEncoder {
    public init(logger: Logger = default)

    public func encode(_ value: RESPValue, into buffer: inout ByteBuffer)
}

RESPDecoder

Inversely, RESPDecoder will translate RESP byte streams into RESPValues.

This also conforms to NIO's ByteToMessageDecoder protocol.

public final class RESPDecoder {
    public enum ParsingState {
        case notYetParsed
        case parsed(RESPValue)
    }

    public func parse(at position: inout Int, from buffer: inout ByteBuffer) throws -> ParsingState
}

RedisCommandHandler

Redis' request / response cycle is a 1:1 mapping, and at the end of the Redis channel pipeline is RedisCommandHandler to marshal the queue of incoming and outgoing messages.

As RedisCommandContext instances are written to the channel, RedisCommandHandler pulls out the information to store callbacks in a queue.

When responses from Redis return, callbacks are popped off the queue and given the response as RESPValues.

public struct RedisCommandContext {
    public let command: RESPValue
    public let responsePromise: EventLoopPromise<RESPValue>
}

open class RedisCommandHandler {
    public init(logger: Logger = default)
}

Maturity Justification

Until now, packages through the SSWG process have been accepted as Sandbox maturity - so it's appropriate to justify why NIORedis should be considered mature enough for Incubating.

This package supports:

~90 of Redis' commands as convenient extension methods
130+ unit tests, including RESP encoding / decoding and all command extensions

In addition, it meets the following criteria according to the SSWG Incubation Process:

As well as Vapor, and I, are in the processes of providing a (framework agnostic) higher-level library: RedisKit that Vapor has said they will be using as their implementation in version 4.

Seeking Feedback

API-wise, what are things you do or don't like?
Is there anything we could remove and still be happy?
Is there anything you think is missing before it can be accepted?
- I'm working on Pub/Sub right now
- Known Issues / Feature Requests

johannesweiss · April 1, 2019, 4:56pm

CC @Helge_Hess1/@tanner0101/@IanPartridge for the existing Redis drivers.

johannesweiss · April 1, 2019, 5:19pm

Thank you for your propsal!

I have a question about this function signature. This suggests that operation can return a value of any type but that surprises me because we can't transform any type into a RESPValue. Am I missing something?

Helge_Hess1 · April 1, 2019, 5:23pm

Just for reference (I don't plan to submit anything, nor do I think I qualify as a single person vendor ), my implementations are over here:

swift-nio-redis, this is just the RESP parser/encoder and a pipeline setup. It intentionally doesn't do any connection management so it is useful in arbitrary setups. Including on the server side.
swift-nio-redis-client, this is building on top of swift-nio-redis and adds a Node.js like API, including the related connection management API. Has PubSub. Also supports promises.
redi/s, my swift-nio-redis based Redi/S server. This is benchmarked against the C server and I optimised it as far as I could at the time, I should try again w/ Swift 5

IMO the implementation is decent and highly optimised and profiled against Redi/S. The RESP and client module works with both, NIO 1/2 and and Swift 4/5, I think I didn't port Redi/S to 5/NIO-2 yet.

I think we discussed that before, IMO it is nice if the protocol encoders/decoders are at least kept in separate modules.

Also I'd very much prefer to see (and strongly recommend to build) a common NIO connection handling and pooling library before clients are added. Otherwise we end up with N different implementations. (If high level clients are to be added to NIO in the first place, which IMO is debatable, I'd prefer to include just protocol implementations in the spirit of NIOHTTP1/2/WebSocket).

johannesweiss · April 1, 2019, 5:27pm

Totally everybody can suggest libraries. To move forward to the later SSWG maturity stages there is a requirement of having a few committers and probably also a 'sponsor' from the SSWG just in case you'd walk away from your project or in case there's a bad security vulnerability whilst you're on a long holiday.

Helge_Hess1 · April 1, 2019, 5:31pm

I think this is incorrect. Redis strings should always be treated as arbitrary Data objects, even a simple string doesn't have to be UTF-8, could be Latin-1 as well. They call them "strings", but they are Data objects really.

case bulkString([UInt8])

I think it is desirable to keep this as a ByteBuffer/ByteBuffer slice. Again, just for reference I have this thing:

public enum RESPValue {
  case simpleString(ByteBuffer)
  case bulkString  (ByteBuffer?)
  case integer     (Int)
  case array       (ContiguousArray<RESPValue>?)
  case error       (RESPError)
}

Mordil · April 1, 2019, 5:40pm

The function signature does allow that possibility, but is documented as "undefined behavior" in a sense - as the purpose of this callback is to have access to the RedisClient for invoking commands.

RedisPipeline implements RedisClient to handle the enqueuing of commands through send(command:with:).

I wrote an issue to at least improve the ergonomics of execute(): nio-redis#28

I have a feeling that while working on that, the enqueue(operation:) signature will be more restrictive to prevent inadvertent API misuse

johannesweiss · April 1, 2019, 5:41pm

Yes, I think we should restrict the API to not permit undefined behaviour if at all possible. And in this case it seems like it's totally possible.

Mordil · April 1, 2019, 5:48pm

I'll update the proposal to include the different parts of the implementation, rather than a single repo.

I think we discussed that before, IMO it is nice if the protocol encoders/decoders are at least kept in separate modules.

I still haven't been convinced that splitting those out into a separate module or package is entirely necessary - especially as it adds an additional step to remember "I have to declare a dependency on two different packages, or two different modules rather than just one" without using @_exported

Also I'd very much prefer to see (and strongly recommend to build) a common NIO connection handling and pooling library before clients are added. Otherwise we end up with N different implementations. (If high level clients are to be added to NIO in the first place, which IMO is debatable, I'd prefer to include just protocol implementations in the spirit of NIOHTTP1/2/WebSocket).

This is true, but unless it's handled by NIO, or we agree to use something like NIOKit or SwiftNIO Extras I think it's "out of scope" for the purpose of any base driver library aside from building a connection.

Mordil · April 1, 2019, 5:59pm

I think this is incorrect. Redis strings should always be treated as arbitrary Data objects, even a simple string doesn't have to be UTF-8, could be Latin-1 as well. They call them "strings", but they are Data objects really.

Is this feedback to change this as to [UInt8] storage rather than String? I don't disagree with this, and can easily make the change.

I think it is desirable to keep this as a ByteBuffer/ByteBuffer slice. Again, just for reference I have this thing:

I remember this feedback during the pitch, and that's why I worked to switch .bulkString to [UInt8] from Data like it was. However, I disagree with going further to ByteBuffer.

My goal for RESPValue & RESPValueConvertible was to provide bridges and representations of RESP with Swift native values. Users who want a ByteBuffer can easily create one however they want using the [UInt8] while it doesn't push users who don't want to, to have to.

In addition, the use of a shared allocator floating as a global (even fileprivate) in order to create RESPValue on the fly just rubs me the wrong way that I can't quite articulate.

In regards to ContingousArray<RESPValue> I don't think that's necessary over [RESPValue] as the Standard Library says this:

If the array’s Element type is a struct or enumeration, Array and ContiguousArray should have similar efficiency.

All of the RESPValue cases have storage to struct types.

johannesweiss · April 1, 2019, 6:09pm

Well, so in the implementation itself, you will already have it as a ByteBuffer, free of charge. Converting it to [UInt8] (or String for that matter) will allocate a new storage and copy it. If a user would want to go back to a ByteBuffer, they would need to allocate & copy again.

So I think ByteBuffer would actually be a very good type here. We can offer convenience APIs on top which do the conversion to [UInt8] but then it'll be the user who opts in to allocate and copy. If however (as it's done right now) the library chooses to allocate and copy, the user can't take opt out of that and peak performance might suffer.

Helge_Hess1 · April 1, 2019, 6:15pm

Well, not to [UInt8] specifically, but to a (not the) Data, not a String, yes.

I remember this feedback during the pitch, and that's why I worked to switch .bulkString to [UInt8] from Data like it was. However, I disagree with going further to ByteBuffer.

For me NIO level stuff needs to be performance sensitive. You can always wrap it in something nicer at a higher level.

In regards to ContingousArray I don't think that's necessary over [RESPValue]

Yup, that is not strictly necessary though it communicates the intent the well.

Mordil · April 1, 2019, 6:42pm

Alright, I've seen the reasoning for changing from [UInt8] for storage to ByteBuffer. I've written issue #31 to track this

johannesweiss · April 2, 2019, 5:27pm

As chatted about in the Vapor Discord, I'd like to add some feedback for RedisPipeline. So if I understand correctly, the main task of RedisPipeline is batching up the commands so you can get back an array of responses. So whilst the implementation of this feature might make use of pipelining, that's not what the user is after when using the API. A user really wants to send a bunch of commands and get the responses back as an array. Therefore, if we leave everything else as is, I think we should rename this RedisCommandBatcher or something.

RedisPipeline is also confusing in other ways: First, the type doesn't create a pipeline but may use pipelin ing. The more important thing is that a user might assume that the library doesn't make use of pipelining if the user doesn't use the RedisPipeline type but that's not the case, if the user enqueues two commands with a single flush, it will still be pipelined, the only difference is that their results don't come back as an array.

johannesweiss · April 2, 2019, 5:42pm

Mordil:

RESPValue

RESPValue represents the different types outlined in Redis' protocol as an enum .
public enum RESPValue {
    case null
    case simpleString(String)
    case bulkString([UInt8])
    case error(RedisError)
    case integer(Int)
    case array([RESPValue])
}

Quick question regarding this: I really like the above way of having a declarative way for a value. Is there are reason we shouldn't also have a declarative way for the command themselves? Something like

public enum RESPCommand {
    case get(RESPValueConvertible)
    case increment(RESPValueConvertible)
    case set(RESPValueConvertible, RESPValueConvertible)
    // ...
}

That way, batching of requests can become more natural and we don't need

public func enqueue<T>(operation: (RedisClient) -> EventLoopFuture<T>) -> RedisPipeline

where we have this weird operation: (RedisClient) -> EventLoopFuture<T> which I commented about before. Correct me if I'm wrong but I believe the T is just there because we need support all of the singular operations like $0.get(...) which already return an EventLoopFuture but of different types... If we created an enum RESPCommand we could have

connection.sendBatch([.get("foo"), .set("bar", 1), .increment("buz")])

and sendBatch would just take [RESPCommand].

Mordil · April 2, 2019, 5:46pm

I've thought about this too - but held off until now because I didn't see a huge use case for it.

This discussion has shown that use case. I'll write an issue for this and update this comment with a link to it for tracking.

EDIT: NIORedis#33

Helge_Hess1 · April 3, 2019, 10:31pm

That is IMO not the idea of the Redis pipeline. The Redis pipeline kinda directly maps to Channel.write vs Channel.flush (i.e. it is indeed the same like HTTP/1.1 pipelining).

I could imagine something like this:

connection.withAutoflushDisabled {
  // regular client calls
} // explicitly flushes at the end

This way NIO can decide if it wants to flush earlier (send buffer full), or we do it explicit once we wrote the requests we want can bundle together in a TCP packet.

Combining the results (whether pipelined calls or not) can already be done by using a NIO promise (cascade or something? don't remember), right? That is completely unrelated to pipelining.

Helge_Hess1 · April 3, 2019, 10:53pm

johannesweiss:

Something like
public enum RESPCommand {
    case get(RESPValueConvertible)
    case increment(RESPValueConvertible)
    case set(RESPValueConvertible, RESPValueConvertible)
    // ...
}
That way, batching of requests can become more natural and we don't need

In the user facing API you don't want:

connection.do(.get("key")) // what about the result?

But just:

connection.get("key") // => promise or callback

The enum dupes quite a bit. Though the same is true for NIO HTTP :->

A Redis command call is just a regular RESP array where the first element is the command key.
For queuing commands I have this RedisCommand object which is a little like your enum. I guess one might potentially save an (array) alloc because your enum might directly encode the parameters for short commands. So maybe it is indeed a good idea despite all the extra code!

For queuing I have this thing: RedisCommandCall which bundles the command with the associated promise. /shrug

BTW: If the enum is done, this:

public enum RESPCommand {
    case get(RESPValueConvertible)
}

should probably be

public enum RESPCommand {
    case get(RESPStringConvertible) // ???
}

because the various commands do expect certain types, not just arbitrary RESP representable values.

P.S.: ... just ideas as usual.

johannesweiss · April 4, 2019, 11:48am

No, pipelining doesn't map to write/flush. Even if you do (for brevity written as a synchronous program)

write(request1); flush(); write(request2); flush(); write(request3); flush()

from the perspective of Redis, you still pipelined the requests. The exact TCP packets may or may not (depending on the TCP_NODELAY setting) differ.

If you truly don't want pipelining the program would look like

write(request1); flush(); waitForResponse(); write(request2); flush(); waitForResponse(); write(request3); flush();

Which means in an asynchronous world, you would need to implement some extra stuff to hold onto the N+1st command until we have the Nth command's response.

So whether we use pipelining is not determined by when we flush but rather if we wait for a response before sending the next request.

Said that,

write(request1); write(request2); write(request3); flush()

is clearly more efficient than

write(request1); flush(); write(request2); flush(); write(request3); flush()

But that's a separate issue. The only thing I was saying is that the main purpose of RedisPipeline is not whether to pipeline or not but rather whether to batch. And if we have a batching mode we can also make sure we don't flush more often than necessary.

Why not both APIs? I would absolutely recommend users to use the latter one (connection.get("key")).

Indeed, in NIO's case I think that's fine because NIO is a low-level tool built for people who write networking libraries which then offer the syntactical sugar on top. For the Redis library I think we should offer the nicest possible API, ie. enums internally but not in the simple user-facing APIs. If however the enums solve a particular issue that users might face too, we might just expose them too in more advanced APIs. So maybe

// nice convenience API
connection.get("key") -> EventLoopFuture<RespValue>
// more complicated API, used to implement the simple API (maybe `internal` only?)
connection.singular(.get("key")) ->  EventLoopFuture<RespValue>
// the underlying API that implements all others, may be public or not
connection.batched([.get("key"), .get("other-key"), .set("foo", "bar")])

For the Redis batching API case however, the enum could work. Just because the user might want to batch multiple thing that don't have the same return type and with the enum we can return them in one array without losing any type-safety.

Helge_Hess1:

should probably be
public enum RESPCommand {
    case get(RESPStringConvertible) // ???
}
because the various commands do expect certain types, not just arbitrary RESP representable values.

You are right. Using RESPValue as a key for GET is wrong because Redis won't accept all RESP values as keys. We should solve this with an explicit RESPKey type, maybe something like below.

The enum RESPCommand itself, I would keep as closely aligned to the wire-protocol as possible, as type-safe as possible. Therefore, below I suggest a RESPKey which can represent all RESP keys. It could be backed by either a String or a ByteBuffer, whatever's more correct in the protocol. For convenience (users want String literals we should implement the various ExpressibleBy* protocols). So RESPKey would feel pretty much like a String but we could do some extra validation (ie. no \r\ns in there)

public struct RESPKey: ExpressibleByStringLiteral, CustomStringConvertible, ExpressibleByStringInterpolation {
    // similar to https://github.com/apple/swift-log/blob/master/Sources/Logging/Logging.swift#L433-L448
    [...]
}

public enum RESPCommand {
    case get(RESPKey)
    case set(RESPKey, RESPValue)
}

// because some users might just want to query with a String
extension RESPCommand {
    public static func get(_ string: String) -> RESPCommand {
        // warning: needlessly inefficient implementation because RESPKey will probably offer
        // a constructor directly from String. Just to show-case 
        return .get("\(string)" as RESPKey)
    }
}

let connection = ...

// convenience API
let result1: EventLoopFuture<RESPValue> = connection.get("foo")

 // directly construct a RESPKey using ExpressibleByStringLiteral/ExpressibleByStringInterpolation
let result2: EventLoopFuture<RESPValue> = connection.singular(.get("foo"))

// via "public static func get(_ string: String) -> RESPCommand"
let explicitlyStringKey: String = "foo"
let result3: EventLoopFuture<RESPValue> = connection.singular(.get(explicitlyStringKey))

Helge_Hess1 · April 4, 2019, 12:01pm

Nitpicking. This is interdependent. You have to flush, otherwise you would wait forever for a result. Sticking to your example:

write(request1); waitForResponse()

Won't do anything. The Redis document about pipelining above explicitly outlines that they do it "for the flush" The whole point is bundling many small requests into one big packet, specifically not to accomplish interleaving because the backend has to do work (like in an HTTP pipeline). The backing in Redis is essentially instant (in memory hash table).

P.S.: That doesn't have much to do with the discussion, but Redis operations are atomic. By receiving many requests in a single batch, a lock only needs to be acquired once (in the case of Redi/S, Redis itself is single threaded).