Hi all,
I, along with @Ben_Cohen, @moiseev and @Devin_Coughlin, have been working on improving Swift APIs for Apple's unified logging system, based on some recent exciting developments to the Swift language and compiler. This post presents an overview of our plan for creating a customizable interface for Apple's logging system. Our proposal uses custom string interpolation, which is a recently-improved language feature, and compile-time interpretation, which is a compiler technology that is in the process of being upstreamed as an experimental feature.
We are very excited to discuss our proposal and welcome your feedback on this. However, the main aim of this post is not to focus too much on Apple specific logging APIs, but to talk about the language features and underlying ideas that enable these improvements to this logging API. Specifically, we think the following aspects of the proposal are applicable more generally:
- We show how the new string interpolation features can be used to surface all functionalities provided by printf-style format string and varargs.
- We present an extensible Swift implementation that deconstructs a string interpolation into a format string and an argument sequence.
- We present a compiler optimization based on compile-time interpretation that can infer the format string and other compile-time constants from the implementation that deconstructs the string interpolation.
We therefore see this proposal as being relevant to the following potential future directions:
- Development of a general-purpose, platform-agnostic logging API for Swift.
- Development of custom logging libraries for interfacing with other logging systems.
- Migration of other C-style Swift APIs such as localization APIs to string interpolation.
- Development of the compile-time interpretation technology and making it available to Swift users.
(A more detailed version of this proposal is available here).
Highlights of this Proposal
-
Changes
os_log
API to accept string interpolations instead of the printf-style format string and varargs combination that it currently accepts.
The new API (written in camelCase) will expose all the formatting options that are supported by the logging system such as tagging data as private, specifying the precision of floating-point numbers etc. -
Presents a Swift implementation of the osLog API based on custom string interpolation.
The implementation will interface with the native C ABIs provided by the logging system, which expect aprintf
-style format string and a contiguous sequence of bytes containing the varargs. The implementation constructs the required inputs to the C ABIs from the string interpolation that is passed in, thereby relieving the users ofosLog
from these low-level considerations. All of this construction logic will be implemented in Swift. Some general, reusable components of the implementation could be made available in the standard library for use by other similar use case. -
Enables Swift users to extend the osLog API to log their custom types.
Swift users can define extensions to the new logging API. The extensions may accept values of user-defined types and may define how to render the type and which fields are private etc. Instances of user-defined types can be interpolated in the log calls as if they were natively-supported values:osLog("User-defined type: \(instance)")
. -
Makes osLog invocations fast and aims to match the performance of os_log calls in C programs compiled with Clang.
This is achieved by an optimization pass that uses compile-time interpretation. Using such a general-purpose framework reduces the need for hardcoded knowledge within the optimization passes. This makes it easier for the implementation to evolve, and allows the extensions to the logging API to benefit from the optimizations. However, the use of the interpreter means that implementing the logging API requires more sophistication. It requires separating out the implementation into a compile-time interpretable part and a runtime part. Nonetheless, this cognitive overhead is hidden from the users and is a concern only for the implementers/extenders of the logging API.
Proposal
The New User Interface for Logging
The following table shows some examples of osLog
calls with formatting options in the proposed syntax.
An equivalent call in the existing os_log
syntax is also shown for comparison. The new custom string interpolation feature allows interpolated terms to be sequences of labeled expressions (similar to parameters of functions). We propose to use these features to specify formatting options with the interpolated expressions.
Proposed Syntax | Existing syntax |
---|---|
osLog("Network delay: \(delay) ms") |
os_log("Network delay: %lf ms", delay) |
osLog("Network delay: \(delay, .decimal(2)) ms") |
os_log("Network delay: %.2lf ms", delay) |
osLog("Network delay: \(private: delay, .decimal(2)) ms") |
os_log("Network delay: %{private}.2lf ms", delay) |
osLog("Login time: \(private: currentTime, .time) ms") |
os_log("Login time: %{private, time_t}d ms", currentTime) |
osLog("My user-defined type: \(myinstance, .myoption(myparam))") |
- |
The formatting options are cases of enums. For standard types the enums could be defined along with the definition of the osLog
API. (For user-defined types, it can be defined by the user creating the logging extension for the type.) E.g., the enum sketched below lists (some of) the formatting options for an Int
type supported by osLog
.
public enum IntLogFormat {
case hex
case octal
case time
case ipAddress
...
}
The formatting options that can be paired with a type is fixed by the definition of the interpolation functions for the type. This provides more type safety than the existing APIs, where formatting options such as time_t
are specified as strings. The interpolated values can also be user-defined types and can also accept user-defined options much like the natively defined types, provided the logging API has been appropriately extended to handle the user-defined type.
Formatting options are required to be static i.e., compile-time known values
An important requirement of the API is that the formatting options must be compile-time constants. If the enum cases accept parameters (e.g. like .decimal(2)
), those parameters should also be compile-time constants. This is necessary as the logging system requires the format string to be a static string in the compiled binary. (This is essential for performance as well as for enforcing a privacy policy.) We propose to generate a compiler error if this requirement is violated.
Logging API Implementation Model
We propose to implement osLog
as a Swift program and make it extensible by Swift users. The details described here would be relevant for users interested in creating such extensions.
As a running example, consider a call: osLog("Login time: \(private: currentTime, .time)")
. The implementation described below will construct a format string of the form: "Login time: %{private,time_t}ld"
and a buffer consisting of the bytes of the interpolated expression: currentTime
, and will invoke the C logging ABIs with these values.
We define osLog
as a function that accepts a struct: PackedLogMsg
, which conforms to the string interpolation protocol: ExpressibleByStringInterpolation
protocol. When a string interpolation is passed to the osLog
function, the compiler would generate code to create an instance of PackedLogMsg
and invoke a sequence of methods on it passing the literal and interpolated parts of the string interpolation.
For example, the sequence of calls generated by the compiler for the call osLog("Login time: \(private: currentTime, .time)")
is similar to the code shown below:
var stringInterpol = PackedLogMsg.StringInterpolation(literalCapacity: 12, interpolationCount: 1)
stringInterpol.appendLiteral("Login time: ")
stringInterpol.appendInterpolation(private: currentTime, .time)
var packedMsg = PackedLogMsg(stringInterpolation: stringInterpol)
osLog(packedMsg)
The construction of the format string and byte buffer happens in the string interpolation protocol methods of PackedLogMsg
. The following code snippet sketches the implementation of these functions. The code snippet has some missing pieces which are elided for brevity. The complete implementation of PackedLogMsg
and osLog
can be found here: PackedLogMsg prototype.
A sketch of the implementation of osLog and PackedLogMsg
public func osLog(_ packedMsg: PackedLogMsg) {
// Check if logging is enabled.
guard logger.isEnabled(...) else { return }
// Allocate a raw byte buffer and a byte encoder.
let byteCount = packedMsg.totalArgByteCount
var bufferPtr = UnsafeMutablePointer<UInt8>.allocate(capacity: byteCount)
var byteEncoder = ByteEncoder(bufferPtr)
// Move the data tracked by 'packedMsg' into the byte buffer.
packedMsg.encode(&byteEncoder)
// Invoke the C logging ABI.
_os_log_impl_wrapper(packedMsg.formatString, bufferPtr, byteCount, ...)
// Clean up all state.
bufferPtr.deallocate()
}
public struct PackedLogMsg : ExpressibleByStringInterpolation {
let stringInterpol: StringInterpolation
public struct StringInterpolation : StringInterpolationProtocol {
var preamble: UInt8 // The first byte of the header.
var argCount: UInt8 // Number of arguments in the buffer.
var formatString: String
var encodeOps: EncodeOperations
public mutating func appendLiteral(_ literal: String) {
formatString += literal // Escaping of percent characters is elided for brevity.
}
public mutating func appendInterpolation(private number: @autoclosure () -> Int64, _ format: IntLogFormat) {
preamble |= 1 // Set the private argument bit in the preamble.
argCount += 1
formatString += "%{private"
switch format {
case .time:
formatString += ",time_t"
case .ipAddress:
formatString += ",network:in_addr"
...
default:
break
}
formatString += "}ld"
encodeOps.append(UInt8(1)) // Append a preamble for the argument.
encodeOps.append(UInt8(8)) // Append the byte size of the argument.
encodeOps.append(number) // Append the argument itself.
}
}
public init(stringInterpolation: StringInterpolation) {
stringInterpol = stringInterpolation
}
}
The implementation is also compile-time interpretable, i.e., it can be interpreted (by a powerful enough interpreter) to infer the the value of the format string at compile time. The use of EncodeOperations
to abstract buffer construction and "autoclosuring" of parameters in the above code snippet are essential design choices that make the implementation easier to interpret. More details are available here.
Enabling User-defined Types to be Logged
The PackedLogMsg.StringInterpolation
struct can be extended by Swift users in order to log user-defined types. To log a user-defined type it suffices to define an overload of appendInterpolation
on PackedLogMsg.StringInterpolation
that accepts the user-defined type and interpolates the properties of the type as desired. The following code snippet illustrates this on a struct Developer
that uses an enum Language
. With this implementation, one can directly log instances of Developer
e.g. osLog("Developer \(dev) made a commit")
.
Note that all that the extension has to do is to invoke appendInterpolation
methods on types already
supported by PackedLogMsg
. In fact, as illustrated by appendInterpolation(_ dev: Developer)
, this can be accomplished by recursively using the string interpolation syntax itself, once we have defined a appendInterpolation
overload on PackedLogMsg
that can accept instances of itself. (See here for the full implementation.)
Extending PackedLogMsg to log a user-defined struct Developer and an enum Language
struct Developer {
let id: Int
var name: String
var preferredLanguage: Language
}
enum Language {
case swift
case objectiveC
case cpp
}
extension PackedLogMsg.StringInterpolation {
// Ideally, the parameters should be made autoclosures to enable compile-time interpretation.
mutating func appendInterpolation(_ dev: Developer) {
let msg: PackLogMsg = "Id: \(private: dev.id) Name: \(private: dev.name) Preferred Language: \(dev.preferredLanguage)"
appendInterpolation(msg) // Combines self with msg.stringInterpol.
}
mutating func appendInterpolation(_ lang: Language) {
switch lang {
case .swift:
appendInterpolation("swift")
case .objectiveC:
appendInterpolation("objectiveC")
case .cpp:
appendInterpolation("C++")
}
}
}
Optimization
The overall optimization workflow is summarized below. We propose to create a dedicated optimization pass that drives the following steps. The goal of this optimization is to match the performance of the code generated by Clang for equivalent os_log
calls in C.
-
The compile-time interpreter is run on the string interpolation calls to infer the compile-time value of the struct
packedMsg
and its properties. -
The implementation of
osLog
is optimized by folding in the uses of the properties ofpackedMsg
by their constant values. This involves replacing accesses toformatString
property with the inferred string literal, replacingtotalArgsByteCount
with the inferred constant, and replacing all indirect invocation of the closures tracked bypackedMsg.encodeOps
with their inferred targets. (The functionpackedMsg.encode(&byteEncoder)
invokes the closures tracked bypackedMsg.encodeOps
onbyteEncoder
.) -
All invocations of autoclosures that were created from the arguments to the string interpolation methods are converted to direct calls based on the targets inferred by the compile-time interpreter.
-
Existing function inlining pass is used to inline all direct function calls to
ByteEncoder
methods and closure targets. -
Existing dead-code and dead-object elimination passes are used to eliminate the
packedMsg
instance, which is now unused as its uses have been folded, and also to eliminate the string interpolation calls. -
The heap-allocation of the byte buffer in the
osLog
function is promoted to stack, based on the fact that it is deallocated at the end of the function.
If any of the above steps is not possible, we propose to present an error or warning to the user (or implementer of the logging API) depending on whether it results in a violation of the contract of the logging system or a performance degradation. E.g., if the format string cannot be inferred as a compile-time constant, it's a contract violation and is therefore an error. However, if step 6 is not possible, it is a loss of performance but is not an error.
Compile-Time Interpreter - Upstreaming Plans and Extensions
The compile-time interpreter is available in the Swift tensorflow branch. The interpreter is being upstreamed as an experimental compiler feature as per the plan laid out here: Compile-time interpretation by marcrasi ¡ Pull Request #19579 ¡ apple/swift ¡ GitHub.
We will use our real-world experience of it with osLog to help inform a future language evolution proposal for the compile-time interpreter and for any user-visible feature it enables.
The following is a list of language features that the interpreter would have to support in order for it to be applicable to the implementation sketched above.
- Basic language constructs such as: integers, floats, structures, function calls and generics.
- Array literal construction and array operations such as array indexing, array append.
- String literal construction and string operations such as
+
,+=
etc. - Closure creation, closure assignment and invocation.
The interpreter already supports most of the features listed above, except for closures. We propose to create a new representation within the interpreter for closures, which consists of the target of the closure (which is a SIL function) and variables that are captured. The closures would be treated as another kind of symbolic constant within the interpreter. The closures that are not invoked during interpretation are allowed to use arbitrary Swift code. The targets of the closures that are invoked during interpretation must be interpretable and all the captures of the closure must be constants tracked by the interpreter.
ABI Stability and Backward Deployment
The existing os_log API will be supported to ensure ABI stability. More interestingly, the optimizations proposed here enable backward deployment of code compiled with the new osLog API to older, ABI-stable Swift by eliminating abstractions such as PackedLogMsg
from the compiled code.