[Pitch] Standard Network Address types

Recently I’ve been working on a package named swift-dns, which is:

A Swift DNS library built on top of SwiftNIO; aiming to provide DNS client, resolver and server implementations.

I’ve been thinking about decoupling some IP and domain name types that can be useful outside that package as well, and putting them into a dedicated package.

Most of the code is already ready, living in the swift-dns repository, in Sources/DNSModels/NetworkAddress.

Here’s what I’m thinking right now about how to make these useable as a standalone package.

  • There will be a package named like “NetworkAddress“.
  • A few products, one dedicated to IPAddress types, one to Domain Name, one for compatibility between IP and Domain Name, and another that just re-exports all.
  • Current Domain Name implementation contains IDNA compatibility code (for those unfamiliar, IDNA helps with non-ascii domain names, for example if a domain name is in Persian, or Chinese). I’m thinking of disabling IDNA compatibility by default, and introducing an “IDNA“ trait that reenables it when enabled. The expectation would be that enabling the trait is on end-user, not any libraries that might use this “NetworkAddress“ library.

Right now the main API consists of a few types:
IPv4Address, IPv6Address, IPAddress, DomainName, CIDR.

All these types are self-explanatory, and all of them contain optimized implementations for common stuff, such as en/decoding to/from String/ByteBuffer, as well as IP < –- > DomainName conversions.
That also means that for now this package would be relying on SwiftNIO, until we have a better solution for a “bag of bytes“ in the ecosystem, or if I find a reasonable way to make the types independent of ByteBuffer.

The CIDR type contains a somewhat simple CIDR implementation, providing a way to do containment checks and initializations of the CIDR type through trivial bitwise operations, as well as optimized String en/decoding implementations.

Another thing to mention is that currently a bunch of the implementations require macOS 26 on macOS, due to usage of spans, specially the UTF8Span of String. Most if not all of these implementations can be back-deployed if needed, although with worse performance.

There are currently a good amount of tests, as well as benchmarks for these types as well.
The latest results can be found in the benchmark CI runs in the “Summary“ section. Currently the latest CI run is this one.

The current benchmark results are as below:



```
Host 'd20e4aedf073' with 2 'x86_64' processors with 7 GB memory, running:
#71-Ubuntu SMP PREEMPT_DYNAMIC Tue Jul 22 16:52:38 UTC 2025 (Hetzner - Falkenstein)
```

## DomainName

### Equality_Check_CPU_20M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       260 |       260 |       260 |       260 |       270 |       270 |       270 |        20 |

### Equality_Check_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         0 |         0 |         0 |         0 |         0 |         0 |         0 |        10 |

### app-analytics-services_dot_com_Binary_Parsing_CPU_2M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       180 |       180 |       180 |       180 |       190 |       190 |       190 |        28 |

### app-analytics-services_dot_com_Binary_Parsing_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         1 |         1 |         1 |         1 |         1 |         1 |         1 |        10 |

### app-analytics-services_dot_com_String_Parsing_CPU_200K

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |        80 |        90 |        90 |        90 |        90 |       100 |       100 |        57 |

### app-analytics-services_dot_com_String_Parsing_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         4 |         4 |         4 |         4 |         4 |         4 |         4 |        10 |

### google_dot_com_Binary_Parsing_CPU_2M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       170 |       170 |       170 |       170 |       170 |       180 |       180 |        30 |

### google_dot_com_Binary_Parsing_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         1 |         1 |         1 |         1 |         1 |         1 |         1 |        10 |

### google_dot_com_String_Parsing_CPU_200K

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |        50 |        60 |        60 |        60 |        60 |        70 |        70 |        83 |

### google_dot_com_String_Parsing_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         4 |         4 |         4 |         4 |         4 |         4 |         4 |        10 |

## IPAddress

### 111_Machine_Warmup_Benchmark

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (μs) * |         0 |         0 |         0 |         0 |         0 |     10000 |     10000 |     30916 |

### IPv4_CIDR_Create_Then_Check_Is_Loopback_100M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       100 |       110 |       110 |       110 |       120 |       120 |       120 |        45 |

### IPv4_CIDR_Create_Then_Check_Is_Multicast_100M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       110 |       110 |       110 |       110 |       120 |       120 |       120 |        45 |

### IPv4_CIDR_Create_Then_Check_Is_Multicast_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         0 |         0 |         0 |         0 |         0 |         0 |         0 |        10 |

### IPv4_String_Decoding_Local_Broadcast_10M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       180 |       190 |       190 |       190 |       190 |       200 |       200 |        27 |

### IPv4_String_Decoding_Local_Broadcast_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         0 |         0 |         0 |         0 |         0 |         0 |         0 |        10 |

### IPv4_String_Decoding_Localhost_10M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       150 |       160 |       160 |       160 |       170 |       170 |       170 |        31 |

### IPv4_String_Decoding_Zero_10M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       150 |       150 |       150 |       160 |       160 |       160 |       160 |        33 |

### IPv4_String_Encoding_Local_Broadcast_15M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       180 |       190 |       190 |       190 |       190 |       200 |       200 |        27 |

### IPv4_String_Encoding_Localhost_15M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       170 |       170 |       170 |       180 |       180 |       180 |       180 |        29 |

### IPv4_String_Encoding_Mixed_15M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       170 |       170 |       180 |       180 |       180 |       180 |       180 |        29 |

### IPv4_String_Encoding_Mixed_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         0 |         0 |         0 |         0 |         0 |         0 |         0 |        10 |

### IPv4_String_Encoding_Zero_15M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       170 |       170 |       170 |       180 |       180 |       180 |       180 |        29 |

### IPv6_CIDR_Create_Then_Check_Is_Loopback_100M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       110 |       110 |       110 |       110 |       120 |       120 |       120 |        45 |

### IPv6_CIDR_Create_Then_Check_Is_Multicast_100M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       110 |       110 |       110 |       110 |       120 |       120 |       120 |        45 |

### IPv6_CIDR_Create_Then_Check_Is_Multicast_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         0 |         0 |         0 |         0 |         0 |         0 |         0 |        10 |

### IPv6_String_Decoding_2_Groups_Compressed_At_The_Begining_2M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       100 |       100 |       110 |       110 |       110 |       110 |       110 |        47 |

### IPv6_String_Decoding_2_Groups_Compressed_At_The_End_2M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |        90 |        90 |        90 |       100 |       100 |       100 |       100 |        54 |

### IPv6_String_Decoding_2_Groups_Compressed_In_The_Middle_2M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       100 |       100 |       100 |       100 |       110 |       110 |       110 |        49 |

### IPv6_String_Decoding_2_Groups_Compressed_In_The_Middle_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         0 |         0 |         0 |         0 |         0 |         0 |         0 |        10 |

### IPv6_String_Decoding_Localhost_Compressed_10M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |        90 |       100 |       100 |       100 |       100 |       110 |       110 |        51 |

### IPv6_String_Decoding_Uncompressed_2M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       120 |       120 |       130 |       130 |       130 |       130 |       130 |        40 |

### IPv6_String_Decoding_Zero_Compressed_10M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |        80 |        80 |        80 |        90 |        90 |        90 |        90 |        60 |

### IPv6_String_Decoding_Zero_Uncompressed_2M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       120 |       120 |       130 |       130 |       130 |       130 |       130 |        40 |

### IPv6_String_Encoding_Localhost_10M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       150 |       150 |       150 |       160 |       160 |       160 |       160 |        33 |

### IPv6_String_Encoding_Max_4M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       200 |       210 |       210 |       210 |       210 |       220 |       220 |        24 |

### IPv6_String_Encoding_Mixed_4M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       180 |       180 |       190 |       190 |       190 |       190 |       190 |        27 |

### IPv6_String_Encoding_Mixed_Malloc

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Malloc (total) *       |         1 |         1 |         1 |         1 |         1 |         1 |         1 |        10 |

### IPv6_String_Encoding_Zero_10M

| Metric                 |        p0 |       p25 |       p50 |       p75 |       p90 |       p99 |      p100 |   Samples |
|:-----------------------|----------:|----------:|----------:|----------:|----------:|----------:|----------:|----------:|
| Time (user CPU) (ms) * |       170 |       170 |       180 |       180 |       180 |       180 |       180 |        29 |

Some examples from the results above:
IPv6_String_Decoding_2_Groups_Compressed_In_The_Middle_2M ([2001:0db8:85a3::8a2e:0370:7334]) takes 110ms, meaning ~18 million + rounds in a second.
IPv4_String_Decoding_Local_Broadcast_10M (255.255.255.255) takes 200ms, meaning ~50 million + rounds in a second.

Right now the only thing that is not decently-optimized is the IDNA implementation, which is only used when a domain name is not in simple ASCII.
The implementation passes the whole Unicode 17’s IDNA test suite with 6400 tests, in an extensive way, but I haven’t yet gone for making sure the implementation is optimized.

I don’t expect a package like this to have a massive impact. I’m mostly just hoping for some refined APIs in different packages. For example if a library is taking an argument for an address to another host, It should be able to use these types to make sure an incorrect value is harder to pass to the library, while having better performance in some situations.

So … what do you-all think? Would such a package benefit the community? Would you do this in another way?

11 Likes

I tried something similar for the domain type: GitHub - coenttb/swift-domain-type: A Swift package with a domain-accurate and type-safe Domain model in line with web standards.. Will check out your pay in more detail later when I have time.

2 Likes

Can you do some benchmark using inet_ntop and inet_pton as baseline to compare against?

There is also GitHub - tayloraswift/swift-ip: inline ip address types from @taylorswift

That’s an interesting idea. I could add some permanent benchmarks for the inet APIs, just so there is something to compare the library’s benchmarks to.

I have little idea how they’d compare though off-hand.

That’s also an interesting library. I had seen it before but hadn’t taken a deep look at it. I knew it has IP implementations like obvious from the name, and had taken a look at the IP APIs there, but that was pretty much it.

There are some differences between this to-be-library (let’s call it network-address) and swift-ip though.

  • The biggest one is no DomainName support. Which also makes sense since the library is called swift-ip.
  • I struggled a bit to find where the swift-ip tests are and thought maybe somehow someway there are no tests. Anyway, it does look like network-address has a much more complete tests suite. network-address already has a lot of tests for every situation I could think of.
  • Other difference is that swift-ip isn’t yet using the newer APIs like Span, which can result in better performance, unless you’re already using the unsafe pointer APIs. Again, makes sense to some extent, Span APIs are very new.
  • I notice swift-ip has a few dependencies and modules that make it less of a minimal library. If you need any of those modules, then you can’t use network-address. network-address will only contain essential implementations with minimum dependencies.
  • There are also some other subjective differences, e.g. the API surface in swift-ip. One might prefer swift-ip, or network-address.

So all in-all it appears to me that swift-ip was made for usage by the author itself, with less focus on a general audience. I do think it’s working well for the author, but not sure how an external user would feel.

It looks like network-address will be faster than inet APIs in 3 of 4 benchmarks, and do the same amount of mallocs.

Note that this is when calling the inet APIs from Swift. In some situations it won’t make much of a difference, in some other it might. In anycase this is what a Swift user would feel.

I tried to make it a fair battle and make sure we have the least overhead from Swift when using the inet APIs. In the initial benchmarks I put together, the inet APIs were doing worse due to the Swift interaction overhead, and I had to resolve those.
Most of the inet usage is copy pasted from swift-nio.

The benchmarks results are in this GitHub comment.
Click on the ‘Click to expand benchmark result‘ to see the results.
Those who have inet in their name are the inet related benchmarks.
All inet related benchmarks have a counterpart benchmark using the library’s swift APIs.
It’s also worth noting the inet APIs can do more than the Swift APIs do, but not by that much. And again, this what a Swift user would have to do anyway, if they want to do these operations (e.g. parse an ip address from string).

The run-count only applies to the cpu-user-time benchmarks, not mallocs.

15 Millions IPv4_String_Encoding_Mixed
swift: 190ms - 0 mallocs
inet_pton: 1570ms - 0 mallocs

10 Millions IPv4_String_Decoding_Local_Broadcast
swift: 180ms - 0 mallocs
inet_pton: 240ms - 0 mallocs

4 Millions IPv6_String_Encoding_Mixed
swift: 200ms - 1 mallocs
inet_ntop: 1830ms - 1 mallocs

2 Millions IPv6_String_Decoding_2_Groups_Compressed_In_The_Middle_No_Brackets
swift: 110ms - 0 mallocs
inet_ntop: 70ms - 0 mallocs

Good job.

Take a look at inet_ntop source, swift can meet and exceed.

In general the CIDR/IPAddress should be a separate library from DNS library.

Yeah I’d want to see what is the algorithm used there to parse/decode IPv6s … Working with IPv6 String representation is not trivial due to the compression sign (::), so they could be simply using a superior algorithm.

By DNS, did you mean DNS protocol, or Domain Name? Just so I know what you’re exactly talking about.

If you meant the DNS protocol, then I’m happy we’re on the same page, since this whole post is about me trying to see if I should actually go for that.

But if you mean Domain Name, then I’ll have to ask others (e.g. the swift-nio team) to see if they think IP-address stuff should be in a completely separate package, and then the DomainName package can depend on that package if needed, or I can move on with the current plan which is to host these under the same package, but in different targets. With the current plan you still can just not depend on Domain Name if you don’t want to, then you won’t be compiling it either, but it does add some (small?) development overhead since it’ll still need to fetch the Domain Name code which is in the same package, and e.g. fetch swift-nio for Domain Name since it currently uses nio’s ByteBuffer as its storage (Thinking of putting that behind a trait as well. So an IDNA trait, and a NO_NIO_BYTE_BUFFER trait).

1 Like

I got nerd-sniped into pixel-peeping the string-to-ipv6 parsing logic…
I definitely wasn’t needed as what I had written was only ~60% slower, and could do ~18 million rounds per second for “2_Groups_Compressed_In_The_Middle_No_Brackets”,
but the net end result is positive.
The previous implementation used to be faster for short ipv6s, but long story short, worsening the performance of that is a fine compromise to me, if I can make parsing of bigger ipv6 strings get faster. Short ipv6s are still faster to parse than bigger ones anyway, which also makes sense.

I basically copy-pasted the Glibc ipv6 parsing implementation to Swift. The logics still have some differences (e.g. my implementation can parse ipv6 string that are enclosed in brackets, Glibc can’t, or maybe doesn’t want to for some reason (legacy?!)), but they are very close.
In the end Glibc wasn’t doing things too differently in terms of logic.

The final result is:

3 Millions IPv6_String_Decoding_2_Groups_Compressed_In_The_Middle_No_Brackets
swift: 130ms - 0 mallocs
inet_ntop: 100ms - 0 mallocs

so from ~60% slower to ~30% slower than glibc.

1 Like

Where do you think the remainder of the gap is?

I have some ideas but not quite sure. I could take a look at the assembly to tell though. I'd guess it's likely due to the lower level handling of the bytes.

Checking the assemblies was part of the plan, but the whole thing took too long so I let it be.

Like mentioned, my initial implementation was only 60% slower. After that I didnt immediately go for copy pasting the Glibc. I tried a few more implementations out of curiosity, and all were worse than the initial implementation I had put together.

Namely, I tried a first-analyze-then-parse approach, which would first find position of all colons and dots (for ipv4-mpped ipv6s), and only then attempt to parse the string. The positions were stack-allocated in different bytes of some integers. Even then, that turned even worse than my initial implementation. Presumably the cost to iterating the bytes, although minimal, was still too high because in this approach I had to iterate twice over the bytes, once for analyzing and once for parsing.

One other thing that also took a decent amount of time and threw me off was the Swift compiler.

At some point I had some too-long-to-evaluate compiler errors due to some invalid code, so I just moved part of the function into another function. After that the whole thing started to perform 40% better. Apparently Swift was nudged to actually inline and optimize everything, unlike when the whole logic was in 1 function.

For the record, Darwin APIs do quite worse than Glibc or my implementation, for some reason:

On my MacBook:

15 Millions IPv4_String_Encoding_Mixed
swift: 153ms - 0 mallocs
inet_pton: 3036ms - 0 mallocs

10 Millions IPv4_String_Decoding_Local_Broadcast
swift: 251ms - 0 mallocs
inet_pton: 468ms - 0 mallocs

4 Millions IPv6_String_Encoding_Mixed
swift: 281ms - 1 mallocs
inet_ntop: 1473ms - 1 mallocs

3 Millions IPv6_String_Decoding_2_Groups_Compressed_In_The_Middle_No_Brackets
swift: 180ms - 0 mallocs
inet_ntop: 360ms - 0 mallocs

1 Like

I just published this as an standalone package under GitHub - swift-dns/swift-endpoint: a high-performance library containing types representing an endpoint, such as DomainName and IPv4/v6Address .
Chose `swift-endpoint` as the name.
See the README for more info.

There are 2 IDNA_SUPPORT and NIO_BYTE_BUFFER_SUPPORT traits but they don’t do anything yet.
I’ll have to see if they’re worth the hassle.
For example I’ll have to make sure swift-idna does have a noticeable impact at all, so then I can work on the trait to be able to disable the IDNA support. I think IDNA is not needed for most users, unless e.g. they are taking domain names from their users, and they want to make sure they support all cases. Disabling IDNA might help trimming down the binary size for example, for embedded apps.

1 Like

Some more followup

I made IDNA stuff 5x-13x faster (based on some benchmarks I put together).
Only in cases when the IDNA conversion was not already skipped due to the implementation noticing the input is already all-good or only needs a simple uppercased → lowercased conversion.

In all 3 of swift-idna, swift-endpoint and swift-dns, I also introduced some backwards compatibility stuff so now basically all swift-endpoint and swift-dns APIs are available on macOS 15 as well. In swift-idna, APIs support down to macOS 13.

I think another ~5x improvement is possible in swift-idna. Would need some more backwards-compatibility hacks since UTF8Span is only available on macOS 26, but should technically be doable.

2 Likes