One type per file: helpful or harmful?

taylorswift · March 8, 2024, 10:54pm

i had an interesting debate with a colleague this afternoon - it was over some snippets of code that look like this:

public
func build(pipeline:inout Mongo.PipelineEncoder)
{
    Self.loadSubscribers(&pipeline, premium: false, limit: self.limit, skip: 0)
    Self.loadSubscribers(&pipeline, premium: true, limit: self.limit, skip: 0)

he said this was stupid, that there was obviously a mutating instance function on Mongo.PipelineEncoder trying to come out here, and that the code would read much more fluently if it were written like this:

public
func build(pipeline:inout Mongo.PipelineEncoder)
{
    pipeline.loadSubscribers(premium: false, limit: self.limit, skip: 0)
    pipeline.loadSubscribers(premium: true, limit: self.limit, skip: 0)

but i shot that down because it would run afoul of our organization’s one type per file rule. punctiliously applying that policy would require loadSubscribers to go in a separate Mongo.PipelineEncoder (ext).swift file, and this function is too specific to this particular aggregation pipeline to be visible to the entire module.

it was half-seriously suggested that we should break up the DB queries module into individual modules for each database query, which would satisfy the policy by giving each query its own Mongo.PipelineEncoder (ext).swift file. but i personally felt like that would be a lot of effort to expend in order to satisfy some policy i introduced.

my question is then: do you have a similar one type per file policy in place where you code? if so, have you found it productive or counterproductive?

jlukas · March 8, 2024, 11:15pm

My personal rule is one externally-visible type per file, with a small exception for types that should be nested under the primary type, but can’t be for technical reasons.

For protocols, conforming other types to the protocol is also allowed, as long as the conformance is simple.

Private types/extensions are always allowed.

jrose · March 8, 2024, 11:24pm

You’re probably well aware that I, the main person behind the original private/internal/public access control, think one type per file is garbage. I do think having “stripes” of types across files might indicate that something is odd in your design, but the truth is that types within a module often don’t correspond to component boundaries, and using types as your only tool for component boundaries leads to…well, this. Helper functions stuck arbitrarily onto types or floating free in a file, and way more things being internal when they could have been fileprivate.

JuneBash · March 8, 2024, 11:25pm

No, but we only have 4 total iOS developers. During code review, if a file is getting unwieldy, we'll ask them to split it up so it's easier to parse and better organized. But sometimes it makes sense to group things into a single file, especially if they're small (<20 lines) types.

I could imagine with larger teams it could be helpful to codify this a bit more.

Karl · March 8, 2024, 11:32pm

I am vehemently opposed to all kinds of dogmatic thinking.

Engineers deserve the freedom to decide for themselves how best to structure and organise the systems they build, and blanket policies of this sort take away that freedom. I've found these kinds of policies lead to engineers who have a lack of confidence in their own instincts and experience.

If you find a pattern that works for you, fine. But if it becomes an iron-clad rule, to the point where it excludes alternatives which may be advantageous in their own ways, then I think it's gone too far and it's worth considering whether it has started to sap the craftsmanship out of engineering and suppress the individual talents of the engineers who work on the project. Because personally, I think that's an important thing that all engineers (and all people) need to be fulfilled in their work and lives.

michelf · March 8, 2024, 11:47pm

My own rule: an extension with fileprivate stuff can belong anywhere. If you later need to expose the fileprivate function to other files, then consider moving the extension to its own file or in a file with related extensions.

tcldr · March 9, 2024, 12:06am

I followed the same rule once upon a time, probably because I read it was 'good practice' at some point. But like you, I think it created a kind of reluctance or friction to creating a new type. Especially as I think Swift benefits from types being cheap – the compiler likely optimises many of them into thin air anyway – so now I often find myself creating many types per file.

Usually, they'll be fileprivate, and often I find myself designing a mini 'module within a module' with one public (or internal) type and many local fileprivate types. Occasionally, I'll even declare a small type within a function – if it helps, or makes things a bit neater.

A+++. Would recommend.

tera · March 9, 2024, 12:20am

I consider one type per file harmful, as well as any other similarly rigid coding rule.

Take an example I posted in a another thread:

struct  X1<T> { var x: (T) }
struct  X2<T> { var x: (T, T) }
struct  X3<T> { var x: (T, T, T) }
struct  X4<T> { var x: (T, T, T, T) }
struct  X5<T> { var x: (T, T, T, T, T) }
struct  X6<T> { var x: (T, T, T, T, T, T) }
struct  X7<T> { var x: (T, T, T, T, T, T, T) }
struct  X8<T> { var x: (T, T, T, T, T, T, T, T) }
struct  X9<T> { var x: (T, T, T, T, T, T, T, T, T) }
struct X10<T> { var x: (T, T, T, T, T, T, T, T, T, T) }

typealias   X20<T> =   X2<X10<T>>
typealias   X30<T> =   X3<X10<T>>
typealias   X40<T> =   X4<X10<T>>
typealias   X50<T> =   X5<X10<T>>
typealias   X60<T> =   X6<X10<T>>
typealias   X70<T> =   X7<X10<T>>
typealias   X80<T> =   X8<X10<T>>
typealias   X90<T> =   X9<X10<T>>
typealias  X100<T> =  X10<X10<T>>
typealias  X200<T> =  X2<X100<T>>
typealias  X300<T> =  X3<X100<T>>
typealias  X400<T> =  X4<X100<T>>
typealias  X500<T> =  X5<X100<T>>
typealias  X600<T> =  X6<X100<T>>
typealias  X700<T> =  X7<X100<T>>
typealias  X800<T> =  X8<X100<T>>
typealias  X900<T> =  X9<X100<T>>
typealias X1000<T> = X10<X100<T>>

If to take the one-type-per-file rule literally that would cost 28 files! And add extra ~200 lines of code if you use the standard attribution header in each file:

//
//  FileName.swift
//  ProjectName
//
//  Created by Author on XX/XX/XXXX.
//

IMHO that's the definition of counterproductive.

Jon_Shier · March 9, 2024, 3:10am

I’ve always wondered where compiler performance inflection point would be between individual files and many types in one file. Did you ever benchmark the various forms?

Joe_Groff · March 9, 2024, 3:12am

Benchmark information would be valuable. My intuition given the way Swift works is that I wouldn't expect breaking things into files within a module to have much effect one way or another on overall compiler performance, since we more or less freely allow cross-file references and so have to treat all of the files in a module as one unit for many purposes.

Jon_Shier · March 9, 2024, 3:22am

The existence and benefit of the batch compilation mode would indicate there are at least some forms of Swift code where the balance moves one way or the other.

taylorswift · March 9, 2024, 3:31am

i have heard of extreme cases where breaking up very long files into smaller files had a dramatic impact on compilation speed, because the long file uses a lot of memory, and this can cause the kernel to start swapping. but that was with a file that was tens of thousands of lines long, and the speed difference was observed on Swift 5.8, which was worse with memory than 5.10 is.

in any case, i don’t think compilation speed should be an important driver in deciding how to distribute code across files. the simple act of upgrading from 5.9 to 5.10, which supports incremental builds in Docker, has already saved me more time than any amount of repo layout optimization could.

jrose · March 9, 2024, 3:38am

Yeah, parallelism within a module is mostly split on file lines (when not doing whole-module optimization), so if you have everything in one file, you won’t get any parallelism. But I wouldn’t expect 32 vs 64 files to make too much of a difference unless the size distributions are wildly off and one batch job gets stuck with all the big files. Still, that’s an analysis from principles, not a benchmark.

rauhul · March 9, 2024, 4:35am

FWIW this absolutely does not hold true for the MMIO interfaces generated by svd2swift. Splitting each peripheral type into its own file dramatically improved compile performance.

dnadoba · March 9, 2024, 5:47am

The Swift API Design Guidelines say Prefer methods and properties to free functions.

Static methods are often just free functions in a namespace and in your case there is an obvious self (Mongo.PipelineEncoder). Therefore a method should be preferred.

wadetregaskis · March 9, 2024, 6:08am

Although presumably fileprivate helps, in this particular case?

young · March 9, 2024, 6:45am

Just simple static large array init is very very slow to compile: Why compile an array size 3773 statically init each element very slow?

So guess more complicated large swift file will be slow.

Hacksaw · March 9, 2024, 7:00am

That Self.function(&actualTarget, ...) construction makes me feel like my sanity is fading.

In general I try to keep files short and highly related. This sometimes is one type in a file, but often I have a storage manager which is really a collection paired with file loading and saving functions, maybe a current item index, and other helpers, and then the type in the collection is declared afterward.

So class Things {}, and the struct Thing {}

If I have helper functions which only involved the types declared in a particular file, I'd put them in that file, but if they involved another type, I might put them in a third file. Maybe.

If a function is semantically coupled to the business logic of the app, they'd be in the file where they are used. For instance, a function computing portions of a color code bar mapping time passed to percentage goes in the file drawing the bars, because it's a local convenience function, though not a function that directly does anything with members of the type.

So more like "one highly related and coupled group of things per file."

Part of this comes from a basic organizational principle I picked up, which is I try to put things in the place I would naturally look for them first.

tera · March 9, 2024, 12:38pm

Another reason not to have this rigid ruling: the type could be huge! E.g. it has various extensions for several protocol conformances. In this case I'll put those in separate files.

David_Ungar2 · March 10, 2024, 3:44am

Files date back to the days of 9-track mag tape (at least). I'm saddened that they play such a central role for program organization today. A Swift program is a web of interconnections, including depends-on and is-depended-upon. We hope to mostly confine the web to within types, but plenty of strands must go between types, for various reasons, including functionality and features. Despite the best efforts of most IDE teams, the division of code into files still dominates affordances and salience, sigh.

Haven't gotten that whinge off of my chest ;), I wish I could say something helpful. But all I have is "Let the punishment fit the crime" wrt to both policy and meta-policy. It depends on your context, including tooling, team organization, priorities, etc.