Protocol naming

kgominyuk · August 4, 2022, 10:11am

Hi all,

There are two protocol naming statements in Swift Design Guidelines:

Protocols that describe what something is should read as nouns (e.g. Collection).
Protocols that describe a capability should be named using the suffixes able, ible, or ing (e.g. Equatable, ProgressReporting).

They are so broad that many articles are trying to define the naming approaches.

Could you please provide more details on what the difference is between what something is and capability?
When should SomethingProvider be used? When should SomethingProviding be used?

Gero · August 4, 2022, 11:48am

Hm, this is an interesting question for sure. Of course it all depends, but maybe it's helpful to rephrase the guidelines a bit:

"What something is": I think you could also say that this is something that you can do stuff with. Or that you can do things to. A collection is a "thing" that you can put elements into. You do something to a collection (most of the time). If you put the an instance of the protocol somewhere, you perceive it as a static part that "sits around".
"Capability" or "What something is doing": I think here the majority of the time you ask the thing to do something on its own. I admit it's a bit shaky because, e.g. checking equality of two Equatables could be seen as you doing that, but in my opinion that is more something that the two instances "do" themselves. You invoke the function, sure (just as you invoke append on collection), but the result is more the two Equatables doing the check and less you modifying them (so they are more active than passive).

Obviously this is probably a generally very opinionated area of discussion, but I rarely have a problem following the above guidelines: If a protocol contains only (or mostly) functions, thus defining active behavior of any instance of an adopting type, I go with the -ing name. If it is about abstracting the storage of data, e.g. has associated values, properties containing things that are persisted over the lifetime of the instance, I use a noun.

itaiferber · August 4, 2022, 1:10pm

Like @Gero notes, there are no hard and fast rules here, and things are largely up to interpretation. A slightly different interpretation of these rules (that largely leads to the same results):

"Protocols that describe what something is" are tied inherently to the identity of a type: for example, an Array would be a collection of elements even if the Collection protocol did not exist, or it did not adopt it. That is central to the classification of what Array is, and a protocol describing that identity helps group similar types together
- As a note: these types of protocols are usually so fundamental to a type that they are conformed to unconditionally; e.g., an Array is a Collection regardless of what it holds
"Protocols that describe a capability" are tied inherently to additional behavior that a type can perform, which are additive to their behavior: for instance, OperationQueue is ProgressReporting, but that is not central to its identity, and it could do its job just as well if it weren't ProgressReporting
- These types of protocols are usually less fundamental to a type's existence, and may be able to be adopted conditionally (e.g. Array is Equatable only if its element type is Equatable) or in an extension externally

Personally: I would call type a SomethingProvider if providing Somethings is inherently tied to its existence as a type (an example from the codebase I work on: ViewStateProvider), and SomethingProviding if it can provide Somethings, but it's not a central point of the type's existence.

kgominyuk · August 8, 2022, 11:42am

These two interpretations are the most common. Because of this ambiguity, it’s easy to choose the path that can lead to an unclear API.

I tried to dig into SE-0023 API Design Guidelines review, but I didn’t manage to find an answer there. Maybe @dabrahams can bring to light the initial intention behind protocol naming statements.

dabrahams · August 9, 2022, 5:18pm

Chiming in because I was mentioned. Personally, I would never use either of those kinds of names. They describe a function signature or property type that is fully encoded in the type system (“returns a Something”), rather than anything that adds cognitive value for a reader above and beyond the declaration itself. Names should always add value that is not implied by other program elements.

@Gero's rephrasing of “what something is” doesn't really resonate for me. To me, “do stuff with (or to)” sounds like capability. I view “what something is” as a taxonomic classification, as in zoology/biology: is this thing animal, vegetable, or mineral? An animal? OK, more specifically, is it a mammal, insect, marsupial, bird… etc. It's a question that goes to the type's fundamental nature. Most types don't/shouldn't have multiple fundamental natures ("it's a floor wax! No, it's a dessert topping!"). When you come up with a “what something is” name like this, it helps explain to a reader the roles the type can play in a larger context: in the context of architectural elements, it's a support element, or a decorative element, or a door, or a window, or a room.

Capabilities generally apply across many different categories of things, like "can fly," or "has a color," and say less about a thing's role, and more about its discernible attributes.

HTH,
Dave

kgominyuk · August 11, 2022, 9:45pm

I agree - not the best example. My only intention was to give a name example without and with the "-ing" suffix. I've read a few articles that suggested using -ing names for things that are doing something and I was wondering how this applies to the real world.

Zoology example is excellent. But it has one downside - it is very obvious. It is so deeply in us that bird is a noun, that no one will argue. But I'm sure that there are people who will easily say that RepositoryProvider should be named RepositoryProviding.

I just want to clarify to make sure that I understand correctly. Architectural elements fall into “what something is” category. The name should explain to the reader the role of the element in the architecture and how it fits into the big picture.

The "capability" category describes the object itself outside of the architecture.

dabrahams · August 14, 2022, 8:28pm

Very few names that seem work well for me with "ing," so I may not be the best person to ask about this I guess my advice would be “try really hard to avoid it.”

Zoology example is excellent. But it has one downside - it is very obvious. It is so deeply in us that bird is a noun, that no one will argue.

My point wasn't that about its part of speech, but that it describes what we think of as the fundamental essence of a large category of things. If I ask you what that thing over there is and you tell me it's “airborne” or “physical” or “sentient,” or “colorful.” I'll say, “yeah, those are qualities it has that cut across many essences of thing, but what is it?” until you finally tell me it's a bird. If you tell me it's an animal, I might ask you to be more specific, but now we're in the hierarchy of concepts that describes its essence.
Hashable, for example describes a quality or capability of a type that could cut across the essential categories of collections, integers, or color mappings.

But I'm sure that there are people who will easily say that RepositoryProvider should be named RepositoryProviding.

I don't know what to tell ya. I don't think this thing we've got here identifies a real abstraction…
and going back in history to its introduction, it started out as an unnecessary protocol that could have just been a simple function type ((RepositorySpecifier, String) throws->Void).

A strong hint that there's no abstraction here is that the doc comment summary is "A repository provider,“ which does nothing to describe the thing being declared beyond repeating its name. This just one of several reasons the API guidelines put so much emphasis on writing a good doc summary: doing the exercise will tell you when your names are bad, and if you have a hard time writing a good description, it can tell you that your abstraction is bad. Another red flag is that the protocol was able to evolve into something very different from its original API without the name ever changing. "Provider" itself (like “Manager”) is another very good hint that there's no abstraction here. There are a billion ways a thing could "provide repositories;” (()->Repository fits that description); what's special about this one?

Yes. To be precise, I would say that architectural element categories (like door, or window) describe what something is. I can't comment on “the big picture.” Saying that an Array is a Collection tells you what it is, but doesn't say much about “the big picture” AFAICT.

I'm afraid I don't really understand that statement. I would say that the capability category describes “cross-cutting aspects” of types, like that they are serializable or equatable.

Jon_Shier · August 15, 2022, 2:41am

dabrahams:

I don't know what to tell ya. I don't think this thing we've got here identifies a real abstraction…
and going back in history to its introduction , it started out as an unnecessary protocol that could have just been a simple function type ((RepositorySpecifier, String) throws->Void).

A strong hint that there's no abstraction here is that the doc comment summary is "A repository provider,“ which does nothing to describe the thing being declared beyond repeating its name. This just one of several reasons the API guidelines put so much emphasis on writing a good doc summary: doing the exercise will tell you when your names are bad, and if you have a hard time writing a good description, it can tell you that your abstraction is bad. Another red flag is that the protocol was able to evolve into something very different from its original API without the name ever changing. "Provider" itself (like “Manager”) is another very good hint that there's no abstraction here. There are a billion ways a thing could "provide repositories;” (()->Repository fits that description); what's special about this one?

Starting with a function as an abstraction can make a lot of sense. It's becoming more common and the increasingly popular Swift Composable Architecture actually starts with modeling the dependency environment as closures, so the approach is becoming more mainstream. But simple functions don't scale particularly well. This is especially true when you you have many functions related to single features, when you need to provide default implementations, when you need to provided associated functionality for some groups of functions, or want easily pronounceable names for those groups of related functionality. But perhaps the most common need is that there really does need to be multiple implementations of some set of functionality. Most often this is only two, production and testing, but there may also be multiple versions of each of those as well. So how are we to model and name this abstraction?

Most often these are modeled exactly as SPM has in the example provided: a single protocol which lists the related requirements to be provided necessary for some set of functionality. Now, I can't answer whether the RepositoryProvider protocol is actually necessary, that is, whether there are actually multiple possible providers but for the sake of argument let's assume there are. Given a single set of associated functionality, how else are we to model it in Swift other than a protocol? What would you name it?

Note that I'm not sure your documentation exercise is as valuable as you hope. If we look at Collection's intro statement, "A sequence whose elements can be traversed multiple times, nondestructively, and accessed by an indexed subscript.", we can easily express RepositoryProvider in similar terms. Perhaps "A cancellable source of information and functionality used to fetch, validate, and create working copies from a repository." So I don't think it's that hard to create a statement that meets your criteria, it's just that the intro statement is most often the least read part of any type's documentation that doc writers don't often bother to expand it. This is especially true for something like SPM where its public API is almost never consumed. Libraries like the Swift standard library or the Swift Composable Architecture are somewhat rare in that regard.

dabrahams · August 15, 2022, 8:59pm

You clearly misunderstood me. First, I was not suggesting that functions are a good substitute for every protocol (I am, after all, ”Mr. Protocol-Oriented”). I was saying that this particular protocol started out as something that didn't even need to exist. I said it could have been a function type (a.k.a. closure). When your protocol has a single requirement that's a method, you might as well be represent that signature as a function type. You can have multiple functions that match the signature of that type, and you can store any one of them in a closure instance.

Note that I'm not sure your documentation exercise is as valuable as you hope.

It's not a hope, it's a fact that I can attest to from experience, and that has been validated by many people I've worked with over the last 20 years.

If we look at Collection's intro statement, "A sequence whose elements can be traversed multiple times, nondestructively, and accessed by an indexed subscript.", we can easily express RepositoryProvider in similar terms.

Yes, that's a good idea.

Perhaps "A cancellable source of information and functionality used to fetch, validate, and create working copies from a repository."

“Source of information and functionality” is vacuous; every type fits that description. All you've said here is that you have three functions that operate on repositories, (fetch, validate, and create). Also, calling a type “cancellable” has no obvious meaning. This is not yet an abstraction.

So I don't think it's that hard to create a statement that meets your criteria,

I'm still waiting to see one.

To be clear, this exercise is not about saying something that has the “ring in the ear” of a formalism, but about saying something that is actually meaningful. The desire for names to have a familiar/uniform feel, irrespective of information content, is part of what led to many of the “muffin man” names in Cocoa. It's obviously a very strong human instinct. In my experience, though, it's not just harmless/needless, but works against comprehension, by making things that are actually different seem the same. It's just as easy to fall into the same trap with documentation as with names, and it takes real work to do better.

it's just that the intro statement is most often the least read part of any type's documentation

If that's true, it's only because intro statements are typically weak, like the example we're discussing now.