Using jargon in the name of the @derivative attribute

This is a spinoff from another thread. Therefor, it's focused on a specific case ("wrt"), but the topic itself might be more general.

So, without further ado :slight_smile::

That is fine — but don't you see a difference? This is all simple and accessible English. There some abbreviations (func, var, struct… I think that's it?), but those are simple as well (and much less prone to misinterpretation than acronyms).

I never argued that there's something wrong with "with respect to": No matter wether it's term of art, this is quite simple and accessible, too. uses the long form all over the place, but never mentions the acronym — and shows that it is not well established.

I don't know how I could have made clearer that I don't propose to actually using those words (ot: what is the meaning of the innocent little word "by" for English mathematicians?)
Instead, I suggested to ask somebody who's not an expert, but a native speaker (I would do this myself, but I guess many here have much better connections to a fitting High School).


When it comes to naming, I think it's best to keep the majority in mind, and not trying to please a very small group by preferring their jargon. An expert may prefer his wording, but would certainly also understand common language easily.


One thing I think we should keep in mind is that Wikipedia and academic papers are essentially documentation. I would certainly expect the documentation of something like derivative(of:wrt:) to spell out “takes the derivative of foo with respect to x”.

1 Like

I have no problem with spilling it out in full, I think. I’m only objecting to using another, unrelated argument label.

Well, only if you don't read the second answer, which (with almost as many upvotes as the accepted one), says:

In mathematics "w.r.t." is part of the standard jargon. It is not unusual to see it used (sparingly) in peer-reviewed journal articles.


That does not matter to me — to consider something as well established, I expect that there is a vast majority of supporters.
You could only criticize that this thread isn't representative, because there aren't that many people involved, and it could be that the majority of the them has no idea about the truth... after all, this is the internet, and actually every true mathematician would simply answer "yes" (and show a link to a single paper which uses the abbreviation ;-)

I do doubt that wrt is any less simple/accessible compared to with respect to. Especially in the domain-specific usage we're talking about (differentiable maths).

Perhaps wrt is less formal, but that's about it. They have about the same clarity, making it much more concise.


I would prefer withRespectTo. As a general rule I prefer clarity over concision in just about every case (which I suppose makes me a bit of an outlier in the programming world...)

For this particular case I would point out that one of the explicitly stated goals of the first class autodifferentiation project is putting these tools in the hands of more people, many of whom aren’t domain experts. One of the things that is hoped for is that more general programmers will dip their toes in neural net programming and integrate NN techniques deeper in their code because it’s easier than ever to do so with first class support of auto-diff in the compiler.

So while I get that domain-specific abbreviations and acronyms are fair game per Swift naming conventions, I don’t think that’s a great lens for this specific project. We should be shooting for naming that’s easy to scan by people with limited domain knowledge who are moving towards the use of these code patterns for the first time.


My favourite example is some Collection. It could have been spelled opaque Collection to more precisely reflect its behaviour, but instead the syntax is plain and simple English.

With respect to "wrt", I think it's simple enough jargon for a feature like differentiable programming. It's the correct phrase, but the full spelling is simply too long-winded.


Why is it too long? Code is read much more often than written. Also autocomplete will most probably write the label for you.


The question lack context. It is about English academic language in general, and not about mathematical terminology. It is also opinion.

“The derivative of x with respect to f” is the correct and established name. In this context, “wrt” is commonplace, in written running text. However, in abbreviated form, dx/df is of course much more common, but that translates poorly to programming language.

True, but even when reading having something fully spelled out can get to be heavy. There's a reason nobody designs languages like COBOL anymore. I mean, what's easier to read: let x = 3 * 5 or MULTIPLY 3 BY 5 GIVING X?

More, longer words does not necessarily lead to more clarity.

1 Like

Papers are not the only medium by which people communicate. I can assure you that every person in the English-speaking world who has taken high-school calculus has seen “w.r.t.” on the blackboard at some point. Whether they recall this usage is another question.

For precedent on naming for our numerical APIs, see “ulp.”


Sure. But I’m not sure that’s a very apt counterexample for the specific bikeshedding we’re doing here. It uses both substantially more words, more letters, and different (and more drawn out) syntax. For our purposes we’re just talking about a difference in letters.

‘wrt’ versus ‘withRespectTo’ is 10 letters difference. In my opinion it doesn’t read as “heavy” at all.

And again I go back to the meta use case - increasing the approachability of NN techniques and code patterns by programmers who are neither experts in NNs nor necessarily mathematicians.


I did not learn calculus in English, and that's probably why I never saw "w.r.t.". But I can translate words and "with respect to" is easy to understand at a glance, whereas "wrt" is pretty inscrutable until you realize it's an acronym (not necessarily obvious without the dots) and then search online to know what words it stands for, and then map those words to those you learned.

Just providing a data point; I don't really care personally.


If they really are that new, and just want to tinker, they wouldn't care either way. They'll likely go "The example use x here, so I'll put x here", or "I renamed x to speed, so here should be speed".

Some papers using the abbreviated phrase:


So... let me guess this straight: it's not jargon because it's in a mathematical paper?

1 Like

To me "wrt" is a perfectly clear mathematical term. Yes it’s short for "with respect to". Just like "log" is short for "logarithm", "cos" is short for "cosine", "atan" is short for "arc tangent" or "inverse tangent", "e" is short for Euler‘s number and the exponential function. And so on. In mathematical formulae and code it’s normal to use abbreviations and acronyms, where they exist, for mathematical entities.


Well, yes, it was a bit of a hyperbolic example, but the point still stands. COBOL was designed to be read, and even though code is read more than it is written, nobody is looking to follow in its syntactic footsteps.

And it may be only ten letters difference, but that is still ten extra, unnecessary, letters each and every time your eyes roll over it. It adds up.

I’m sorry, but it’s going to take a lot to convince me that someone is going to overcome the hurdles that need to be overcome to start using a function like this in their code only to find “wrt” a bridge too far. Or to never come across a use of “wrt” in their learning leading up to this usage.

FWIW, as a mathematician turned programmer: I don't love "wrt", even though I immediately know what it means. In practice, we don't really write out "with respect to x"; we write 𝜕f/𝜕x. I also don't love withRespectTo:, however. It's far too wordy for what will essentially be the most basic operation in differential programming¹.

My personal taste would probably be to use a word that's just slightly out of usual formal math usage, but also immediately clear and easier to read: derivative(of: by:).

¹ This is like if function application used a three-word phrase (call f withArguments: x, y) or if we used the name VariableLengthArray<T>. It's an absolutely fundamental operation to the programming model, and warrants a concise name.