[FOU] Locale Components, Language, and Language Components

I want to share a story that maybe will inspire for improving these APIs or writing better documentation for new comers like me.

My use case is to create a custom Locale using Locale.Components that will render numbers using a specific script. I will then use it in a date formatter.

It all started when I wanted to create a locale component instance with a specific numbering system, which led me to this page in Xcode's Developer Documentation window. This was when I was first introduced to BCP 47 (I didn't know what is it at this point).

After diving deeper, I found that I can use availableNumberingSystems (print its value somewhere), this showed me an array of strings. I had to spend a lot of time to understand where to look for to know which numbering system ID shows the digits I wanted. I followed a bunch of hyperlinks inside unicode.org. In order:

  1. Googling "BCP 47" led me to this Wikipedia article.
  2. The section "Extension U (Unicode Locale)" gave me a hint so I googled '"latn" numbering system unicode extension', this led me to this unicode org page.
  3. The last line of the Numbering Systems section led me to this page.
  4. This finally led me to what I was looking for here: supplemental/numberingSystems.xml.

Then I was finally able to see which numbering system identifier renders the digits I wanted.

I think many developers like me solving this problem for the first time will find it very confusing to start with the documentation. I imagine many will just give up on the documentation and start to experiment and apply trial and error until they reach what they are looking for.

Since the list of numbering system identifiers is predefined in the standard, I think it would be a major improvement if those numbering systems are defined as static instances like .arab and .latn. Locale.NumberingSystem can have a new initializer that accepts a value of those static instances instead of a String value (e.g. Locale.NumberingSystem(identifier: .arab). Also documentation for each of these static values can show what digits will be rendered. For example:

/// A numbering system that uses the digits: ٠١٢٣٤٥٦٧٨٩
public static var arab: Locale.NumberingSystem { get }

If this is not feasable, then I think the documentation for Locale.NumberingSystem can elaborate more on what BCP 47 is, and provide a table that shows how each value in availableNumberingSystems will render digits or link to the supplemental/numberingSystems.xml as the source of truth for finding this information.

4 Likes