PatrickA
(Patrick A)
1
I am able to compile in Swift a very simple source file encoded in UTF-8 (with or without BOM).
But when I try to build the same source file but now encoded in UTF-16, I get an error : "Invalid Swift parseable output (malformed JSON)".
The file (main.swift) :
import Foundation
let str : String = "abc"
print (str)
What is happening ? What JSON is malformed?
Note that I can see the UTF-16 file in the code editor, I can edit it, but I cannot build it.
In Xcode > Preferences > Text Editing, I set "Default Text Encoding" to UTF-16BE/LE but to no avail.
Any help is appreciated, does Swift support only UTF-8 source files?
jrose
(Jordan Rose)
2
Yes, Swift only supports UTF-8 input (in any environment, not just Xcode). That assumption's woven through the compiler in a few ways, mainly assuming that some content can be copied directly from the source file into output, but supporting UTF-16 in particular would be tricky because of the 16-bit code units, which would require reworking the lexer quite a bit, which could affect parsing speed (but might not be significant).
Do you have a particular reason why it'd be useful to accept source files encoded as UTF-16?
PatrickA
(Patrick A)
3
Thank you, Jordan.
No, I don't have a pressing need to accept source files encoded as UTF-16, I was just testing how XCode would behave under various circumstances.
1 Like
PatrickA
(Patrick A)
4
I have one ancillary question: what is the purpose of the Xcode > Preferences > Text Editing > Default Text Encoding if using anything but UTF-8 may cause a problem?
Avi
5
Xcode supports languages other than Swift.
2 Likes
michelf
(Michel Fortin)
6
That sounds rather off the mark as an error message. I wonder where it comes from.
jrose
(Jordan Rose)
7
Moreover, you can edit files that are not source code from within Xcode.
1 Like
jrose
(Jordan Rose)
8
Yeah, it'd be worth filing a bug against Xcode and/or Swift to make sure the error message is good. I suspect some of the null bytes from the source file are making it into the output unescaped somehow (UTF-16-encoded ASCII characters have all zeroes in the high byte).