I've been looking at the init(contentsOfFile, usedEncoding) initializer for
NSString in corelibs-foundation.
Am I right in thinking that this method should use some method to attempt
to detect the character encoding of the file before returning a decoded
String?
If so, I've been working on a pure Swift library to detect string
encodings, and wondered if continued work on it might be useful for
implementing this missing method?
I've been looking at the init(contentsOfFile, usedEncoding) initializer for NSString in corelibs-foundation.
Am I right in thinking that this method should use some method to attempt to detect the character encoding of the file before returning a decoded String?
In this case, the Foundation implementation just looks at an extended attribute of the file to see if it contains the encoding. If it doesn’t have the xattr then we don’t attempt to guess (name of xattr is “com.apple.TextEncoding”).
Foundation has another API which attempts to guess the encoding of a data blob, but I think we left it out of the swift-corelibs stubs:
On Jun 21, 2017, at 7:39 AM, Andy Best via swift-corelibs-dev <swift-corelibs-dev@swift.org> wrote:
If so, I've been working on a pure Swift library to detect string encodings, and wondered if continued work on it might be useful for implementing this missing method?
Is the preferred approach to mirror Foundation as closely as possible (e.g.
under Linux basically do nothing), or is implementing something like
stringEncodingForData under the hood preferable in this case?
···
On 21 June 2017 at 17:43, Tony Parker <anthony.parker@apple.com> wrote:
Hi Andy,
On Jun 21, 2017, at 7:39 AM, Andy Best via swift-corelibs-dev < > swift-corelibs-dev@swift.org> wrote:
Hey,
I've been looking at the init(contentsOfFile, usedEncoding) initializer
for NSString in corelibs-foundation.
Am I right in thinking that this method should use some method to attempt
to detect the character encoding of the file before returning a decoded
String?
In this case, the Foundation implementation just looks at an extended
attribute of the file to see if it contains the encoding. If it doesn’t
have the xattr then we don’t attempt to guess (name of xattr is
“com.apple.TextEncoding”).
Foundation has another API which attempts to guess the encoding of a data
blob, but I think we left it out of the swift-corelibs stubs:
If so, I've been working on a pure Swift library to detect string
encodings, and wondered if continued work on it might be useful for
implementing this missing method?
Our preferred approach so far is to mirror Foundation as closely as possible.
I don’t know if we want to implement stringEncodingForData as part of swift-corelibs-foundation. In any case, we are trying to avoid bringing in as few dependencies outside of the Swift project itself as possible, to keep Foundation as low level as possible for stability, ease of use, and ease of portability.
- Tony
···
On Jun 21, 2017, at 9:51 AM, Andy Best <andybest.net@gmail.com> wrote:
Is the preferred approach to mirror Foundation as closely as possible (e.g. under Linux basically do nothing), or is implementing something like stringEncodingForData under the hood preferable in this case?
On 21 June 2017 at 17:43, Tony Parker <anthony.parker@apple.com <mailto:anthony.parker@apple.com>> wrote:
Hi Andy,
On Jun 21, 2017, at 7:39 AM, Andy Best via swift-corelibs-dev <swift-corelibs-dev@swift.org <mailto:swift-corelibs-dev@swift.org>> wrote:
Hey,
I've been looking at the init(contentsOfFile, usedEncoding) initializer for NSString in corelibs-foundation.
Am I right in thinking that this method should use some method to attempt to detect the character encoding of the file before returning a decoded String?
In this case, the Foundation implementation just looks at an extended attribute of the file to see if it contains the encoding. If it doesn’t have the xattr then we don’t attempt to guess (name of xattr is “com.apple.TextEncoding”).
Foundation has another API which attempts to guess the encoding of a data blob, but I think we left it out of the swift-corelibs stubs:
If so, I've been working on a pure Swift library to detect string encodings, and wondered if continued work on it might be useful for implementing this missing method?
Someone on the team here just reminded me that we do have a very basic form of encoding detection here as well: just looking for the BOM at the beginning of the data.
- Tony
···
On Jun 21, 2017, at 9:55 AM, Tony Parker via swift-corelibs-dev <swift-corelibs-dev@swift.org> wrote:
Our preferred approach so far is to mirror Foundation as closely as possible.
I don’t know if we want to implement stringEncodingForData as part of swift-corelibs-foundation. In any case, we are trying to avoid bringing in as few dependencies outside of the Swift project itself as possible, to keep Foundation as low level as possible for stability, ease of use, and ease of portability.
- Tony
On Jun 21, 2017, at 9:51 AM, Andy Best <andybest.net@gmail.com <mailto:andybest.net@gmail.com>> wrote:
Is the preferred approach to mirror Foundation as closely as possible (e.g. under Linux basically do nothing), or is implementing something like stringEncodingForData under the hood preferable in this case?
On 21 June 2017 at 17:43, Tony Parker <anthony.parker@apple.com <mailto:anthony.parker@apple.com>> wrote:
Hi Andy,
On Jun 21, 2017, at 7:39 AM, Andy Best via swift-corelibs-dev <swift-corelibs-dev@swift.org <mailto:swift-corelibs-dev@swift.org>> wrote:
Hey,
I've been looking at the init(contentsOfFile, usedEncoding) initializer for NSString in corelibs-foundation.
Am I right in thinking that this method should use some method to attempt to detect the character encoding of the file before returning a decoded String?
In this case, the Foundation implementation just looks at an extended attribute of the file to see if it contains the encoding. If it doesn’t have the xattr then we don’t attempt to guess (name of xattr is “com.apple.TextEncoding”).
Foundation has another API which attempts to guess the encoding of a data blob, but I think we left it out of the swift-corelibs stubs:
If so, I've been working on a pure Swift library to detect string encodings, and wondered if continued work on it might be useful for implementing this missing method?