I'm currently working on integrating SourceKit with a macOS application. AppKit APIs (e.g. NSAttributedString, NSLayoutManager, etc) deal in terms of NSRange (UTF-16 code units?). SourceKit, however, deals in terms of integer offsets and lengths (UTF-8 code units?). Is there a more efficient or easier way to convert back and forth between the two other than doing the index(_:offsetBy:) -> samePosition(in:) dance?
I'm currently working on integrating SourceKit with a macOS application. AppKit APIs (e.g. NSAttributedString, NSLayoutManager, etc) deal in terms of NSRange (UTF-16 code units?). SourceKit, however, deals in terms of integer offsets and lengths (UTF-8 code units?).
UTF8 byte offsets
Is there a more efficient or easier way to convert back and forth between the two other than doing the index(_:offsetBy:) -> samePosition(in:) dance?
If you’re doing a bunch of queries in the same file, you could build a table of line start offsets in both UTF8 and UTF16 you may get faster results by going UTF8 offset -> UTF8 line + delta -> UTF16 line + delta -> UTF16 offset. Since then the expensive part is O(line length) instead of O(file size).
I don’t know of a good canned solution.
···
On Mar 24, 2017, at 10:59 AM, Tyler Stromberg via swift-dev <swift-dev@swift.org> wrote:
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev
I ended up writing some convenience APIs to perform these conversions along
with many other useful SourceKit<->Cocoa conversions like line+column,
UTF-8, UTF-16 and String.Index in SourceKitten. It's MIT-licensed so feel
free to grab the String extensions from the project yourself:
That being said, you might have an easier time working with SourceKitten
than with with SourceKit directly, since it does a whole lot more, like
dynamically resolving+loading which SourceKit to use, caching expensive
operations, easier multi-threaded access, generating documentation, etc.
···
On Fri, 24 Mar 2017 at 10:59 Tyler Stromberg via swift-dev < swift-dev@swift.org> wrote:
I'm currently working on integrating SourceKit with a macOS application.
AppKit APIs (e.g. NSAttributedString, NSLayoutManager, etc) deal in terms
of NSRange (UTF-16 code units?). SourceKit, however, deals in terms of
integer offsets and lengths (UTF-8 code units?). Is there a more efficient
or easier way to convert back and forth between the two other than doing
the index(_:offsetBy:) -> samePosition(in:) dance?
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev
I'm currently working on integrating SourceKit with a macOS application. AppKit APIs (e.g. NSAttributedString, NSLayoutManager, etc) deal in terms of NSRange (UTF-16 code units?). SourceKit, however, deals in terms of integer offsets and lengths (UTF-8 code units?).
UTF8 byte offsets
Thanks for the clarification. =)
Is there a more efficient or easier way to convert back and forth between the two other than doing the index(_:offsetBy:) -> samePosition(in:) dance?
If you’re doing a bunch of queries in the same file, you could build a table of line start offsets in both UTF8 and UTF16 you may get faster results by going UTF8 offset -> UTF8 line + delta -> UTF16 line + delta -> UTF16 offset. Since then the expensive part is O(line length) instead of O(file size).
I don’t know of a good canned solution.
Yeah, this is what I figured I'd have to do.
Thanks for the help!
···
On Mar 24, 2017, at 12:08 PM, Ben Langmuir <blangmuir@apple.com> wrote:
On Mar 24, 2017, at 10:59 AM, Tyler Stromberg via swift-dev <swift-dev@swift.org> wrote:
We started off using those convenience APIs from SourceKitten (huge thanks for SourceKitten, BTW) but ended up moving to our own solution — for this issue due to some performance issues, particularly in large files, and for interfacing with SourceKit in general because we wanted a little more control over request/response handling.
···
On Mar 24, 2017, at 12:09 PM, Jean-Pierre Simard <jp@jpsim.com> wrote:
I ended up writing some convenience APIs to perform these conversions along with many other useful SourceKit<->Cocoa conversions like line+column, UTF-8, UTF-16 and String.Index in SourceKitten. It's MIT-licensed so feel free to grab the String extensions from the project yourself: https://github.com/jpsim/SourceKitten/blob/master/Source/SourceKittenFramework/String+SourceKitten.swift
That being said, you might have an easier time working with SourceKitten than with with SourceKit directly, since it does a whole lot more, like dynamically resolving+loading which SourceKit to use, caching expensive operations, easier multi-threaded access, generating documentation, etc.
On Fri, 24 Mar 2017 at 10:59 Tyler Stromberg via swift-dev <swift-dev@swift.org> wrote:
I'm currently working on integrating SourceKit with a macOS application. AppKit APIs (e.g. NSAttributedString, NSLayoutManager, etc) deal in terms of NSRange (UTF-16 code units?). SourceKit, however, deals in terms of integer offsets and lengths (UTF-8 code units?). Is there a more efficient or easier way to convert back and forth between the two other than doing the index(_:offsetBy:) -> samePosition(in:) dance?
_______________________________________________
swift-dev mailing list
swift-dev@swift.org
https://lists.swift.org/mailman/listinfo/swift-dev