I was able to get a simple C program calling into libSourceKitInProc.so
running so the next step is basically getting it to compile under node.js and apm and writing the javascript nan bindings for it. SourceKit is actually really really easy to use, it’s a shame the documentation is so poor.
Weird. It looks like SourceKit is actually inferior to many text editor syntax highlighters in the number of different things it can identify. It can’t separate out function calls from ordinary variable identifiers, or “special” builtins like self
or $0
. It doesn’t know about standard library symbols like abs(_:)
or min(_:_:)
. The only place it seems to outperform regex-based highlighters is it can tell the difference between type identifiers and regular identifiers, and it can highlight string interpolations…
IMHO none of those are something that should receive special treatment…
i think basically every editor and most users agreed they are?
That is... non-optimal.
I agree that you really want to be able to categorise things as finely as possible (even if you then map some of them to the same appearance on your syntax theme).
With the standard library stuff, does it maybe work if you actually have an import
of the relevant library in the file? I found that was how it works for auto-completion (which is good, I think).
sourcekit exposes about 20 species of syntax tokens. most of them have to do with attributes, the preprocessor, and doc comments, so “normal” code really only gets sorted into identifiers, types, keywords, numeric literals, strings, and string interpolation anchors
Yeah, a sourcekit
node module would be ideal. I think Ryan Lovelett was aiming to do something like that, but I'm not sure what the status of it is.
Also some relevant discussion here I think (Brian - @modocache - is the person responsible for nuclide-swift I think).
i know, i’ve figured most of that out in the last 48 hours or so. I process the data in C and then just pass (row, column) marker ranges to the javascript which handles the Atom UI highlighting. Node.js/APM builds the module cleanly with Nan and V8. Probably the more efficient way to do it compared with parsing SourceKitten output. The only real issue remaining is figuring out how to get more detailed highlighting that doesn’t treat foo
and self
and $0
the same and knows about stdlib symbols. And a weird Nan bug that seems to corrupt UTF8 strings if they’re exactly 342 characters or longer.
The code is here btw: atomic-blonde/blonde.cpp at master · kelvin13/atomic-blonde · GitHub
Another potential problem is linking with the Swift runtime libraries. People seem to install them all over the place (mine live in $HOME/tools/swift/usr/lib
for example) which makes -rpath
not work
I think you'd maybe want syntax highlighting to be done from a regex parser because regex are easier to analyze and a regex parser would almost certainly outperform something more complex like SK at this task.
Can't the general context-sensitive tooling be done with SK and the more specific regex-tied tooling be done with a parser and a syntax (YAML I think is the standard?) file?
Edit: that's my best guess as to why SK seems to not be designed for syntax highlighting
we could probably treat self
and $
as special cases (hacky) but highlighting standard library symbols is going to be more problematic. that said, the current language-swift89
isn’t much better at it,, it gives way too many false-positives (and basically breaks every time a new Swift Evolution proposal gets implemented) for me to be happy with it just by the fact that stdlib symbols like min
get overloaded so often in user code. But that would require more sophisticated name resolution beyond simple syntax parsing
There really isn’t a good solution to this because you can define min(_:_:)
in another file in the same module and all of a sudden min(_:_:)
and Swift.min(_:_:)
are two completely different symbols. language-swift89
currently treats all occurences of min
as a standard library symbol, unless a .
comes before it, unless Swift
comes before the .
, which is maybe 75% accurate
Oh damn
More syntax highlighting fun:
The atom plugin is basically fully functional and you can test it here: atomic-blonde
atomic-blonde
seems to be a bit slower than the language-swift89
grammar. The SourceKit highlighter has a latency of about 11ms per keypress. The old regex one by comparison takes just 3ms.
Here’s a time graph of the SourceKit highlighter (each lobe containing a green rectangle represents one highlighter cycle, 3 cycles are shown):
(Un)surprisingly, over 90% of the extra time is spent in javascript, mostly in calls to clearing and constructing marker layers in the Atom interface. The actual call to blonde
, the C++ module that calls SourceKit and does additional processing, takes less than a millisecond. (It’s the tiny yellow rectangle at the bottom outlined in blue.)
Could you trigger the build of the cpp on demand, rather than as part of the apm install? That way you can make the location of the runtime libraries a configuration that people can edit in Atom's settings UI.
(this approach looks great by the way - I think I'll do something similar, with the bare minimum of the SourceKit calls that I need)
user would have to have clang++ or g++ installed no?