There is really a dearth of information. I can tell you from personal experience it is a bunch of reverse engineering what others have done and trial and error.
I have a native Node.js module for interacting with SourceKit that I started a long time ago and have subsequently abandoned. Mostly because I transitioned my VSCode plugin to using a native Swift based implementation of the same idea.
It did work and it might still. Regardless it should show you how you can use both the C API and the SourceKit protocol (more on that later) using Node. Also at least a starting point for compiler and linker flags, etc.
Most sources, and the SourceKit readme, say you should use the sourcekitd(.so) library object instead of libsourcekitdInProc.so, but I only have the latter file in my swift installation.
It might say that but in my experience you are right
libsourcekitdInProc.so is the way to go.
Where is the SourceKit API documented?
So when you say API to my mind there are really 2 APIs for SourceKit. The first is the C API of each individual function call that is exposed in
libsourcekitdInProc.so. The second is the communication protocol that SourceKit actually uses to do work.
The first one, as far as I know, has no documentation. What I know I've learned through reading the source code and trial and error.
The second one, has the beginnings of docs. With a goal of having more (e.g., SR-2117).
It seems that SourceKit returns tokens as character offsets from the beginning of the file, but I need line numbers, as Atom’s marker API works in 2 dimensions.
The way I'd say it is it requires byte offsets from the beginning of the file and those might correspond to characters but it depends on your file's encoding.
I cannot remember exactly where but if you look at the source code for
sourcekit-test they have the same requirement line numbers and char offset in line. It has a private API that handles this mapping.
In any implementation I've seen people are manually implementing that mapping in their library.
How separable are the syntax parsing passes? Does Sourcekit have the concept of synchronization points? Parsing a 2000-line .swift file on every keystroke cannot be very fast.
I know I was shocked at how performant this is even for just naively rerunning on every edit. Especially for most implementations of these sorts of tools that I am assuming you are trying to make (e.g., some sort of Atom code completion tool).
All that having been said there is an incremental API that has to do with indexing files. Though that is not very well documented or understood by the community. It stands to reason that someone at Apple knows but to date the docs do not seem to reflect that. I think that would be something to be updated and better discussed in the protocol docs.