Using SourceKit?

What do I need to do to use SourceKit APIs from another language (say: javascript?) There’s surprisingly little information on this online.

  • Most sources, and the SourceKit readme, say you should use the sourcekitd(.so) library object instead of libsourcekitdInProc.so, but I only have the latter file in my swift installation.

  • Where is the SourceKit API documented?

  • it seems that SourceKit returns tokens as character offsets from the beginning of the file, but I need line numbers, as Atom’s marker API works in 2 dimensions.

  • How separable are the syntax parsing passes? Does Sourcekit have the concept of synchronization points? Parsing a 2000-line .swift file on every keystroke cannot be very fast.

2 Likes

I'd maybe start by compiling SourceKitten and using its CLI through javascript.

SourceKit itself has a CLI, but I think SourceKitten's might be more uniform since there are secondary pieces of SourceKit that are not compiled yet for Linux.

That said, Norio Nomura seems to be the expert here (specially on the compatibility side). Try to contact him, he's a nice guy and will probably help you right away :slight_smile:


Edit*: I forgot to mention it, but SourceKitten is the Swift wrapper of SourceKit. It works really well and is well tested. And as of... Swift 3.2, I think? Compiles and passes every test on Linux.

And on this I reaaaally have no idea. It seems to me that it must be working incrementally (i.e. not re-parsing after every keystroke) because SourceKittenDaemon is pretty fast and it uses SourceKit through SKitten.

For this, any of the authors of SKitten or Benedikt Terhechte (the owner of SKD) should definitely know the answer of.

i think calling a C API from javascript will be much easier than calling a Swift API?

Yeah yeah yeah, of course. But what I mean is calling the console commands xD

SourceKitten is both a library and a CLI :slight_smile:

i don’t understand what you mean (I know SourceKitten has a CLI tool. i don’t get how that helps with the javascript atom package)

Ahh. I mean that you should be able to do something like this from JS:

  • output = exec ("sourcekitten <query>")
  • resultOfQuery = Result(output)

Now that I think about it though, that means that you'll probably have to parse Strings? :thinking: not good.

Unless SKitten outputs JSON from the CLI. I don't know the specs.

sourcekitten does output JSON - parsing it from javascript is trivial.

Here's an example.

1 Like

What do I need to do to use 2 SourceKit APIs from another language (say: javascript?) There’s surprisingly little information on this online.

There is really a dearth of information. I can tell you from personal experience it is a bunch of reverse engineering what others have done and trial and error.

I have a native Node.js module for interacting with SourceKit that I started a long time ago and have subsequently abandoned. Mostly because I transitioned my VSCode plugin to using a native Swift based implementation of the same idea.

It did work and it might still. Regardless it should show you how you can use both the C API and the SourceKit protocol (more on that later) using Node. Also at least a starting point for compiler and linker flags, etc.

Most sources, and the SourceKit readme, say you should use the sourcekitd(.so) library object instead of libsourcekitdInProc.so, but I only have the latter file in my swift installation.

It might say that but in my experience you are right libsourcekitdInProc.so is the way to go.

Where is the SourceKit API documented?

So when you say API to my mind there are really 2 APIs for SourceKit. The first is the C API of each individual function call that is exposed in libsourcekitdInProc.so. The second is the communication protocol that SourceKit actually uses to do work.

The first one, as far as I know, has no documentation. What I know I've learned through reading the source code and trial and error.

The second one, has the beginnings of docs. With a goal of having more (e.g., SR-2117).

It seems that SourceKit returns tokens as character offsets from the beginning of the file, but I need line numbers, as Atom’s marker API works in 2 dimensions.

The way I'd say it is it requires byte offsets from the beginning of the file and those might correspond to characters but it depends on your file's encoding.

I cannot remember exactly where but if you look at the source code for sourcekit-repl or sourcekit-test they have the same requirement line numbers and char offset in line. It has a private API that handles this mapping.

In any implementation I've seen people are manually implementing that mapping in their library.

How separable are the syntax parsing passes? Does Sourcekit have the concept of synchronization points? Parsing a 2000-line .swift file on every keystroke cannot be very fast.

I know I was shocked at how performant this is even for just naively rerunning on every edit. Especially for most implementations of these sorts of tools that I am assuming you are trying to make (e.g., some sort of Atom code completion tool).

All that having been said there is an incremental API that has to do with indexing files. Though that is not very well documented or understood by the community. It stands to reason that someone at Apple knows but to date the docs do not seem to reflect that. I think that would be something to be updated and better discussed in the protocol docs.

4 Likes

Just to be clear, that really is a basic function of SourceKit itself and not just SourceKitten. As far as I know, SourceKitten is just surfacing the output of this SourceKit implementation in the cli interface.

A pedantic distinction to be sure but I think it might be important for someone who is implementing things with SourceKit.

As an example. If you look at this you can see how the Node application is calling the SourceKit JSON serialization. Then later uses v8 to parse that JSON response into a native JavaScript object. This code has no dependency other than Node and SourceKit.

1 Like