Kafka client package

Shilpee · March 11, 2022, 7:16am

I am Shilpee Gupta, an undergraduate student from IGDTUW, Delhi, India. I am really interested to work on the Kafka Client Package project with the new concurrency features since I am always intrigued by the new concurrency features. I have gone through project description and was hoping to discuss with you and get some pointers along with it privately. I sent a message for further details.

Thank You!

-- Shilpee Gupta --

felixschlegel · March 11, 2022, 10:18am

Hey @FranzBusch, @Shilpee and everybody else interested in this project!

First of all, thank you @Shilpee for starting the public discussion about the Kafka client package project!

My name is Felix, a CS undergrad at the Technical University of Munich. This project sparked my interest in particular as making C libraries easily accessible in Swift is a game-changer for Swift on Server. Although I have experience with Swift and C, I must admit that I have not worked with Apache Kafka before.

For this reason, I want to ask you if you know any best practices or resources to get some hands-on experience with Kafka?

I am looking forward to your answers and wish you all a great start of GSoC!

Best regards

Felix Schlegel

hassila · March 11, 2022, 10:32am

For anyone working on this - you might want to check out https://redpanda.com/ as one additional perhaps easier to set up backend which is Kafka compatible.

Also GitHub - redpanda-data/redpanda: Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!

Shilpee · March 11, 2022, 10:33am

Thanks!

felixschlegel · March 11, 2022, 11:08am

Thank you very much! @hassila

FranzBusch · March 11, 2022, 2:01pm

Hello everyone,

Thanks for voicing your interest in this project! Let me try to explain a bit more what this project is about.

Kafka plays a crucial part in many modern software environments, and almost every modern language has clients available. In the past, a lot of different groups have worked on Kafka clients for Swift (Franz, PerfectKafka, SwiftKafka). Some of them are already using librdkafka and can be used as a reference point for how to interop between Swift and the lib. Additionally, a good starting point is to read up on the official documentation of librdkafka, which can be found here. There are also recommendations for developers of language bindings at the bottom.

This project aims to wrap this lib and vend a native Swift API using the new Concurrency features. The two major APIs that need to be exposed are the Consumer and Producer API that librdkafka offers.

As a GSoC participant, you need to develop a proposal on how you want to achieve the end goal. Some rough ideas that I have how you can get started:

First, I would focus on the currency types that a Kafka library needs, regardless of the underlying implementation, e.g., a KafkaClient, KafkaTopic, KafkaMessage, etc. . Next, I would focus on the APIs that adpoters are using to produce and consume messages. Lastly, you need to figure out how to wrap the actual C library.

Kafka has a large surface area and can be very complex in some parts; I think it is essential to state this here. This project’s scope should be to produce a starting point with the basic APIs needed to consume, produce, and configure librdkafka. Additionally, we want to follow the Swift on Server guides best practices around logging and adopting Concurrency, which can be found here.

Another point of reference that I think is great to look at is the go library kafka-go which natively implements the Kafka protocols and vends both a high and low-level API.

If you have more detailed questions around specific areas, please ask questions here or reach out to me through the forums or at f.busch at apple.com.

Franz

PS: Also, thanks for the link @hassila, having a running Kafka backend makes everything easier. I can also recommend trying to run Kafka inside minikube (tutorial how-to).

lyiyu · March 29, 2022, 3:50pm

Hello @FranzBusch and everyone,
My name is Huiju Li, an undergraduate student at Huazhong University of Science and Technology in China. I have experience in Swift and C, etc, and I am particularly interested in your project.

I read your reply above and hope to learn more about this project.

I followed the tutorial you provided to run Kafka in minikube and tried a simple wrapper around a C library in swift, which I found interesting and made me more interested in the project.

I intend to see how other Kafka clients for Swift are implemented, hopefully to gain some experience and see where this project can be innovated or better.

I'm not sure if I'm learning in the right direction and I'd appreciate your help on how to move forward and gain a deeper understanding of the project.

I would like to have a more detailed discussion with you on how to develop a proposal. It is best to contact you privately to discuss further.

I just found out about gsoc a few days ago, hope it's not too late.

Thanks,
Li Huiju

FranzBusch · March 31, 2022, 10:11am

Hi @lyiyu,

Thanks for voicing your interest in the project. It's certainly not too late to get started with GSoC!

I think it is worth exploring all the current Kafka Swift libs to understand how they are doing certain things.

You can contact me here on the forum via private message or directly in this thread for further questions about the proposal. I would encourage everyone to ask as many questions in this thread so that everybody who is interested gets the same information.

A very good resource is the Contributor Guide on the official GSoC page. Especially for writing a proposal, they have a good section that should be included which you can find here.

Franz

lyiyu · March 31, 2022, 10:30am

Thanks for your reply, I have read the official guidance and understand the basic framework of a proposal content. But I still have some questions and would appreciate help.

Is a good proposal enough to participate in the program, and are there other requirements that are required? (In addition to the Recommended skills mentioned in the project description)
Do you have any suggestions for the division of the project timeline?
If I finish a draft of the proposal, could I send it to you for some revisions?

I hope I didn't ask some stupid question, looking forward to your reply, thanks!

FranzBusch · March 31, 2022, 10:57am

Thanks for the questions @lyiyu.

A good proposal is the key to being able to participate in the program. After the submission deadline, all proposals will be evaluated and a decision is taken on which candidates to admit for which project.

Great question! I haven't fully made up my mind but I think a good direction could follow along this timeline:

Setup a local testing environment with Kafka
First prototype that calls librdkafka directly to produce and consume messages
Defining the necessary currency types (KafkaTopic, KafkaMessage, etc.)
Coming up with the Producer interface of the KafkaClient
Creating the Consumer interface based on AsyncSequence
Exposing the librdkafka configuration through some means to the adopters of the new package

Sure, feel free to send it to me via private message on this forum or via email f.busch at apple.com

On a general note, one thing that we should achieve with the package is exposing as little as possible of librdkafka in the public interface. Librdkafka is a great starting point to bootstrap a Kafka library but it uses its own threading model. At a future point, it could make sense to migrate the internals of the new package to a native implementation and it would be great if this could be achieved without an API breaking change.

Franz

lyiyu · March 31, 2022, 11:07am

Thank you most sincerely.I benefited a lot from your answer.

Shilpee · April 1, 2022, 1:30pm

Sorry for the delay in coming up for the discussion.

Some pointers to discuss:

Franz has their own implementation of sending messages and everything. It doesn't used interop with any library as far as I understood.

PerfectKafka and SwiftKafka has used header file from librdkafka for interoperability and this way was also recommended in the introduction.md of librdkafka. They both require to install librdkafka from the home-brew and write a export statement in the terminal.
I think installing it from the home-brew will be smooth but writing a export statement might lead to terminal errors for setting environment or any random error might pop up. This can lead to bad start for anyone who is trying it out for the first time.

So we can write a script to install librdkafka. I have seen such implementation in swift-distributed-actors
An implementation like this to clone the library, I don't know if this can be applicable here(just asking!)
Or anything to avoid this export statement?

On Debian, we will install librdkafka-dev and on RPM, we will install librdkafka-devel.(Assuming we are expanding to linux as well.)

Do we need to provide SSL and SASL configuration support in our client package? As librdkafka also provides some configuration properties for them.

So if at some point API breaking occurs, then do we need to write underlying implementation in swift like Franz ?

I don't remember exactly which gsoc post was it or where I read this thing at SwiftForum. It said to follow swift-evolution template template for writing proposal. So I wanted to confirm the final template in case we need to think about ABI stability in our project?
Swift evolution template has a topic of ABI Stability. As, the header file of librdkafka provides some types for ABI Compatibility. So, Do we need to write proposal accordingly the swift-evolution template . Or following this template (Writing a proposal | Google Summer of Code Guides) would suffice.

tbartelmess · April 2, 2022, 12:32am

Hi @Shilpee, a while ago I've started a Kafka Client on top of SwiftNIO: GitHub - tbartelmess/kafka-nio: Non-blocking, event-driven Swift client for Apache Kafka., but I had to stop working on it because of a change in Jobs.

The project is in a very early state but it can communicate with Kafka brokers and produce/consume messages.

It doesn't use a librdkafka - it has a script to generate swift definitions for the various Kafka messages from the protocol definitions.

If you are interested looking into using this as a starting point - I am not planning to work on it in the near to medium future. If you'd like to take over the project or are interested in some of the implementation details and motivations let me know.

Shilpee · April 2, 2022, 3:47am

Hi @tbartelmess,
Thanks for sharing the repo!
I will look into the implementation details as of now.

FranzBusch · April 11, 2022, 8:52am

Thanks for asking all these questions @Shilpee and sorry for the delayed response but I was out last week.

I think with the scope of GSoC we could go with the easiest route and just let the users know how to install librdkafka in the system. Similar to what PerfectKafka is doing. Later on, we might want to make this better and maybe we could build librdkafka through SPM, but this is out of scope for this project IMO.

Good question, I think this is not the primary goal of this project but something that might be needed. It would be good to have a small outline of this in the proposal of which use-cases need it and how it might be achieved.

Not sure if I get this question right. What I meant with my initial statement was that for the beginning using librdkafka to bootstrap a new Kafka library is great, but for maximum performance and portability, a native Swift implementation would be desired. When doing the native implementation it would be great to just swap out the internals of the library without having a breaking API change. This is something that one can actively influence by not exposing details of librdkafka in the public interface; however, it might not be achievable and that is also okay :)

Good question, I am not 100% sure what template we recommend. Maybe @ktoso can chime in here quickly. Regardless of what template we use in the end, ABI stability is not something that concerns us here since we are going to ship it as an SPM package.

FranzBusch · April 11, 2022, 8:53am

Thanks for sharing this! I think this is a great project but sadly a native Kafka implementation in Swift is out of scope for GSoC. It would just take too long. What we want to achieve is having a library that allows us to use Kafka and vend nice Concurrency APIs. In the future, we can then work on actually backing these APIs with a nice native Swift implementation.

ktoso · April 11, 2022, 9:46am

FranzBusch:

Shilpee:

I don't remember exactly which gsoc post was it or where I read this thing at SwiftForum. It said to follow swift-evolution template template for writing proposal. So I wanted to confirm the final template in case we need to think about ABI stability in our project?

Swift evolution template has a topic of ABI Stability. As, the header file of librdkafka provides some types for ABI Compatibility. So, Do we need to write proposal accordingly the swift-evolution template . Or following this template (Writing a proposal | Google Summer of Code Guides) would suffice.

Good question, I am not 100% sure what template we recommend. Maybe @ktoso can chime in here quickly. Regardless of what template we use in the end, ABI stability is not something that concerns us here since we are going to ship it as an SPM package.

The idea to follow swift-evolution style is more for projects which work on "Swift" and "Swift project" projects. Either way you don't have to follow that pattern, and with regards to things like ABI you don't need to write that up for this proposal either because it is new work and does not have to worry about breaking any existing users etc

Feel free to use any template you like, no problem if you make your own etc.

Hope this clarifies!

AdieOlami · May 16, 2023, 6:18pm

Hi, do we have any update on this? I will like to contribute also.

felixschlegel · May 17, 2023, 4:10pm

Hey @AdieOlami , it's great to hear you want to contribute to our project! We are actively working on getting a v1.0 release ready. Once that's finished, there will be tasks other contributors can get involved in!

Best regards,
Felix