C++/Swift interop

Hi,

I'm experimenting with C++/Swift interop and stumbled upon a problem. But first of all some code.

The Swift side:

import Foundation
import OSLog

// Struct to hold log entry data
public struct LogEntry {
   public let timestamp: TimeInterval
   public let message: String
   public let process: String
   public let pid: Int
   public let level: String
}

// Function to fetch logs
public func fetchLogs() -> [LogEntry] {
   var logEntriesList: [LogEntry] = []
   
   guard let logStore = try? OSLogStore(scope: .system) else {
      return logEntriesList
   }
   
   do {
      let lastThirtySeconds = logStore.position(date: Date().addingTimeInterval(-30))
      let logEntries = try logStore.getEntries(at: lastThirtySeconds)
      
      for entry in logEntries {
         if let logEntry = entry as? OSLogEntryLog {
            let entryData = LogEntry(
               timestamp: logEntry.date.timeIntervalSince1970,
               message: logEntry.composedMessage,
               process: logEntry.process,
               pid: Int(logEntry.processIdentifier),
               level: "\(logEntry.level)"
            )
            logEntriesList.append(entryData)
         }
      }
   } catch {
      return []
   }
   
   return logEntriesList
}

And the C++ side:

#include <iostream>
#include <test_log_retrieval_interop-Swift.h>
#include <chrono>

int main(int argc, const char * argv[]) {
   
   const auto logEntries = test_log_retrieval_interop::fetchLogs();
   for (const auto& logEntry : logEntries) {
      const auto duration_since_epoch = std::chrono::duration<double>(logEntry.getTimestamp());
      const auto duration_since_epoch_casted = std::chrono::duration_cast<std::chrono::system_clock::duration>(duration_since_epoch);
      
      std::chrono::time_point<std::chrono::system_clock> tp(duration_since_epoch_casted);

//      std::string msg;
      const std::string msg = (logEntry.getMessage());
      const std::string level = logEntry.getLevel();
      const std::string process = logEntry.getProcess();
      const pid_t pid = static_cast<pid_t>(logEntry.getPid());
      std::cout << "Timestamp: " << tp << std::endl
                 << "Message: " << msg << std::endl
                 << "Process: " << process << std::endl
                 << "PID: " << std::to_string(pid) << std::endl
                 << "Level: " << level << std::endl<< std::endl;
   }
   return 0;
}

My Swift code retrieves log entries using the OSLog framework under macOS. I wrote a function in Swift returning an array of entries consumed on the C++ side. So far, so good. Everything works as expected except one thing.

The line

const std::string msg = log.getMessage();

causes the whole process to max out at about 100% CPU usage after successfully iterating over a few dozen entries (see the attached screenshot). Moreover, the process consumes more and more RAM and won't exit until manually forced to.

Commenting the line out yields the expected result, i.e., each entry's value (except message) will be printed.

Explicit casting like

const std::string msg = static_cast<std::string>(log.getMessage());

does not help either.

As I'm new to Swift, I might be making a fundamental mistake I don't see. Can somebody give me a hint as to what the problem could be?

The program is run under macOS 14.5 with Xcode 15.4 and Swift 5.

Thanks in advance!

Update:
This was a debug build.

Does this happen if you rewrite the same loop in Swift? It might be a bug in Swift’s String implementation.

My C++ a bit rusty, I don't remember how static_cast differs from old-C-style cast (if differs at all, and that’s why I can’t remember), but Swift docs on the matter suggest conversion in that way to std::string:

const std::string msg = (std::string)log.getMessage();

Hi,

Thanks for the replies and suggestions!

@ksluder I tried a Swift loop, which works fine.
@vns I tried this, too, but it did not make a difference.

Update:
I tried a release build, too, which shows the same problem.

I would say this indicates there’s a bug in the C++ implementation of swift::String or swift::UTF8View.

1 Like

So this happens not for all messages? Maybe that's something inside the string that causes this behaviour? Seems like it stuck in iteration loop. Have you tried to look at the string that causes this?

After pausing the process, the offending string seems to be:

"Device found changed: CBDevice DD28E924-670E-893B-1E47-CB86AE3B6DDD, BDA 6E:2A:A4:6F:48:35, Nm 'Bluetooth-Gerät', DsFl 0x40 < NearbyInfo >, DvF 0x40000000000 < Connectable >, RSSI -59, AMfD <4c 00 10 06 29 1e 45 5c 7b 38>, nbIAT <45 5c 7b>, nbIF 0x3C2 < Ranging AT Duet WiFiP2P HtSp >, CF 0x80200000000 < RSSI Attributes >"

It does not look to be especially long or malformed. The only exception could be the German "ä," although setting the variable "msg" to "äää" or "ä" did not trigger the above-mentioned behavior.

What happens if you loop over the string’s UTF8View on the Swift side? Something like:

for x in logEntry.message.utf8 {
  print("\(x)")
}

Because that’s what the Swift implementation of operator std::string is doing:

extension std.string {
  /// Creates a C++ string having the same content as the given Swift string.
  ///
  /// - Complexity: O(*n*), where *n* is the number of UTF-8 code units in the
  ///   Swift string.
  public init(_ string: String) {
    self.init()
    let utf8 = string.utf8
    self.reserve(utf8.count)
    for char in utf8 {
      self.push_back(value_type(bitPattern: char))
    }
  }

(Also, it seems like this could be made a lot faster for contiguous strings by using the std::basic_string constructor that takes a count and a pointer to CharT. Filed as issue #75110.)

Of course, running this takes a little longer, but it only consumes about 6% of the CPU and runs fine, as well as exits normally.

Hm, I’m starting to wonder if something is throwing an exception and corrupting the Swift iterator’s state. Can you try adding a breakpoint that triggers on C++ exception throw?

I added a breakpoint that triggers on C++ exceptions, but none was thrown.
As a side note, I let the process run for two minutes to show the memory usage in more detail as (maybe) another hint. See this screenshot:

As it is getting late here in Germany, I have to apologize for leaving for the moment. I will definitely be back tomorrow, maybe (hopefully) with an explanation.

In the meantime, I would like to really thank everybody here for your help. The quick responses I got, as well as the thorough guidance, are outstanding!

Have a good and relaxing evening everyone.

2 Likes

Hello,

I investigated the problem further today but could not find the exact issue. As it seems not to be too trivial to find, I opened up a ticket at Apple. Maybe they can shed some light on it.

I profiled the process with instruments, and as @ksluder suggested, there seems to be a problem with UTF8View:

What you see above the marked line in the allocation profile, is repeated infinitely if the process is not stopped.

Can you please share the FB number?

String you are receiving seems to be not exactly "native" if I understood correctly call tree, but bridged from CoreFoundation... Maybe you can try to convert it to pure Swift string before passing to LogEntry (maybe constructing it char by char, as assumption)? But if any, that seems like a bug after all.

1 Like

@ksluder Of course. The number is FB14264301.
@vns I will try that. Another data point is never wrong.

1 Like