Performance issues with iOS Swift function to word-wrap streaming text received in real-time

I am facing Performance issues with iOS Swift function to the word-wrap streaming text received in real-time **

We have written an iOS function to perform word-wrapping based on character count on streaming text received in real-time, to display on an OLED. We are trying to replicate what the Android StringTokenizer class and methods do, since iOS Swift does not have an equivalent class/method. But we have performance issues with this function (delay in the output of wrapped words compared to the streaming text being received). We are looking for the best possible method/solution/class in Swift which can handle the entire process of word wrapping of streaming text with minimum delay.

Background:

We are developing an iOS mobile app to convert speech to text in real-time using cloud-based automatic speech-to-text (STT) model APIs including STT APIs from Google, Azure & IBM, and display the transcribed text on an OLED display via a BLE device.

The app uses the mobile device’s microphone to capture “endless” or “infinite” streaming audio feeds, streams the audio to the cloud STT API, and receives stream speech recognition (transcription) results in real-time as the audio is processed.

Unlike standard speech-to-text implementations in voice assistants which listen to voice commands, wait for silence and then give the best output, in infinite streaming, there is a continuous stream of input audio being sent to the STT API which returns interim and final transcription results continuously.

As the transcription data is received from the API (interim/final), the app performs word-wrapping logic to prepare the text for display on the OLED. This logic depends on the number of characters that can be displayed on one line of the OLED and the number of lines in the display.

For example, on our current OLED, we can display 18 characters in one line and a maximum of 5 lines at a time. So if a string of than 18 characters is received from the STT API, the app logic has to find the appropriate space character in the string after which to break it and display the remaining string on the next line(s). . Here’s an example of a string received from the STT API:

TranscribeGlass shows closed captions from any source in your field of view

The expected result on the 5 lines of the OLED after performing word wrapping:

TranscribeGlass shows closed captions from any source in your field of view

Explanation: Line 1 - “TranscribeGlass” - because “TranscribeGlass shows” exceeds 18 characters so break the string after the space before “shows”, and wrap it to the next line Line 2 - “shows closed” - because“shows closed captions” exceeds 18 characters, break the string after the space before “captions” Line 3 - “captions from any” - because “captions from any source” exceeds 18 characters, break the string after the space before “source” Line 4 - “source on in your” - because “source on in your field” exceeds 18 characters, break the string after the space before “field” Line 5 - “field of view” - no wrapping needed since the string is <= 18 characters

In Android, we have a ready-made StringTokenizer class & methods available that allows an application to break a string into tokens.

Our iOS Swift function:

We have written a function WriteOnlyCaption(), which takes a string as input and performs the word wrapping. See the code below:

For strings which has less than 18 characters, we are using the below function

public func writeOnlyCaption(text:String, isFinalBLE: Bool) {

    print("TEXT IS NOW:\(text)")
  
    var sendString:String = " "
    var previousWrapLocation = 0
    var currentWrapLocationIndex = characterPerLine

    while(text.count - previousWrapLocation > characterPerLine) {

        while(text[currentWrapLocationIndex] != " ") {
            //print("While CurrentWrapLocationIndex Index \(currentWrapLocationIndex)")
            currentWrapLocationIndex -= 1
        }
        sendString = sendString + String(text[previousWrapLocation...currentWrapLocationIndex]) + "\n"

        line.addLineIndex(textIndex: currentWrapLocationIndex)

        currentWrapLocationIndex += 1

        previousWrapLocation = currentWrapLocationIndex
        currentWrapLocationIndex = currentWrapLocationIndex  + characterPerLine
    }
    sendString = sendString + String(text[previousWrapLocation..<text.count]) + "\n"
    print("sendSting:->\(sendString)")
    line.addLineIndex(textIndex: sendString.count)
    line.setSendString(sendText: sendString)

    line.tooString()
    if previousLine == 0 {
        
        previousLine = line.getLines().count
    }
    print("previousLine:->\(previousLine)")
    var seed     = ""
    var counter  = 0

    var stringData = line.getSendString().components(separatedBy: "\n")
    stringData.removeLast()
    
    if stringData.count < 5 {
        repeat {
            if counter >= 0 {
                seed = seed + stringData[counter] + "\n"
            }
            counter += 1
        } while(stringData.count != counter)
    } else {

        counter = stringData.count - 5
        repeat {
            if counter > 0 {
                seed = seed + stringData[counter] + "\n"
            }
            counter += 1
        } while(stringData.count != counter)
    }
   print("Seeeddddddd:->\(seed)")
   print("counter: \(counter)")
 
    if seed != "" {
        if isFinalBLE {

            var seedBytes = seed.bytes
            
            if previousLine == line.getLines().count {
                seedBytes.append(0x03)
            } else if line.getLines().count > previousLine {

                seedBytes.append(0x02)
                seedBytes.append(0x03)
            }
            seedBytes.append(0x00)
            print("-- seedBytes: \(seedBytes) --")
            previousLine = line.getLines().count
            self.writeDataWithResponse(writeData: Data(seedBytes), characteristic: self.getCharsticFrom(UUID: CHAR.GLASS_WRITE)!)
        } else {
            var seedBytes = seed.bytes
            if previousLine == line.getLines().count {
                //seedBytes.append(0x00)
            } else if line.getLines().count > previousLine {

                if line.getLines().count - previousLine == 1 {
                    seedBytes.append(0x02)
                } else if line.getLines().count - previousLine == 2{
                    seedBytes.append(0x02)
                    seedBytes.append(0x02)
                } else {
                    seedBytes.append(0x02)
                }
            }
            seedBytes.append(0x00)
            print("-- seedBytes: \(seedBytes) --")
            previousLine = line.getLines().count
            self.writeDataWithResponse(writeData: Data(seedBytes), characteristic: self.getCharsticFrom(UUID: CHAR.GLASS_WRITE)!)
        }
        
    }
}

As you can see, we are using four while loops to perform the word wrapping correctly and send it to the OLED.

The main issue with this function is that four while loops and different if-else conditions create lots of delays (varies from 450ms to 900ms) while processing the text string, and this results in poor performance - either a lag in the display of text on the OLED compared to the streaming audio or entire chunks of transcribed text get skipped.

Terms of Service

Privacy Policy

Cookie Policy