Does speech synthesis via Swift work on Mac or is it restricted to iOS?

Hi, I tried:

import AVFoundation

print("Starting")
let utterance = AVSpeechUtterance(string: "Hello world")
utterance.voice = AVSpeechSynthesisVoice(language: "en-GB")
utterance.rate = 1.0

let synthesizer = AVSpeechSynthesizer()
synthesizer.speak(utterance)
print("Done")

in an iOs app and in a command-line app on MacOS.

The code compiles without problem as a command-line app on MacOS, but does not seem to generate sound (speech).

Am I missing some ancillary parameters (authorisations, etc)?

1 Like

Try it in a Mac GUI app, it'll work.

What's prolly happening in your case is that when running through the command line, the program quits as soon as it reaches the last statement. That means there's never time for the speech to actually happen. You need to keep the command line program alive to give time for the audio to play. Some basic strategies for that here.

1 Like

Thank you.

That Stack Overflow thread is a little old; the most modern solution for this is probably to use async execution. However, AVSpeechSynthesizer doesn't expose async API directly so you'd also have to write a wrapper around that, using unsafe continuations.

But if you just need something quick and dirty that works, don't worry about that and just use what's shown on the Stack Overflow page.

1 Like

When using it on iOS, I noticed that the actual speak() operation is carried out in a separate process indeed.

Thank you again!

No need for a GUI; it only needs a run loop. The following works when run from the command line.

Main
//
//  main.swift
//  Speech
//
//  Created by ibex on 24/4/2022.
//

MySpeechEngine.render (text:"Twin primes are pairs of primes which differ by two.")
MySpeechEngine.render (text:"(This name was coined by Stäckel in 1916.)")
MySpeechEngine.render (text:"3 and 5, 5 and 7, 11 and 13, 17 and 19 are twin primes.")
MySpeechEngine.render (text:"[From the PrimePages @ primes.utm.edu.]")
SpeechEngine
//
//  SpeechEngine.swift
//
//  Created by ibex on 11/6/2022.
//

// Speech Synthesis
import AVFoundation

struct MySpeechEngine {
    static func render (text:String) {
        // Create an utterance.
        let utterance = AVSpeechUtterance (string: text)

        // Configure the utterance.
        utterance.rate = SpeechParameters.speechRate
        utterance.pitchMultiplier = 0.8
        utterance.postUtteranceDelay = 0.2
        utterance.volume = 0.8

        // Retrieve the Australian English voice.
        let voice = AVSpeechSynthesisVoice (language: "en-AU")

        // Assign the voice to the utterance.
        utterance.voice = voice

        // Create a speech synthesizer.
        let synthesizer = AVSpeechSynthesizer ()

        // Set up a delegate to detect the end of speech and to stop the run loop
        let delegate = SpeechSynthesizerDelegate ()
        synthesizer.delegate = delegate

        // Tell the synthesizer to speak the utterance.
        DispatchQueue.main.async {
            synthesizer.speak (utterance)
        }
        // Keep this context alive while the synthesizer is running...
        // Get stuck here until the delegate stops the run loop!
        CFRunLoopRun ()
    }

    //
    // ------------------------------------------------
    //
    private struct SpeechParameters {
        static var speechRate :  Float {
        #if false
            AVSpeechUtteranceMinimumSpeechRate
        #elseif false
            AVSpeechUtteranceMaximumSpeechRate
        #else
            0.6 * AVSpeechUtteranceDefaultSpeechRate
        #endif
        }
    }

    // ------------------------------------------------
    //
    private class SpeechSynthesizerDelegate : NSObject, AVSpeechSynthesizerDelegate {
        func speechSynthesizer( _ synthesizer: AVSpeechSynthesizer,
                               didFinish utterance: AVSpeechUtterance) {
            
            print ("\(type (of:self)): \(#function): stopping the run loop...")
            CFRunLoopStop (CFRunLoopGetCurrent ())
        }
    }
}
3 Likes

Thank you.