I’m trying to learn swift and haven’t done much programming in a very long time and could use some help.
I’m trying to load a local WebVtt file and then parse through the time stamps and the associated text.
Ultimately I’m hoping to make an app for personal use on Mac that can load a transcript WebVtt file and sync it to audio and do like an interactive transcript where the text will sync to the audio file that’s associated or conversely tap on text and have the audio start playing from that location.
Below is a sample WebVtt file:
let testString = """
WEBVTT
00:00:00.000 --> 00:00:04.560
Did you know that up to 80 to 90 percent
00:00:04.560 --> 00:00:09.080
of those making a decision to Code will hit a wall and give up?
00:00:09.080 --> 00:00:14.920
What? Yes, it's true. Did you know that most people give up within the first year?
00:00:14.920 --> 00:00:19.400
Some extra dialog would go here
00:00:19.400 --> 00:00:23.360
also additional dialog would also go here
In case the formatting doesn’t show well there’s a single return from WebVtt and the first time stamp and then a double return in between all the rest including at the end of the file.
My Swift Code:
let regexPattern = /(?m)^(\d{2}:\d{2}:\d{2}\.\d+) +--> +(\d{2}:\d{2}:\d{2}\.\d+).*[\r\n]+\s*(?s)((?:(?!\r?\n\r?\n).)*)/
if let match40 = testString.firstMatch(of: regexPattern) {
let entireResult = match40.output.0
let firstResult = match40.output.1
let secondResult = match40.output.2
let thirdResult = match40.output.3
This works regarding getting the first result of each match and group which was my proof of concept that I seem to be on the right track.
My issue however is when I try switching out firstMatch with wholeMatch nothing I’ve tried let me go beyond the first match (or work at all).
I’ve had Xcode concert the RegEx into RegEx Builder but then the code above doesn’t work and nothing I try works with the auto generated builder code.
Secondly and the larger next issue is how to iterate through the matches as needed? This will be for a file that contains approximately an hour of transcription.
I’m at a loss and over the last month have posted this in other places and really kind of stuck.
Why don't you tackle the problem by breaking it into smaller pieces and working on one piece at a time? That is, break the whole text into lines, and parse each line individually and collate the results.
That’s initially what I was trying to do when I had Xcode try to convert the RegEx into the newer RegEx builder code.
I figured with the capture’s I could do named captures but nothing worked for me and I’m not at the point where I could figure out why it wasn’t working as there was no explicit errors.
Maybe it was a mistake on my part but many tutorials that are out there for learning swift focus on games and I figured I’d try to learn by building something I was interested in and would maintain myself.
Breaking it down is my next step but wanted to do a call for help as well as I realize I’m over my head at the moment.
The above is my RegexBuilder code that I got working.
Below is my code to cycle to show all the RegEx Matches:
let matches = testString.matches(of: patternTest)
print("Array of Text elements are:")
for match14 in matches {
let (notSureWhatGoesHere) = match14.output
print(match14.output.3)
}
The following code will print only the text from the WebVTT file without the timestamps. If you need to get all the first timestamps change the "print(match14.output.3" to "print(match14.output.1" and if you need all of the ending timestamps you can use "print(match14.output.2".
Embarrassingly enough I wasn't sure what goes in the second let hence the "notSureWhatGoesHere" but for my purposes of just accessing all the RegEx Matches and Groups in Swift this is working for me in a Swift Playground file.
I hope this might help someone else later on as well.