Reading s16 PCM WAV file gets different values in Swift

I’m reading in an s16 PCM WAV file in both Python and Swift. In Python, I get a bunch of values like this:

 [ 0.00000000e+00]
 [-3.45625685e-09]
 [-3.51880986e-09]
 [-3.58192380e-09]
 [-3.64559867e-09]
 [-3.70983445e-09]
 [-3.77463116e-09]
 [-3.83998878e-09]
 [-3.90590732e-09]
 [-3.97238677e-09]
 [-4.03942713e-09]

Those same values in Swift come out like this:

0.0
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05
-3.0517578e-05

ffprobe says the file is pcm_s16le, and I don’t do anything special with the Python for reading it in.

Swift is reading it in as a Float32, which for some reason works (Float64 fails because it’s a Double). Python appears to be reading it in as Float64 (the default for the soundfile library), which explains the multiple orders of magnitude of difference.

However, the fact that Swift is reading the same values instead of different ones — why would that be the case?

Might be worth sharing some code.

You’re right, and as will all things, producing a full MRE has solved all my problems.

Paste: hqgfgUJz is the Swift code I had, and in adding the Python code that produces the values that I thought were correct, I realized that it was the PYTHON code that was wrong.

tl;dr Python was overwriting the values with a Hann window, which is why they looked different by the time I inspected them.

I was using numpy and had code to the effect of the following two lines:

inp  = self.voltages[start:end]
inp *= self.window

This causes self.voltages to be overwritten, which I was NOT expecting.

Anyways, the Swift code was totally fine, but I am absolutely open to suggestions on improving it!

1 Like

pcm_s16le is signed Int16 little endian samples, reading them as 32 bit floats can not possibly be right.
I don't have the PCM file to see what are the values of the first few samples to expect from it.

Here’s the audio file: https://bellwether.llc/transformer.wav

My understanding is that the difference between reading them as Int16 and Float32 is that Float32 value = Int16 value / max Float32 value, so I wasn’t worried about them. And since I put it into a Fourier transform, I figure it’s their relative values that are important.

The first samples of that file are:

0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 FFFF
FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0000 0000 FFFF 
FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0000 0000 0000 0100 0100 0200 0200 
0200 0200 0200 0200 0200 0200 0200 0300 0300 0300 0200 0200 0200 0200 0200 0200 
0200 0200 0200 0100 0100 0100 0100 0000 0000 0000 0000 FFFF FFFF FEFF FEFF FEFF 
FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF 
FFFF FFFF FFFF 0000 0000 0000 0000 0000 0000 0000 0000 0000 FFFF FFFF FFFF FFFF 
FFFF FFFF FFFF FFFF 0000 0000 0000 0000 0000 0100 0100 0100 0100 0100 0100 0100 
0100 0100 0100 0100 0100 0100 0000 0000 0000 0000 0000 FFFF FFFF FFFF FFFF FFFF 
FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FEFF FFFF FFFF FFFF 0000 0100 0100

which is this if read as little endian Int16 samples:

0, ..., -1, .... 0, ... -1, .., 0, ..., 1, 2, ... 3 ..., 2, ... 1, ..., 0, ... -1, ... -2, ...