Sample value distribution statistics

Release: 0.2

△ Index ▼ Sample value distribution ▷ Download ▷ Build
▾ Options ▾ Result ▾ File formats ▾ History ▾ Todo

Overview

sampledist is intended to analyze audio data statistically.

Use cases

Check whether your sound device works well.
Analyze audio data for artifacts of sound compression.

Program sequence

The program reads 16-Bit PCM encoded audio data with two channels from stdin or a file in RIFF wave format.
The data is cut into chunks of N samples.
Each block is analyzed.
The results are written to stderr.
Optionally the value distribution is written to a file.
Optionally a command is executed or a constant is written to stdout.
Go for the next block.

Command line options

infilename - name of input file - default stdin: Read PCM data from filename (instead of stdin). The file name could be a transient source like a fifo or a character device.
bnsamples - block length - 32768 by default: This is the number of samples read in one block. Each time this number of samples are received the result is analyzed and written to console or file. Less samples do not produce any output.
lncount - number of cycles - 1 by default: Number of blocks to process until the program completes.
loop - infinite input: Switches the program to continuous mode. It can only be terminated by either sending a interrupt signal or when the input stream gets closed.
alcount - add blocks - infinite by default: Add the results of count blocks before the statistic is cleared. By default the results are cumulative, i.e. the statistics apply to all samples analyzed so far rather than the last block. Use al1 to provide only statistics for the last block read.
psanum - discard first num samples: Use this to discard spikes at the starting or to reach a steady state. You may also use this option to discard headers from PCM fileslike RIFF wave format.
dffilename - write histogram data to file: Writes the sample value distribution to filename. See ? file format
wd - write histogram data to hist.dat - shortcut for dfhist.dat.: Writes the sample value distribution to hist.dat.
rffilename - raw data file: Write raw data to filename. These is the raw input data without any processing so far, except for option psa. It is intended for diagnostics only. ? file format
wr - write raw data: Write raw data to default file raw.dat - shortcut for rfraw.dat.
execcommand - execute shell command: Each time a block has completed and the data has been written command is passed to system(). Note that sampledist waits for the command to complete. This gives you exclusive access to the data files but it may also interfere with the real time processing of the input data. You may alternatively consider to pipe the command to stdout instead (option plot), if you do not need this kind of synchronization.
plotcommand - pipe command: Write command to stdout each time a block has completed. You can use this to synchronize plot programs when new data arrives. Note that sampledist will not wait for any command completion.

Examples

@@TODO

Result

The program writes blocks like this to stderr.

	samples 	dB
min	-30581	-23544	-0.6	-2.9
max	30736	24574	-0.6	-2.5
mean	1.90	-16.49
stddev	5995.40	3912.31
skew	-0.0087	-0.0189
kurtos.	0.00820	0.00918
crest	0.19506	0.15920	-14.2	-16.0

min, max: Minimum and maximum sample value of left and right channel as absolute value and as dBFSR.
mean: Average sample value of left and right channel. A larger non-zero value indicates a DC bias in the audio data.
stddev: Standard deviation of sample value of left and right channel. The unit is digits. This is related to the RMS power in the audio data, but it should not be confused with the psychoacoustic loudness.
skew: Bias corrected Skewness of deviation of sample value of left and right channel.
This value should be close to zero indicating symmetric distributions. Everything else likely indicates a significant non-linearity.
kurtos.: Bias corrected Kurtosis excess of deviation of sample value of left and right channel.
This value is small for deviations close to the standard deviation, e.g. white noise. However, it increases as the audio data contain high dynamic. On the other side negative values indicate highly compressed audio data, e.g. loudness war.
crest: Crest factor of the audio data, i.e. ratio of the peak value to the RMS value. ?2 for sinusoidal, ?4 for noise, even more for typical audio.

File formats

`hist.dat` - value distribution

Column Symbol Description

[1] v sample value, [-32768, 32767]

[2] h_L relative frequency of sample value at the left channel

[3] h_R
relative frequency of sample value at the right channel

[4] H_L absolute frequency of sample value at the left channel

[5] H_R absolute frequency of sample value at the right channel

Column	Symbol	Description
[1]	`v`	sample value, [-32768, 32767]
[2]	`h_L`	relative frequency of sample value at the left channel
[3]	`h_R`	relative frequency of sample value at the right channel
[4]	`H_L`	absolute frequency of sample value at the left channel
[5]	`H_R`	absolute frequency of sample value at the right channel

`raw.dat` - raw data

Column Symbol Description

- line number
sample index n

[1] L(n) Channel 1 sample value

[2] R(n) Channel 2 sample value

Column	Symbol	Description
-	line number	sample index `n`
[1]	`L(n)`	Channel 1 sample value
[2]	`R(n)`	Channel 2 sample value

Change log

Version 0.2

Port to Linux
Support for skewness and kurtosis.

Version 0.1

Internal revision

TODOs, known issues

Input formats: Currently only 16 bit PCM data is supported. Indeed it does not make too much sense to analyze the frequency of sample values for 24 or 32 bit data. But the moment analysis of the value distribution does make sense.

Sample value distribution statistics

Overview

Use cases

Program sequence

Command line options

Examples

Result

File formats

hist.dat - value distribution

raw.dat - raw data

Change log

Version 0.2

Version 0.1

TODOs, known issues

`hist.dat` - value distribution

`raw.dat` - raw data