SoundTouch library Copyright (c) Olli Parviainen 2002-2004
SoundTouch is an open-source audio processing library that allows changing the sound tempo, pitch and playback rate parameters independently from each other, i.e.:
Author email: oparviai @ iki.fi
SoundTouch WWW page: http://www.iki.fi/oparviai/soundtouch
Note: The above URL is a relay address that will forward the browser to the actual server. If you'll create a link to SoundTouch library page, please use the above URL instead of the actual address, so that the correct link address will be maintained although the pages would move to another server.
Before compiling, notice that you can choose sample data format if it's desirable to use floating point sample data instead of 16bit integers. See section "sample data format" for more information.
Project files for Microsoft Visual C++ 6.0 are supplied with the source code package. Please notice that SoundTouch library uses processor-specific optimizations for Pentium III and AMD processors that require a “processor pack” upgrade for the Visual Studio 6.0 to be installed in order to support these optimizations. The processor pack upgrade can be downloaded from Microsoft site at this URL:
http://msdn.microsoft.com/vstudio/downloads/tools/ppack/default.aspx
If the above URL is unavailable or removed, go to http://msdn.microsoft.com and perform a search with keywords “processor pack”.
Visual Studio .NET supports required instructions by default and thus doesn't require installing the processor pack.
To build the binaries with Visual C++ 6.0 compiler, either run "make-win.bat" script or open the appropriate project files in source code directories with Visual Studio. The final executable will appear under the "SoundTouch\bin" directory. If using the Visual Studio IDE instead of the “make-win.bat” script, directories “bin” and “lib” have to be created manually to the SoundTouch package root for the final executables. The “make-win.bat” script creates these directories automatically.
Also other C++ compilers than Visual C++ can be used, but project or makefiles then have to be adapted accordingly. Performance optimizations are written in Visual C++ compatible syntax, they may or may not be compatible with other compilers. If using GCC (Gnu C Compiler) compiler package such as DJGPP or Cygwin, please see next chapter for instructions.
The SoundTouch library can be compiled in practically any platform supporting GNU compiler (GCC) tools. SoundTouch have been tested with gcc versions 3.2.3., but it shouldn't be very specific about the gcc version. Assembler-level performance optimizations for GNU platform are currently available in x86 platforms only, they are automatically disabled and replaced with standard C routines in other processor platforms.
To build and install the binaries, run the following commands in SoundTouch/ directory:
./configure - |
Configures the SoundTouch package for the local environment. |
make - |
Builds the SoundTouch library & SoundStretch utility. |
make install - |
Installs the SoundTouch & BPM libraries to /usr/local/lib and SoundStretch utility to /usr/local/bin. Please notice that 'root' privileges may be required to install the binaries to the destination locations. |
NOTE: At the time of release the SoundTouch package has been tested to compile in GNU/Linux platform. However, in past it's happened that new gcc versions aren't necessarily compatible with the assembler setttings used in the optimized routines. If you've problems to get the SoundTouch library compiled, as a first work-around try to disable the optimizations by editing the file "include/STTypes.h" and removing the following definition there:
#define ALLOW_OPTIMIZATIONS 1
The sample data format can be chosen between 16bit signed integer and 32bit floating point values, the default is 16bit signed inteher. The sample data format is chosen in file "STTypes.h" by choosing one of the following defines:
The sample data can have either single (mono) or double (stereo) audio channel. Stereo data is interleaved so that every other data value is for left channel and every second for right channel. Notice that while it'd be possible in theory to process stereo sound as two separate mono channels, this isn't recommended because processing the channels separately would result in losing the phase coherency between the channels, which consequently would ruin the stereo effect.
Sample rates between 8000-48000Hz are supported.
The processing and latency constraints of the SoundTouch library are:
SoundTouch provides three seemingly independent effects: tempo, pitch and playback rate control. These three controls are implemented as combination of two primary effects, sample rate transposing and time-stretching.
Sample rate transposing affects both the audio stream duration and pitch. It's implemented simply by converting the original audio sample stream to desired duration by interpolating from the original audio samples. In SoundTouch, linear interpolation with anti-alias filtering is used. Theoretically a higher-order interpolation provide better result than 1st order linear interpolation, but in audio application linear interpolation together with anti-alias filtering perform subjectively about as well as higher-order filtering would.
Time-stretching means changing audio stream duration without affecting it's pitch. SoundTouch uses WSOLA-like time-stretching routines that operate in time domain. Compared to sample rate transposing, time-stretching is a much heavier operation and also require a longer processing "window" of sound samples that's being kept inside the processing algorithm to work with, thus increasing the algorithm input/output latency. Typical i/o latency for the SoundTouch time-stretch algorithm is around 100 ms.
Sample rate transposing and time-stretching are then used together to produce the tempo, pitch and rate controls:
The time-stretch algorithm has few parameters that can be tuned to optimize sound quality for certain application. The current default parameters have been chosen by iterative if-then analysis (read: "trial and error") to obtain best subjective sound quality in pop/rock music processing, but in applications processing different kind of sound the default parameter set may result into a sub-optimal result.
The time-stretch algorithm default parameter values are set by these #defines in file "TDStretch.h":
#define DEFAULT_SEQUENCE_MS 82 #define DEFAULT_SEEKWINDOW_MS 28 #define DEFAULT_OVERLAP_MS 12
These parameters affect to the time-stretch algorithm as follows:
Notice that these parameters can also be set during execution time with functions "TDStretch::setParameters()" and "SoundTouch::setSetting()".
The table below summarizes how the parameter can be adjusted for different applications:
Parameter name | Default value magnitude | Larger value affects... | Smaller value affects... | Music | Speech | Effect in CPU burden |
SEQUENCE_MS |
Default value is relatively large, chosen for slowing down music tempo | Larger value is usually better for slowing down tempo. Growing the value decelerates the "echoing" artifact when slowing down the tempo. | Smaller value might be better for speeding up tempo. Reducing the value accelerates the "echoing" artifact when slowing down the tempo | Default value usually good | A smaller value than default might be better | Increasing the parameter value reduces computation burden |
SEEKWINDOW_MS |
Default value is relatively large, chosen for slowing down music tempo | Larger value eases finding a good mixing position, but may cause a "drifting" artifact | Smaller reduce possibility to find a good mixing position, but reduce the "drifting" artifact. | Default value usually good, unless a "drifting" artifact is disturbing. | Default value usually good | Increasing the parameter value increases computation burden |
OVERLAP_MS |
Default value is relatively large, chosen to suit with above parameters. | If you reduce the "sequence ms" setting, you might wish to try a smaller value. | Increasing the parameter value increases computation burden |
General optimizations:
The time-stretch routine has a 'quick' mode that substantially speeds up the algorithm but may degrade the sound quality by a small amount. This mode is activated by calling SoundTouch::setSetting() function with parameter id of SETTING_USE_QUICKSEEK and value "1", i.e.
setSetting(SETTING_USE_QUICKSEEK, 1);
CPU-specific optimizations:
SoundStretch audio processing utility
Copyright (c) Olli Parviainen 2002-2003
SoundStretch is a simple command-line application that can change tempo, pitch and playback rates of WAV sound files. This program is intended primarily to demonstrate how the "SoundTouch" library can be used to process sound in own programs, but it can as well be used for processing sound files.
SoundStretch Usage syntax:
soundstretch infile.wav outfile.wav [switches]
Where:
"infile.wav" |
is name of the input sound data file (in .WAV audio file format). |
"outfile.wav" |
is name of the output sound file where the resulting sound is saved (in .WAV audio file format). This parameter may be omitted if it's not desired to save the output (e.g. when only calculating BPM rate with '-bpm' switch). |
[switches] |
Are one or more control switch. |
Available control switches are:
-tempo=n |
Change sound tempo by n percents (n = -95.0 .. +5000.0 %) |
-pitch=n |
Change sound pitch by n semitones (n = -60.0 .. + 60.0 semitones) |
-rate=n |
Change sound playback rate by n percents (n = -95.0 .. +5000.0 %) |
-bpm=n |
Detect the Beats-Per-Minute (BPM) rate of the sound and adjust the tempo to meet 'n' BPMs. If this switch is defined, the "-tempo=n" switch value is ignored. If "=n" is omitted, i.e. switch "-bpm" is used alone, the program just calculates and displays the BPM rate but doesn't adjust tempo according to the BPM value. |
-quick |
Use quicker tempo change algorithm. Gains speed but loses sound quality. |
-naa |
Don't use anti-alias filtering in sample rate transposing. Gains speed but loses sound quality. |
-license |
Displays the program license text (LGPL) |
Notes:
Example 1
The following command increases tempo of the sound file "originalfile.wav" by 12.5% and saves result to file "destinationfile.wav":
soundstretch originalfile.wav destinationfile.wav -tempo=12.5
Example 2
The following command decreases the sound pitch (key) of the sound file "orig.wav" by two semitones and saves the result to file "dest.wav":
soundstretch orig.wav dest.wav -pitch=-2
Example 3
The following command processes the file "orig.wav" by decreasing the sound tempo by 25.3% and increasing the sound pitch (key) by 1.5 semitones. Result is saved to file "dest.wav":
soundstretch orig.wav dest.wav -tempo=-25.3 -pitch=1.5
Example 4
The following command detects the BPM rate of the file "orig.wav" and adjusts the tempo to match 100 beats per minute. Result is saved to file "dest.wav":
soundstretch orig.wav dest.wav -bpm=100
v1.3.0:
v1.2.1:
v1.2.0:
v1.1.1:
v1.01:
v1.0:
v1.3.0:
v1.2.1:
v1.2.0:
v1.1.1:
v1.1:
v1.01:
SoundTouch audio processing library
Copyright (c) Olli Parviainen
This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA