# Created by Octave 3.6.1, Tue Mar 20 07:03:19 2012 UTC # name: cache # type: cell # rows: 3 # columns: 7 # name: # type: sq_string # elements: 1 # length: 2 au # name: # type: sq_string # elements: 1 # length: 315 y = au(x, fs, lo [, hi]) Extract data from x for time range lo to hi in milliseconds. If lo is [], start at the beginning. If hi is [], go to the end. If hi is not specified, return the single element at lo. If lo<0, prepad the signal to time lo. If hi is beyond the end, postpad the signal to time hi. # name: # type: sq_string # elements: 1 # length: 26 y = au(x, fs, lo [, hi]) # name: # type: sq_string # elements: 1 # length: 6 auload # name: # type: sq_string # elements: 1 # length: 1008 -- Function File: [X,FS,SAMPLEFORMAT] = auload (FILENAME) Reads an audio waveform from a file given by the string FILENAME. Returns the audio samples in data, one column per channel, one row per time slice. Also returns the sample rate and stored format (one of ulaw, alaw, char, int16, int24, int32, float, double). The sample value will be normalized to the range [-1,1] regardless of the stored format. [x, fs] = auload(file_in_loadpath("sample.wav")); auplot(x,fs); Note that translating the asymmetric range [-2^n,2^n-1] into the symmetric range [-1,1] requires a DC offset of 2/2^n. The inverse process used by ausave requires a DC offset of -2/2^n, so loading and saving a file will not change the contents. Other applications may compensate for the asymmetry in a different way (including previous versions of auload/ausave) so you may find small differences in calculated DC offsets for the same file. # name: # type: sq_string # elements: 1 # length: 65 Reads an audio waveform from a file given by the string FILENAME. # name: # type: sq_string # elements: 1 # length: 6 auplot # name: # type: sq_string # elements: 1 # length: 2310 -- Function File: [Y,T,SCALE] = auplot (X) -- Function File: [Y,T,SCALE] = auplot (X,FS) -- Function File: [Y,T,SCALE] = auplot (X,FS,OFFSET) -- Function File: [Y,T,SCALE] = auplot (...,PLOTSTR) Plot the waveform data, displaying time on the X axis. If you are plotting a slice from the middle of an array, you may want to specify the OFFSET into the array to retain the appropriate time index. If the waveform contains multiple channels, then the data are scaled to the range [-1,1] and shifted so that they do not overlap. If a PLOTSTR is given, it is passed as the third argument to the plot command. This allows you to set the linestyle easily. FS defaults to 8000 Hz, and OFFSET defaults to 0 samples. Instead of plotting directly, you can ask for the returned processed vectors. If Y has multiple channels, the plot should have the y-range [-1 2*size(y,2)-1]. scale specifies how much the matrix was scaled so that each signal would fit in the specified range. Since speech samples can be very long, we need a way to plot them rapidly. For long signals, auplot windows the data and keeps the minimum and maximum values in the window. Together, these values define the minimal polygon which contains the signal. The number of points in the polygon is set with the global variable auplot_points. The polygon may be either 'filled' or 'outline', as set by the global variable auplot_format. For moderately long data, the window does not contain enough points to draw an interesting polygon. In this case, simply choosing an arbitrary point from the window looks best. The global variable auplot_window sets the size of the window required for creating polygons. You can turn off the polygons entirely by setting auplot_format to 'sampled'. To turn off fast plotting entirely, set auplot_format to 'direct', or set auplot_points=1. There is no reason to do this since your screen resolution is limited and increasing the number of points plotted will not add any information. auplot_format, auplot_points and auplot_window may be set in .octaverc. By default auplot_format is 'outline', auplot_points=1000 and auplot_window=7. # name: # type: sq_string # elements: 1 # length: 54 Plot the waveform data, displaying time on the X axis. # name: # type: sq_string # elements: 1 # length: 6 ausave # name: # type: sq_string # elements: 1 # length: 871 usage: ausave('filename.ext', x, fs, format) Writes an audio file with the appropriate header. The extension on the filename determines the layout of the header. Currently supports .wav and .au layouts. Data is a matrix of audio samples in the range [-1,1] (inclusive), one row per time step, one column per channel. Fs defaults to 8000 Hz. Format is one of ulaw, alaw, char, short, long, float, double Note that translating the symmetric range [-1,1] into the asymmetric range [-2^n,2^n-1] requires a DC offset of -2/2^n. The inverse process used by auload requires a DC offset of 2/2^n, so loading and saving a file will not change the contents. Other applications may compensate for the asymmetry in a different way (including previous versions of auload/ausave) so you may find small differences in calculated DC offsets for the same file. # name: # type: sq_string # elements: 1 # length: 25 usage: ausave('filename. # name: # type: sq_string # elements: 1 # length: 4 clip # name: # type: sq_string # elements: 1 # length: 206 Clip values outside the range to the value at the boundary of the range. X = clip(X) Clip to range [0, 1] X = clip(X, hi) Clip to range [0, hi] X = clip(X, [lo, hi]) Clip to range [lo, hi] # name: # type: sq_string # elements: 1 # length: 74 Clip values outside the range to the value at the boundary of the range. # name: # type: sq_string # elements: 1 # length: 5 sound # name: # type: sq_string # elements: 1 # length: 2377 usage: sound(x [, fs, bs]) Play the signal through the speakers. Data is a matrix with one column per channel. Rate fs defaults to 8000 Hz. The signal is clipped to [-1, 1]. Buffer size bs controls how many audio samples are clipped and buffered before sending them to the audio player. bs defaults to fs, which is equivalent to 1 second of audio. Note that if $DISPLAY != $HOSTNAME:n then a remote shell is opened to the host specified in $HOSTNAME to play the audio. See manual pages for ssh, ssh-keygen, ssh-agent and ssh-add to learn how to set it up. This function writes the audio data through a pipe to the program "play" from the sox distribution. sox runs pretty much anywhere, but it only has audio drivers for OSS (primarily linux and freebsd) and SunOS. In case your local machine is not one of these, write a shell script such as ~/bin/octaveplay, substituting AUDIO_UTILITY with whatever audio utility you happen to have on your system: #!/bin/sh cat > ~/.octave_play.au SYSTEM_AUDIO_UTILITY ~/.octave_play.au rm -f ~/.octave_play.au and set the global variable (e.g., in .octaverc) global sound_play_utility="~/bin/octaveplay"; If your audio utility can accept an AU file via a pipe, then you can use it directly: global sound_play_utility="SYSTEM_AUDIO_UTILITY flags" where flags are whatever you need to tell it that it is receiving an AU file. With clever use of the command dd, you can chop out the header and dump the data directly to the audio device in big-endian format: global sound_play_utility="dd of=/dev/audio ibs=2 skip=12" or little-endian format: global sound_play_utility="dd of=/dev/dsp ibs=2 skip=12 conv=swab" but you lose the sampling rate in the process. Finally, you could modify sound.m to produce data in a format that you can dump directly to your audio device and use "cat >/dev/audio" as your sound_play_utility. Things you may want to do are resample so that the rate is appropriate for your machine and convert the data to mulaw and output as bytes. If you experience buffer underruns while playing audio data, the bs buffer size parameter can be increased to tradeoff interactivity for smoother playback. If bs=Inf, then all the data is clipped and buffered before sending it to the audio player pipe. By default, 1 sec of audio is buffered. # name: # type: sq_string # elements: 1 # length: 28 usage: sound(x [, fs, bs]) # name: # type: sq_string # elements: 1 # length: 7 soundsc # name: # type: sq_string # elements: 1 # length: 793 usage: soundsc(x, fs, limit) or soundsc(x, fs, [ lo, hi ]) soundsc(x) Scale the signal so that [min(x), max(x)] -> [-1, 1], then play it through the speakers at 8000 Hz sampling rate. The signal has one column per channel. soundsc(x,fs) Scale the signal and play it at sampling rate fs. soundsc(x, fs, limit) Scale the signal so that [-|limit|, |limit|] -> [-1, 1], then play it at sampling rate fs. If fs is empty, then the default 8000 Hz sampling rate is used. soundsc(x, fs, [ lo, hi ]) Scale the signal so that [lo, hi] -> [-1, 1], then play it at sampling rate fs. If fs is empty, then the default 8000 Hz sampling rate is used. y=soundsc(...) return the scaled waveform rather than play it. See sound for more information. # name: # type: sq_string # elements: 1 # length: 60 usage: soundsc(x, fs, limit) or soundsc(x, fs, [ lo, hi ])