TSHEADPHONE(1) TSHEADPHONE(1) NAME tsheadphone - compensate a time series from a time series file from a music CD for headphones SYNOPSIS tsheadphone [-b lfcf] [-c hfcf] [-d dly] [-f pole] [-g gain] [-p] [-v] [infile.txt [outfile.txt]] DESCRIPTION RIFF WAVE compensator-compensates a file consisting of a time series of numbers-each record in the time series is two tab delimited numbers; the first number is the PCM magnitude for stereo channel one, the sec- ond number is the PCM magnitude for stereo channel two; the output is a time series, compensated for use with stereo headphones. Tsheadphone compensates the output of tsriff(1) for headphones, it takes the time series output of tsriff(1) and compensates it. For example: tsriff myfile.wav | tsheadphone | tsunriff myfile.wav new.wav where the file, new.wav, is a RIFF/WAVE file, myfile.wav, compensated for headphones-the header for the file, new.wav, was copied from the original, myfile.wav by the tsunriff program, (however, the -H option to the tsriff(1) program could be used if the header needs to be manip- ulated.) Consider listening to a stereo system with speakers. At low frequen- cies, at half wavelengths longer than the distance between the ears, sound will diffract around the head and there will be no directional characteristics. So, for wavelengths longer than about a foot, (corre- sponding to a frequency of about 1100 Hz.,) there is no stereo effect. For frequencies below about 1100 Hz., the left and right channels can be simply added together for each headphone channel. Now consider frequencies higher than about 1100 Hz. Those frequencies have directional characteristics; if the sound from the left and right speakers are identical, (i.e., monophonic,) the sound would appear to come from a point source directly in front of the listener-for both speakers and headphones. If the sound from the left and right speakers are not identical, then they would appear to come from point sources to the left and right of a listener with headphones, (which makes the music appear to come from inside the head.) Note that it is a vector addition-identical sound in both channels appears to come from straight ahead in headphones, while different sounds appear to come directly from the left or right. To make sound appear to come from in between those directions, a fraction, the cross feed factor, of one channel must be added to the other, and vice versa. A standard configuration for a stereo system is for the speakers to be located 15 feet apart, and aimed at the listener at an included angle of 60 degrees, which provides an illusion of spatially distributed music. For example, if the listener is looking straight ahead, then the sound level of the right channel would be cos (60) = 0.5 from the right speaker, and cos (120) = -0.5 from the left speaker. If the listener's head turns 15 degrees toward the left speaker, the sound level would raise to cos (45) = 0.707 from the right speaker and cos (105) = -0.269 from the left speaker. Likewise, sound that is balanced in a ratio of 0.707 from the right speaker and -0.269 from the left speaker would appear 15 degrees to the right of the listener, (by reciprocity argu- ments.) This is the stereo illusion. Most recorded music is mixed according to this model. Looked at from a slightly different perspective, suppose a recording engineer wants the illusion that a sound is coming from a source that is theta degrees to the right of straight ahead. Then mixing the sound level in a ratio of sin(theta) / cos(theta) = tan(theta) between the left and right channels would produce the desired effect. For example, if theta was 15 degrees, then sin(15) = 0.259 and cos(15) = 0.966, or the ratio of right to left channel sound level would be 0.966 / 0.259 = 0.373. (Assuming that the speakers were "spherical radiators.") For headphones to sound like the same stereo illusion-by reciprocity arguments-then the sound should appear to come from two sources sepa- rated by +/- 30 degrees, (or 60 and 120 degrees from the axis of the listener's ears.) Recorded music mixed such that a sound level ratio of sin (60) / cos (60) = 1.732 between the left and right speakers would produce sound appearing to come from 30 degrees off axis. Or, a factor of 1 / 1.732 = 0.577 from one channel should be added to the other, and vice versa. So as an approximate model, at frequencies below about 1100 Hz., both the right and left channels are added together for each stereo channel. At frequencies above 1100 Hz., each stereo channel has a cross feed of about a factor of 0.577 = 5 Db. from the other channel added to it- ignoring sound blocking by the head itself. Note that the analysis is in good agreement with the empirical (See: http://headwize.com/tech/elemnts_tech.htm, Technical Paper: The Ele- ments of Musical Perception by HeadWize for particulars) literature: E. MacPherson, "A Computer Model of Binaural Localiztion of Stereo Imaging Measuremnt," JAES, September, 1991 claims about a -3 Db. = 0.71, cross feed below 700 Hz. falling to about -10.0 Db. = 0.32 cross feed above 700 Hz., with a 300 microsecond delay, (see: http://headwize.com/tech/headrm1_tech.htm, Technical Paper: The Elements of Muscial Perception, corresponding to about a quarter of a foot,) in the cross feed between channels. (Which is about -3Db. at all frequencies from the analysis-the above analysis did not include blocking by the head, which apparently is about -3Db.) To produce the single pole low pass filter at 700 Hz., the http://localhost/www.johncon.com/ndustrix/utilities/tspole.tar.gz sources from the http://localhost/www.johncon.com/ndustrix/utili- ties/tspole.txt tspole(1) program from the href="http://local- host/www.johncon.com/ndustrix/">NdustriX site was used. The single pole low pass filtering of a time series is implemented from the following discrete time equation: v = I * k2 + v * k1 n + 1 n where I is the value of the current input sample in the time series, v are the n'th and n + 1'th value of the output time series, and k1 and k2 are constants determined from the following equations: -2 * p * pi k1 = e and k2 = 1 - k1 where p is a constant that determines the frequency of the pole-a value of unity places the pole at the sample frequency of the time series. For a pole frequency of 700 Hz. the value of p is about 0.016, (p = 700 / 44,100,) and pi is 3.1459 ... The high frequency zero is constructed by multiplying the magnitude of one channel by 0.32, summing it into the other channel, for all fre- quencies; for frequencies below 700 Hz., the magnitude of one channel is multiplied by 0.71 - 0.32 = 0.39, summing it into the other channel, (e.g., for frequencies below 700 Hz., the left channel has a sound level of L + 0.32R + 0.39R = L + 0.71R,) and above 700 Hz., a factor of 0.32 of one channel is summed into the other, (or the left channel would be L + 0.32R.) The gain is adjusted by multiplying the output by 0.59 to bring the low frequency sound level back near the original. Unfortunately, RIFF/WAVE PCM values are restricted to -32768 to 32767, and since, for frequencies below the pole of 700 Hz., the magnitude has been increased by a factor of 1.71, the gain is reduced by a factor of 0.59. This means, for a mono recording, (i.e., both stereo channels contain the same PCM values,) at frequencies much larger than 700 Hz., the magnitude would be reduced by a factor of 1 + 0.32 / 1.71 = 0.77, or about 2 Db., (voltage or sound pressure, i.e., 20 times log base 10 the ratio.) The delay in the cross feed can be altered using the -d option to the tsheadphone(1) program. The delay equivalent to 0.25 feet, would be about 300 microseconds. The delay is implemented with a ring buffer, each "bucket" representing 1 /44.1 KHz., so 500 microseconds would require 12.3 buckets, or about 13-this delay is inserted in the total cross feed, (including below 700 Hz. and above,) for both channels. The -p option causes the program to print data, including the minimum, maximum, and RMS of all PCM values in both channels. Because the RIFF/WAVE standard only supports PCM data values that are a short signed int, (two 8 bit bytes,) overflow is regularly possible, which can be remedied by lowering the value of the -g parameter, and re-run- ning the program. OPTIONS -b lfcf Low frequency cross feed between stereo channels, (defaults to 0.71). -c hfcf High frequency cross feed between stereo channels, (defaults to 0.32). -d dly Delay, in seconds, of cross feed between stereo channels, (defaults to 0.0003). -f pole Pole frequency, (defaults to 700 Hz.) -g gain Volume gain, (defaults to 1.0). -p Print data about PCM values. -v Print the version and copyright banner of the program. infile.txt Input filename of text PCM data, (defaults to STDIN). outfile.txt Output filename of text PCM data, (defaults to STDOUT). WARNINGS There is little or no provision for handling numerical exceptions. SEE ALSO tsriff(1), tsunriff(1), tsheadphone(1) DIAGNOSTICS Error messages for incompatible arguments, failure to allocate memory, inaccessible files, and opening and closing files. AUTHORS ---------------------------------------------------------------------- A license is hereby granted to reproduce this software source code and to create executable versions from this source code for personal, non-commercial use. The copyright notice included with the software must be maintained in all copies produced. THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY. Copyright (c) 1994-2005, John Conover, All Rights Reserved. Comments and/or bug reports should be addressed to: john@email.johncon.com (John Conover) ---------------------------------------------------------------------- January 19, 2005 TSHEADPHONE(1)