This is an outline of the methodology used in the
construction of the data presented in Appendix C of the
document, (fractal.ps.gz,
or fractal.pdf,
about 6MB.) The reader is assumed to have remedial knowledge
of computing concepts, statistics, and manipulation of time
series data sets. The man(1) pages
for the programs and their C sources, which were used in the
methodology are available on the Utilities
page.
Procedure
The following procedure is executed in the markets sub
directory of the sources from a "Makefile" using the Unix
utility make(1). The programs tsfraction,
tslsq,
tsnormal,
tsrms,
tshurst,
tshcalc,
tsunfairbrownian,
tsshannonmax,
tsshannon,
tsXsquared,
tsderivative,
tsstatest,
and tslogreturns
are described briefly in Programs Appendix of the document,
and in addition have online manual pages which can be viewed
by the Unix utility man(1). In depth
descriptions of the programs are available in the program
sources.
Note that many of the parametric values in the analysis of
the fractal time series data set are derived by different
methodologies. This is for comparative consistency
verification.
Run the program tsfraction
on the fractal time series data set to produce a time series
of the increments.
Run the program tslsq,
with the p option, on the time series of the increments to
produce the least squares fit formula for the average of the
increments in the time series of the increments.
Run the program tsnormal,
with the p option, on the time series of the increments to
produce the mean and standard deviation for the average of
the increments in the time series of the
increments.
Run the program tsrms
on the time series of the increments to produce a time
series of the root mean square of the time series of the
increments.
Run the program tsrms,
with the p option, on the time series of the increments to
produce the root mean square of the time series of the
increments.
Using the Unix utility
sed(1), remove any negative signs
from the time series of the increments to produce a time
series of the absolute value of the time series of the
increments.
Run the program tslsq,
with the p option, on the time series of the absolute value
of the increments to produce the least squares fit formula
for the absolute value of the increments.
Run the program tsnormal,
with the p option, on the time series of the absolute value
of the increments to produce the mean and standard deviation
for the average of the time series of the absolute value of
the increments.
Run the program tsnormal,
with the options t s 30, on the time series of the
increments to produce a time series graph of the bell curve
of the distribution of the increments in the time series of
the increments.
Run the program tsnormal,
with the options t s 30 f, on the time series of the
increments to produce a time series graph of the
distribution of the increments in the time series of the
increments.
Run the program tsXsquared
on the distribution of the increments to produce a chi^2
confidence level that the distribution of the increments
does have a Gaussian distribution.
Run the program tsstatest
on the distribution of the increments to produce an
estimation of the size of the required data set for
reasonable accuracy.
Run the program tsderivative
on the time series of the increments to produce the first
derivative of the time series of the
increments. Additionally, run the program tsnormal,
with the options t s 30, and t s 30 f to produce a time
series graph of the distribution of the first derivative of
the increments.
Run the program tsderivative
on the time series of the increments to produce the second
derivative of the time series of the increments.
Additionally, run the program tsnormal,
with the options t s 30, and t s 30 f to produce a time
series graph of the distribution of the second derivative of
the increments.
Run the program tshurst
on the time series to produce a graph of the Hurst
coefficient of the time series.
Run the program tslsq,
with the p option, on the graph of the Hurst coefficient of
the time series to produce the least squares fit formula for
the Hurst coefficient of the time series.
Run the program tshcalc
on the time series to produce a graph of the H parameter of
the time series of the increments.
Run the program tslsq,
with the p option, on the graph of the H parameter of the
time series of the increments to produce the least squares
fit formula for the H parameter of the time series of the
increments.
Run the program tsunfairbrownian,
with the f option and the root mean square value of the
time series of the increments, on the fractal time series
data set to produce a simulation of the fractal time series
data set.
Run the program tsfraction
on the simulation of the fractal time series data set to
produce a time series of the increments of the simulation of
the fractal time series data set.
Run the program tsnormal,
with the p option, on the simulation of the time series of
the increments to produce the mean and standard deviation
for the average of the increments in the simulation of the
time series of the increments.
Run the program tsnormal,
with the options t s 30, on the simulation of the time
series of the increments to produce a time series graph of
the bell curve of the distribution of the increments in the
simulation of the time series of the increments.
Run the program tsnormal,
with the options t s 30 f, on the simulation of the time
series of the increments to produce a time series graph of
the distribution of the increments in the simulation of the
time series of the increments.
Run the program tsshannonmax
on the fractal time series data set to produce a graph of
the maximum Shannon probability for the fractal time series
data set.
Run the program tsshannonmax,
with the p option, on the fractal time series data set to
produce the value of the maximum Shannon probability for the
fractal time series data set.
Run the program tslogreturns,
with the p option, on the fractal time series data set to
produce the value of the logarithmic returns of the fractal
time series data set.
Run the program tsshannon
with the value of the logarithmic returns of the fractal
time series data set to produce the value of the Shannon
probability for the fractal time series data set.
Run the program tslsq,
with the e p options, on the fractal time series data set
to produce the value of the coefficient of the exponential
returns for the fractal time series data set.
Run the program tsshannon
with the value of the coefficient of the exponential returns
of the fractal time series data set to produce the value of
the Shannon probability for the fractal time series data
set.
Use the Unix utility
egrep(1) with the argument "e "
on the time series of the increments to "filter" records
containing a negative sign. Pipe this time series to the
Unix utility wc(1) to produce a
count of the records in the time series of the increments
with negative signs.
Use the Unix utility wc(1)
on the time series of the increments to produce a count of
the records in the time series of the increments.
Use the Unix utility awk(1)
divide the count of the records in the time series of the
increments with negative signs, by the count of the records
in the time series of the increments, and subtracting from
unity, to produce the value of the maximum Shannon
probability for the time series of the increments.
In addition, the Unix utility
awk(1)is used to parse and reformat
data from this procedure into LaTeX macros for direct import
into this manuscript. The Markets Appendix is machine
generated.
Verification Methodology
As a cursory verification methodology:
Using the mean and root mean square values of the
normalized increments of the time series data, and the
Shannon probability as calculated by counting the total
number of records that the market movement was positive, in
relation to the total number of records in the data set to
verify the accuracy.
Compare the Shannon probability, as found by the tsshannonmax
program to the value of the Shannon probability as
calculated by counting the total number of records that the
market movement was positive, in relation to the total
number of records in the data set
Compare the four methods of calculating the
logarithmic returns:
By calculation based on the mean of the normalized
increments.
By the calculation of the constant in the least
squares approximation to the normalized
increments.
By the calculation of the exponential least squares
fit to the original time series data set, with the program
tslsq.
By the calculation of the logarithmic returns, with
the program tslogreturns.
Using the mean, standard deviation, and the root mean
square of the normalized increments, and the Shannon
probability as calculated by counting the total number of
records that the industrial market movement was positive, in
relation to the total number of records in the data set,
verify the accuracy.
Compare the accuracy of the equality of the absolute
value and root mean square of the normalized
increments.
Note that the numerical manipulations are relatively
simple, and can be implemented with simple
awk(1) scripts.
A license is hereby granted to reproduce this software
source code and to create executable versions from this source
code for personal, noncommercial use. The copyright notice
included with the software must be maintained in all copies
produced.
THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO
WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING
WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY
PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF
THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY
RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY.
So there.
Copyright © 19942011, John Conover, All Rights
Reserved.
Comments and/or bug reports should be addressed to:
 john@email.johncon.com
 http://www.johncon.com/
 http://www.johncon.com/ntropix/
 http://www.johncon.com/ndustrix/
 http://www.johncon.com/nformatix/
 http://www.johncon.com/ndex/
 John Conover
 john@email.johncon.com
 January 6, 2006
