Ndustrix: Architecture

Software For Business Intelligence Analytics:

Architecture

This is an outline of the methodology used in the construction of the data presented in Appendix C of the document, (fractal.ps.gz, or fractal.pdf, about 6MB.) The reader is assumed to have remedial knowledge of computing concepts, statistics, and manipulation of time series data sets. The man(1) pages for the programs and their C sources, which were used in the methodology are available on the Utilities page.

Procedure

The following procedure is executed in the markets sub directory of the sources from a "Makefile" using the Unix utility make(1). The programs tsfraction, tslsq, tsnormal, tsrms, tshurst, tshcalc, tsunfairbrownian, tsshannonmax, tsshannon, tsXsquared, tsderivative, tsstatest, and tslogreturns are described briefly in Programs Appendix of the document, and in addition have online manual pages which can be viewed by the Unix utility man(1). In depth descriptions of the programs are available in the program sources.

Note that many of the parametric values in the analysis of the fractal time series data set are derived by different methodologies. This is for comparative consistency verification.

Run the program tsfraction on the fractal time series data set to produce a time series of the increments.
Run the program tslsq, with the -p option, on the time series of the increments to produce the least squares fit formula for the average of the increments in the time series of the increments.
Run the program tsnormal, with the -p option, on the time series of the increments to produce the mean and standard deviation for the average of the increments in the time series of the increments.
Run the program tsrms on the time series of the increments to produce a time series of the root mean square of the time series of the increments.
Run the program tsrms, with the -p option, on the time series of the increments to produce the root mean square of the time series of the increments.
Using the Unix utility sed(1), remove any negative signs from the time series of the increments to produce a time series of the absolute value of the time series of the increments.
Run the program tslsq, with the -p option, on the time series of the absolute value of the increments to produce the least squares fit formula for the absolute value of the increments.
Run the program tsnormal, with the -p option, on the time series of the absolute value of the increments to produce the mean and standard deviation for the average of the time series of the absolute value of the increments.
Run the program tsnormal, with the options -t -s 30, on the time series of the increments to produce a time series graph of the bell curve of the distribution of the increments in the time series of the increments.
Run the program tsnormal, with the options -t -s 30 -f, on the time series of the increments to produce a time series graph of the distribution of the increments in the time series of the increments.
Run the program tsXsquared on the distribution of the increments to produce a chi^2 confidence level that the distribution of the increments does have a Gaussian distribution.
Run the program tsstatest on the distribution of the increments to produce an estimation of the size of the required data set for reasonable accuracy.
Run the program tsderivative on the time series of the increments to produce the first derivative of the time series of the increments. Additionally, run the program tsnormal, with the options -t -s 30, and -t -s 30 -f to produce a time series graph of the distribution of the first derivative of the increments.
Run the program tsderivative on the time series of the increments to produce the second derivative of the time series of the increments. Additionally, run the program tsnormal, with the options -t -s 30, and -t -s 30 -f to produce a time series graph of the distribution of the second derivative of the increments.
Run the program tshurst on the time series to produce a graph of the Hurst coefficient of the time series.
Run the program tslsq, with the -p option, on the graph of the Hurst coefficient of the time series to produce the least squares fit formula for the Hurst coefficient of the time series.
Run the program tshcalc on the time series to produce a graph of the H parameter of the time series of the increments.
Run the program tslsq, with the -p option, on the graph of the H parameter of the time series of the increments to produce the least squares fit formula for the H parameter of the time series of the increments.
Run the program tsunfairbrownian, with the -f option and the root mean square value of the time series of the increments, on the fractal time series data set to produce a simulation of the fractal time series data set.
Run the program tsfraction on the simulation of the fractal time series data set to produce a time series of the increments of the simulation of the fractal time series data set.
Run the program tsnormal, with the -p option, on the simulation of the time series of the increments to produce the mean and standard deviation for the average of the increments in the simulation of the time series of the increments.
Run the program tsnormal, with the options -t -s 30, on the simulation of the time series of the increments to produce a time series graph of the bell curve of the distribution of the increments in the simulation of the time series of the increments.
Run the program tsnormal, with the options -t -s 30 -f, on the simulation of the time series of the increments to produce a time series graph of the distribution of the increments in the simulation of the time series of the increments.
Run the program tsshannonmax on the fractal time series data set to produce a graph of the maximum Shannon probability for the fractal time series data set.
Run the program tsshannonmax, with the -p option, on the fractal time series data set to produce the value of the maximum Shannon probability for the fractal time series data set.
Run the program tslogreturns, with the -p option, on the fractal time series data set to produce the value of the logarithmic returns of the fractal time series data set.
Run the program tsshannon with the value of the logarithmic returns of the fractal time series data set to produce the value of the Shannon probability for the fractal time series data set.
Run the program tslsq, with the -e -p options, on the fractal time series data set to produce the value of the coefficient of the exponential returns for the fractal time series data set.
Run the program tsshannon with the value of the coefficient of the exponential returns of the fractal time series data set to produce the value of the Shannon probability for the fractal time series data set.
Use the Unix utility egrep(1) with the argument "-e -" on the time series of the increments to "filter" records containing a negative sign. Pipe this time series to the Unix utility wc(1) to produce a count of the records in the time series of the increments with negative signs.
Use the Unix utility wc(1) on the time series of the increments to produce a count of the records in the time series of the increments.
Use the Unix utility awk(1) divide the count of the records in the time series of the increments with negative signs, by the count of the records in the time series of the increments, and subtracting from unity, to produce the value of the maximum Shannon probability for the time series of the increments.

In addition, the Unix utility awk(1)is used to parse and reformat data from this procedure into LaTeX macros for direct import into this manuscript. The Markets Appendix is machine generated.

Verification Methodology

As a cursory verification methodology:

Using the mean and root mean square values of the normalized increments of the time series data, and the Shannon probability as calculated by counting the total number of records that the market movement was positive, in relation to the total number of records in the data set to verify the accuracy.
Compare the Shannon probability, as found by the tsshannonmax program to the value of the Shannon probability as calculated by counting the total number of records that the market movement was positive, in relation to the total number of records in the data set
Compare the four methods of calculating the logarithmic returns:
- By calculation based on the mean of the normalized increments.
- By the calculation of the constant in the least squares approximation to the normalized increments.
- By the calculation of the exponential least squares fit to the original time series data set, with the program tslsq.
- By the calculation of the logarithmic returns, with the program tslogreturns.
Using the mean, standard deviation, and the root mean square of the normalized increments, and the Shannon probability as calculated by counting the total number of records that the industrial market movement was positive, in relation to the total number of records in the data set, verify the accuracy.
Compare the accuracy of the equality of the absolute value and root mean square of the normalized increments.

Note that the numerical manipulations are relatively simple, and can be implemented with simple awk(1) scripts.

A license is hereby granted to reproduce this software source code and to create executable versions from this source code for personal, non-commercial use. The copyright notice included with the software must be maintained in all copies produced.

THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY.

So there.

Comments and/or bug reports should be addressed to:

john@email.johncon.com

http://www.johncon.com/

http://www.johncon.com/ntropix/

http://www.johncon.com/ndustrix/

http://www.johncon.com/nformatix/

http://www.johncon.com/ndex/

John Conover

john@email.johncon.com

January 6, 2006

Last modified: Tue Mar 1 16:07:41 PST 2011 $Id: architecture.html,v 1.0 2011/03/02 00:20:11 conover Exp $