NdeX_medium.jpg

 

Program for Binary Searching a Constant Flat File Database:

Architecture


Home | Installation | Usage | FAQs | Utilities | Architecture | QA | Tests | Links | License | Author | Download | Thanks



home.jpg
installation.jpg
usage.jpg
FAQs.jpg
utilities.jpg
architecture.jpg
QA.jpg
tests.jpg
links.jpg
license.jpg
author.jpg
download.jpg
thanks.jpg

The ndex program requires mmap(2) to map the database file into the Unix VM system, meaning the database file is not copied into allocated memory. The database file resides in the Unix VM system, and access is quite quick since it uses the hardware LRU/MRU/tag table/cache memory page system for access/search, and there are no practical limits, (other than integer size,) on the size of the file.

The binary search algorithm is implemented as follows:

  1. Recursively binary search a memory area, referenced by the begin and end variables, for string. The variable begin references the beginning of the area to be searched; the variable end references the end of the area to be searched, and will always reference the '\n' terminating character of the last record of the memory area, (which is a requirment); the variable string is the null terminated string to search for, and has a length represented by the variable string_len.

  2. Requirements:

    1. The memory area size must be greater than zero, (this is a requirement for mmap(1), also.)

    2. The last character of the memory area must be a '\n' text file record terminator, e.g., the last record, (which, minimally, may be the only record,) must be terminated.

    3. The records in the memory area must be in lexical sequence, (consistent with strcmp() and strncmp().)

  3. The general strategy is:

    1. For each iteration in the recursion:

      1. Divide the memory area exactly in two.

      2. Work from this middle, in both directions, to find the end of the preceding record, (or beginning of file, whichever comes first) and, the end of the current record:

        1. The current record contains the first instance of the the terminating character at, or past, the middle.

        2. The current record's beginning is the next character after the preceding records's terminating character, or the beginning of the memory area.

      3. Compare this record with the string for a match.

      4. If not a match, decide which direction to move for the next iteration, and call this function, again.

      5. The recursive binary search terminates when a match is found, or when there is no more memory area to search.

    2. Return a reference to the matched area in memory, or null if no match.


A license is hereby granted to reproduce this software source code and to create executable versions from this source code for personal, non-commercial use. The copyright notice included with the software must be maintained in all copies produced.

THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY.

So there.

Copyright © 1994-2007, John Conover, All Rights Reserved.


Comments and/or bug reports should be addressed to:

john@email.johncon.com

http://www.johncon.com/
http://www.johncon.com/ntropix/
http://www.johncon.com/ndustrix/
http://www.johncon.com/nformatix/
http://www.johncon.com/ndex/
John Conover
john@email.johncon.com
January 6, 2005



Home | Installation | Usage | FAQs | Utilities | Architecture | QA | Tests | Links | License | Author | Download | Thanks


Copyright © 1994-2007 John Conover, john@email.johncon.com. All Rights Reserved.
Last modified: Thu Mar 22 18:07:33 PDT 2007 $Id: architecture.html,v 1.0 2007/03/23 01:11:10 conover Exp $
Valid HTML 4.0!