BSEARCHTEXT(1) BSEARCHTEXT(1) NAME bsearchtext - binary search a text file database for character string(s) SYNOPSIS bsearchtext [-A] [-f] [-P] [-r m|n] [-v] dbfilename [string(s)] DESCRIPTION Bsearchtext is for binary searching a text file database for character strings. Any matches are printed to stdout, in order. The program requires mmap(2) to map the database file into the Unix VM system. The database file name is a required command line argument. The database is a standard Unix text file, one string per line, in lex- ical order, constructed with "sort -u infile > outfile", or equivalent. The database mechanism is conservative with machine resources, requir- ing about 12.5 micro-seconds of machine time to lookup a word in the Unix system dictionary, (2.5 MB, quarter of a million words, single 466 MHz., Pentium, lightly loaded, Linux 2.2, time(1) command to lookup every word in the dictionary, divided by the number of words.) Concep- tually, the database mechanism is implemented similar to the the tech- nique used in the look(1) command, but requires exact matches, as opposed to partial key matches. The strings to be searched for may be supplied as additional optional command line arguments, or redirected to the program via stdin for com- patibility with procmail(1), and other e-mail scripting agents. A suitable procmail(1) recipe example might be: :0 wfh * ? something | bsearchtext reject.db | formail -A "X-Notice: Word in reject.db database" which could be, if necessary, overridden, on a case-by-case basis, with the example recipe: :0 wfh * ^X-Notice: +Word +in +reject.db +database * ? something | bsearchtext accept.db | formail -I "X-Notice: Word in reject.db database" or similar construct, where the databases contain e-mail addresses or domain names, etc. Similarly, the look(1) program could be used: :0 * ? look -f "Word" "${HOME}/reject.db" | formail -I "X-Notice: Word in reject.db database" which would provide the same functionality, but with partial key matches. The program contains less than 200 lines of declarations and state- ments, all of which are documented with in line comments. The program has been compiled and tested on SunOS, Solaris, and Linux, and may work on other brands of Unix. The program returns 0 if no error and any of the specified strings were found in the database file, 1 if no error and no strings were found; else returns a unique error code greater than 1 representing the error encountered-which will, also, print an error diagnostic to stderr. The -r option is useful for controlling the return value under error conditions-for example, the program return can be preempted if the database file can not be opened, (or read,) with a return value of match, or no match, depending on environmental requirements. OPTIONS dbfilename Database file name. string(s) Character string(s) to be searched for, (defaults to stdin). -A Return = match if all strings found, (match if any string found). -f Lower case search, (the database must be in lower case) -P Print the string(s) not in the database. -r m|n On file error, exit return = match for m, no match for n. -v Print the program's version information. WARNINGS Under buffer overflow conditions, the program makes no attempts at han- dling the situation-it just detects it, prints an error message, and exits. SEE ALSO receivedIP(1), receivedIPdb(1), receivedIPdbdedup(1), receivedIPdbrm(1), receivedIPdbusort(1), bsearchtext(1), receivedAd- dress(1), receivedTodb(1), receivedMSGIDdb(1), receivedUnknowndb(1), tolower(1), toupper(1), bsorttext(1) receivedIPforgedb(1), hsearch- text(1), bsearchbody(1) DIAGNOSTICS Error messages for incompatible arguments, failure to allocate memory, inaccessible files, opening and closing files, and input record buffer overflow. AUTHORS ---------------------------------------------------------------------- A license is hereby granted to reproduce this software source code and to create executable versions from this source code for personal, non-commercial use. The copyright notice included with the software must be maintained in all copies produced. THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY. Copyright (c) 2001-2007, John Conover, All Rights Reserved. Comments and/or bug reports should be addressed to: john@email.johncon.com (John Conover) ---------------------------------------------------------------------- January 16, 2007 BSEARCHTEXT(1)