BSEARCHBODY(1) BSEARCHBODY(1) NAME bsearchbody - compare domain names and IP addresses found in the body of an e-mail against a database file SYNOPSIS bsearchbody [-c] [-P] [-r m|n] [-v] dbfilename [files(s)] DESCRIPTION Bsearchbody is for searching e-mail bodies for site domain names and IP addresses, which are compared against the names and addresses in a database file. This program uses mmap(2) to map the database file into the Unix VM system. The names and addresses from the e-mail body are compared against the database file using a binary search. The database file name is a required command line argument. The database is a standard Unix text file, one domain name or IP address per line, in lexical order, constructed with "sort -u infile > outfile", or equivalent. The database is a standard Unix text file, one IP address, or site domain name, per line, in lexical order, con- structed with "sort -u infile > outfile", or equivalent. An IP address range can be represented as a Class A, B, or C range. For example, the IP address "123.210." in the database file would match "123.210.1.0" in a "Received: " e-mail header record. The database mechanism is conservative with machine resources, requir- ing about 12.5 micro-seconds of machine time to lookup a word in the Unix system dictionary, (2.5 MB, quarter of a million words, single 466 MHz., Pentium, lightly loaded, Linux 2.2, time(1) command to lookup every word in the dictionary, divided by the number of words.) Concep- tually, the database mechanism is implemented similar to the the tech- nique used in the look(1) command, but requires exact matches, as opposed to partial key matches. The input e-mail file name(s) may be supplied as additional optional command line arguments, or redirected to the program via stdin for com- patibility with procmail(1), and other e-mail scripting agents. A suitable procmail(1) recipe example might be: :0 wfh * ? bsearchbody blacklist.db | formail -A "X-Notice: Message in blacklist.db database" A suitable way of making a database is by something like: bsearchbody -c -P blacklist.db < spam.1 spam.2 ...> temp where the spam.(n) files are spam messages. The temp file will have to be hand edited for appropriate spam addresses, and have the current blacklist.db database added to it, and then sorted in lexical order as per sort(1) to make a new blacklist.db standard Unix text file database. The program contains less than 1000 lines of declarations and state- ments, all of which are documented with in line comments. The program has been compiled and tested on SunOS, Solaris, and Linux, and may work on other brands of Unix. The program returns 0 if no error and a match was found in the database file for the site domain names or IP addresses, 1 if no error and no match found; else returns a unique error code greater than 1 represent- ing the error encountered-which will, also, print an error diagnostic to stderr. The -r option is useful for controlling the return value under error conditions-for example, the program return can be preempted if the database file can not be opened, (or read,) with a return value of match, or no match, depending on environmental requirements. OPTIONS dbfilename Database file name. -c Print all addresses, not just the first -P Print the addresses not in the database. -r m|n On file error, exit return = match for m, no match for n. -v Print the program's version information. WARNINGS Under buffer overflow conditions, the program makes no attempts at han- dling the situation-it just detects it, prints an error message, and exits. SEE ALSO receivedIP(1), receivedIPdb(1), receivedIPdbdedup(1), receivedIPdbrm(1), receivedIPdbusort(1), bsearchtext(1), receivedAd- dress(1), receivedTodb(1), receivedMSGIDdb(1), receivedUnknowndb(1), tolower(1), toupper(1), bsorttext(1) receivedIPforgedb(1), hsearch- text(1), bsearchbody(1) DIAGNOSTICS Error messages for incompatible arguments, failure to allocate memory, inaccessible files, opening and closing files, and input record buffer overflow. AUTHORS ---------------------------------------------------------------------- A license is hereby granted to reproduce this software source code and to create executable versions from this source code for personal, non-commercial use. The copyright notice included with the software must be maintained in all copies produced. THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY. Copyright (c) 2001-2007, John Conover, All Rights Reserved. Comments and/or bug reports should be addressed to: john@email.johncon.com (John Conover) ---------------------------------------------------------------------- January 16, 2007 BSEARCHBODY(1)