RECEIVEDIPDBDEDUP(1) RECEIVEDIPDBDEDUP(1) NAME receivedIPdbdedup - dedup the IP addresses in database files(s) against a database file SYNOPSIS receivedipdbdedup [-v] dbfilename [dbfilename(s)] DESCRIPTION ReceivedIPdbdedup is for dedup'ing the IP addresses in database files(s) against a database file. The dedup'ed database is printed to stdout. The program requires mmap(2) to map the database file into the Unix VM system. The database file name is a required command line argument. The database is a standard Unix text file, one IP address per line, in lexical order, constructed with "sort -u infile > outfile", or equiva- lent. An IP address range can be represented as a Class A, B, or C range. For example, the IP address "123.210." in the database file would match "123.210.1.0" in a "Received: " e-mail header record. The database mechanism is conservative with machine resources, requir- ing about 12.5 micro-seconds of machine time to lookup a word in the Unix system dictionary, (2.5 MB, quarter of a million words, single 466 MHz., Pentium, lightly loaded, Linux 2.2, time(1) command to lookup every word in the dictionary, divided by the number of words.) Concep- tually, the database mechanism is implemented similar to the the tech- nique used in the look(1) command, but requires exact matches, as opposed to partial key matches. The program has implicit IP addresses that do not have to be included in the database-those with invalid "dotted quad" element values, (such as greater than 255, for example.) The input database file name(s) may be supplied as additional optional command line arguments, or redirected to the program via stdin. The most common usage is where the input database file name and the database file name are the same: receivedIPdbdedup example.db < example.db > newexample.db which besides removing duplicate records, also, removes records that would be matched by a superset of IP addresses-for example, if the database file contained the IP addresses "123.210.1.0", and the Class B address "123.210" the "123.210.1.0" address would be removed from the output, since the Class B address is a superset of the IP address. The program contains less than 300 lines of declarations and state- ments, all of which are documented with in line comments. The program has been compiled and tested on SunOS, Solaris, and Linux, and may work on other brands of Unix. The program returns 0 if no error and a match was found in the database file for the IP addresses, 1 if no error and no match found; else returns a unique error code greater than 1 representing the error encountered-which will, also, print an error diagnostic to stderr. OPTIONS dbfilename Database file name. dbfilename(s) Database file names(s), (defaults to stdin). -v Print the program's version information. WARNINGS Under buffer overflow conditions, the program makes no attempts at han- dling the situation-it just detects it, prints an error message, and exits. The program is capable of rejecting entire Class A, Class B, or Class C, IP address ranges. Discretion is advised. SEE ALSO receivedIP(1), receivedIPdb(1), receivedIPdbdedup(1), receivedIPdbrm(1), receivedIPdbusort(1), bsearchtext(1), receivedAd- dress(1), receivedTodb(1), receivedMSGIDdb(1), receivedUnknowndb(1), tolower(1), toupper(1), bsorttext(1) receivedIPforgedb(1), hsearch- text(1), bsearchbody(1) DIAGNOSTICS Error messages for incompatible arguments, failure to allocate memory, inaccessible files, opening and closing files, and, input record buffer overflow. AUTHORS ---------------------------------------------------------------------- A license is hereby granted to reproduce this software source code and to create executable versions from this source code for personal, non-commercial use. The copyright notice included with the software must be maintained in all copies produced. THIS PROGRAM IS PROVIDED "AS IS". THE AUTHOR PROVIDES NO WARRANTIES WHATSOEVER, EXPRESSED OR IMPLIED, INCLUDING WARRANTIES OF MERCHANTABILITY, TITLE, OR FITNESS FOR ANY PARTICULAR PURPOSE. THE AUTHOR DOES NOT WARRANT THAT USE OF THIS PROGRAM DOES NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OF ANY THIRD PARTY IN ANY COUNTRY. Copyright (c) 2001-2007, John Conover, All Rights Reserved. Comments and/or bug reports should be addressed to: john@email.johncon.com (John Conover) ---------------------------------------------------------------------- January 16, 2007 RECEIVEDIPDBDEDUP(1)