ARCHIE(1L) MISC. REFERENCE MANUAL PAGES ARCHIE(1L) NAME archie(tm) - Internet archive server listing service SYNOPSIS archie DESCRIPTION This manual page describes Version 3 of the archie system. This Internet information service allows the user to query a catalog containing a list of files which are available on hosts connected to the Internet. Software located through this service can be obtained by means of ftp(1); for hosts with access to BITNET/NetNorth/EARN, it can be obtained by electronic mail through the Princeton bitftp (1L) service. Send mail to bitftp@pucc.princeton.edu Other Internet users who are not directly connected may use the services of various ftp-by-mail servers including ftpmail@decwrl.dec.com Some archie systems track archive sites globally, others only track the archive sites in their country, region or continent in order to reduce the load on trans-oceanic links. There are a number of archie hosts serving different continental user communities. The servers command will list the most up-to-date information on archie servers worldwide. archie.au Australia archie.edvz.uni-linz.ac.at Austria archie.univie.ac.at Austria archie.uqam.ca Canada archie.cs.mcgill.ca Canada archie.funet.fi Finland archie.univ-rennes1.fr France archie.th-darmstadt.de Germany archie.ac.il Israel archie.unipi.it Italy archie.wide.ad.jp Japan archie.hana.nm.kr Korea archie.sogang.ac.kr Korea archie.uninett.no Norway archie.rediris.es Spain archie.luth.se Sweden archie.switch.ch Switzerland archie.ncu.edu.tw Taiwan archie.doc.ic.ac.uk United Kingdom archie.hensa.ac.uk United Kingdom archie.unl.edu USA (NE) archie.internic.net USA (NJ) archie.rutgers.edu USA (NJ) archie.ans.net USA (NY) archie.sura.net USA (MD) archie can be accessed interactively, via electronic mail or through archie client programs available widely on the Internet. Using the Interactive (telnet) Interface In order to use the interactive system you should use the following procedure: 1) telnet to the archie system closest to you. Do not use ftp for this, it will not work. 2) Login as user archie no capitals, no password is required. The system should print a banner message and status report before presenting you with the command prompt. Some newer operating systems will prompt for a password. Just hit the return key and continue. 3) Type help for complete information on the system. For full details, refer to the section entitled ARCHIE COM- MANDS which appears below. Using the Electronic Mail Interface In order to use the email interface, send requests to: archie@ where is one of the hosts listed above, or one returned by the servers command. Send the word help in a message to obtain a list of available commands and features. This is a completely automated interface, acting without human intervention. For full details, refer to the section entitled ARCHIE COM- MANDS which appears below. Using the archie clients The source code as well as machine executables for a variety of archie client programs can be obtained via anonymous ftp(1) from many of the archie server hosts listed above. They are usually stored in the archie/clients or pub/archie/clients directories. These clients communicate via the Prospero distributed file system protocol with archie servers, which perform the specified queries and return the results to the user. Currently there are Unix and VMS command line, curses and X window clients as well as Mac and PC Windows versions. For more information on Prospero send your queries to info- prospero-request@isi.edu Communicating with the Database Administrators Mail to archie administrators at a particular archie server should be sent to the address archie-admin@ where is one of the hosts listed above. To send mail to the implementors of the archie system, please send mail to archie-group@bunyip.com The archie server system is a product of Bunyip Information Systems. Requests for additions to the set of hosts surveyed for the catalog, modifications to the Software Description Catalog, or other administrative matters, should be sent to: archie-admin@bunyip.com ARCHIE COMMANDS In the archie system version 3 the telnet and email clients accept a common set of commands. Additionally, there are specialized commands specfic to the particular interfaces. See THE INTERACTIVE INTERFACE and THE EMAIL INTERFACE sec- tions below for a list of these commands. Note that some archie server sites may disable some of the commands for reasons particular to their site. As well some sites limit the number of concurrent interactive (telnet) sessions to better utilize limited resources. Commands Arguments to commands shown in square brackets '[]' are optional; all others are mandatory. find prog This command produces a list of files matching the pat- tern . The may be interpreted as a simple substring, a case sensitive substring, an exact string or a regular expression, depending on the value of the search variable. The output normally contains such information as the file name that was matched, the directory path leading to it, the site containing it and the time at which that site was last updated. The format of the output can be selected through the output_format variable. The results are sorted accord- ing to the value of the sortby variable, and are lim- ited in number by the maxhits variable. prog is identical to find. It is included for backward compatibility with older versions of the system. help [ [] ...] Invokes the help system and presents help on the speci- fied topic. A list of words is considered to be one topic, not a list of individual topics. Thus, help set maxhits requests help on the subtopic maxhits of topic set, not on two separate topics. After help is presented the user is placed in the help system at the deepest level containing subtopics. For example, after typing help set maxhits and being shown the information for that topic the user is placed at the level set in the help hierarchy. list [] Produce a list of sites whose contents are contained in the archie catalog. With no argument all the sites are listed. If given, the argument is interpreted as a regular expression (See "REGULAR EXPRESSIONS" below) against which to match site names: only those names matching are printed. The format of the output can be selected through the output_format variable. Note that the numerical (IP) address associated with a site name was valid at the last time the site was updated in the archie catalog but may have been changed subsequently. Furthermore, the listed IP address is the primary address as listed in the Domain Name System (secondary addresses are not stored). Example: list lists all sites in the catalog, while list .de$ lists all German sites. mail
Mail the result of the last command that produced out- put (eg. find, whatis, list) to
. This must be a vaid email address. manpage [ roff | ascii ] Display the archie manual page (this file). The optional arguments specify the format of the returned document. roff specifies UNIX troff(1) format while ascii specifies plain, preformatted ASCII output. With no arguments it defaults to ascii. domains Asks the current server for the list of the archie pseudo-domains that it supports. See the entry for the match_domain variable below. This command takes no arguments. Example: domains requests the list of pseudo-domains from the server. The result looks (in part) something like this: africa Africa za anzac OZ & New Zealand au:nz asia Asia kr:hk:sg:jp:cn:my:tw:in centralamerica Central America sv:gt:hn easteurope Eastern Europe bg:hu:pl:cs:ro:si:hr mideast Middle East eg:.il:kw:sa northamerica North America usa:ca:mx scandinavia Scandinavia no:dk:se:fi:ee:is southamerica South American ar:bo:br:cl:co:cr:cu:ec:pe usa United States edu:com:mil:gov:us westeurope Western Europe westeurope1:westeurope2 world The World world1:world2 The first column gives the names of pseduo-domains sup- ported by the server. The second gives the "natural language" description of the pseudo-domain and the third column is the actual definitions of those domains. Thus here the "asia" domain is comprised of the Domain Name System country codes for Korea ("kr"), Hong Kong ("hk"), Singapore ("sg") etc. Pseudo-domains may also be constructed from other pseudo-domains: thus one component of the the "northamerica" domain is itself constructed from the "usa" pseudo-domain. motd Re-display the "message of the day", which is normally printed when the user initially logs on to the client (in the case of the interactive interface) or at the start of the returned message (in the email interface). servers Display a list of all publicly accessible archie servers worldwide. The names of the hosts, their IP addresses and geographical locations are listed. set [] Set the specified variable. Variables are used to con- trol various aspects of the way archie operates; the interpretation of arguments, the format of output from various commands, etc. See the section below on variables for a description of each one as well as the entries for unset and show. show [ ...] Without any argument, display the status of all the user-settable variables, including such information as its type (boolean, numeric, string), whether or not it is set and its current value (if its type requires a value). Otherwise show the status of each of the specified arguments. Example: show maxhits site This command is currently unimplemented under version 3 of the archie system. unset variable Remove any value associated with the specified vari- able. This may cause counter-intuitive behavior in some cases; for example, if maxhits is not defined by the user, the find command will print the internal default number of matches rather than an unlimited number of matches. version Print the current version of the client. whatis Search the Software Description Catalog for the given substring, ignoring case. This catalog consists of names and short descriptions of many software packages, documents (like RFCs and educational material), and data files stored on the Internet. Example: whatis uucp in part gives as a result: findpath.sh UUCP Pathfinder logfile-stats UUCP LOGFILE analyzer mapstats UUCP map statistics pro- gram Variable Types The behavior of archie can be modified by certain variables, the values of which may be changed using the set command, or removed entirely by the unset command. There are three variable types: boolean (Set or unset) numeric (Integer within a defined range) string (String of characters which may or may not be restricted). If the value of a string variable should con- tain leading or trailing spaces then it should be quoted. Two ways of quoting text are to surround it with a pair of double quotes (`"'), or to precede individual char- acters with a backslash (`\'). (A double quote, or a backslash may itself be quoted by preceding it by a backslash.) The resulting value is that of the string with the quotes stripped off. Numeric Variables maxhits Allow the find command to generate at most the speci- fied number of matches (permissible range: 0-1000; default: 100). Example: set maxhits 100 halts prog after 100 matches have been found in total. maxhitspm Across all the anonymous FTP archives on the Internet (and even on one single anonymous FTP archive) many files will have the same name. For example, if you Sun Release 4.1 Last change: 12 Apr 1994 7 ARCHIE(1L) MISC. REFERENCE MANUAL PAGES ARCHIE(1L) search for a very common filename like "README" you can get hundreds even thousands of matches. You can limit the number of files with the same name through this variable. For example, set maxhitspm 100 tells the system only 100 files with the same name. Note that the overall maximum number of files returned is still controlled with the 'maxhits' variable. maxmatch This variable will limit the number filenames returned. For example, if maxmatch is set to 2 and you perform a substring search for the string "etc", and the catalog contains filenames "etca", "betc" and "detc" only the filenames "etca" and "betc" will be returned. However, depending on the values of maxhitspm and maxhits you will get back a number of actual files with those names. Example: set maxmatch 20 max_split_size Approximate maximum size, in bytes, of a file to be mailed to the user. Any output larger than this will be split in pieces of about this size. This can be set by the user in the range 1024 to ~2Gb with a default of 51200 bytes. String Variables compress The kind of data compression the user can specify when mailing back output. Currently allowed values are none and compress (standard UNIX compress(1),withadefaultof encode The type of post-compression encoding the user can specify when mailing back output. Currently allowed values are none and uuencode, with a default of none. Note that this variable is ignored unless compression is enabled (via the compress) variable. language Allows the user to specify the language in which the help, etc. is presented. Currently the default value is english. mailto If the mail command is issued with no arguments, mail the output of the last command to the address specified by this string variable. Initially this variable is unset. Example: set mailto user@frobozz.com Conventional Internet addressing styles are understood. BITNET sites should use the convention: user@sitename.bitnet UUCP addresses can be specified as user@sitename.uucp match_domain This variable allows users to restrict the scope of their search based upon the Fully Qualified Domain Names (FQDN) of the anonymous FTP sites being searched. In this way, the user can specify a colon-separated list of domain names to which all returned sites must match. Each component in the list is taken as the rightmost part of the FQDN. For example, set match_domain ca:internic.net:harvard.edu means that the names of all returned sites must end in "ca" (Canada), "internic.net" (sites in the Internet NIC) or "harvard.edu" (sites at Harvard University). While these are all real domain names, listing all pos- sible combinations for say, the USA, would quickly become tedious (and if you think that is bad, try list- ing all the countries on the Internet in Europe). To aid in this problem, the archie system has the concept of pseudo-domains to allow users to use a shorthand notation when using this facility. These pseudo-domains are defined on a server-by-server basis and you can use the domains command to query your current server for its list of predefined pseudo-domains. A pseudo-domain is a list of real DNS domain names and/or a list of other pseudo-domains. For example, the archie administrator on the server could define the pseudo-domain "usa" to be "edu:mil:com:gov:us" If this definition existed on the server, then you could set match_domain usa which would be the same as saying set match_domain edu:mil:com:gov:us In addition, the server administrator may define "northamerica" to be "usa:ca:mx" meaning that "northamerica" is composed of the pseudo- domain "usa" and the real domains "ca" (Canada) and "mx" (Mexico). This process can be repeated for 20 lev- els (more than sufficient for any naming scheme). By using the domains command you can determine what pseudo-domains your current server supports. match_path Sometimes you only would like your search (using the find command) to look for files or directories with a certain set of names in their full path. For example, many anonymous FTP site administrators will put software packages for the MacIntosh in a path containing the name "mac" or "macintosh". Another exam- ple is when a document exists in several formats and you are only looking for the PostScript version. You can guess that the file may end in ".ps" or it maybe in a directory called "ps" or "PostScript". This is usually guesswork, but is is useful to have the archie system only look for files or directories with particular components in their path name. This variable allows you to do this. The arguments are a colon-separated list of possible path name com- ponents. In the last example above, saying set match_path ps:postscript will restrict the search only to match those files or directories which have the strings "ps" or "postscript" in their path. The comparison is always case-insensitive (regardless of the value of the match variable) and there is a log- ical OR connecting the components so that the above statement says: "find only files which have 'ps' OR 'postscript' in their path". If either component matches then the condition is satisfied. output_format Affects the way the output of find and list is displayed. User settable, with valid values of machine (machine readable format), terse and verbose, with a default of verbose. search The type of search done by the find (or prog) command. User settable with a range of exact, regex, sub, sub- case, exact_regex, exact_sub and exact_subcase with a default of sub. (The exact_ types cause it to try exact first, then fall back to type if no matches are found). The values have the following meanings: exact Exact match (the fastest method). A match occurs if the file (or directory) name in the catalog corresponds exactly to the user-given substring (including case). For example, this type of search could be used to locate all files called xlock.tar.Z regex Allow user-specified (search) strings to take the form of ed(1) regular expressions. Note: unless specifically anchored to the begin- ning (with ^) or end (with $) of a line, ed(1) regular expressions (effectively) have ``.*'' prepended and appended to them. For example, it is not necessary to type find .*xnlock.* because find xnlock suffices. In this instance, the regex match is equivalent a simple substring match. Those unfam- iliar with regular expressions should refer to the section entitled REGULAR EXPRESSIONS which appears below. sub Substring (case insensitive). A match occurs if the file (or directory) name in the catalog con- tains the user-given substring, without regard to case. Example: The pattern: is matches any of the following: islington this poison subcase Substring (case sensitive). As above, but taking case as significant. Example: The pattern: TeX will match: LaTeX but neither of the following: Latex TExTroff server the Prospero server to which the client connects when find or list commands are invoked. User settable, with a default value of localhost. sortby Set the method of sorting to be applied to output from the find command. Typing the keyboard interrupt char- acter (generally Cntl-C on UNIX hosts) aborts a search. This will also dequeue the request from the server. Unlike previous versions of the archie system, version 3 does not allow partial results. The output phase may be aborted by typing the abort character a second time. The five permitted methods (and their associated reverse orders) are: none Unsorted (default; no reverse order, though rnone is accepted) filename Sort files/directories by name, using lexical order (reverse order: rfilename) hostname Sort on the archive hostname, in lexical order (reverse order: rhostname) size Sort by size, largest files/directories first (reverse order: rsize) time Sort by modification time, with the most recent file/directory names first (reverse order: rtime) THE INTERACTIVE (TELNET) INTERFACE The interactive interface accepts the following commands and variables in addtion to those listed above. Commands stty [[