add release dir

This commit is contained in:
Mario Fetka
2024-05-28 17:59:32 +02:00
parent 1a700daf11
commit 2d5eb9fe1c
142 changed files with 34106 additions and 0 deletions

View File

@@ -0,0 +1,85 @@
.\" Copyright (c) 1992,1994,1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)anonftp_parser_output.5
.\"
.TH ANONFTP_PARSER_OUTPUT 5 "August 1996"
.SH SYNOPSIS
.B anonftp_parser_output
\- description of the output of the Archie anonymous FTP listing parsers
for the anonftp catalog
.SH DESCRIPTION
.PP
Currently all anonymous FTP listings in the Archie system are parsed from
the Operating System specific format in to a common Archie-defined parser
output format. This allows the system to have standard database
structures, and updating routines without having to worry abou the
individual characteristics of each operating system. This format is
described below. One of the underlying assumptions is that the file
system of the listing being parsed has a tree-like structure with the
concepts of internal nodes (directories) and leaves (files).
The parser output corresponding to a particular listing consists of a
sequence of variable length records, one record per file in the listing.
This is composed of a fixed "core" component followed by a variable
length string.
The core is composed of records with fields:
<file_size><date_time><parent><child><perms><flags>
where
.TP
.B <file_size>
Is and unsigned 32 bit quantity containing the size of the file.
.TP
.B <date_time>
An unsigned 32 bit quantity. It's value is interpreted as seconds UTC
(GMT) since Jan 1, 1970. The actual quantity that it represents varies
between operating systems, but it is typically the creation or last
modification time for the file in question.
.TP
.B
<parent>
Unsigned 32 bit quantity. It is the record number of the parent directory
in the current file. A value of zero signifies that the parent directory
is the root.
.TP
.B
<child>
Unsigned 32 bit. The record number of the first child of this directory.
It is zero if the current record describes a directory which has no
children. It is undefined for regular files.
.TP
.B <perms>
Unsigned 16 bit. Contains a bit vector for the permission field values
in an operating-system dependent fashion.
.TP
.B <flags>
Unsigned 16 bit. A bit vector of particular attributes of the entry.
Currently if the entry is a directory, bit 1 is set. If it is a link, bit
2 is set.
.LP
The rest of the record consists of an unsigned 16 bit quantity containing
the length of the string (in bytes) in the record. This value is always
rounded up to the next multiple of 4. The final element is a variable
length byte sequence containing the string itself. This is always padded
up to the next 4-byte boundary.
.SH "SEE ALSO"
.BR parse_anonftp_* (n),
.BR net_anonftp (n),
Archie System Manual
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,454 @@
.\" Copyright (c) 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" April 1996
.\"
.\" @(#)archie_clients.n
.\"
.TH ARCHIE_CLIENTS N "August 1996"
.SH NAME
.B telnet-client, email-client
\- Archie client interfaces
.SH SYNOPSIS
.B telnet-client
[
.BI -d \ <debug\ level>
] [
.B -e
] [
.B -i
] [
.B -l
] [
.BI -L \ <logfile>
] [
.BI -o \ <output\ file>
] [
.B -s
]
.sp
.B email-client
[
.BI -i \ <input\ file>
] [
.BI -M \ <dir>
] [
.BI -h \ <dir>
] [
.B -l
] [
.BI -L \ <logfile>
] [
.BI -T \ <telnet\ client>
] [
.BI -t \ <tmp\ dir>
] [
.BI -u
] [
.BI -v
]
.SH DESCRIPTION
.LP
This manual page describes the Archie telnet and email interface clients for
Archie system administrators. The instructions for general use can be found in
.BR archie (n).
In Archie versions 3.X all clients use the Prospero server to process incoming
requests. This lowers the demand that each telnet session and email message
places on the host and allows more users to access the system at one time.
Currently the email and telnet clients only support access to the
\fBanonftp\fP database. It is likely that in future releases they will be
modified to allow access to other databasess.
.SH "EMAIL CLIENT"
The email client interface is a wrapper around the telnet client, which
performs all of the query processing. The email client determines the return
address from the header of the incoming mail as well as extracting the
`Subject:' line, which, in the Archie system, is treated as part of the
message body. This information is passed to the telnet client, which arranges
for the queries to be performed and the resulting message returned to the
user.
.SS "Email Client Options"
The following options are accepted by the email client.
.TP
.B \-l
\fIDo not\fP write any user output to the default email log file
.B ~archie/logs/email.log.
Note that this is the \fIreverse\fP of most other programs in the Archie
system. Errors will be written to
.I stderr
if this option is used.
.TP
.BI \-M \ <dir>
The master Archie database directory path. If not specified, the program
looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-h \ <dir>
The Archie host database directory path. If not specified, the program first
tries
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-L \ <logfile>
The name of the file to be used for logging information, rather than the
default,
.BR ~archie/logs/email.log .
Note that debugging information is also written to the log file. This option is
passed to the telnet client.
.TP
.BI -i \ <input\ file>
Read the incoming mail from
.I <input\ file>
rather than the default,
.IR stdin .
This option is provided so that the incoming mail may be queued in a temporary
directory, then processed periodically.
.TP
.BI -T \ <telnet\ client>
Use the
.I absolute path
specified by
.I <telnet\ client>
as the program to execute to process the incoming mail. If omitted, the
program invokes the program
.BR ~archie/bin/telnet-client .
.TP
.BI -t \ <tmp\ dir>
Use the directory
.I <tmp\ dir>
for temporary files. This must be the absolute path of the directory. By
default, the directory
.B ~archie/db/tmp
is used.
.TP
.BI -u
Log the incoming mail to the current log file. The log entry contains the
default return address and a timestamp. Note that the log will not reflect
return addresses modified in the message body by the \fBpath\fP command or
\fBmailto\fP variable. A entry is written to the log file on completion of the
request.
.TP
.BI -v
Verbose mode. The email client will log each phase of the request processing,
as well as echoing each line of the input message as it is written to the
telnet client process. This mode should only be used when trying to determine
problems, as the verbosity will cause the log file to become very large, very
quickly.
.PP
Once the appropriate information has been extracted from the incoming mail,
the email client invokes the telnet client to service the request, then exits
when the telnet client has completed. Note that the email client performs
only minor processing and that all queries are ultimately submitted to the
Prospero server, thus it uses very few system resources.
.SS Variables
Variables in the telnet client have a number of attributes affecting their
use. These are:
.TP
.B name
This is the unique character string by which the user refers to a variable.
.TP
.B type
Boolean, numeric and string. A boolean variable is either set or unset,
corresponding to true and false. Except in certain cases, both numeric and
string variables may also be unset, in which case they lose their current
value.
.TP
.B status
Set or unset.
.TP
.B value
As above, boolean variables have no value, merely a status. When set, numeric
and string variables will have \fIsome\fP value, when unset they lose their
current value. Some numeric or string variables have default values when the
telnet client is invoked.
Some variables may not be set by the user, and may be set only in the telnet
client initialization file.
.TP
.B visibility
Some variables may be set only through the initialization file. Such
variables are typically used for internal state information or for local
configuration, and may neither set nor displayed by the user.
.TP
.B range
Due to their special meaning, some numeric and string variables are restricted
to a limited range of values. (Currently, these ranges are fixed in the
software.) An example is the \fBsearch\fP variable, which, due to fixed
search types, \fImust\fP take one of a preselected range of values.
If the value of a string variable is to contain leading or trailing spaces
then it must be quoted. Text may be quoted by surrounding it with a pair of
double quotes (`"'), or by preceding individual characters with a
backslash (`\e'). (A double quote, or a backslash may itself be quoted by
preceding it by a backslash.) The resulting value is that of the string with
the quotes stripped off. For example
.\" WARNING: the indented lines contain _real_ tabs. Keep 'em.
.sp
set prompt "zork-archie> "
.sp
would cause the value of \fBprompt\fP to retain the trailing space.
.sp
set prompt "a "prompt\\ >
.sp
embeds two spaces in the prompt, while
.sp
set prompt slash-quote\\\\\\"
.sp
will put a slash and a double quote at the end of the prompt.
.SH "TELNET CLIENT"
This client has four modes which determine how it behaves for certain
operations. Some variables may only be set, and are only visible when the
telnet client is running in a particular mode. Similarly, some commands may
only be executed in certain modes. The current modes are \fIsystem batch\fP,
for initialization files set up by the administrator, \fIuser batch\fP, for
initialization files set up by an ordinary user, \fIinteractive\fP, for the
normal command line mode and \fIemail\fP, when the telnet client is run from
the email client. These are explained below.
.SS "Telnet Client Options"
.TP
.B \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B -L
option. Errors will, by default, be written to
.IR stderr .
.TP
.BI \-L " <logfile>"
The name of the file to be used for logging information. Note that debugging
information is also written to the log file.
.TP
.B -e
Set the telnet client to run in email mode. See
.SM "Email Mode"
below.
.TP
.B -i
Set the program to run in interactive mode. See
.SM "Interactive Mode"
below.
.TP
.BI -o " <output file>"
This option is no longer supported and is allowed for backward compatibility
only.
.TP
.B -s
Run in system batch mode. See
.SM "System Batch Mode"
below.
.PP
The telnet client need not reside on the same machine as the Prospero server,
or the rest of the Archie system. For example, a university may decide to
install the telnet client on several machines around the campus.
Users have the option of executing the telnet client on these hosts to have
their queries performed. This technique allows the Archie server machine to be
free of any Archie sessions itself, although it is still responsible for
processing the queries in the normal fashion. In this way the telnet client
is effectively the same as other client programs like
.BR xarchie ,
or the Archie command line client. If this technique is used then the
\fBserver\fP variable must be set to the name of the Archie server host.
.SS "System Batch Mode"
Unless the
.B \-i
option is supplied, the telnet client always starts in this mode. The purpose
is to allow system specific commands to be executed, and variables to be set,
which cannot then be changed once in interactive or email mode. The client
first reads the default configuration file
.BR ~archie/.archierc .
If the client is invoked as any user other than the
pre-defined
.I archie
user, it will then switch to
.I user batch
mode and attempt to read the file
.BI ~ <user> /.archierc
for further configuration information (where
.I <user>
is the name of the user executing the program).
The following variables may only be set in the system configuration file.
Variables which are interactive in nature (such as
.B term
and
.BR status )
should
.I not
be set in the configuration file.
.TP
.B niceness
A number passed to the
.BR nice (3)
function before the client enters the command loop. This has the effect of
reducing the priority of the telnet client process. This may be useful when
the host system is under a high load or has other, more time\-critical,
processes running.
.TP
.B email_help_file
The name of the help file, returned to the sender, when the incoming mail
message contains the command
.BR help ,
or an empty message is received. By default this is
.BR ~archie/etc/email.help .
.TP
.B help_dir
The name of the directory containing the help tree. It has a default value of
.BI ~archie/help/ <language>\fR,\fP
where
.I <language>
is the value of the
.B language
variable. This value is modified when the
.B language
variable is modified. The internal system default is
.BR ~archie/help/english .
See
.SM "The Help System"
below.
.TP
.B mail_from
This is the `From' address put in mail sent to the user. The default is
`archie\(emerrors'.
.TP
.B mail_host
The machine to which to connect in order to send mail. This machine contains
the programs to actually compress, encode, split, etc. the mail. The default
is `localhost'.
.TP
.B mail_service
The service, as listed in in
.BR /etc/services ,
to which to connect, in order to reach the actual mail sending program. A
numeric value is interpreted as a port number. The default is
`archiemail'.
.TP
.B man_ascii_file
The path to the plain ASCII version of the Archie manual page, which is
accessed through the
.B manpage
command. It has a default value of
.BR ~archie/etc/manpage.ascii .
.TP
.B man_roff_file
As above, but the troff (nroff) version. The default value is
.BR ~archie/etc/manpage.roff .
.TP
.B pager_help_opts
Options to be passed to the pager, when invoked within the help system. The
default is
.BR \-c .
See the
.BR less (1)
manual page for further information.
.TP
.B pager_opts
Similar to the previous variable, but it applies when the pager is used
outside of the help system. The default is
.BR \-c .
.TP
.B prompt
The prompt displayed in the main command loop. The default is `archie> '.
.TP
.B servers_file
Relative path to the file containing the list of current Archie servers. The
contents of the file are printed when the
.B servers
command is invoked. The default is
.BR ~archie/etc/serverlist .
.PP
The following commands may only be used in system batch mode.
.TP
.BI disable
Disables the use of the command name supplied as an argument. For example, the
line
.sp
.\" WARNING: the indented line contains _real_ tabs. Keep 'em
\fCdisable mail\fP
.sp
if placed in the system configuration file, will disable the use of the
.B mail
command by users.
.SS "User Batch mode"
Currently all variables available in interactive mode are also available in
user batch mode.
.SS "Email Mode"
None of the interactive commands (such as
.BR status )
or variables (such as
.BR pager )
are available in email mode.
.SS "Interactive mode"
The variables documented in
.BR archie (n)
may all be set by the user. In addition, Archie administrators should be aware
of the following variables.
.TP
.B language
This variable allows the user to specify the language in which help, etc. is
presented. Currently the default value is `english'. The directory
.BI help/ <language>
must exist for this variable to be changed to
.IR <language> .
To add a new language to the help facility, the directory
.BI ~archie/help/ <language>
must be created. Then, a translated version of an existing help hierarchy
must be placed under that directory.
.SS "The Help System"
The Archie help system is hierarchical, in that there is an initial set of
topics, some of which may have further subtopics. Within the help system the
user is presented with a different prompt and has a limited set of commands
for moving around the hierarchy. Different languages may be supported in the
help system by creating an alternate hierarchy for each language.
For example, to set up a hierarchy for French, you would create the directory
.BR ~archie/help/francais ,
under which you would create a separate directory for each topic for which you
wanted to provide help. In each such directory, a file called
.B =
must contain the actual text to be displayed. For example, the
.BR ~archie/help/english
directory contains the directory
.BR set .
If the user types `help set' the text in the file
.B ~archie/help/english/set/=
is displayed.
Subcommands (or subtopics) are supported by creating subdirectories of the
main topic directories similarly to that described above. For example, the
text for the subtopic `search', under the topic `set', would be in
the file
.BR ~archie/help/english/set/search/= .
Please note that while the help system can support multiple languages, the
list of commands, as well as error messages are currently returned in English.
Future versions of the client will also allow these to be converted in an
administrator-defined fashion, thereby allowing the client to operate
completely in the language of choice.
Full details of the operation of the Archie clients can be found in the user
manual page in
.BR ~archie/etc/manpage.roff .
.SH "SEE ALSO"
.BR archie (n).
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,184 @@
.\" Copyright (c) 1992,1994,1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)archie_headers.5
.\"
.TH ARCHIE_HEADERS 5 "August 1996"
.SH SYNOPSIS
.B Archie_headers
\- description of header format for the Archie 3.X system
.SH DESCRIPTION
.PP
From start to finish, every data file in the Archie Update Cycle begins
with an Archie "Header Record". This contains all the information
necessary for the various components to process the data obtained from
the Data Host. Much of the information transmitted in the header record
is ultimately stored in the Host databases at completion of the cycle and
the record is modified along the cycle to reflect the changing status of
the data.
The header is in ASCII format and is human readable regardless of the
format of the other data which may or may not follow it. In some cases,
the header itself contains the data necessary to complete the cycle.
All headers are delimited by a `@header_begin' string and terminated with
a `@header_end' which must start in the first column, that is, they must
be immediately preceeded by a NEWLINE character. The data itself starts
immediately after the final NEWLINE of the termination string.
The following fields are used by the Archie system:
.TP
.B primary_hostname
The primary hostname of the site to which the data belongs. These names
are used internally by the Archie system.
.TP
.B preferred_hostname
The name under which users see this site listed. It will be a valid
canonical name (CNAME) for that site.
.TB
.B generated_by
The component of the Archie system which has
generated this header. Valid values are:
.RS
.RS
.TP
.B parser
Output from the parse phase
.TP
.B retrieve
Output from the data aquistion phase
.TP
.B server
Generated by the data retrieval phase
.TP
.B admin
Generated by an external administrative procedure
.TP
.B control
Generated by the controlling routines (usually after an error)
.RE
.RE
.TP
.B source_Archie_hostname
The name of the Archie host responsible for monitoring information at
this Data Host.
.TP
.B primary_ipaddr
The primary IP address of the Data Host used internally by the Archie
system.
.TP
.B access_methods
The name of the Archie database to which this data belongs. Eg, "anonftp"
(for anonymous ftp listings), "whois" (for a white pages service) etc.
.TP
.B access_command
The database-specific sequence of parameters used during the Data
Aquisition phase to perform the aquisition of the raw data from the Data
Host.
.TP
.B os_type
The operating system of the Data Host.
.TP
.B timezone
The timezone of the Data Host in signed seconds from GMT.
.TP
.B retrieve_time
The time of data aquistion from the data host. This is written as
YYYYMMDDHHMMSS (year, month, day, hour, minute, second) and is always in
UTC (GMT).
.TP
.B parse_time
The time the data was parsed. Written in the same format as the
retrieve_time field.
.TP
.B update_time
The time the data was updated. Written in the same format as the
retrieve_time field.
.TP
.B no_recs
The number of "records" in this data. For example, the value for a file
listing would be the number of files in the listing. This field may not
be appropriate for some databases and would not be used.
.TP
.B current_status
Lists the current status of the data host. This can be:
.RS
.RS
.TP
.B active
available to be queried and updated
.TP
.B inactive
temporarily disabled from the system
.TP
.B del_by_Archie
scheduled to be deleted. Usually means that the data in the system is
out of date
.TP
.B del_by_admin
scheduled to be deleted by the local Archie administrator
.TP
.B disabled
inactivated by the local Archie administrator
.TP
.B not_supported
Database type is not supported at this site
.RE
.RE
.TP
.B update_status
One of "fail" or "succeed". Used internally by the system to determine
result of the previous phase of the update.
.TP
.B prospero_host
One of "yes" or "no" depending on if the Prospero system is in operation
at that site.
.TP
.B data_name
In the case that the data aquisition phase of the update cycle generates
more than one data file, this field will contain a unique string
indentifying this particular data. For example, if wildcards were used
during data aquisition for a set of files, then data_name will be set to
the name of the particular file that is the source of the data.
.SH EXAMPLE
The following is an example of a header record:
.RS
.RS
.nf
\fC
@header_begin
generated_by server
source_Archie_hostname java.cc.mcgill.ca
primary_hostname acfcluster.nyu.edu
access_method anonftp
access_command :anonymous:
os_type vms_std
retrieve_time 19930404172308
no_recs 0
current_status active
update_status succeed
format raw
prospero_host no
data_name /pub/gnu/gcc.tar.Z
@header_end
[...data begins...]
\fP
.fi
.RE
.RE
.SH "SEE ALSO"
Archie System Manual
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,197 @@
.\" Copyright (c) 1992,1994,1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)archie_protocol.5
.\"
.TH ARCHIE_PROTOCOL 5 "August 1996"
.SH SYNOPSIS
.B Archie_protocol
\- description of the internal client/server Archie protocol
.SH DESCRIPTION
.PP
This protocol describes the \fBinternal\fP Archie protocol, not that used
between the Prospero servers and clients for querying the database.
This protocol is used for distributing data between Archie systems
worldwide.
Protocol commands themselves are in ASCII printable characters and <cr>,
<lf> (ASCII 13 and 10 respectively) although data such as that from the
Archie files database are transmitted as a binary stream in Sun XDR
format. All commands must start at the beginning of the line and are
terminated by <cr><lf>. Intervening whitespace may be spaces or tabs or
both. The data stream may be compressed depending on the configuration of
the servers at either end of the connection.
The basic security mechanism enforced by the server is such that it only
allows those clients attempting connection from "approved" hosts to
establish a session. This is described in
.BR arserver (n).
It is assumed in the following description that such a session (the
\fIcontrol connection\fP) has been
established.
The following commands are used by the client and server to communicate:
.PP
.B LISTSITES
Sent by the client to the server to request all sites matching the
criteria set out by the parameters to the command. These parameters in
order are:
.RS
<db> '<'|'>' <from date> <domainlist>
.RE
All sites which have been updated in the databases named <db> more/less
recently than <from date> are to be listed. <db> is composed of a colon
separated list of database names. The character '<' means less recently,
'>' means more recently. <from date> is a date string in the format
YYYYMMDDHHMMSS in UTC. All zeros in this field is taken to mean that all
sites in <db> are to be listed. Sites listed must also match the
<domainlist> field. This can either be '*' for any domain or a colon
separated list of real domains or pseudo domains. Pseudo-domains are
user created and reside after installation in the file
\fB~Archie/host_db/domain-db\fP. See
.BR ardomains (n)
for further explanation.
The first line of response from the server is:
.RS
TUPLELIST <number>
.RE
If <number> is non-zero the rest of the response is composed of a set
tuples, <site tuple>, one for each site/database pair in the files
matching the given criteria, one per line terminated by <cr><lf>. If
<number> is zero, then no further lines are transmitted. Each tuple
uniquely identifies that entry across the entire Archie system. The
tuples consist of the following fields in the given order:
<source>:<date>:<primary name>:<pref name>:<ip addr>:<db>
Where:
.RS
.TP
.B <source>
is the Archie host responsible for monitoring that site
.TP
.B <date>
is the date of listing retrieve of anonymous ftp host
.TP
.B <primary name>
is the primary host name of site in the database
.TP
.B <pref name>
is the preferred host name (CNAME) of host. Field may be empty if no such
name is stored
.TP
.B <ip addr>
is the primary IP address of the host
.TP
.B <db>
is the database name
.RE
Host names are specified in standard RFC 1037 format
IP addresses are specified in standard 'dotted decimal' notation
The date is specified YYYYMMDDHHMMSS in UTC. However this value is not
examined other than for inequality tests with internal records.
.PP
.B SENDSITE
With this command, the client informs the server that it would like the
information about a particular site/database combination. It determines
this by comparing the tuples returned by the LISTSITES command above with
its local database. The parameters to the command in order are:
.RS
<primary hostname>:<database>[:<port>] ["compress"]
.RE
If the "compress" string is included with the protocol command, then the
remote server is requested to compress the data stream. The remote server
may agree but also has the option of ignoring this request.
In the case of the `webindex' database, the port is specified, as multiple
servers can reside on the `<primary hostname>' machine.
The server responds with:
.RS
SITELIST <port number>
.RE
With this response the server informs the client that it ready to
transmit the information and that it is available on <port number>. The
client then opens a connection (data channel) on the server host with
that <port number>.
The format in which the information for this site/database combination is
transmitted is determined by the actual database information. See
.BR arserver (n).
The header for that site/database is usually the first piece of
information to be transmitted on the data channel, however this is not
defined by the protocol.
There is no acknowledgement from the client.
.PP
.B SENDHEADER
With this command the client specifies to the server that the header
record should be transmitted on the control connection. This command is
used by the client when invoked in "retrieval mode" (see
.BR arserver (n).
The parameters with this command are (in order):
.RS
<primary hostname>:<database>
.RE
There is no acknowledgement from the client.
.PP
.B SENDEXCERPT
With this command the client specifies to the server that the excerpt record
should be transmitted on the data connection. This command is used by the
client after receiving the site for the `webindex' database.
This command creates the sam reply from the server as if it was a
SENDSITE command.
The parameters with this command are (in order):
.RS
<primary hostname>:<database>:<port>
.RE
.PP
.B DUMPCONFIG
This command causes the server to list its arupdate.cf configuration
file. The format is the same as that used in the configuration file
except that semicolons (':') are used as field separators. The server
signals the termination of the output by the line ENDDUMP.
.PP
.B QUIT
The client requests that the control connection be closed. There is no
acknowledgement from the server.
.SH "SEE ALSO"
.BR arserver (n)
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,352 @@
.\" Copyright (c) 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)arcontrol.n
.\"
.TH ARCONTROL N "August 1996"
.SH NAME
.B arcontrol
\- perform automated updating routines on Archie catalogs
.SH SYNOPSIS
.B arcontrol \-u | \-p | \-r
[
.BI \-M \ <dir>
] [
.BI \-h \ <dir>
] [
.BI \-m \ <maxcount>
] [
.B \-U
] [
.B \-n
] [
.BI \-T \ <timeout>
] [
.BI \-Z
] [
.B \-t
.I <dir>
] [
.B \-v
] [
.B \-l
] [
.B \-L
.I <logfile>
]
.SH DESCRIPTION
.LP
The
.B arcontrol
program is normally invoked automatically by the
.BR cron (8)
daemon. The program initiates the processes necessary to acquire, process
and incorporate new data into the various Archie catalogs.
.SH "OPTIONS"
.PP
One of the following options must be supplied:
.RS
.TP
.B \-r
Process data files with the
.B .retr
suffix, deposited in the holding (temporary) directory by the retrieval phase.
.TP
.B \-p
Process data files with the
.B .parse
suffix, created by the data aquisition phase
.TP
.B \-u
Process data files with the
.B .update
suffix, created by the parse phase
.RE
.PP
In addition, the following options are available:
.RS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not given,
the program tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not
supplied the program will default first to
.B ~archie/db/host_db
and failing that, to
.BR ./host_db .
.TP
.BI \-t " <dir>"
Sets the name of the directory used for temporary files.
If not given, the program uses
.BR ~archie/db/tmp .
.TP
.BI \-m " <maxcount>"
The maximum number of date files to process in any given invocation. This
is especially useful when there are many date files and a limit of how
many to process simultaneously is desired. There is an internal
default of 30 data files in retrieval mode, which may be raised or
lowered by this option. By default in update or parse mode, as many files
as are available will be processed. The special value 0 may be supplied
as an argument to this option and has the meaning of overriding the internal
default maximum: as many files as are available will be processed.
.TP
.B \-n
Do not modify the compression status of the temporary data files. By
default data stored temporary on disk throught the Update Cycle is stored
in a compressed state. However, this data must be uncompressed before
being used. This option tells the system to perform the least amount of
processing in order to use the data. This option requires that there be
more disk space for the uncompressed data.
.TP
.B \-U
Actively uncompress temporary data. Data that is obtained in compressed
form should be uncompressed before writing temporary files. This may
speed processing at certain stages of the update cycle. This option
requires that there be more disk space for the uncompressed data.
.TP
.BI \-T " <timeout>"
Set the timeout on the retrieval phase of the Update Cycle. If the
retrieval connection has been idle for more than the timeout value the
retrieval is terminated and an error generated.
.I <timeout>
is specified in minutes. This value is passed directly to the data acquisition
process. The default is 10 minutes.
.TP
.B \-Z
If in retrieval mode, then the retrieval process will automatically look
for an indexing file (this is defined in the retrieval program's
configuration file).
.TP
.B \-v
Verbose mode. Will tell you what it is doing.
.TP
.B \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to
.IR stderr .
.TP
.BI \-L " <logfile>"
The name of the file to be used for logging information.
Note that debugging information is also written to the
log file. This implies the
.B \-l
option, as well.
.RE
.SH "NAMING CONVENTIONS"
The subprocesses spawned by
.B arcontrol
follow a well-defined naming convention:
.IP
.IR "<phase prefix>" _ <dbname> _ <special>
.PP
where
.I <phase prefix>
is one of
.RS
.TP
.B retrieve
For the data aquistion phase of the cycle
.TP
.B parse
For the parse phase
.TP
.B update
For the update phase
.RE
.PP
and
.I <dbname>
is the name of the catalog associated with the data being processed.
.PP
In certain cases, it is nessesary to process data destined for the same
Archie catalog in different ways, depending on their source. For example,
UNIX and VMS anonymous FTP listings are significantly different in form
and are parsed differently. Therefore
.I <special>
could apply to, among other things, operating systems.
.TP
Example:
.RS
.PP
.B parse_anonftp
is responsible for parsing the data for the anonftp
catalog. This program then spawns
.IP
.PD .1v
.B parse_anonftp_unix_bsd
.PP
or
.IP
.PD 1v
.B parse_anonftp_vms_std
.PP
depending on the operating system of the source data host. The
information required to determine which program to use is read from the
header record associated with all data files.
.br
.PP
The current convention for naming data files during the
update cycle is:
.IP
\fI<site name>\fR\(em\fI<dbname>\fR_\fI<cntl num>\fR.\fI<phase suffix>\fR[\fI<tmp suffix>\fR]
.PP
where
.RS
.TP
.I <site name>
is the name of the source host for this data
.TP
.I <dbname>
is the name of the Archie catalog with which this data is
associated
.TP
.I <cntl num>
is a number whose function is to distinguish different
sets of data from the same site and for the same catalog.
Note that this number is arbitrarily determined and may
change after undergoing any given phase of the update
cycle
.TP
.I <phase suffix>
is one of `.retr', `.parse' or `.update' depending on which phase of the cycle
the data is destined for.
.TP
.I <tmp suffix>
is usually `_t'. This is used as a temporary name for data files currently
undergoing processing.
.RE
.PP
Example:
.RS
.PD .1v
.PP
The retrieval phase may generate a file with the name
.IP
.PD .1v
.sp
\fCarchie.mcgill.ca-anonftp_69.parse\fP
.sp
.PP
during the processing. The file may be called
.sp
.IP
.PD 1v
\fCarchie.mcgill.ca-anonftp_23.parse_t\fP
.sp
.PP
.TP
upon completion.
.RE
.SH "DATA PROCESSING"
.PP
Data aquisition, processing and update provide the basis for the Archie
system model and operate under the direction of
.B arcontrol.
.PP
The Archie system temporary directory (by default
.B ~archie/db/tmp
unless overridden by the
.B \-t
option) is first scanned for the data files whose
filename suffixes are appropriate for the mode in which the program was
invoked. The header record for each file is then read to determine the
actions which are to be taken. A pre-process pass is taken over each
data file which may modify it to conform to the correct format for the
next processing phase. For example, a compressed data file may be
uncompressed.
.B arcontrol
is also responsible for coordinating the processing operations so that
for example, no more than one processing program is operating on any
given data file concurrently.
.PP
.B Data Acquisition Phase
.RS
.PP
All retrieval is performed asynchonously. That is, all retrieval
processes are launched without the control process waiting for them
to return immediately. They are monitored after all have been
launched.
.PP
The connection on which the retrieval is taking place is monitored by the
retrieval process responsible. If the connection has been idle for more
than a preset limit, the connection is closed. Since arcontrol is
responsible for running the appropriate retrieval process in normal
operation this idle interval may be set with the
.B \-T
switch, with units in minutes.
.PP
All programs in the retrieval phase generate data files with the `.parse'
suffix.
.RE
.PP
.B Parse Phase
.RS
.PP
Parsing is performed synchonously, each file in turn. This phase generates
data files with the `.update' suffix.
.RE
.PP
.B Update Phase
.RS
.PP
Updating is performed synchronously.
.B arcontrol
waits for the return of the appropriate update process after launching it.
This mechanism aims to prevent the concurrent updating of any of the Archie
catalogs by more than one process.
.RE
.SH "STOPPING PROCESSING"
If for some reason it is necessary for the Archie administrator to terminate
the program before it has completed processing the current batch of files the
file
.B ~archie/etc/process.stop
should be created. After the completion of processing each file, the arcontrol
program checks for the existence of this file. If it exists, processing
terminates and log and mail entries are generated (if they are being
requested). Creation of this will will also prevent further continuation of
update cycles and thus the file should be removed when no longer needed.
.PP
.B Note:
While this functionality is useful, files that would have been processed
before the program has terminated will be left with the `_t' suffix and will
not be picked up by subsequent invocations of the arcontrol program and have
to be removed or renamed (without the `_t' suffix) manually by the
administrator.
.SH BUGS
.LP
Files are preprocessed as a batch operation at the start of the program rather
than one at a time as needed. As a result, if the process terminates before
completing its tasks, files with the `_t' suffix will be left in the temporary
directory and have to be removed manually.
.LP
Sites that change their primary host names between updates
are currently not correctly handled.
.SH FILES
There are no configuration files currently associated with this program.
.LP
The only compression format currently implemented is Lempel-Ziv with
.BR compress (1)
.
.SH "SEE ALSO"
.BR retrieve_* (n),
.BR parse_* (n),
.BR update_* (n),
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,211 @@
.\" Copyright (c) 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)ardomains.n
.\"
.TH ARDOMAINS N "August 1996"
.SH NAME
.B ardomains
\- maintain the Archie system psuedo-domains database
.SH SYNOPSIS
.B ardomains
[
.B \-M
.I <dir>
] [
.B \-h
.I <dir>
] [
.B \-d
] [
.B \-f
.I <dom file>
]
.SH DESCRIPTION
.PP
.B ardomains
takes a correctly formatted file and enters it into the Archie
pseudo-domains database.
.SH OPTIONS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not
given, the program tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not
supplied the program will default first to
.B ~archie/db/host_db
and failing that, to
.BR ./host_db .
.TP
.B \-d
Dump the current domain database to stdout in a format suitable for
reloading into the compiled format.
.TP
.BI -f " <dom file>"
Use the file
.I <dom file>
instead of the default, which is
.BR ~archie/etc/ardomains.cf .
.PP
.SH Pseudo-Domains & File Format
.RS
.PP
The Archie system has the concept of
.IR pseudo-domains .
This is primarily for the convenience of being able to specify ``domains''
like `.usa', rather than specifying `.gov' & `.com' & `.edu' & `.us', etc. By
default, (unless overridden by the
.B \-m
option) the files
.BI ~archie/host_db/domain-db. { dir , pag }
contain this database. The format of the input for this program is
a file (by default
.B ~archie/etc/ardomains.cf
) containing lines of the form:
.IP
.I <pseudo domain> <psuedo domain>
[[
.B :
[
.I <pseudo domain>
|
.I <domain>
]]...] [
.I <description>
]
.PP
The definition of the domain may be followed by a description. This
description is incorporated into the database where it is available to
other components of the system.
.PP
Any line in the file may be continued on the following line by making the
backslash (`\e') the last character on the line.
Comments may be included in the file. They are started by `#' and end at the
next newline.
For example:
.RS
.TP
.PD .1v
\fCeurope de:ie:pt:es:uk:at:fr:.il\\
.IP
:be:nl:ch
.TP
scan no:fi:dk:se Scandinavia
.TP
noram edu:com:gov:us:ca North America
.LP
world europe:scan:noram:asia
.TP
.PD 1v
asia kr:hk:sg:jp:cn:my:tw:in #Subset of asia\fP
.RE
.PP
This means that the pseudo-domain `europe' is composed of the DNS domain names
of the countries of europe. Similarly `world' is composed of the
psuedo-domains `europe', `scan' (Scandinavia), `noram' (North America) and
`asia'.
.PP
When trying to determine if a site is a member of a list of given domains
and pseudo-domains, the Archie system first resolves each pseudo-domain
into its base constituents by walking the domain tree:
.IP
`world' would be `europe:scan:noram:asia'
.IP
`noram' would be `edu:com:gov:us:ca'
.PP
The system would then search to see if any of these are in turn
psuedo-domains. This process can be nested to (currently) 20 levels. If an
entry cannot be resolved into subcomponents, it is taken as is. Thus for
example, the pseudo-domain `uquebec' could be arbitrarily defined as
.IP
uquebec mcgill.ca:uqam.ca:concordia.ca:crim.ca
.PP
defining the various domains within the psuedo-domain of universities in
Quebec. Since the `mcgill.ca' domain cannot be further resolved by the
system, it is taken to be a base component.
.PP
This technique gives both the Archie administrators and users the ability
to use a form of shorthand when specifying domains.
.PP
Note that on final resolution the base names must match a real DNS domain
to be meaningful (or it will never match in the comparison). The system
makes no attempt to verify the authenticity of the base domains: they are
just used for comparisons with other names which they may or may not
match.
.PP
Any pseudo domain used must be defined before being used in the
definition of another psuedo-domain.
.PP
Loops in the domain database are detected when this program is run.
For example domain A is composed of domains B & C. Domains B & C may not
be composed of any domains which directly or indirectly contain domain A.
.TP
Note:
The `.il' entry differs from the rest in the above example by having a
preceding period (`.'). This is so that sites in Israel (`.il') do not match
US military sites (`.mil') since the comparison is done right to left. There
are only a few cases in which this is important since only the US and Canada
allow entries in DNS which do not end in the unique ISO country code.
.PP
.B ardomains
is designed so that the Archie system is never without a domains database even
during update of that database. This is done by manipulating the links to old
and new copies of the database.
.SH FILES
The default input file (if the
.B \-f
option is not specified) is
.BR ~archie/etc/ardomains.cf .
.LP
The active domain files are by default (when not overridden by the
.BR \-M " or " \-h
options) in the directory
.BR ~archie/db/host_db\fP .
The
.BR ndbm (3)
system is used for the psuedo-domain architecture and are composed of
`.dir' and `.pag' files. The active domain files are
.BR domain-db .
These are links to
.BR domain-db-new .
The files
.B domain-db-old
are used as backups in the relinking process.
.SH NOTES
The domain database files \fIcannot\fP be copied by standard means such
as
.BR cp (1),
.BR tar (1),
etc. See
.BR ndbm (3)
for an explanation. The only safe way to copy the files is to use
.BR dump (8),
and
.BR restore (8).
.SH DIAGNOSTICS
When loops are detected, an error message is printed specifying the offending
domain and the program exits, restoring the previous domain database in the
process.
.LP
Syntax errors are usually detected and flagged with the same result.
.SH "SEE ALSO"
.BR ndbm (3),
Archie system documentation.
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,353 @@
.\" Copyright (c) 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)arserver.n
.\"
.TH AREXCHANGE N "August 1996"
.SH NAME
.B arretrieve
\- Archie data exchange program
.SH SYNOPSIS
.B arexchange
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI \-C " <config>"
] [
.BI \-f " <force hosts>"
] [
.BI \-F " <remote server list>"
] [
.BI \-T " <timeout>"
] [
.B \-e
] [
.BI \-d " <catalog list>"
] [
.BI \-I " <size>"
] [
.BI \-m " <maximum number>"
] [
.B \-Z
] [
.B \-v
] [
.B \-c
] [
.B \-u
] [
.B \-j
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.PP
.B arexchange
is used by the local Archie host to retrieve preprocessed data, from other
remote Archie hosts, by contacting the remote
.B arserver
process. It does so on a per-database basis with the use of the associated
.B ~archie/bin/net_*
programs. So, for example, to exchange webindex data the
.B net_webindex
program is invoked, and it is responsible for the actual data
transmission.
.B arexchange
is responsible for the administration of the data exchange.
.RE
.SH OPTIONS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not specified, the
program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not supplied, the program
will first try
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-C " <config>"
Use
.I <config>
as the configuration file. See the
.SM CONFIGURATION
section.
.TP
.BI \-f " <force hosts>"
.I <force hosts>
is a colon separated list of data hosts to be retrieved. This overrides the
default requests which would normally be obtained from the configuration file,
and retrieves the data hosts even if they are not scheduled for
retrieval. Note that the data hosts' information will not be retrieved if the
server process at the other end of the connection determines that the
associated records are currently inactive. All the servers listed in the
configuration file are contacted for the specified
.I <force hosts>
(unless overridden by the
.B \-F
option). The databases retrieved are those specified in the configuration
file.
.TP
.BI \-F " <remote server list>"
.I <remote server list>
is the colon separated list of names of Archie servers in the configuration
file from which the program should retrieve the information requested. No
other servers are contacted, and if the given servers are not listed in the
configuration file no action is taken. A simple case insensitive string
comparison is performed between the server list on the command line and those
in the configuration file. No DNS comparison is done, so the same names in the
configuration file must be used on the command line (i.e. aliases must not be
used).
.TP
.BI \-T " <timeout>"
If, during the data exchange, the connection is idle for
.I <timeout>
minutes, the process is aborted. The timeout has a default value of
10 minutes.
.TP
.B \-j
The program does not invoke the actual data exchange, but prints the data
provided by the remote server process as if the data exchange were being
carried out. This is printed out in an internal protocol format. See
.BR archie_protocol (5).
.TP
.B \-e
Do not expand pseudo-domains on output. By default, a pseudo-domain specified
in the
.B arupdate.cf
file (to be exchanged) is replaced with its corresponding list of domains.
.TP
.BI \-d " <database list>"
Only exchange those databases in the colon separated
.IR "<database list>" .
If none of the databases in the list are present in the configuration file
entry for any given server, then no transfers are performed with that
server. This option may be used in conjunction with the
.B \-F
and/or
.B \-f
options to obtain the specified databases, on particular data hosts, from
specific remote Archie servers.
.TP
.B \-v
Verbose mode. Write debugging information to
.IR stderr ,
or to a log file, if one is specified.
.TP
.B \-c
Request compression of the transferred data (using the
.BR compress (1)
or
.BR gzip (1)
program). The remote server may choose to ignore the request.
.TP
.B \-u
Request uncompressed transmission of data. The remote server may choose
to ignore the request.
.TP
.B \-l
Write any messages to the default log file
.BR ~archie/logs/archie.log .
The name of the log file can be overridden with the
.B \-L
option. By default, errors are written to
.IR stderr .
.TP
.B \-Z
Force the retrieve programs to pick up the
.B ls-lR.Z
or
.B ls-lR.gz
files.
.TP
.BI \-m " <maximum number>"
Specify the maximum number of entries to process.
.TP
.BI \-I " <size>"
Set a minimum size, in bytes, for a site file to be indexed. If the size of
the site file is greater than or equal to this size, a `.idx' file will
accompany the site file in order to speed up search queries. The default
value of
.I <size>
is 500000 bytes. This option is useful only when updating (using the
.B \-u
option), and is used by the insert programs.
.TP
.BI \-L " <logfile>"
The name of the file to which information is logged. This option must be used
with the
.B \-l
option. Note that debugging information is also written to the log file.
.RE
.SH CONFIGURATION
.PP
This program is intended to be periodically run by the
.BR cron (8)
daemon (see
.SM "Configuring the System"
in the Archie documentation). The program reads the configuration file,
.BR ~archie/etc/arupdate.cf ,
unless overridden by the
.B \-C
option. This file is also used by the
.B arserver
program. (See
.BR arserver (n)).
Lines in the configuration file have the following format:
.IP
\fI<archie host> <config>\fP [, \fI<config>\fP ...]
.PP
where
.RS
.TP
.I <archie host>
is the Fully Qualified Domain Name of the host with which data is to be
exchanged. Note that the local Archie server name should not be in this file.
.RE
.PP
For each
.I <archie host>
there may be multiple
.I <config line>
entries. Since each entry starting with
.I <archie host>
is considered to be a single line, the backslash, (`\\') is used as a line
continuation character. Each
.I <config>
consists of the following fields:
.LP
.I <db list> <domain list> <maxno> <perms> <freq> <date> <fail>
.PP
.SS Field Interpretation
.PP
The fields have the following syntax and meaning:
.RS
.TP
.I <db list>
A list of Archie databases about which to query the server. An asterisk (`*')
indicates that the server is to be queried about
.I all
databases.
.TP
.I <domain list>
The list of domains about which to query the server.
.TP
.I <max no>
The maximum number of sites to accept, from the server, at a time. A value of
zero indicates that
.I all
available sites should be obtained.
.TP
.I <perms>
If the character `w' appears here, the arexchange program should query the
server about the information on this line (see below).
.TP
.I <freq>
A number specifying the minimum delay, by default in minutes, until the server
is again contacted. The time may be specified in hours or days by appending
an `h' or `d'
.I immediately
after the number.
.TP
.I <date>
A date, in YYYYMMDDHHMMSS format, indicating the last time the
.B arexchange
program queried this server.
.TP
.I <fail>
The number of consecutive, failed attempts to contact this server.
.RE
.PP
Example:
.IP
The following is an example of a configuration file entry and how the program
would interpret it.
.LP
\fCbunyip.com anonftp:webindex europe:usa 30 w 12h 19920703162322 2\fP
.PP
.RS
.TP
Field 1.
Contact the server at `bunyip.com' to retrieve the exchanged data.
.TP
Field 2.
Request information about the anonftp and webindex databases.
.TP
Field 3.
Ask only for those sites in `europe' and `usa'.
.TP
Field 4.
Retrieve at most 30 sites from the server, in one session.
.TP
Field 5.
This line is
.I enabled
so the client is to ask the server about information specified on this line.
.TP
Field 6.
Contact the server every 12 hours.
.TP
Field 7.
The time at which the server was last contacted.
.TP
Field 8.
There have been two consecutive failures to contact this site.
.sp
.sp
.RE
Note that both the local
.BR arserver
and
.BR arexchange
programs read the same configuration file. Furthermore, it is through the
.I <perms>
field that the respective server and client invocations determine the lines
intended for them. The
.B arexchange
program uses lines with a
.I <perms>
value of `w'.
.sp
.RE
It is also possible to specify `rw' in the permission field. This means both
that the remote client may contact the local server and that the local client
is to contact the remote server.
.sp
.RE
See
.BR arserver (n)
for more examples.
.RE
.SH FILES
~archie/etc/arexchange.cf
.br
~archie/db/host_db/*
.SH SEE ALSO
.PP
.BR cron (8),
.BR arserver (n),
.BR arretrieve (n),
Archie system documentation.
.SH AUTHOR
Bunyip Information Systems,
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,216 @@
.\" Copyright (c) 1993, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)arretrieve.n
.\"
.TH ARRETRIEVE N "August 1996"
.SH NAME
.B arretrieve
\- local Archie retrieval client
.SH SYNOPSIS
.B arretrieve
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI \-C " <config>"
] [
.BI \-f " <force hosts>"
] [
.BI \-F " <remote server>"
] [
.BI \-d " <catalog list>"
] [
.BI \-T " <timeout>"
] [
.B \-j
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.PP
This program is normally invoked by the
.BR cron (8)
process and obtains a set of headers (See the Archie documention on `archie
headers'), from the \fIlocal\fP arserver program, corresponding to those sites
requiring update. Each header matching the critera specified by this program
(the `client') is placed in a separate file, where it is then used as the
first step of the Archie update cycle.
.RE
.SH OPTIONS
.RS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not given, the program
tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not supplied the program
will default first to
.B ~archie/db/host_db
and failing that, to
.BR ./host_db .
.TP
.BI \-C " <config>"
Use the file
.I <config>
as the configuration file. See below.
.TP
.BI \-f " <force hosts>"
.I <force hosts>
is a colon separated list of hosts to be retrieved. This overrides the default
requests which would normally be obtained from the configuration file and
retrieves the hosts even if they are not scheduled for retrieval. Note that
the hosts information will not be retrieved if the server process at the other
end of the connection determines that the associated records are currently
inactive. All the servers listed in the configuration file are contacted for
the specified
.I <force hosts>
(unless overridden by the
.B \-F
option). The catalogs retrieved are those specified in the configuration file.
.TP
.B \-F " <remote server>"
.I <remote server>
is the name of an Archie server in the configuration file from which the
program should retrieve the information requested. No other servers are
contacted and if the given server is not listed in the configuration file no
action is taken.
.TP
.BI \-d " <catalog list>"
Only exchange those catalogs in the colon separated
.IR <catalog list> .
If none of the catalogs in the list are present in the configuration file
entry for any given server then no transfers are performed with that server.
.TP
.B \-j
The program does not perform the actual data retrieve, but prints the data
provided by the remote server process if the data retrieve were to be carried
out.
.TP
.BI \-T " <timeout>"
If during the data retrieve the connection is idle for
.I <timeout>
minutes, abort the process. The timeout is set, by default, to 10 minutes.
.TP
.B \-v
Verbose mode. All output is written to the current log file.
.TP
.B \-l
Write any user output to the default log file
.BR ~archie/logs/archie.log .
If desired, this can be overridden with the
.B \-L
option. Errors will, by default, be written to
.IR stderr .
.TP
.BI \-L " <logfile>"
The name of the file to be used for logging information. Note that debugging
information is also written to the log file. This implies the
.B \-l
option, as well.
.RE
.RE
.SH CONFIGURATION
.PP
The program reads a configuration file which is, by default,
.B ~archie/etc/arretrieve.cf
(unless overridden by the
.B \-C
option). This file has the same format as the
.B arupdate.cf
file and very similar semantics to those used by arexchange program. (See
.BR arexchange (n).)
.PP
This file has lines of the following format:
.IP
.IR "<archie host> <config> " [, " <config> " ...]
.PP
where
.I <archie host>
is the Fully Qualified Domain Name of the host from which the data for the
start of the Update Cycle is to be obtained. Normally, this would be the
local Archie server. If the arserver program is running on the same host,
.I <archie host>
may be specified as `localhost'. The asterisk character (`*') may be used to
signify the fact that \fIany\fP remote Archie host may connect to the local
Archie host, although this would be very unusual.
.PP
The backslash character (`\\') is used as a line continuation marker. Each
.I <config>
consists of the following fields.
.IP
.I <db list> <domain list> <maxno> <perms> <freq> <date> <fail>
.PP
.TP
.I <db list>
A colon separated list of Archie catalogs to query the server about. An
asterisk (`*') specifies that the server is to be queried about
.I all
catalogs.
.TP
.I <domain list>
The colon separated list of domains to query the server about.
.TP
.I <max no>
The maximum number of sites (headers) to accept from the server at any one
time. If 0 is specified, the all available sites in need of update in the
Archie catalog will be downloaded.
.TP
.I <perms>
The character `w' here indicates the client should query the server about the
information on this line. If this character is `r' the current line is
ignored.
.TP
.I <freq>
A number specifying the minimum number of minutes that are to elapse before
contacting the server with this query again. It may have the modifiers `h' or
`d',
.I immediately
following the number, specifying that the value is in units of hours or days
respectively.
.TP
.I <date>
Date in YYYYMMDDHHMMSS format. This specifies the last time that the client
performed a query.
.TP
.I <fail>
The number of consecutive attempts to contact this server which have failed.
.PP
Note that this program will normally not be configured to contact any Archie
.B arserver
program other than the local one, since retrieval of the raw data will, under
normal circumstances, be handled by the Archie host responsible for that data
site.
.RE
.SH FILES
~archie/etc/arretrieve.cf
.br
~archie/db/host_db/*
.SH SEE ALSO
.PP
.BR cron (8),
.BR arserver (n),
.BR arexchange (n),
Archie system documentation
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,335 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)arserver.n
.\"
.TH ARSERVER N "August 1996"
.SH NAME
.B arserver
\- Archie data exchange server
.SH SYNOPSIS
.B arserver
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI \-C " <config>"
] [
.B \-S
] [
.B \-c
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.PP
.B arserver
is responsible for providing remote access to and from the Archie
catalogs and data. This process is currently used
to perform two functions:
.RS
.IP a)
Allow the exchange of preprocessed data between Archie hosts in conjunction
with the
.B arexchange
program and
.IP b)
Provide the information to the local data aquisition phase of the update cycle
in conjunction with the
.B arretrieve
program.
.RE
.SH OPTIONS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not given, the program
tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not supplied the program
will default first to
.B ~archie/db/host_db
and failing that, to
.BR ./host_db .
.TP
.BI \-C " <config>"
Use the file
.I <config>
as the configuration file. See below.
.TP
.B \-S
Sleep for 20 seconds after initial startup. This is primarily used for
debugging.
.TP
.B \-c
Send outgoing inter-Archie data in compressed format (using the
.BR compress (1)
program). This results in a significant improvement in transfer times. Remote
clients may specifically override this setting and get the data
uncompressed. However, in this case a message that this has occurred is logged
in the log file.
.TP
.B \-v
Verbose mode. Output is written to the current log file.
.TP
.BI \-L " <logfile>"
The name of the file to be used for logging information. Note that debugging
information is also written to the log file. Logging is by default written to
the default log file
.BR ~archie/logs/archie.log .
.SH CONFIGURATION
.PP
This program is designed to be run from the
.BR inetd (8)
process in normal operation (see
.SM "Configuring the System"
in the Archie documentation).
.PP
The program uses a configuration file,
.BR ~archie/etc/arupdate.cf ,
unless overridden by the
.B \-C
option. This file is also used by the arexchange program (See
.BR arexchange (n))
Any program contacting the arserver process is called a
.IR client .
The file has lines of the following format:
.IP
.IR "<archie host> <config>" " [, " <config> " ] ..."
.PP
where
.RS
.TP
.I <archie host>
is the Fully Qualified Domain Name of the host with which data is to be
exchanged. This name enforces a basic security mechanism by specifying which
remote Archie clients are allowed to connect to the local Archie host. The
asterisk character (`*') may be used to signify the fact that \fIany\fP remote
Archie host may connect to the local Archie host.
.RE
.PP
For each
.I <archie host>
there may be multiple
.I <config line>
entries. Since each entry starting with
.I <archie host>
is considered to be a single line, the backslash character (`\\') is used as a
line continuation marker. Each
.I <config>
consists of the following fields:
.PP
.I <db list> <domain list> <maxno> <perms> <freq> <date> <fail>
.PP
.B Field Interpretation
.RS
.PP
The fields have the following syntax and semantics:
.RS
.TP
.I <db list>
A colon separated list of Archie catalog names. The requesting client may
query the local server only about the catalogs listed here.
.I All
catalogs may be queried by using the asterisk character (`*').
.TP
.I <domain list>
A colon separated list of Archie pseudo-domains and domains. The client
program asks questions of any site in the local server which belongs to the
domains listed here. Specify an asterisk (`*') if queries about \fIall\fP
domains are to be allowed.
.TP
.I <max no>
The maximum number of sites, matching the above constraints, to be returned to
the client process.
.TP
.I <perms>
If the character `r' occurs in this field then the client has `read
permission' on the catalogs/domains specified on this configuration line. This
field exists to allow \fIasymmetric\fP data exchanges where Archie site A may
request data from Archie site B but not vice versa.
.TP
.I <freq>
Ignored (and never modified) by
.BR arserver .
See below.
.TP
.I <date>
Ignored by
.BR arserver .
See below.
.TP
.I <fail>
Ignored by
.BR arserver .
See below.
.RE
.LP
Example:
.LP
The following is a sample configuration file entry accompanied by an
explanation of how the server would interpret this line.
.sp
\fCbunyip.com anonftp:webindex europe:usa 30 r 12h 19920703162322 0\\
.br
.RS
whois:yp asia 10 r 30d 19920603162322 0\fP
.RE
.PP
Line 1.
.RS
.TP
Field 1.
The remote client (RC) on the site `bunyip.com' is allowed to
contact the local server.
.TP
Field 2.
The RC is allowed to query the local server about the anonftp \fIand\fP
webindex Archie catalogs.
.TP
Field 3.
The RC can query about sites in the Archie pseudo domains of `europe'
\fIand\fP `usa' (which of course must be defined in the server Archie
system). See
.BR ardomains (n).
.TP
Field 4.
The RC data for can request at most 30 sites in any given session.
.TP
Field 5.
The data in the Archie catalogs specified by this line is `enabled' for the
RC.
.TP
Field 6.
Ignored.
.TP
Field 7.
Ignored.
.TP
Field 8.
Ignored.
.RE
.PP
Line 2.
.RS
is a continuation of line 1, due to the continuation character (\\) as the
last character on line 1. Therefore line 2 applies to the remote client
connecting from `bunyip.com'.
.RS
.TP
Field 1
For the sake of clarity there is no ``Field 1'' used on continuation lines.
.TP
Field 2.
The RC may query for entries in the `whois' and `yp' catalogs.
.RE
.RE
.PP
Note that both the
.B arserver
and
.B arexchange
programs read the same configuration file, and further note that it is through
the
.I <perms>
field that the respective server and client invocations can determine the
lines intended for them. The server picks up lines with a
.I <perms>
value of `r'. It is also possible to specify `rw' as the permission
field. This means both that the remote client may contact the local server and
that the local client can contact the remote server.
.PP
Example:
.RS
.PP
The following line specifies one
.I symmetric
and two
.I asymmetric
examples of data exchange.
.LP
\fCbunyip.com anonftp usa 50 rw 1d 19920703162322 0\\
.PD .1v
.RS
.IP
\fCwebindex europe 30 r 10d 19920703162322 0\\
.IP
.PD 1v
\fCwhois usa 70 w 5d 19920703162322 0\ \fP
.RE
.PP
The first line specifies that the local client is to contact the remote server
on `bunyip.com' and request all sites in the `anonftp' catalog from the `usa'
pseudo-domain. In addition, the remote client may contact the local server and
request the same information (symmetric).
.PP
The second line specifies that the remote client may request information about
the `webindex' catalog for sites in the `europe' domain. Since `r' is
specified as the permissions field, the local client will not query the remote
server for this information (asymmetric). Similarly, the local client is to
query the remote server about information on the `webindex' catalog from sites
in `usa' (line 3), but may not be queried by the remote client about this
catalog in this domain.
.RE
.RE
.RE
.RE
.PP
.B Data Exchange
.RS
.PP
The actual data exchange is not performed by the server or client which just
negotiate the information which is to be received and transmitted. The
protocol used by arserver and its clients is described in
.BR archie_protocol (5).
An auxililary process designed specifically to transfer the underlying data is
used for actual data transmission. The client and server processes
automatically generate the name of the program to run from the following rule:
.IP
.BI net_ <dbname>
.PP
where
.I <dbname>
is the name of the Archie catalog with which this data is associated. This
auxiliary program resides in the
.B ~archie/bin
directory. The data exchanged in this manner may be in a `raw' or preprocessed
state depending on the catalog. No assumptions should be made as to the state
of this data. However, the underlying data exchange routines will produce
files appropriately prepared for insertion at some phase of the update cycle.
.RE
.SH FILES
~archie/etc/arupdate.cf,
.br
~archie/db/host_db/*
.SH SEE ALSO
.PP
.BR inetd (8),
.BR arexchange (n),
.BR arretrieve (n),
.BR archie_protocol (5),
Archie system documentation
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,53 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)convert_hostdb.n
.\"
.TH CONVERT_HOSTDB N "August 1996"
.SH NAME
.B convert_hostdb
\- convert the host_db database from the old Archie-3.2 format to the new
Archie-3.5 format
.SH SYNOPSIS
.B convert_hostdb
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
]
.SH DESCRIPTION
.B convert_hostdb
creates a backup of the old host_db database, then it creates a database, in
the new format, suitable for Archie-3.5 systems.
.SH OPTIONS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not specified, the
program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not specifed, the program
will first try
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.PP
.SH FILES
~archie/db/host_db/*
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,125 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)db_build.n
.\"
.TH DB_BUILD N "August 1996"
.SH NAME
.B db_build
\- build the strings index file.
.SH SYNOPSIS
.B db_build
[
.BI \-k " <maxsize>"
] [
.BI \-d " <database>"
] [
.B \-f
] [
.BI \-t " <tmp-dir>"
] [
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.B db_build
builds the strings index file,
.BR Stridx.Index .
The index file allows faster searching in the strings database. The
file
.B Stridx.Split
contains the size of the initial segment of
.B Stridx.Strings
that is currently indexed. If
.B Stridx.Strings
has grown much larger than the value in
.BR Stridx.Split ,
it is recommended that
.B db_build
be run, to speed up the searching.
.B db_build
will usually take a long time to build an index. The time required
to build the index grows with the size of the
.B Stridx.Strings
file.
.SH OPTIONS
.TP
.BI \-k " <maxsize>"
The approximate, maximum amount of memory to use while building the index.
The value is specified in kilobytes (1024 bytes).
.TP
.BI \-f
Force the building of the index. By default,
.B db_build
checks the sizes of the
.B Stridx.*
files to decide whether or not to rebuild the index.
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not specified, the
program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-d " <database>"
The catalog for which to build the index,
.BR Stridx.Strings .
By default the program will examine all available catalogs listed in
.BR ~archie/etc/catalogs.cf .
.TP
.BI \-t " <tmp-dir>"
The name of the temporary directory where the build will be performed.
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not specified, the program
will first try
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-v
Verbose. Print messages indicating what the program is doing.
.TP
.BI \-l
Write messages to the default log file,
.BR ~archie/logs/archie.log .
The location of the log file can be overridden with the
.B \-L
option. By default, messages are written to
.IR stderr .
.TP
.BI \-L " <logfile>"
The name of the file to be used for logging messages. Note that debugging
information is also written to the log file. The
.B \-l
option must also be specified.
.SH FILES
~archie/db/\fI<db-name>\fP/Stridx.*
.SH SEE ALSO
.BR db_stats (n)
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,117 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)db_check.n
.\"
.TH DB_CHECK N "August 1996"
.SH NAME
.B db_check
\- Check the consistency of the Archie databases.
.SH SYNOPSIS
.B db_check
[
.BI \-H " <hostname>"
] [
.BI \-p " <port>"
] [
.BI \-k " <keyword>"
] [
.BI \-d " <database>"
] [
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.B db_check
performs a quick scan over the database files and prints out statistical
information. Numbers indicating the size of each site file and the
different types of records contained are printed.
.SH OPTIONS
.TP
.BI \-H " <hostname>"
The host name of the site about which you want statistics.
.TP
.BI \-p " <port>"
The port number of the site, if it is different from the
default value.
.TP
.BI \-k " <keyword>"
One can check for the existance of a certain word in any
site.
.B db_check
will produce a list of host indicies containing this word.
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not
specified, the program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-d " <database>"
The catalog about which to print statistics.
By default, the program will examine all available catalogs listed in
.BR ~archie/etc/catalogs.cf .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not
specified, the program will first try
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-v
Verbose mode. Print messages indicating what the program is
doing.
.TP
.BI \-l
Log messages to the file
.BR ~archie/logs/archie.log .
The location of the file may be overridden with the
.B \-L
option. By default, messages are written to
.IR stderr .
.TP
.BI \-L " <logfile>"
Specify the log file. For this to have any effect, the
.B \-l
option must be specified, as well. Note that debugging information is also
written to the log file.
.SH DIAGNOSTICS
.B db_check
gives a variety of warning and error messages if the catalog is found to be
inconsistent. Messages containing the string `ERROR' are considered serious
and are usually the sign of a corrupt catalog.
.SH FILES
~archie/db/<db-name>/*
.br
~archie/db/<db-name>/start_db/*
.br
~archie/db/host_db/*
.SH SEE ALSO
.BR db_stats (n)
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,99 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)db_dump.n
.\"
.TH DB_DUMP N "August 1996"
.SH NAME
.B db_dump
\- dump the list of available sites, in the database, in ASCII format.
.SH SYNOPSIS
.B db_dump
[
.BI \-H " <hostname>"
] [
.BI \-d " <database>"
] [
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.B db_dump
prints a brief message about each site indexed in any database.
The information includes host name, port number, IP address and the site index
number. The site index number is the number by which the site is known to
the database.
.B db_dump
is useful for a quick listing of the database in question.
.SH OPTIONS
.TP
.BI \-H " <hostname>"
The host name of the site about which you want statistics.
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not
specified, the program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-d " <database>"
The catalog about which to print information.
By default, the program will examine all available catalogs listed in
.BR ~archie/etc/catalogs.cf .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not
specified, the program will first try
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.B \-v
Verbose mode. Print messages indicating what the program is
doing.
.TP
.B \-l
Log messages to the file
.BR ~archie/logs/archie.log .
The location of the file may be overridden with the
.B \-L
option. By default, messages are written to
.IR stderr .
.TP
.BI \-L " <logfile>"
Specify the log file. For this to have any effect, the
.B \-l
option must be specified, as well. Note that debugging
information is also written to the log file.
.SH FILES
~archie/db/\fI<db-name>\fP/*
.SH SEE ALSO
.BR db_stats (n)
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,116 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)db_reorder.n
.\"
.TH DB_REORDER N "August 1996"
.SH NAME
.B db_reorder
\- fix the order of sites in the start_db database according to the
domain.order file.
.SH SYNOPSIS
.B db_reorder
[
.BI \-H " <hostname>"
] [
.BI \-d " <database>"
] [
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
For each string in the
.B start_db
database,
.B db_reorder
sorts the list of associated hosts according to the contents of
.BR ~archie/etc/domain.order .
This file specifies the order in which hosts are listed, in response
to a search, when multiple hosts are associated with a match.
.PP
.B db_reorder
must be run in order for changes to
.B domain.order
to take effect.
The
.B domain.order
contains one or more lines, each of which is a colon (`:') separated list of
domains, or pseudo-domains. Hosts belonging to domains on lines nearer the
start of the file will appear first in search results. The order of domains
within a single line has no effect on the result.
.PP
For example if the
.B domain.order
contains:
.nf
.IP
\fCca:edu
org:com\fP
.PP
then hosts in `.ca' and `.edu' domains will appear befores those in the `.org'
and `.com' domains.
.SH OPTIONS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not specified, the
program first looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-d " <database>"
The catalog for which the corresponding starts database will be reordered. By
default, the program will reorder all catalogs listed in
.BR ~archie/etc/catalogs.cf .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not specified, the program
will first look in
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.B \-v
Verbose mode. Print messages indicating what the program is doing.
.TP
.B \-l
Log messages to the file
.BR ~archie/logs/archie.log .
The location of the file may be overridden with the
.B \-L
option. By default, messages are written to
.IR stderr .
.TP
.BI \-L " <logfile>"
Specify the log file. For this to have any effect, the
.B \-l
option must be specified, as well. Note that debugging information is also
written to the log file.
.SH FILES
~archie/db/<db-name>/start_db/*
.br
~archie/etc/domain.order
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,101 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)db_siteidx.n
.\"
.TH DB_SITEIDX N "August 1996"
.SH NAME
.B db_siteidx
\- create the site index file for a site.
.SH SYNOPSIS
.B db_siteidx
[
.BI \-H " <hostname>"
] [
.BI \-p " <port>"
] [
.BI \-d " <database>"
] [
.BI \-I " <size>"
] [
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.B db_siteidx
creates an index corresponding to a site. The index file has the same name as
the site file, but with a `.idx' prefix. A site index file is not necessary,
but it speeds up searches in its associated site. Therefore, larger site
files will benefit from an index.
.SH OPTIONS
.TP
.BI \-H " <hostname>"
The host name about which you want statistics.
.TP
.BI \-p " <port>"
The port number of the site if it is different from the default value.
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not specified, the
program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-d " <database>"
The catalog where the site exists.
.TP
.BI \-I " <size>"
Set the minimum size for a site file to be indexed. The size is in bytes. If
the size of the site file is greater or equal to this size a `.idx' file will
accompany this site file to speed up searches in it. By default, this size is
500000 bytes.
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not specified, the program
will first default to
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.B \-v
Verbose. Print debugging information.
.TP
.B \-l
Write any user output to the default log file
.BR ~archie/logs/archie.log .
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to
.IR stderr .
.TP
.BI \-L " <logfile>"
The name of the file to be used for logging information. Note that debugging
information is also written to the log file. This implies the
.B \-l
option, as well.
.SH FILES
~archie/db/\fI<database name>\fP/*
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,108 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)db_stats.n
.\"
.TH DB_STATS N "August 1996"
.SH NAME
.B db_stats
\- print statistics about Archie databases.
.SH SYNOPSIS
.B db_stats
[
.BI \-H " <hostname>"
] [
.BI \-p " <port>"
] [
.BI \-d " <database>"
] [
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH DESCRIPTION
.B db_stats
performs a quick scan over the database files and prints out statistical
information. Basic numbers reflecting the size of each site file and the
number of different type of records contained are calculated. It also
gives general information regarding each database like the total size
of indexed files and so on.
.BR db_stats ,
when run on the webindex catalog, reports the number of URLs on each site and
the number of indexable URLs. An example of a non-inexable URL is a gif file
or a binary file.
.SH OPTIONS
.TP
.BI \-H " <hostname>"
The host name of the site about which you want statistics. by default
.B db_stats
will examine all sites.
.TP
.BI \-p " <port>"
The port number of the site, if it is different from the default value.
.TP
.BI \-d " <database>"
The catalog about which to print statistics. By default, the program will
examine all available catalogs listed in
.BR ~archie/etc/catalogs.cf .
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not specified, the
program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not specified, the program
will first try
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.B \-v
Verbose mode. Print messages indicating what the program is doing.
.TP
.B \-l
Log messages to the file
.BR ~archie/logs/archie.log .
The location of the file may be overridden with the
.B \-L
option. By default, messages are written to
.IR stderr .
.TP
.BI \-L " <logfile>"
Specify the log file. For this to have any effect, the
.B \-l
option must be specified, as well. Note that debugging information is also
written to the log file.
.SH FILES
~archie/db/\fI<database name>\fP/*,
.br
~archie/db/host_db/*
.SH SEE ALSO
.BR db_check (n),
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,118 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)delete_anonftp.n
.\"
.TH DELETE_ANONFTP N "August 1996"
.SH NAME
.B delete_anonftp
\- delete a site from the Archie anonftp database
.SH SYNOPSIS
.B delete_anonftp
.BI \-H " <hostname>"
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI \-p " <port>"
] [
.BI \-w " <dir>"
] [
.BI \-t " <tmp>"
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH OPTIONS
.TP
.BI \-H " <hostname>"
The fully qualified domain name or IP address, in standard `quad' or `dotted
decimal' format, of the site to be deleted. This name must already exist in
the Archie host database or be resolvable via the Domain Name System. This
parameter is mandatory.
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not specified, the
program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-w " <dir>"
The name of the directory in which the Archie anonftp catalog resides. This
parameter overrides the default catalog name,
.BR ~archie/db/anonftp ,
as well as the
.B \-M
option.
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not specified, the program
first looks in
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-p " <port>"
The port number at the FTP site. The default value is 21.
.TP
.B \-v
Verbose mode. Print messages indicating what the program is doing.
.TP
.B \-l
Log messages to the file
.BR ~archie/logs/archie.log .
The location of the file may be overridden with the
.B \-L
option. By default, messages are written to
.IR stderr .
.TP
.BI \-L " <logfile>"
Specify the log file. For this to have any effect, the
.B \-l
option must be specified, as well. Note that debugging information is also
written to the log file.
.SH DESCRIPTION
.B delete_anonftp
modifies and inactivates the appropriate entry in the auxiliary
host database. The program issues an error message and exits if
the specified host does not exist in the anonftp catalog. In normal
operation this program is only invoked by the
.BI update_anonftp (n)
program.
If invoking this program from the command line, care should be taken to insure
that no other processes are modifying the anonftp catalog. The exclusive
locking mechanism provided by the
.BR update_anonftp (n)
program, which invokes
.B delete_anonftp
in normal operation, will not be available in the command line invocation.
.SH FILES
~archie/db/host_db/*
.br
~archie/db/anonftp/*
.SH "SEE ALSO"
.BR update_anonftp (n),
.BR insert_anonftp (n)
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,121 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)delete_webindex.n
.\"
.TH DELETE_WEBINDEX N "August 1996"
.SH NAME
.B delete_webindex
\- delete a site from the Archie webindex database
.SH SYNOPSIS
.B delete_webindex
.BI \-H " <hostname>"
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI \-p " <port>"
] [
.BI \-w " <dir>"
] [
.B \-v
] [
.B \-l
] [
.BI \-L " <logfile>"
]
.SH OPTIONS
.TP
.BI \-H " <hostname>"
The fully qualified domain name or IP address, in standard `quad' or `dotted
decimal' format, of the site to be deleted. This name must already exist in
the Archie host database, or be resolvable via the Domain Name System. This
parameter is required.
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. By default, the program
first looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-w " <dir>"
The name of the directory in which the Archie webindex catalog resides. This
parameter overrides the default catalog name,
.BR ~archie/db/webindex ,
as well as the
.B \-M
option.
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. By default, the program first
looks in
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-p " <port>"
The port number at the Web site. The default value is 80.
.TP
.B \-v
Verbose mode. Debugging information is printed to
.IR stderr ,
or to the log file, if
.B \-l
is specified.
.TP
.B \-l
Write output to the default log file
.BR ~archie/logs/archie.log .
The name of this file can be overridden with the
.B \-L
option. By default, errors are written to
.IR stderr .
.TP
.BI \-L " <logfile>"
The name of the log file. This option has no effect unless the
.B \-l
option is also specified. Note that debugging information is also written to
the log file.
.SH DESCRIPTION
.B delete_webindex
modifies and deactivates the appropriate entry in the auxiliary host database.
If the specified host does not exist in the webindex catalog, the program
prints an error message, then exits. Normally, this program is only invoked
by
.BR update_webindex (n).
.PP
When running
.B delete_webindex
from the command line, one should ensure
that no other processes are modifying the webindex catalog.
The exclusive locking mechanism provided by
.BR update_webindex (n),
is not available from the command line.
.SH FILES
~archie/db/host_db/*
.br
~archie/db/webindex_db/*
.br
~archie/db/webindex_db/start_db/*
.SH "SEE ALSO"
.BR update_webindex (n),
.BR insert_webindex (n)
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,179 @@
.\" Copyright (c) 1993, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)handle_header.n
.\"
.TH HANDLE_HEADER N "August 1996"
.SH NAME
.B handle_header
\- perform Archie header modifications and update host databases
.SH SYNOPSIS
.B handle_header
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI -H " <header string>"
] [
.B \-U
] [
.BI \-a " <filename>"
] [
.BI \-d " <filename>"
] [
.BI \-m " <message>"
] [
.BI \-r " <header fieldname>"
] [
.BI \-p " <header fieldname>"
] [
.B \-s
]
.SH DESCRIPTION
This program is a temporary measure to provide administrators with a method of
manipulating data files with Archie headers. The program always reads from
stdin and writes to stdout (which can be suppressed). If invoked without
arguments, it behaves like
.BR cat (1),
except that it assumes the incoming data has an Archie header, otherwise it
will complain.
.SH OPTIONS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not given, the program
tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not supplied the program
will default first to
.BR ~archie/db/host_db ,
and failing that, to
.BR ./host_db .
.TP
.BI \-d " <filename>"
Take the incoming data, strip off the Archie header, writing it to
.IR <filename> .
The rest of the data is written to
.IR stdout .
.TP
.BI \-a " <filename>"
Copy
. stdin
to
.IR stdout ,
prepending the Archie header stored in
.IR <filename> .
This option, with
.BR \-d ,
allows one to temporarily store the headers while the main body of the data is
being processed, then later restore the headers.
.TP
.BI \-H " <header string>"
Modifies the Archie header that is written to the output. Refer to
.BR archie_headers (5)
for the header field names. The syntax for
.I <header string>
is:
.RS
.RS
.sp
.IR "<fieldname> <value>" " [; " "<fieldname> <value>" " ] ..."
.sp
.RE
.RE
.RS
Note:
.I <header string>
is a single parameter and must be quoted in shell scripts to avoid being
interpreted as multiple arguments. All the fields listed in
.BR archie_headers (5)
are supported except `primary_hostname' and `primary_ipaddr', to which changes
are not allowed. Any number of these fields may be modified. In the case of
duplicates an error will be flagged. For the time-based fields (`parse_time',
`retrieve_time', `update_time') the special word `now' may be used instead of
the time. The program will take the current time and insert it into the
field. Otherwise, the time must be specified in YYYYMMDDHHMMSS format, GMT
(\fBdate\fP(1) can provide this format).
.RE
.TP
.B \-s
Suppress the normal action of copying the rest of the (non-header)
.I stdin
data to
.IR stdout .
.TP
.BI \-p " <header fieldname>"
Implies
.BR \-s .
Print the value of the
.I <header fieldname>
to
.IR stdout .
One can extract particular values in the header with, for example:
.sp
.RS
.RS
\fCcat file | (set x=`handle_header -p retrieve_time`)\fP
.RE
.RE
.sp
.RS
This sets the
.BR csh (1)
variable `x' to the value of `retrieve_time' in YYYYMMDDHHMMSS string format.
Note that this value would not be exported to the calling shell.
.RE
.TP
.BI \-r " <header fieldname>"
Removes the given
.IR "<header fieldname>" ,
and its associated value, from the header being manipulated.
.TP
.BI \-m " <message>"
Tells the program to write
.I <message>
to the Archie mail notification system (see
.SM "Monitoring The System"
in the Archie manual).
.I <message>
must be supplied. The required information will be extracted from the current
header. For example, if the `update_status' field is `fail' then a failure
message will be written.
.TP
.B \-U
Update the record in the host databases. The standard DNS host name checks
will be done. Note that it is the responsibility of the update program to do
any file locking required to prevent concurrent updates. The
.B arcontrol
program will run
.B retrieve_*
processs in parallel, but
.B parse_*
and
.B update_*
processes are run in sequence. The program will mark all inserted records as
`ACTIVE'.
.SH "SEE ALSO"
.BR retrieve_* (n),
.BR parse_* (n),
.BR update_* (n),
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,552 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)host_manage.n
.\"
.TH HOST_MANAGE N "August 1996"
.SH NAME
.B host_manage
\- administrative management of the Archie host databases
.SH SYNOPSIS
.B host_manage
[
.BI \-M " <dir>"
] [
.BI \-h " <dir>"
] [
.BI \-C " <config>"
] [
.BI \-D " <domain list>"
] [
.BI \-H " <source hostname>"
] [
.I <sitename>
]
.SH DESCRIPTION
.PP
.B host_manage
is a
.BR curses (3)
based program which allows the Archie administrator to view and change the
contents of the host databases.
.SH OPTIONS
.TP
.BI \-M " <dir>"
The name of the master Archie database directory. If not given, the program
tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.BR ./db .
.TP
.BI \-h " <dir>"
The name of the Archie host database directory. If not supplied the program
will default first to
.B ~archie/db/host_db
and failing that, to
.BR ./host_db .
.TP
.BI \-C " <config>"
The name of the file containing configuration information for displaying the
catalog access methods. Without this flag, the default used is
.BR ~archie/etc/hm_db.cf .
.TP
.BI \-D " <domain list>"
Restrict the display to only those sites in the database which belong to the
colon separated list of Archie psuedo-domains given by
.IR <domain list> .
.TP
.BI \-H " <source hostname>"
Use the given
.I <source hostname>
as the value for the `source' field on the display, when adding a new host,
overriding the default value returned by
.BR gethostname (2).
.TP
.I <site name>
The information about the given site will be displayed upon entry into the
program.
.SH DISPLAY
.PP
.B host_manage
displays the complete information on each site stored in the Archie host
databases. Note that a number of the fields on the display are for
informational value only and may not be directly modified by the user. These
fields are marked below by an asterisk (`*').
.PP
The screen is partitioned into 4 main areas.
.IP 1)
Information about the site itself. This is composed of:
.RS
.RS
.TP
.B Primary Hostname
The primary DNS name for the site. This is used internally by the Archie
system.
.TP
.B Preferred Name
A valid DNS CNAME record for the site, if one exists. For example, may ftp
archive sites use a CNAME record of the form ftp.\fI<domain>\fP.
.TP
.BR Catalogs *
The list of the auxiliary catalogs currently associated with this site.
.TP
.BR "Primary IP address" *
The valid DNS IP address; used internally by the Archie system.
.TP
.B Operating System
The operating system of the site, if known.
.TP
.B Prospero Host
Whether or not the site runs a Prospero system daemon.
.TP
.B Timezone
The timezone in which the site is located. This is displayed as a signed
quantity in hours and minutes (in 15 minute increments). Negative values
represent timezones to the west of the Greenwich meridian, positive
values to the east.
.RE
.RE
.IP 2)
Information about the various (auxiliary) databases of this site.
.RS
.RS
.TP
.B Catalog
The name of this catalog entry.
.TP
.BR Source *
The Archie host responsible for monitoring this catalog at this site.
.TP
.BR "Generated By" *
Shows which stage in the Archie update cycle was responsible for this
record. This is useful for entries for which there was some error in
processing.
.TP
.BR "Retrieve Time" *
The time that the Data Aquisition phase acquired the information. Converted to
local time for display.
.TP
.BR "Parse Time" *
The time that the information was parsed. Converted to local time for display.
.TP
.BR "Update Time" *
The time that the information was updated in the catalog. Converted to local
time for display.
.TP
.BR "Number of Records" *
The number of records stored. The definition of `record' is catalog dependent.
.TP
.BR "Fail Count" *
The number of times that update of this site/catalog combination has been
attempted and failed.
.TP
.BR Msg *
Contains information generated by the Archie subsystems which may be of use to
the Archie administrator. This may include such things as reasons for update
failure, etc.
.RE
.RE
.IP 3)
Information concerning the commands used to retrieve the data. Database
dependent. See the section
.SM "Access Display Configuration"
for a description of this display.
.IP 4)
Ancilliary information.
.RS
.RS
.TP
.B Current Status
The current status of this catalog in the system. Valid values are:
.RS
.RS
.TP
.B Active
Entry currently active.
.TP
.B Inactive
Entry temporarily inactive by the Archie subsystems. Data files have been
purged from the system.
.TP
.B Not Supported
This site does not support this catalog type.
.TP
.B Scheduled for deletion by local administrator
This catalog has been rendered `permanently' inactive by the Archie
administrator
.TP
.B Scheduled for deletion by Archie
Usually means that the data is out of date and has been scheduled for
deletion.
.TP
.B Disabled
Temporarily disabled by the local Archie administrator.
.TP
.B Deleted
The associated data files have been removed and the record will be removed
from the host databases at a later date by the
.BR host_manage (n)
program.
.RE
.RE
.TP
.BR "Force Update" *
The data for this site has been scheduled for early update.
.TP
.BR "Action Status" *
Current action being performed on this record. Values are:
.RS
.RS
.TP
.B Add
For a new entry
.TP
.B Update
To modify existing data
.RE
.RE
.RE
.RE
.SH EDITING
.PP
The command set for editing is based on the standard emacs key bindings.
Currently these are not user configurable.
.RS
.PP
.B Cursor Movement
.RS
.TP
.PD .1v
.B ^A
Beginning of line
.TP
.B ^E
End of line
.TP
.B ^N
Next field
.TP
.B ^P
Previous field
.TP
.B ^F
Next character
.TP
.PD 1v
.B ^B
Previous character
.RE
.PP
.B Editing
.RS
.PP
The program is always in insert mode.
.TP
.PD .1v
.B ^K
Kill to end of line (into yank buffer)
.TP
.B ^Y
Insert from yank buffer
.TP
.B ^D
Delete character at cursor
.TP
.B ^H
Delete previous character
.TP
.PD 1v
.B <DEL>
Delete previous character
.RE
.PP
.B Viewing/Modifying the Data
.RS
.TP
.PD .1v
.B <space>
Next value
.TP
.B <ESC>
Previous value
.TP
.B <TAB>
Display modification menu. See below.
.TP
.B ^G
Bring up host list display. See below.
.TP
.B ^U
Update current entry
.PD 1v
.B <return>
Go to (display record for) given entry.
.RE
.TP
.B Miscellaneous
.RS
.TP
.PD .1v
.B ^L
Refresh Screen
.TP
.PD 1v
.B ^C
Exit program
.RE
.RE
.PP
Some fields on the display are read-only. The cursor will not enter these
fields.
.PP
At any time the user may specify the name of a site in the `Primary hostname'
field, then type
.BR <return> .
This will bring up the record for that site if it is found in the database.
.SH HOST LIST DISPLAY
This multi-column display allows one to browse through all the host names
stored in the database. The key bindings are a subset of those on the main
display. All other keys are ignored.
.RS
.RS
.TP
.B <blank>
Forward one page
.TP
.B <ESC>
Backwards one page
.TP
.B ^F
Forward one column.
.TP
.B ^N
Down one entry.
.TP
.B ^P
Up one entry.
.TP
.B ^B
Backward one column.
.TP
.B <return>
Go back to main display and show full record of current (highlighted)
entry.
.TP
.B ^
Go to first host in database.
.TP
.B $
Go to last host in database.
.TP
.B ^G
Return to main display. The last record shown on the main display is
given.
.SH "MODIFICATION MENU"
This menu allows the user to modify the status of the site entry in the host
databases. The following alternatives are provided:
.RS
.RS
.LP
Add site to host database
.IP
This option clears the values in the fields (with the exception of the
`source' field) and allows the user to enter a new site into the host
databases. All normal cursor movement is permitted. After the information has
been entered, type the `update' key (by default ^U) to add the new entry to
the database.
Note that no default is provided for the `Database' field. This should be
filled with the name of the initial catalog for this host (for example,
`anonftp' or `webindex').
.LP
Delete a site from the host database
.IP
Since the Archie catalogs (such as anonftp) depend on the host databases for
host name resolution and other information, it is not currently possible to
immediately delete a site or Archie catalog entry. As a result, the method
used to remove a site or catalog entry from the host databases involves first
marking it for deletion. The system will process this information as part of
its normal operations. When the appropriate files (such as sites files in the
anonftp catalog) have been removed, the catalog entry is marked as
`Deleted'. After all catalogs associated with a host have been so marked, the
site itself can be physically removed from the host databases. This command
will perform this function.
.LP
Add Archie catalog to auxiliary host database
.IP
This command will clear the auxiliary host database area of the screen and
permit the addition of a new auxiliary database name. Type the update key (by
default, ^U) to complete the addition.
.LP
Delete Archie catalog from auxiliary host database
.IP
Use this command to mark an auxiliary database for deletion.
.LP
Reactivate a catalog
.IP
The system or administrator can disable or mark for deletion a catalog for a
particular site. This option causes the current entry to be placed on `ACTIVE'
status.
.LP
Force early update of a site/catalog entry
.IP
Sometimes it is desriable to force the early update of a particular site or
catalog entry, before the time that it would be normally be scheduled. This
command will mark the current catalog for early update.
.LP
Force deletion of the current catalog
.IP
This is a dangerous operation, but may, in some circumstances, be
necessary. For example, since arbitrary catalog names may be used, a
typographical error (such as typing `webindex' as `webinex') can cause an
incorrect entry into the host databases. This option will allow the Archie
administrator to forcefully remove such a catalog without the standard checks
to be performed.
.LP
.SH "ACCESS DISPLAY CONFIGURATION"
.PP
Internally, the generic access method for each Archie catalog type is
stored as a sequence of values (fields) separated by colons. Since each
catalog type may have a different ordering and meaning for these fields,
the host_manage program allows for convenience the administrator to
define names and default values for these fields.
.PP
For this purpose there is a configuration file associated with the
program. By default the file is
.B ~archie/etc/hm_db.cf
unless overridden by the
.B \-C
command line option. This describes how the various
catalog-specific access methods are to be displayed on the screen.
.PP
The file is formatted as
.IP
.I <catalog name>
{
.I field configurations
}
.PP
Each field configution is composed of:
.LP
<field name>,<maximum field width>,<default field value>,<modify>
.RS
.TP
.I <field name>
A string giving the name of the field to be
displayed. This entry is mandatory.
.TP
.I <maximum field width>
A number specifying the maximum number of characters
the field value may have. This entry is mandatory.
.TP
.I <default field value>
Optional. A string giving the default value of the
field should it be empty in the data.
.TP
.I <modify>
Optional. Indicates whether the Archie administrator is
allowed to modify this display field or not. Must either be the
character
.B W
or left empty.
.RE
.PP
Multiple field configurations are delimited by a colon ":". Entries for
the optional fields may be omitted. However an empty entry followed by
other optional fields must have its place maintained by ",,".
.PP
.B Example.
.RS
.PP
The following line describes a typical entry for the anonftp catalog.
.LP
anonftp {Filename,40,,W:User,15:Password,15,,W:Account,15:Path,30,,W}
.PP
This specifies that the access method for the catalog "anonftp" contains
5 fields, here named "Filename", "User" and "Password", "Account", "Path".
.PP
The
.B Filename
field is defined to have a maximum of 40 characters. No
default value is given. The field can be modified by the administrator.
Note that even though a default value is not given, since the optional
.I <modify>
field has been specified, a placeholder ",," is required to
specifiy that the
.I <default value>
field has been omitted.
.PP
The
.B User
field has a maximum \fIwidth\fP of 15 characters and has as a default
value of the string "anonymous". It cannot be modified from the host_manage
program (unless the configuration file is modified to make the field
writable).
.PP
The
.B Password
field is also a maximum of 15 characters. No default value is given
and it cannot be modified by the administrator. Note that since none
of the optional fields are specifed they are omitted and no placeholder
is needed.
.PP
Finally, the
.B Path
field specifies the root path of the `ls-lR' file. This is required for
Novell systems.
.B Other Example
.LP
webindex {Port,5,,W:Path,30,,W}
.PP
The
.B Port
field specifies the port on which the services run.
.PP
.B Path
field specifies the root path of the service.
.RE
.SH DIAGNOSTICS
The program tries to ensure that the host data that is entered is
consistent with the Domain Name System. Errors will be flagged on the
status line at the bottom of the display.
.SH BUGS
Due to a strange interaction between xterms and curses(3), use of this
program on xterm windows often causes random garbage to appear on the
screen. Use of the ^L command will draw and restore the display. This can
be fixed by removing the string:
.sp
.in +2in
:me=\E[m:
.in -2in
.sp
from the
.B /etc/termcap
entry for this and other affected terminal types.
.SH NOTES
There is no file or record locking performed on the host database and so
this program may interact badly with any updates which are currently
being performed by other proceses. This will be fixed in a later release.
Inconsistencies in the DNS are tolerated but flagged as errors.
.SH FILES
~archie/db/host_db/*
.br
~archie/etc/hm_db.cf
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,173 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)insert_anonftp.n
.\"
.TH INSERT_ANONFTP N "August 1996"
.SH NAME
.B insert_anonftp
.SH SYNOPSIS
.B insert_anonftp
.BI \-i \ <file>
[
.BI \-M \ <dir>
] [
.BI \-w \ <dir>
] [
.BI \-h \ <dir>
] [
.BI \-t \ <dir>
] [
.BI \-I \ <size>
] [
.B \-v
] [
.B \-l
] [
.BI \-L \ <logfile>
]
.SH DESCRIPTION
.PP
.B insert_anonftp
inserts, into the anonftp database, the data from a parsed anonymous ftp listing.
It is normally run from the master control program,
.BR arcontrol (n),
as part of the update phase of the Archie update cycle. It may also be invoked
from the command line.
.PP
The file to be inserted must be in Archie anonftp parse output format, such as is
generated by the suite of Archie anonftp parsers
.BR parse_anonftp_* .
This format is standard, regardless of the orginal format of the
anonymous FTP directory listing. The output of this
program is placed directly into the anonftp catalog.
.B insert_anonftp
updates on the anonftp database in several phases, in order to
minimize the critical period where interruption of the insertion could
cause the database to be corrupted. However, to prevent the possibility
of this happening at all, the program should not be interrupted at any time.
.PP
.B insert_anonftp
will not attempt to update a site which is listed as
.I active
in the host databases, nor overwrite an existing data file in the
anonftp database. Error conditions, such as a full file system, are also
detected before the anonftp database is modified.
.PP
This program also modifies the anonftp entry, in the auxiliary host database,
corresponding to this site.
.PP
When invoking this program from the command line, care should be taken to
ensure that no other processes are modifying the anonftp catalog.
The exclusive locking mechanism provided by the
.BR update_anonftp (n)
program which invokes
.B insert_anonftp
in normal operation will not be available in the command line
invocation.
.SH OPTIONS
.PP
The following option must be supplied:
.RS
.TP
.BI \-i \ <file>
The name of the file containing the parsed ftp listing.
.RE
.PP
Additionally, the following options are accepted:
.RS
.TP
.TP
.BI \-M \ <dir>
The name of the master Archie database directory. If not
specified, the program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-w \ <dir>
The name of the directory in which the Archie anonftp
catalog resides. This parameter overrides the default
catalog name,
.BR ~archie/db/anonftp ,
as well as the
.B \-M
option.
.TP
.BI \-h \ <dir>
The name of the Archie host database directory. If not
specified, the program first looks in
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-t \ <dir>
Set the name of directory used to store temporary files. By
default,
.B ~archie/db/tmp
is used.
.TP
.BI \-I \ <size>
Set a minimum size, in bytes, for a site file to be indexed.
If the size of the site file is greater than or equal to this size,
a .idx file will accompany the site file in order to speed up search
queries. The default value of
.I <size>
is 500000 bytes.
.TP
.BI \-v
Verbose mode. Print messages indicating what the program is
doing.
.TP
.BI \-l
Log messages to the file
.BR ~archie/logs/archie.log .
The location of the file may be overridden with the
.B \-L
option. By default, messages are written to
.IR stderr .
.TP
.BI \-L \ <logfile>
Specify the log file. For this to have any effect, the
.B \-l
option must be specified, as well. Note that debugging
information is also written to the log file.
.RE
.SH FILES
~archie/db/host_db/*
.br
~archie/db/anonftp_db/*
.SH BUGS
Unlike the previous Archie system, corruption of the database is limited to
the single site when the program is aborted prematurely. If an insert ends
before the site is created then the data in the
.B db/webindex_db/start_db.*
database will not be accurate but will not affect other insertions. Your
queries will output error messages in the logs files reflecting the
inconsistency.
.SH "SEE ALSO"
.BR db_check (n),
.BR fix_start_db (n),
.BR update_anonftp (n),
.BR delete_anonftp (n),
.BR parse_anonftp_* (n),
.BR arcontrol (n)
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,171 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)insert_webindex.n
.\"
.TH INSERT_WEBINDEX N "August 1996"
.SH NAME
.B insert_webindex
.SH SYNOPSIS
.B insert_webindex
.BI \-i \ <file>
[
.BI \-M \ <dir>
] [
.BI \-w \ <dir>
] [
.BI \-h \ <dir>
] [
.BI \-t \ <dir>
] [
.BI \-I \ <size>
] [
.B \-v
] [
.B \-l
] [
.BI \-L \ <logfile>
]
.SH DESCRIPTION
.PP
.B insert_webindex
inserts, into the webindex database, the data from a parsed web index
listing.
It is normally run from the master control program,
.BR arcontrol (n),
as part of the update phase of the Archie update cycle. It may also be invoked
from the command line.
.PP
The file to be inserted must be in Archie webindex parse output format, such as
is
generated by the Archie webindex parser,
.BR parse_webindex .
The output of this program is placed directly into the webindex catalog.
.B insert_webindex
updates the webindex database in several phases, in order to
minimize the critical period where interruption of the insertion could
cause the database to be corrupted. However, to prevent the possibility
of this happening at all, the program should not be interrupted at any time.
.PP
.B insert_webindex
will not attempt to update a site which is listed as
.I active
in the host databases, nor overwrite an existing data file in the
webindex database. Error conditions, such as a full file system, are also
detected before the webindex database is modified.
.PP
This program also modifies the webindex entry, in the auxiliary host database,
corresponding to this site.
.PP
When invoking this program from the command line, care should be taken to
ensure that no other processes are modifying the webindex catalog.
The exclusive locking mechanism provided by the
.BR update_webindex (n)
program, which invokes
.B insert_webindex
in normal operation, will not be available in the command line
invocation.
.SH OPTIONS
.PP
The following option must be supplied:
.RS
.TP
.B \-i
Input file name
.RE
.PP
Additionally, the following options are accepted:
.RS
.TP
.BI \-M \ <dir>
The name of the master Archie database directory. If not
specified, the program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-w \ <dir>
The name of the directory in which the Archie webindex
catalog resides. This parameter overrides the internally generated
catalog name
.B ~archie/db/webindex
and the
.B \-M
option, if specified.
.TP
.BI \-h \ <dir>
The name of the Archie host database directory. If not
specified, the program will first default to
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-t \ <dir>
Set the name of directory used for temporary files. By
default the program uses
.B ~archie/db/tmp.
.TP
.BI \-I \ <size>
Set the minimum size for a site file to be indexed.
The size is in bytes.
If the size of the site file is greater or equal to this size
a .idx file will accompany this site file to speed up
searches in it. By default this size is
500000 bytes.
.TP
.BI \-v
Verbose mode. Tells you what it is doing.
.TP
.B \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to
.I stderr.
.TP
.BI \-L \ <logfile>
The name of the file to be used for logging information.
Note that debugging information is also written to the
log file. This implies the
.B \-l
option as well.
.RE
.SH FILES
~archie/db/host_db/*
.br
~archie/db/webindex_db/*
.SH BUGS
Unlike the previous Archie system, corruption of the database is limited
to the single site when the program is aborted
prematurely.If an insert ends
before the site is created then the data in the db/webindex_db/start_db.*
database will not be accurate but will not affect other insertions. Your
queries will output error messages in the logs files reflecting the
inconsistency.
.SH "SEE ALSO"
.BR db_check (n),
.BR fix_start_db (n),
.BR update_webindex (n),
.BR delete_webindex (n),
.BR parse_webindex (n),
.BR arcontrol (n)
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,57 @@
.\" Copyright (c) 1993, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)mail_stats.n
.\"
.TH MAIL_STATS N "August 1996"
.SH NAME
.B mail_stats
\- monitor the transactions of the Archie system
.SH SYNOPSIS
.B mail_stats
.SH DESCRIPTION
\fBmail_stats\fP is a shell script which is designed to periodically be
run from the
.BI cron (8)
process. It sends mail to Archie system maintainers describing the
transactions of the system, including sites which have been sucessfully
and unsucessfully retrieved, parsed and updated.
.SH CONFIGURATION
In addition to the standard log entries the Archie system will generate
information on each transaction it performs if the system administrators
create a file called mail.results in the ~archie/etc directory. This file
must be readable and writable by the Archie user. The files mail.fail,
mail.success, mail.add, mail.delete, mail.parse, and mail.retr
will be created in the same directory.
In order for mail_stats to work correctly, the line
.IP
MAIL_PGM=/usr/ucb/mail
.PP
should be set to the user mail agent of choice (if the default given is
not acceptable). The mailer chosen should accept a "-s" command line
switch with a subject as the argument. Also the line
.IP
ARCHIE_USER=archuser
.PP
should be changed to the the name of the Archie administration user code
name if not the given default. Mail to this user should be set to be sent
to the Archie system administrator(s).
Once run, mail_stats will reset all the mail data files except mail.results
which remains until the next invokation of the program.
.SH "SEE ALSO"
.BR arcontrol (n),
.SH FILES
~archie/etc/mail.*
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canda,
1990.

View File

@@ -0,0 +1,151 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)net_anonftp.n
.\"
.TH NET_ANONFTP N "August 1996"
.SH NAME
.B net_anonftp
\- send and receive preprocessed anonftp data for inter-Archie data exchange
.SH SYNOPSIS
.B net_anonftp
[
.B \-I
] [
.BI \-O \ <host>
] [
.BI \-M \ <dir>
] [
.BI \-w \ <dir>
] [
.BI \-p \ <port>
] [
.B \-v
] [
.B \-c
] [
.BI \-h \ <dir>
] [
.B \-l
] [
.BI \-L \ <logfile>
]
.SH DESCRIPTION
.PP
This program is not normally invoked from the command line. Rather, it is
run by the
.BR arserver (n)
and
.BR arexchange (n)
programs to transfer preprocessed Archie anonftp catalog files between
Archie servers.
.PP
In output mode it reads in the requested anonftp catalog file and
re-formats the data into the same format as the output prepared by the
.BR parse_anonftp (n)
Archie programs. This re-formatted data is then converted
into Sun XDR (see
.BR xdr (3n)
) format for transmission (which does any conversions necessary
for machines with different hardware architectures).
.PP
In input mode, the conversion is simply between Sun XDR and the local
parser standard binary format.
.PP
This and all other programs like it (
.B ~archie/bin/net_*
) read from and write to stdin and stdout respectively. All file
names are created in the calling process (usually
.BR arserver (n)\fR).
.SH OPTIONS
.TP
.BI \-O \ <host>
Output mode. Read the local anonftp catalog for the
.I <host>
specified, and after appropriate conversions, write it to stdout.
.TP
.B \-I
Input mode. Read stdin for the incoming data, transform it in the
appropriate manner and write it to stdout.
.TP
.BI \-M \ <dir>
The name of the master Archie database directory. If not
specified, the program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-h \ <dir>
The name of the Archie host database directory. If not
specified, the program will first default to
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-p \ <port>
The port number for the site being transferred.
.TP
.BI \-w \ <dir>
The name name of the directory in which the Archie anonftp
catalog resides. This parameter overrides the internally generated
catalog name
.B ~archie/db/anonftp
and the
.B \-M
option, if specified.
.TP
.BI \-c
In output mode the outgoing data will be compressed (with the
.BR compress (1)
program) in the final stage before being put on the network. This results
in a significant improvement in transfer times. In input mode the format
is automatically recognized from the incoming header an this option is
ignored.
.TP
.BI \-v
Tells you what it is doing.
.TP
.B \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to
.I stderr.
.TP
.BI \-L \ <logfile>
The name of the file to be used for logging information.
Note that debugging information is also written to the
log file. This implies the
.B \-l
option as well.
.PP
Since the output of the anonftp parsers is in binary format, files of
this type may not be blindly copied from one hardware architecture to
another since the manner of storing data types (ints, strings etc) may be
different. Hence the use of Sun XDR format to transparently convert the
transmittted data.
This program creates a temporary file during data transfer. This file
will be removed if the program is interrupted or aborts during the transfer.
.SH FILES
~archie/db/host_db/*
.br
~archie/db/anonftp_db/*
.SH SEE ALSO
.BR arserver (n),
.BR parse_anonftp (n),
.BR insert_anonftp (n)
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,138 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)parse_anonftp.n
.\"
.TH PARSE_ANONFTP N "August 1996"
.SH NAME
.B parse_anonftp
\- parse the raw data for an Archie anonftp catalog site
.SH SYNOPSIS
.B parse_anonftp
.BI \-i \ <filename>
.BI \-o \ <template>
[
.BI \-M \ <dir>
] [
.BI \-h \ <dir>
] [
.BI \-f \ <filter\ pgm>
] [
.BI \-t \ <dir>
] [
.B \-v
] [
.BI \-l
] [
.BI \-L \ <logfile>
]
.SH DESCRIPTION
This program determines the appropriate parser for an Archie anonftp catalog
raw data file and invokes it.
.SH OPTIONS
.TP
.BI \-i \ <filename>
Input filename. Mandatory.
.TP
.BI \-o \ <template>
A "template" for the output filename. This argument is used
to determine the base name of the output file which is
used to generate the final output filename. Mandatory.
.TP
.BI \-M \ <dir>
The name of the master Archie database
directory. If not given, the program tries to look in the
directory
.B ~archie/db
and, failing that, defaults to
.B ./db.
.TP
.BI \-h \ <dir>
The name of the Archie host database directory. If not
supplied the program will default first to
.B ~archie/db/host_db
and failing that, to
.B ./host_db
.TP
.BI \-f \ <filter\ pgm>
The name of the filter program to be run on the raw input before
parsing. If not giving the name is automatically generated. See below.
.TP
.BI \-t \ <dir>
Sets the name of the directory used for temporary files.
If not specified, the program uses
.B ~archie/db/tmp.
.TP
.BI \-v
Verbose mode. Will tell you what it is doing.
.TP
.BI \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to stderr.
.TP
.BI \-L \ <logfile>
The name of the file to be used for logging information.
Note that debugging information is also written to the
log file. This implies the
.B \-l
option as well.
.PP
This program is usually invoked by
.BR arcontrol (n)
automatically during the
update cycle to coordinate the parsing of raw anonftp data files. By
reading the header record the program can determine what type of
parser is to be invoked. The raw data is first run through a
preprocessing filter to remove errors and extraneous information. The
name of this filter is automatically generated from the header of the
input data unless overridden by the -f option and uses the following
convention
.IP
.B filter_anonftp_\fI<os name>\fR
.PP
where
.I <os name>
is the name of the operating system running at the source
data site for the data. Thus for BSD UNIX systems the name would be,
.B filter_anonftp_unix_bsd.
.PP
.SH DIAGNOSTICS
If a data file fails to be sucessfully parsed, a file with a ".parse_t"
extension is left in the ~archie/db/tmp directory (unless another temporary
directory is being used). The header record will report the parsing error
in the "comment" field. If desired, the Archie administrator may attempt
to correct the file manually. If this is done, the file need only be
renamed with a ".parse" extension and will automatically be picked up by
the next parse phase of the Update Cycle.
To help the administrator identify the problem, a file with the suffix
".filtered" is also created. This file contains the raw data filtered
through the pre-processing phase and is the actual file on which the
parser was invoked.
.SH BUGS
Sites whose anonymous FTP tree does not start at "/" are not handled
correctly. This will be fixed in a later release.
The line numbers gernerated by this program when an error occurs are
incorrect since the number of lines in the header at the top of the data
file are not taken into account. The actual line causing the error is the
line number given plus the number of lines in the header. This will be
fixed in a later release.
.FILES
~archie/db/tmp/*
.SH "SEE ALSO"
.BR parser_output (n),
.BR arcontrol (n)
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canda,
1990.

View File

@@ -0,0 +1,96 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)parse_anonftp_<sys>
.\"
.TH PARSE_ANONFTP_SYS N "August 1996"
.SH NAME
parse_anonftp_<sys type> \- generate input to the insertion routine from
recursive listings for the anonftp catalog
.SH SYNOPSIS
parse_anonftp_\fIsys_type\fP
[
.B \-h
] [
.BI \-i \ <input\ file>
] [
.BI \-o \ <output\ file>
] [
.BI \-p \ <prep\ dir>
] [
.BI \-r \ <root\ dir>
]
.SH DESCRIPTION
.LP
\fBparse_anonftp\fP_\fIsys_type\fP describes a family of parsers,
currenly with members
.B parse_anonftp_unix_bsd
for BDS UNIX systems
.B parse_anonftp_novell
for Novell systems and
.BR parse_anonftp_vms_std
for VMS systems.
.PP
\fBparse_anonftp\fP_\fIsys_type\fP reads a pre-filtered recursive
directory listing obtained under the \fIsys_type\fR operating system.
Its output is intended to be the input to the
.BR insert_anonftp (n)
program.
.SH OPTIONS
.LP
.TP 5n
.B \-h
No headers.
The listing is not expected to have a special header (normally used by all
programs in the database insertion pipeline), nor will it generate one on
output. This option can be used for debugging or to test raw (unprocessed) listings.
.TP 5n
.BI \-i <input\ file>
The next argument is the name of the file containing the recursive listing. If
unspecified,
.I stdin
is assumed.
.TP 5n
.BI \-o \ <output\ file>
The next argument is the name of the file to which the output will be written.
If unspecified,
.I stdout
is assumed.
.TP 5n
.B \-p \ <prep\ dir>
Currently, this is used only by the UNIX parser. The next argument is
prepended to the start of all \fIdirectory definitions\fR. For
example, an argument of `.' can be used to turn the directory
definition `bin:' into `./bin:', which is more easily digested by the parser.
.TP 5n
.B \-r
Root directory.
This is currently only used by the UNIX parser. The next argument is the name
the root directory from which the listing is assumed to be taken. This option
is often, but not always, used in conjunction with the
.B \-p
option, and is typically the same string, but with the trailing `/' replaced
with a `:'. For example, together these arguments might be `-r .: -p .'.
.TP 5n
.B \-v
Verbose. Tell you what it is doing.
The output of the program is in Archie Listings Parser Output format
which is described in
.BR anonftp_parser_output (5).
.SH SEE ALSO
.BR parse_anonftp (n),
.BR anonftp_parser_output (5),
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1 @@
.so mann/parse_anonftp_unix_bsd.n

View File

@@ -0,0 +1,298 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)retrieve_anonftp.n
.\"
.TH RETRIEVE_ANONFTP N "August 1996"
.SH NAME
.B retrieve_anonftp
\- retrieve anonymous FTP directory listings for the Archie anonftp catalog
.SH SYNOPSIS
.B retrieve_anonftp
.BI \-i \ <input>
.BI \-o \ <template>
[
.BI \-M \ <dir>
] [
.B \-U
] [
.B \-n
] [
.B \-g
] [
.BI \-C \ <config>
] [
.BI \-T \ <timeout>
] [
.BI \-Z
] [
.B \-l
] [
.BI \-L \ <logfile>
] [
.B \-v
]
.SH DESCRIPTION
.PP
.B retrieve_anonftp
performs the data aquisition phase of the update cycle
for the Archie system and is normally invoked by the
.BR arcontrol (n)
program. As input it is given the name of the file containing the header
information necessary for the retrieval. The output may be several files,
depending on the type of information being retrieved. It is essentially a
self-contained FTP client program. By default the program stores the
retrieved files in adaptive Lempel-Ziv compressed format, the same as that
used by the standard UNIX
.BR compress (1)
and
.BR uncompress (1)
utilites. This may be overridden by the -U and -n options.
.SH OPTIONS
.PP
The following two options are mandatory
.RS
.TP
.BI \-i \ <input>
The filename of the header file containing the necessary
information for the retrieval of the anonymous FTP listings
.TP
.BI \-o \ <template>
The base name (template) for the output file(s) generated by the program.
.RE
.PP
The following are optional
.RS
.TP
.B \-U
Uncompress the retrieved listings and store in an
uncompressed format. This is for those listings which are
retrieved already compressed (eg, ls-lR.Z files). This can potentially
speed up the execution of subsequent phases of the Update Cycle, however
more disk space is needed to hold the uncompressed data.
.TP
.B \-n
Do not compress or uncompress the retrieved input. They will
remain in the format in which they were retrieved. This can potentially
speed up the execution of subsequent phases of the Update Cycle, however
more disk space is needed to hold the uncompressed data.
.TP
.B \-g
Disable the globbing feature. If the globbing characters specified in the
configuration file occur in the names of the files in the file list being
retrieved then a separate data file will be produced for each wildcard
matched. If this flag is specified then this feature will be disabled and
the wildcard characters will no longer have their special meaning.
.TP
.BI \-C \ <config>
Use
.I <config>
as the configuration file. If not given the file
.B ~archie/etc/arretdefs.cf
is used.
.TP
.BI \-T \ <timeout>
If the transfer is inactive for <timeout> minutes, the retrieve is
terminated. The default value is 10 minutes.
.TP
.BI \-Z
If this flag is supplied and there is not an explicit file list given on
input for retrieval the program will look for indexing files on the
remote anonymous FTP archive using information supplied in the
configuration file. It first looks for the filename with compression
extension specified in the configuration file, if this is not successful
it then it looks for the file without the extension. If this too fails,
it attempts to fine those files in a subdirectories called "pub" and
"PUB". If the files are not found it gives up and continues the default
behavior. If one of the files is found its modification time is checked
to determine if it has been changed since the last retrieval of this
host. If it has, then the file is picked up in preference to doing the
recursive listing. If it is not then the file is ignored.
This procedure makes use of the FTP protocol extensions "MDTM" and
"SIZE". If these extensions are not supported on the remote site then the
program proceeds with the default behavior.
.TP
.B \-v
Verbose. Describe the details of the session.
.TP
.BI \-M \ <dir>
The name of the master Archie database directory. If
not given, the program tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.B ./db.
.TP
.B \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to stderr.
.TP
.BI \-L \ <logfile>
The name of the file to be used for logging information.
Note that debugging information is also written to the
log file. This implies the
.B \-l
option as well.
.RE
.PP
The input file containing the header (See
.BR archie_header (5)
)
information is read and the site listed therein contacted. In the
absence of access command information in the header the default
action for the operating system at this site is taken. This action
is described in a configuration file (See "Configuration File" below).
The retrieved data is automatically compressed unless the
.B \-U
or
.B \-n
options have been used. The program will write either the output data
file(s), or a file containing an "error header" in the case that an error
occurred during the retrieve.
.SH CONFIGURATION FILE
The primary purpose of the configuration file for this program is to
provide default parameters to be used when the information in the header
file does not provide explicit instructions of the actions to be
performed during the FTP retrieve. The default configuration file is
~archie/etc/arretdefs.cf.
.sp
NOTE: The semantics of each field of this file is determined on a
per-catalog basis. Only the first 2 fields are invariant across
different catalogs.
.sp
For the anonftp catalog it is composed of lines of the following format:
.LP
.I
<dbname>\fB:\fI<os>\fB:\fI<bintrans>\fB:\fI<compext>\fB:\fI<user>\fB:\fI<passwd>\fB:\fI<acct>\fB:\fI<ftp arg>:\fI<glob>\fB:\fI<idx>\fB
.PP
Where
.RS
.TP
.I <dbname>
is the name of the Archie catalog.
.TP
.I <os>
is the operating system, as specified in Archie header records.
.TP
.I <bintrans>
is the FTP protocol command (as defined in RFC 959) to be used to place
the remote FTP server in binary transfer mode.
.TP
.I <compext>
is the file extension used on that operating system for
the default compression method. For example, ".Z" for files compressed
using the
.BI compress (1)
program.
.TP
.I <user>
is the default user code to use for anonymous FTP access.
.TP
.I <passwd>
is the default password to use for anonymous FTP access. If this
field is not specified the system will automatically generate a
password of the form
.BI archie@ <hostname>
(where
.I <hostname>
is the name of the host performing the retrieve). If the file
~archie/etc/archie.hostname has been configured then the host name given
there is used.
.TP
.I <acct>
is the default account to be used for anonymous FTP access.
.TP
.I <ftp arg>
is the default argument to be used with the FTP "LIST"
command (see RFC 959) when performing a listing at this
site.
.TP
.I <glob>
are the globbing characters used by the remote system
.TP
.I <idx>
the base name used by convention by the remote system to store indexing
information
.RE
.PP
.B Example
.RS
.PP
anonftp:unix_bsd:image:.Z:anonymous:::-R:*?:ls-lR
.RS
.TP
.B Field 1.
Specifies the "anonftp" Archie catalog.
.TP
.B Field 2.
Specifies "unix_bsd" as the operating system
.TP
.B Field 3.
The FTP protocol command for binary transmission is "image".
.TP
.B Field 4.
The default extension for compressed files on this system is ".Z" (from
.B compress(1)
).
.TP
.B Field 5.
The default user is "anonymous".
.TP
.B Field 6.
No password specified. archie@\fI<hostname>\fR will be used.
.TP
.B Field 7
No account specified. Most anonymous FTP implementations do
not require this command to be used.
.TP
.B Field 8.
The argument to the BSD UNIX
.BR ls (1)
command (which on most FTP implementations is invoked by the FTP
daemon on a "LIST" command) for a recursive listing is
.B -R.
.TP
.B Field 9.
UNIX uses '*' and '?' as wildcard characters ("globbing"). If these
characters appear in the names of the file lists then by default the
system will create separate output files for each wildcard match found.
This feature is disabled by the -g option.
.TP
.B Field 10.
The convention on most anonymous FTP sites is that if the recursive
listing is maintained at the site then the file is called "ls-lR" or
compressed as "ls-lR.Z". The -Z flag uses this information to retrieve
these indexing files in preference to doing the recursive listing if they
exist. See the -Z option description above.
.RE
.RE
.SH BUGS
.PP
The FTP command for binary access is not a function of the Operating
System but of the underlying architecture.
.PP
Currently only the Lempel-Ziv compression method is supported.
.SH FILES
~archie/etc/arretdefs.cf
.SH SEE ALSO
.BR arcontrol(n),
.BR compress (1),
.BR uncompress (1)
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,133 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)update_anonftp.n
.\"
.TH UPDATE_ANONFTP N "August 1996"
.SH NAME
.B update_anonftp
\- update the entry for an Archie anonftp database site
.SH SYNOPSIS
.B update_anonftp
.BI \-i \ <file>
[
.BI \-M \ <dir>
] [
.BI \-w \ <dir>
] [
.BI \-h \ <dir>
] [
.BI \-t \ <dir>
] [
.BI \-I \ <size>
] [
.BI \-v
] [
.B \-l
] [
.BI \-L \ <logfile>
]
.SH DESCRIPTION
.B update_anonftp
performs a delete-insert sequence in the Archie anonftp
catalog for the site specifed in the given input file and is normally
invoked, by the
.BR arcontrol (n)
program. The site is first deleted from the anonftp catalog and its
entry in the host database made inactive by the
.BR delete_anonftp (n)
program. An insert into the catalog is then attempted by
.BR insert_anonftp (n)
which if successful, incorporates the new data and reactivates the site.
.SH OPTIONS
.TP
.BI \-i
Input file name containing the header information and (possibly) additional
information.
.TP
.BI \-M \ <dir>
The name of the master Archie database directory. If not
specified, the program looks in the directory
.BR ~archie/db ,
then
.BR ./db .
.TP
.BI \-w \ <dir>
The name of the directory in which the Archie anonftp
catalog resides. This parameter overrides the internally
generated catalog name
.B ~archie/db/anonftp_db
and the
.B \-M
option, if specified.
.TP
.BI \-h \ <dir>
The name of the Archie host database directory. If not
specified, the program will first default to
.BR ~archie/db/host_db ,
then
.BR ./host_db .
.TP
.BI \-t \ <dir>
Set the name of directory used for temporary files. By
default the program will use
.B ~archie/db/tmp.
.TP
.BI \-I \ <size>
Set the minimum size for a site file to be indexed.
The size is in bytes.
If the size of the site file is greater or equal to this size
a .idx file will accompany this site file to speed up
searches in it. By default this size is
500000 bytes.
.TP
.BI \-v
Verbose. Tell you what it is doing.
.TP
.BI \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to stderr.
.TP
.BI \-L \ <logfile>
The name of the file to be used for logging information.
Note that debugging information is also written to the
log file. This implies the
.B \-l
option as well.
.PP
.B update_anonftp
is also responsible for the file locking required to prevent two or
more programs from concurrently updating the catalog. The file is
.B anonftp
and placed in
.B ~archie/db/locks
during the update process. It contains the host name and time of its creation.
.SH FILES
~archie/db/anonftp/*
.br
~archie/db/locks/anonftp
.br
~archie/db/host_db/*
.SH SEE ALSO
.BR arcontrol (n),
.BR parse_anonftp (n),
.BR delete_anonftp (n),
.BR insert_anonftp (n)
.SH AUTHOR
Bunyip Information Systems.
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.br
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.

View File

@@ -0,0 +1,120 @@
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)weaseld.n
.\"
.TH WEASELD N "August 1996"
.SH NAME
weaseld \- convert between Gopher and Prospero protocols
.SH SYNOPSIS
.B weaseld [\fB\-acl \fIacl\-file\fR]
[\fB\-config \fItcl\-script\fR] [\fB\-debug\fR]
[\fB\-emesg \fIfile\fR] [\fB\-log \fIlog\-file\fR]
[\fB\-pdebug \fInum\fR] [\fB\-phost \fIhost\fR]
[\fB\-port \fInum\fR] [\fB\-proot \fIprosp\-root\fR]
[\fB\-stay\fR] [\fB\-user \fIname\fR]
.SH DESCRIPTION
.LP
.B weaseld
serves the contents of a Prospero File System directory tree, while giving
the appearance, to Gopher clients, of an ordinary Gopher server. When
started without arguments \fBweaseld\fR listens on port 70 (officially
assigned to Gopher), and treats the root of the Prospero tree as the main
Gopher menu.
.SH OPTIONS
.LP
.TP 5n
.B \-acl \fIacl\-file\fR
Access control list.
.IR acl\-file
contains a list of network and host addresses, one per line, which
limits the sites that may access the server. All addresses must be
in dotted decimal format, and no other type of line may appear in the
file. Network addresses may be abreviated; for example, `8' refers
to the network `8.0.0.0'.
.TP 5n
.B \-config \fItcl\-script\fR
Execute \fItcl\-script\fR upon start\-up. Sending SIGUSR1 to the parent
process will cause the script to be re\-executed. This code is still under
development, so the exact effect of executing the script is not yet
defined.
.TP 5n
.B \-debug
Write debugging information to \fIstderr\fR. Note that in normal
operation, as a daemon, \fIstderr\fR is not attached to any file,
so any ouput to it will be lost. (See below for information on the
\fB\-log\fR and \fB\-stay\fR options.) The \fB\-debug\fR option is
normally used for debugging.
.TP 5n
.B \-emesg \fIfile\fR
Where to log special messages.
Special messages, as opposed to those written to a log file, are normally
sent to the console. They may be directed to the file or device given
by the \fIfile\fR argument. This option is normally used for debugging.
.TP 5n
.B \-log \fIlog\-file\fR
Log debugging information to a file.
\fIstderr\fR is connected to the file given as the argument. This
allows debugging information to be saved when running as a daemon.
This option is normally used for debugging.
.TP 5n
.B \-pdebug \fInum\fR
Print Prospero debugging information to \fIstderr\fR. The Prospero
libraries use the numeric argument as a debug level. This option is
normally used for debugging.
.TP 5n
.B \-phost \fIhost\fR
The argument is the name of the host on which the Prospero directory
tree resides. The default is \fIlocalhost\fR.
.TP 5n
.B \-port \fInum\fR
The numeric argument is the port on which to listen for connections from
Gopher clients. The default is 70.
.TP 5n
.B \-proot \fIprosp\-root\fR
The argument is the name of the Prospero virtual directory which is to
appear as the root of the Gopher directory tree. The default is the
root of the Prospero virtual directory tree.
.TP 5n
.B \-stay
Stay in the foreground. The server will not background itself, as it
would when started normally. If the \fB\-debug\fR option is specified,
debugging information will be sent to \fIstderr\fR, unless overridden by
the \fB\-log\fR option. The \fB\-stay\fR option is normally used for
debugging.
.TP 5n
.B \-user \fIname\fR
.B weaseld
will attempt to change both its real and effective user IDs to that of
the specified user, which may be either a name or a number. Normally,
this is used only when the daemon is started by root.
.SH SEE ALSO
Documentation provided with the Prospero distribution,
.br
Documentation provided with the Gopher distribution.
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc.,
Canada, 1990.