archie/release/base/manpages/retrieve_anonftp.n
2024-05-28 17:59:32 +02:00

298 lines
8.7 KiB
Plaintext

.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
.\" All rights reserved.
.\"
.\" Archie 3.5
.\" August 1996
.\"
.\" @(#)retrieve_anonftp.n
.\"
.TH RETRIEVE_ANONFTP N "August 1996"
.SH NAME
.B retrieve_anonftp
\- retrieve anonymous FTP directory listings for the Archie anonftp catalog
.SH SYNOPSIS
.B retrieve_anonftp
.BI \-i \ <input>
.BI \-o \ <template>
[
.BI \-M \ <dir>
] [
.B \-U
] [
.B \-n
] [
.B \-g
] [
.BI \-C \ <config>
] [
.BI \-T \ <timeout>
] [
.BI \-Z
] [
.B \-l
] [
.BI \-L \ <logfile>
] [
.B \-v
]
.SH DESCRIPTION
.PP
.B retrieve_anonftp
performs the data aquisition phase of the update cycle
for the Archie system and is normally invoked by the
.BR arcontrol (n)
program. As input it is given the name of the file containing the header
information necessary for the retrieval. The output may be several files,
depending on the type of information being retrieved. It is essentially a
self-contained FTP client program. By default the program stores the
retrieved files in adaptive Lempel-Ziv compressed format, the same as that
used by the standard UNIX
.BR compress (1)
and
.BR uncompress (1)
utilites. This may be overridden by the -U and -n options.
.SH OPTIONS
.PP
The following two options are mandatory
.RS
.TP
.BI \-i \ <input>
The filename of the header file containing the necessary
information for the retrieval of the anonymous FTP listings
.TP
.BI \-o \ <template>
The base name (template) for the output file(s) generated by the program.
.RE
.PP
The following are optional
.RS
.TP
.B \-U
Uncompress the retrieved listings and store in an
uncompressed format. This is for those listings which are
retrieved already compressed (eg, ls-lR.Z files). This can potentially
speed up the execution of subsequent phases of the Update Cycle, however
more disk space is needed to hold the uncompressed data.
.TP
.B \-n
Do not compress or uncompress the retrieved input. They will
remain in the format in which they were retrieved. This can potentially
speed up the execution of subsequent phases of the Update Cycle, however
more disk space is needed to hold the uncompressed data.
.TP
.B \-g
Disable the globbing feature. If the globbing characters specified in the
configuration file occur in the names of the files in the file list being
retrieved then a separate data file will be produced for each wildcard
matched. If this flag is specified then this feature will be disabled and
the wildcard characters will no longer have their special meaning.
.TP
.BI \-C \ <config>
Use
.I <config>
as the configuration file. If not given the file
.B ~archie/etc/arretdefs.cf
is used.
.TP
.BI \-T \ <timeout>
If the transfer is inactive for <timeout> minutes, the retrieve is
terminated. The default value is 10 minutes.
.TP
.BI \-Z
If this flag is supplied and there is not an explicit file list given on
input for retrieval the program will look for indexing files on the
remote anonymous FTP archive using information supplied in the
configuration file. It first looks for the filename with compression
extension specified in the configuration file, if this is not successful
it then it looks for the file without the extension. If this too fails,
it attempts to fine those files in a subdirectories called "pub" and
"PUB". If the files are not found it gives up and continues the default
behavior. If one of the files is found its modification time is checked
to determine if it has been changed since the last retrieval of this
host. If it has, then the file is picked up in preference to doing the
recursive listing. If it is not then the file is ignored.
This procedure makes use of the FTP protocol extensions "MDTM" and
"SIZE". If these extensions are not supported on the remote site then the
program proceeds with the default behavior.
.TP
.B \-v
Verbose. Describe the details of the session.
.TP
.BI \-M \ <dir>
The name of the master Archie database directory. If
not given, the program tries to look in the directory
.B ~archie/db
and, failing that, defaults to
.B ./db.
.TP
.B \-l
Write any user output to the default log file
.B ~archie/logs/archie.log.
If desired, this can be overridden with the
.B \-L
option. Errors will by default be written to stderr.
.TP
.BI \-L \ <logfile>
The name of the file to be used for logging information.
Note that debugging information is also written to the
log file. This implies the
.B \-l
option as well.
.RE
.PP
The input file containing the header (See
.BR archie_header (5)
)
information is read and the site listed therein contacted. In the
absence of access command information in the header the default
action for the operating system at this site is taken. This action
is described in a configuration file (See "Configuration File" below).
The retrieved data is automatically compressed unless the
.B \-U
or
.B \-n
options have been used. The program will write either the output data
file(s), or a file containing an "error header" in the case that an error
occurred during the retrieve.
.SH CONFIGURATION FILE
The primary purpose of the configuration file for this program is to
provide default parameters to be used when the information in the header
file does not provide explicit instructions of the actions to be
performed during the FTP retrieve. The default configuration file is
~archie/etc/arretdefs.cf.
.sp
NOTE: The semantics of each field of this file is determined on a
per-catalog basis. Only the first 2 fields are invariant across
different catalogs.
.sp
For the anonftp catalog it is composed of lines of the following format:
.LP
.I
<dbname>\fB:\fI<os>\fB:\fI<bintrans>\fB:\fI<compext>\fB:\fI<user>\fB:\fI<passwd>\fB:\fI<acct>\fB:\fI<ftp arg>:\fI<glob>\fB:\fI<idx>\fB
.PP
Where
.RS
.TP
.I <dbname>
is the name of the Archie catalog.
.TP
.I <os>
is the operating system, as specified in Archie header records.
.TP
.I <bintrans>
is the FTP protocol command (as defined in RFC 959) to be used to place
the remote FTP server in binary transfer mode.
.TP
.I <compext>
is the file extension used on that operating system for
the default compression method. For example, ".Z" for files compressed
using the
.BI compress (1)
program.
.TP
.I <user>
is the default user code to use for anonymous FTP access.
.TP
.I <passwd>
is the default password to use for anonymous FTP access. If this
field is not specified the system will automatically generate a
password of the form
.BI archie@ <hostname>
(where
.I <hostname>
is the name of the host performing the retrieve). If the file
~archie/etc/archie.hostname has been configured then the host name given
there is used.
.TP
.I <acct>
is the default account to be used for anonymous FTP access.
.TP
.I <ftp arg>
is the default argument to be used with the FTP "LIST"
command (see RFC 959) when performing a listing at this
site.
.TP
.I <glob>
are the globbing characters used by the remote system
.TP
.I <idx>
the base name used by convention by the remote system to store indexing
information
.RE
.PP
.B Example
.RS
.PP
anonftp:unix_bsd:image:.Z:anonymous:::-R:*?:ls-lR
.RS
.TP
.B Field 1.
Specifies the "anonftp" Archie catalog.
.TP
.B Field 2.
Specifies "unix_bsd" as the operating system
.TP
.B Field 3.
The FTP protocol command for binary transmission is "image".
.TP
.B Field 4.
The default extension for compressed files on this system is ".Z" (from
.B compress(1)
).
.TP
.B Field 5.
The default user is "anonymous".
.TP
.B Field 6.
No password specified. archie@\fI<hostname>\fR will be used.
.TP
.B Field 7
No account specified. Most anonymous FTP implementations do
not require this command to be used.
.TP
.B Field 8.
The argument to the BSD UNIX
.BR ls (1)
command (which on most FTP implementations is invoked by the FTP
daemon on a "LIST" command) for a recursive listing is
.B -R.
.TP
.B Field 9.
UNIX uses '*' and '?' as wildcard characters ("globbing"). If these
characters appear in the names of the file lists then by default the
system will create separate output files for each wildcard match found.
This feature is disabled by the -g option.
.TP
.B Field 10.
The convention on most anonymous FTP sites is that if the recursive
listing is maintained at the site then the file is called "ls-lR" or
compressed as "ls-lR.Z". The -Z flag uses this information to retrieve
these indexing files in preference to doing the recursive listing if they
exist. See the -Z option description above.
.RE
.RE
.SH BUGS
.PP
The FTP command for binary access is not a function of the Operating
System but of the underlying architecture.
.PP
Currently only the Lempel-Ziv compression method is supported.
.SH FILES
~archie/etc/arretdefs.cf
.SH SEE ALSO
.BR arcontrol(n),
.BR compress (1),
.BR uncompress (1)
.SH AUTHOR
Bunyip Information Systems
.br
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
.sp
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
1990.