298 lines
8.7 KiB
Plaintext
298 lines
8.7 KiB
Plaintext
.\" Copyright (c) 1992, 1994, 1996 Bunyip Information Systems Inc.
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" Archie 3.5
|
|
.\" August 1996
|
|
.\"
|
|
.\" @(#)retrieve_anonftp.n
|
|
.\"
|
|
.TH RETRIEVE_ANONFTP N "August 1996"
|
|
|
|
.SH NAME
|
|
.B retrieve_anonftp
|
|
\- retrieve anonymous FTP directory listings for the Archie anonftp catalog
|
|
|
|
.SH SYNOPSIS
|
|
.B retrieve_anonftp
|
|
.BI \-i \ <input>
|
|
.BI \-o \ <template>
|
|
[
|
|
.BI \-M \ <dir>
|
|
] [
|
|
.B \-U
|
|
] [
|
|
.B \-n
|
|
] [
|
|
.B \-g
|
|
] [
|
|
.BI \-C \ <config>
|
|
] [
|
|
.BI \-T \ <timeout>
|
|
] [
|
|
.BI \-Z
|
|
] [
|
|
.B \-l
|
|
] [
|
|
.BI \-L \ <logfile>
|
|
] [
|
|
.B \-v
|
|
]
|
|
|
|
.SH DESCRIPTION
|
|
.PP
|
|
.B retrieve_anonftp
|
|
performs the data aquisition phase of the update cycle
|
|
for the Archie system and is normally invoked by the
|
|
.BR arcontrol (n)
|
|
program. As input it is given the name of the file containing the header
|
|
information necessary for the retrieval. The output may be several files,
|
|
depending on the type of information being retrieved. It is essentially a
|
|
self-contained FTP client program. By default the program stores the
|
|
retrieved files in adaptive Lempel-Ziv compressed format, the same as that
|
|
used by the standard UNIX
|
|
.BR compress (1)
|
|
and
|
|
.BR uncompress (1)
|
|
utilites. This may be overridden by the -U and -n options.
|
|
|
|
.SH OPTIONS
|
|
.PP
|
|
The following two options are mandatory
|
|
.RS
|
|
.TP
|
|
.BI \-i \ <input>
|
|
The filename of the header file containing the necessary
|
|
information for the retrieval of the anonymous FTP listings
|
|
.TP
|
|
.BI \-o \ <template>
|
|
The base name (template) for the output file(s) generated by the program.
|
|
.RE
|
|
.PP
|
|
The following are optional
|
|
.RS
|
|
.TP
|
|
.B \-U
|
|
Uncompress the retrieved listings and store in an
|
|
uncompressed format. This is for those listings which are
|
|
retrieved already compressed (eg, ls-lR.Z files). This can potentially
|
|
speed up the execution of subsequent phases of the Update Cycle, however
|
|
more disk space is needed to hold the uncompressed data.
|
|
.TP
|
|
.B \-n
|
|
Do not compress or uncompress the retrieved input. They will
|
|
remain in the format in which they were retrieved. This can potentially
|
|
speed up the execution of subsequent phases of the Update Cycle, however
|
|
more disk space is needed to hold the uncompressed data.
|
|
.TP
|
|
.B \-g
|
|
Disable the globbing feature. If the globbing characters specified in the
|
|
configuration file occur in the names of the files in the file list being
|
|
retrieved then a separate data file will be produced for each wildcard
|
|
matched. If this flag is specified then this feature will be disabled and
|
|
the wildcard characters will no longer have their special meaning.
|
|
.TP
|
|
.BI \-C \ <config>
|
|
Use
|
|
.I <config>
|
|
as the configuration file. If not given the file
|
|
.B ~archie/etc/arretdefs.cf
|
|
is used.
|
|
.TP
|
|
.BI \-T \ <timeout>
|
|
If the transfer is inactive for <timeout> minutes, the retrieve is
|
|
terminated. The default value is 10 minutes.
|
|
.TP
|
|
.BI \-Z
|
|
If this flag is supplied and there is not an explicit file list given on
|
|
input for retrieval the program will look for indexing files on the
|
|
remote anonymous FTP archive using information supplied in the
|
|
configuration file. It first looks for the filename with compression
|
|
extension specified in the configuration file, if this is not successful
|
|
it then it looks for the file without the extension. If this too fails,
|
|
it attempts to fine those files in a subdirectories called "pub" and
|
|
"PUB". If the files are not found it gives up and continues the default
|
|
behavior. If one of the files is found its modification time is checked
|
|
to determine if it has been changed since the last retrieval of this
|
|
host. If it has, then the file is picked up in preference to doing the
|
|
recursive listing. If it is not then the file is ignored.
|
|
This procedure makes use of the FTP protocol extensions "MDTM" and
|
|
"SIZE". If these extensions are not supported on the remote site then the
|
|
program proceeds with the default behavior.
|
|
.TP
|
|
.B \-v
|
|
Verbose. Describe the details of the session.
|
|
.TP
|
|
.BI \-M \ <dir>
|
|
The name of the master Archie database directory. If
|
|
not given, the program tries to look in the directory
|
|
.B ~archie/db
|
|
and, failing that, defaults to
|
|
.B ./db.
|
|
.TP
|
|
.B \-l
|
|
Write any user output to the default log file
|
|
.B ~archie/logs/archie.log.
|
|
If desired, this can be overridden with the
|
|
.B \-L
|
|
option. Errors will by default be written to stderr.
|
|
.TP
|
|
.BI \-L \ <logfile>
|
|
The name of the file to be used for logging information.
|
|
Note that debugging information is also written to the
|
|
log file. This implies the
|
|
.B \-l
|
|
option as well.
|
|
.RE
|
|
.PP
|
|
The input file containing the header (See
|
|
.BR archie_header (5)
|
|
)
|
|
information is read and the site listed therein contacted. In the
|
|
absence of access command information in the header the default
|
|
action for the operating system at this site is taken. This action
|
|
is described in a configuration file (See "Configuration File" below).
|
|
The retrieved data is automatically compressed unless the
|
|
.B \-U
|
|
or
|
|
.B \-n
|
|
options have been used. The program will write either the output data
|
|
file(s), or a file containing an "error header" in the case that an error
|
|
occurred during the retrieve.
|
|
|
|
.SH CONFIGURATION FILE
|
|
The primary purpose of the configuration file for this program is to
|
|
provide default parameters to be used when the information in the header
|
|
file does not provide explicit instructions of the actions to be
|
|
performed during the FTP retrieve. The default configuration file is
|
|
~archie/etc/arretdefs.cf.
|
|
.sp
|
|
NOTE: The semantics of each field of this file is determined on a
|
|
per-catalog basis. Only the first 2 fields are invariant across
|
|
different catalogs.
|
|
.sp
|
|
For the anonftp catalog it is composed of lines of the following format:
|
|
|
|
.LP
|
|
.I
|
|
<dbname>\fB:\fI<os>\fB:\fI<bintrans>\fB:\fI<compext>\fB:\fI<user>\fB:\fI<passwd>\fB:\fI<acct>\fB:\fI<ftp arg>:\fI<glob>\fB:\fI<idx>\fB
|
|
.PP
|
|
Where
|
|
.RS
|
|
.TP
|
|
.I <dbname>
|
|
is the name of the Archie catalog.
|
|
.TP
|
|
.I <os>
|
|
is the operating system, as specified in Archie header records.
|
|
.TP
|
|
.I <bintrans>
|
|
is the FTP protocol command (as defined in RFC 959) to be used to place
|
|
the remote FTP server in binary transfer mode.
|
|
.TP
|
|
.I <compext>
|
|
is the file extension used on that operating system for
|
|
the default compression method. For example, ".Z" for files compressed
|
|
using the
|
|
.BI compress (1)
|
|
program.
|
|
.TP
|
|
.I <user>
|
|
is the default user code to use for anonymous FTP access.
|
|
.TP
|
|
.I <passwd>
|
|
is the default password to use for anonymous FTP access. If this
|
|
field is not specified the system will automatically generate a
|
|
password of the form
|
|
.BI archie@ <hostname>
|
|
(where
|
|
.I <hostname>
|
|
is the name of the host performing the retrieve). If the file
|
|
~archie/etc/archie.hostname has been configured then the host name given
|
|
there is used.
|
|
.TP
|
|
.I <acct>
|
|
is the default account to be used for anonymous FTP access.
|
|
.TP
|
|
.I <ftp arg>
|
|
is the default argument to be used with the FTP "LIST"
|
|
command (see RFC 959) when performing a listing at this
|
|
site.
|
|
.TP
|
|
.I <glob>
|
|
are the globbing characters used by the remote system
|
|
.TP
|
|
.I <idx>
|
|
the base name used by convention by the remote system to store indexing
|
|
information
|
|
.RE
|
|
.PP
|
|
.B Example
|
|
.RS
|
|
.PP
|
|
anonftp:unix_bsd:image:.Z:anonymous:::-R:*?:ls-lR
|
|
.RS
|
|
.TP
|
|
.B Field 1.
|
|
Specifies the "anonftp" Archie catalog.
|
|
.TP
|
|
.B Field 2.
|
|
Specifies "unix_bsd" as the operating system
|
|
.TP
|
|
.B Field 3.
|
|
The FTP protocol command for binary transmission is "image".
|
|
.TP
|
|
.B Field 4.
|
|
The default extension for compressed files on this system is ".Z" (from
|
|
.B compress(1)
|
|
).
|
|
.TP
|
|
.B Field 5.
|
|
The default user is "anonymous".
|
|
.TP
|
|
.B Field 6.
|
|
No password specified. archie@\fI<hostname>\fR will be used.
|
|
.TP
|
|
.B Field 7
|
|
No account specified. Most anonymous FTP implementations do
|
|
not require this command to be used.
|
|
.TP
|
|
.B Field 8.
|
|
The argument to the BSD UNIX
|
|
.BR ls (1)
|
|
command (which on most FTP implementations is invoked by the FTP
|
|
daemon on a "LIST" command) for a recursive listing is
|
|
.B -R.
|
|
.TP
|
|
.B Field 9.
|
|
UNIX uses '*' and '?' as wildcard characters ("globbing"). If these
|
|
characters appear in the names of the file lists then by default the
|
|
system will create separate output files for each wildcard match found.
|
|
This feature is disabled by the -g option.
|
|
.TP
|
|
.B Field 10.
|
|
The convention on most anonymous FTP sites is that if the recursive
|
|
listing is maintained at the site then the file is called "ls-lR" or
|
|
compressed as "ls-lR.Z". The -Z flag uses this information to retrieve
|
|
these indexing files in preference to doing the recursive listing if they
|
|
exist. See the -Z option description above.
|
|
.RE
|
|
.RE
|
|
.SH BUGS
|
|
.PP
|
|
The FTP command for binary access is not a function of the Operating
|
|
System but of the underlying architecture.
|
|
.PP
|
|
Currently only the Lempel-Ziv compression method is supported.
|
|
.SH FILES
|
|
~archie/etc/arretdefs.cf
|
|
.SH SEE ALSO
|
|
.BR arcontrol(n),
|
|
.BR compress (1),
|
|
.BR uncompress (1)
|
|
.SH AUTHOR
|
|
Bunyip Information Systems
|
|
.br
|
|
Montr\o"\'e"al, Qu\o"\'e"bec, Canada
|
|
.sp
|
|
Archie is a registered trademark of Bunyip Information Systems Inc., Canada,
|
|
1990. |