150 lines
3.1 KiB
Groff
150 lines
3.1 KiB
Groff
|
|
|
|
|
|
|
|
Archie 3.5 Beta
|
|
---------------
|
|
|
|
|
|
|
|
Patch level 3:
|
|
==============
|
|
|
|
- Support for ls-lR.gz files
|
|
- New WWW front-end to the Archie database.
|
|
- Improved search speed.
|
|
- ISO-Latin-1 support for searches.
|
|
- bug fixes as well..
|
|
|
|
|
|
Some notes on how to install or take advantage of the new features.
|
|
|
|
|
|
ls-lR.gz
|
|
--------
|
|
You must have gzip and gunzip on your system.
|
|
You need to edit the files
|
|
|
|
~archie/etc/arretdefs.cf
|
|
|
|
modify the line anonftp:unix_bsd:image:.Z:anonymous:::-R:*?:ls-lR
|
|
to anonftp:unix_bsd:image:.gz,.Z:anonymous:::-R:*?:ls-lR
|
|
|
|
you need to modify the file
|
|
|
|
~archie/etc/options.cf (read the instructions in that file)
|
|
|
|
|
|
WWW front-end
|
|
-------------
|
|
|
|
Thw files related to the WWW front-end are in ~archie/cgi
|
|
|
|
You will find in ~archie/cgi/bin
|
|
a perl script (archie) and a binary program (cgi-client).
|
|
|
|
The top part of the perl script will tell you what needs to
|
|
be setup.
|
|
|
|
In ~archie/cgi/html are the different gif files and search forms.
|
|
They are not in their final stage ... so do not hesitate to
|
|
give us your comments.
|
|
|
|
The files archie and archie-adv in that directory should
|
|
be modified to indicate where the perl script is located.
|
|
|
|
We recommend that a uniform url be used for archie so that
|
|
Archie users will easily find the search page.
|
|
|
|
http://archie.foo.bar/archie and
|
|
http://archie.foo.bar/archie-adv
|
|
|
|
|
|
|
|
New in this release:
|
|
====================
|
|
|
|
Here are the major added components to the system with some
|
|
of the key points involved in each one of them.
|
|
|
|
|
|
- Support for a new database module (webindex)
|
|
- retrieval of HTML pages through http protocol
|
|
- Keyword extraction
|
|
- Controlled crawling of the WWW
|
|
- Site by site basis
|
|
- Content extraction
|
|
- configurable stoplist (keyword exclusion)
|
|
|
|
- New database structure
|
|
- More reliable structure.
|
|
|
|
|
|
- New search engine
|
|
- Based on a paged tree structured index
|
|
- Faster searches
|
|
- Less memory required
|
|
- More disk space for construction of the index
|
|
|
|
- New search interface
|
|
- cgi-bin compliant interface
|
|
|
|
|
|
- A better domain filter for anonftp
|
|
- results can be pre-configured to return in a certain order
|
|
(e.g. ftp sites close to the server first)
|
|
- configurable on a server basis
|
|
|
|
|
|
|
|
Fixed bugs
|
|
==========
|
|
|
|
|
|
- the ``-t'' switch on arcontrol creates the new files and work files
|
|
in the specified temp directory and not in ~archie/db/tmp
|
|
|
|
|
|
- Lock files are now created in ~archie/db/locks
|
|
|
|
- host_manage can handle multiple preferred hostsnames
|
|
|
|
|
|
|
|
|
|
Currently working on:
|
|
=====================
|
|
|
|
- Rewrite of the cgi-bin front-end to be more flexible.
|
|
|
|
- Archie Help page
|
|
|
|
- Regular Expressions with the new search engine.
|
|
|
|
- New set of manpages and documentation.
|
|
|
|
- Additional type of searches
|
|
|
|
|
|
|
|
|
|
|
|
Currently testing:
|
|
==================
|
|
|
|
- dirsrv with the new database technology.
|
|
|
|
|
|
|
|
|
|
Known Problems:
|
|
===============
|
|
|
|
- arexchange of webindex will not fully functional
|
|
it will not transfer .excerpt files. We still need to experiment
|
|
with indexing of the Web and see what is involved with
|
|
exchanges of data.
|
|
|
|
|
|
|