Help For Java Indexer V1.5
Welcome to ExNet's Java-based real-time index-search tool, designed to make
instant searches in a site's web pages.
This tool does not require a CGI server, and uses only flat files---ideal if you are an end user behind a firewall,
ideal if you run a Web site in space you buy from a service provider and they do not provide CGI services.
How To Use the Search Tool
When you first load a page containing the search applet, the applet will
attempt to load up the index associated with that applet (note that each applet
can load a different index). It will also have to load up the classes for the
applet. Depending on the speed of you network connection and the size of the
index that will take anything from a few seconds to a couple of minutes.
In most browsers, if you revisit the page with the applet on it without
quitting the browser in the mean time, you will not need to reload the index.
Type some words or a phrase that you want to look
for into the ``Search for'' box (which is yellow where the
browser permits). Either hit the RETURN or ENTER key to start a search
immediately, or wait a little while and a search will start automatically. The
tool will find the documents it thinks best matches those search words, and
will list them in the ``Results'' window, best match first.
Double-click on the document you want to look at. The tool scores first by the
number of the search words that were found in each document, and then by how
rare those words are, ie how good they should be at picking out the documents
you want to see. Thus the word ``the'' generally won't add much to how good
the tool thinks a document is if you include it in your search words, because
for typical English-text documents it is very common.
Good Documents and Bad Documents
Documents that the tool thinks are very likely to be good on the basis of the
words they contain will be marked with three stars (``***'') at the
start of the entry, down to one star for entries that are marginal, and just a
question mark (``?'') for documents that don't match enough of your
search words. This means that sometimes documents at the top of the list (that
match most words) will not have the most stars (because the words matched may
not be very rare and thus good at selecting between documents).
How Many Documents Are In The Index?
You can see the number of different documents or document sections that the
index was generated from, and the number of different words in the index, in
the bottom line of the search applet once the index has loaded. You can also
see a status indicator that shows when the tool is busy, and where possible how
close it is to finishing whatever it is doing. You will usually see this
indicator in use when the index is being loaded or a search is being done.
This tool is multi-threaded and so may be doing several things at once for you!
Auto Cue
If you leave the ``Search for'' box blank for a while you will find that
the applet is usually set up to cue you with some interesting searches you might
do, or brief instructions. If this ``auto cue'' system is being used, text
will appear in the ``Search for'' box as if typed in there, and the system
will search for the words there as normal. You can disable this by leaving
some text in the search box, or by starting to edit any of the search text that
the auto cue system generated.
Which Words and How Many?
As you are typing in your search words you will see text appearing in the
``Doc counts'' box. (You may have to scroll this box to see all the
text in it if your search has many words.) The first part of this box shows
the two closest words in the index on either side of the last search word you
have typed in. If your search word is not in the index it will appear between
these two surrounded by ``?'' question marks. This will help you chose which
words to look for. After that, you will see the word ``FOUND:'' followed by a
list for words. As for the words in the first part of the box, each is
followed by a colon (``:'') and a number, which is the number of documents the
word appeared in. You will notice that the words are all converted to
lower-case, and that duplicates are removed; words that aren't in the index are
not shown at all. Very long words or stings of digits are broken up into
shorter bits. All this helps the indexing mechanism help you find words that
you are looking for.
Jumping To Your Chosen Document
If you want to look at one of the documents listed in the search box,
double-click on its line. If the browser is able to, the selected document
will be shown in a new browser window; on some older browsers the new
document will be displayed in the window the applet was in.
Tuning Your Search
If you use the search tool a lot you may wish to tune its behaviour. Press the
``Control Panel'' button to get to the control panel, and the ``Search'' button
to get back to the normal search interface.
Technical Details and Miscellany
Changes Since Previous Versions
Changes Since V1.4
The main changes from V1.4 to V1.5 are:
- Improved GUI code for faster redisplay when applet is stopped and restarted.
- Trimmed code where possible for faster loading.
Client Requirements and Environment
This tool should work with Netscape 2.01 or later, or Internet Explorer 3.0 or
later, or any other Java interpreter that can run the output of Sun's JDK 1.0.1
or 1.0.2 javac compiler. Not all parts of it will be functional in
all viewers or with all interpreters.
Credits and Thanks
Many of the techniques used in this index are to be found in the excellent book
``Managing Gigabytes,'' Witten, IH, Moffat, A, Bell, TC, Van Nostrand Reinhold
1994, ISBN 0-442-01863-0. Most of the classes have been written by Damon
Hart-Davis; some have been written by Caroline Skene.
Use of This Code by Site Maintainer and End User
Note that the Java classes and code to build these indices is available from
ExNet, please mail us for the price. The viewer (.class) code is free for
non-commercial use, providing you tell us who you are and what you are using it
for. De-compilation is not permitted. All other rights are reserved.
This software is supplied as-is and as end-user of the free parts of this software
we provide you with no warranties of any kind. Our liability to a Web-site provider
is limited to at most the price paid for this product to us.
Web site provider: you may modify this document for display on your site, providing the alterations
are reasonable and providing you note the source of the original document and
provide a link back to ExNet where possible.
ExNet's home page.
Sales queries to info@exnet.com,
technical queries to sysadmin@exnet.com.
All code and documentation copyright DHD/EL 1995--1997.