What the purpose of yet another archive of mailing lists ?
To be fast
To be searchable
To be a real life "test-bed" for our technology
How much messages are indexed ?
We have 292766 messages (> 3Gb) indexed up to the present (January,24, 2004).
Database size (metadata and search indices) is about 1Gb.
What's hardware your server is running on ?
Everything (database, httpd, search ) is running on one server:
Dual PIII 1 GHz, 1Gb RAM, SCSI and IDE drives.
Sponsors are welcome !
What's about software setup ?
Linux, PostgreSQL 7.4.1 (contrib modules: intarray, tsearch2),
Apache, thttpd, modperl, Mason, OpenFTS.
More information is available on MailWare page.
Why the performance can be slow ?
There are may be several reasons:
Your query is too common or/and search region is too broad.
Try to be more specific (the more words, the faster search),
limit search by date range, specific mailing list.
The server is overloaded
We have maintenance works on server
Also, you may try PGsearch,
which should be faster, but lacks metadata support.
What is a stop word ?
Some words are too common or non-informative to be indexed.
We use standard list of stop words with addition of several PostgreSQL
specific words like "PostgreSQL", "postgres", "pgsql"...
Why there is no spell checking support (query misprints) ?
Actually, we have "query correction" in our TODO for contrib/tsearch2,
but we're busy doing another works. If you want to sponsor
development tsearch2, contact us for details.
Authors, please...
Oleg Bartunov and Teodor Sigaev. We're developing and maintaining
GiST in PostgreSQL. Also, we're authors of several contrib modules
for PostgreSQL.
More information is available on our
development page