Welcome to POPFile v0.21.0 This version consists of a major update to v0.20.1 with many improvements and bug fixes: 1. Multi-user phase 1 POPFile now recognizes two environment variables (POPFILE_ROOT and POPFILE_USER) which can be used to customize the location of POPFile and the location of its per-user data. POPFile *no longer* has to be run with the current working directory set to where popfile.pl is installed. POPFILE_ROOT: this is the full path to the popfile.pl file. Once set you can run POPFile from any directory and it will find its modules using POPFILE_ROOT POPFILE_USER: this is where POPFile will keep its per-user data (i.e. the database and history). This means that a single installation of POPFile can be used with multiple users. Just specify a different directory in POPFILE_USER for each user and run POPFile once per user. In a future release POPFile will support a single instance of POPFile with multiple users sharing the same database (this is known as Multi-user Phase 2 and you can read more in the POPFile Roadmap---see below for link) 2. Switch to SQL database In v0.20.0 POPFile switched from storing all its information in flat files to using BerkeleyDB. BerkeleyDB has proved to be not stable enough on Windows 98 in certain cases, for that reason and to prepare for Multi-user Phase 2, the underlying data has been moved into an SQL database. POPFile currently by default uses SQLite but can be successfully used with other SQL databases like MySQL. (SQLite can be found on the web at http://www.sqlite.com/ and is public domain software.) The schema for POPFile can be found in Classifier/popfile.sql for those that want to load it into their own SQL database. If you start using POPFile with a database we don't currently support please let me know (e.g. anyone want to try Oracle?). POPFile *should* work with any SQL database supported by Perl's DBI mechanism. 3. UI improvements The global options for Subject Modification, X-Text-Classification insertion and X-POPFile-Link insertion have been removed and replaced with individual options on a per bucket basis to give greater choice in configuring POPFile. IMPORTANT NOTE: Because of the elimination of the global parameters you will need to check and set/unset the individual bucket parameters for Subject Modification, X-Text-Classification and X-POPFile-Link. Visit the Buckets page to configure each bucket to your liking. The 'unclassified' bucket is now visible in the UI so that you can see how many messages were unclassified, and configure header modification. This also means that unclassified messages are counted in the accuracy statistics; previously they were not counted which could have skewed the accuracy statistics if there were unclassified messages. The history "page" bar has been simplified so that it uses a fixed amount of screen space, while making navigation easy. Filters and searches on the history are now persistent, for example you can click on the Buckets page and return to the History page without losing your filter or search settings. The UI password is now stored as a one-way hash and is no longer ever displayed or stored in plain text. The Buckets page has been modified to only show the 'distinct word' count per bucket and to (finally :-) show the total number of distinct words in the database. Previously we showed two counts with confusing titles: now we show the true number of words in the database, not the "word counts" (which was the number of times each word occurred). Another language has been added to the list of interface localizations: POPFile is now available in our second right-to-left language, Arabic. 4. Change to command line If you are using the command line to configure POPFile there has been a major change in the way it is parsed. The old style (for example, perl popfile.pl -pop3_port 110) has been deprecated and replaced with a proper getopt style command line. We now have a single -- parameter: --set which is used to set POPFile configuration options on the command line. For example 'perl popfile.pl -pop3_port 110' would be replaced with 'perl popfile.pl --set pop3_port=110'. Note that existing scripts will continue to work since the old style is merely deprecated, but they should be upgraded as support for the old style is not guaranteed for all future versions. 5. Enable/disable modules Each POPFile module can now be disabled or enabled with command line options: for example if you don't need the optional XMLRPC module loaded it's possible to specify --set xmlrpc_enabled=0 and it will be unloaded. 6. Anti-spam improvements We've recently seen spam start to use CSS to obscure messages and fool filters like POPFile; in response, this version of POPFile does analysis of CSS in HTML encoded messages. POPFile now correctly uses the SpamAssassin headers to make POPFile more efficient when used in conjunction with SpamAssassin. We now also look at TLDs (Top Level Domains) and store them as pseudowords (most useful for TLDs like .biz). It's possible that you might see a drop in accuracy as your corpus gets trained up on the new anti-spam features. This drop in accuracy will be corrected once you've retrained POPFile a little. 7. POPFile Documentation Project In addition since v0.20.1 a lot of work has gone into a set of web pages called the 'POPFile Documentation Project' that includes FAQs, setup guides, etc., please visit: http://popfile.sf.net/cgi-bin/wiki.pl ESSENTIAL READING IF YOU ARE UPGRADING TO v0.21.0 1. BACK UP YOUR OLD INSTALLATION: POPFile makes this really easy, just copy the entire POPFile directory somewhere. You can then safely install POPFile v0.21.0 on top of your current installation. The installer will automatically create a backup of your existing corpus and configuration file; I just think a full back up is a sensible precaution. 2. IF YOU ARE RUNNING WINDOWS: Please read the section below I AM RUNNING WINDOWS AND NEED TO CHECK MULTIPLE EMAIL ACCOUNTS SIMULTANEOUSLY 3. ON WINDOWS POPFILE IS AN EXE. Windows users will now be able to see POPFile running in the Task Manager with an executable called popfileXX.exe where the XX is one of f, if, b, ib depending on configuration. POPFile is started by running runpopfile.exe which chooses the appropriate popfileXX.exe This might cause your firewall to ask about giving popfileXX.exe permissions, in addition if you had allowed Perl permissions in your firewall they are NO LONGER needed. I AM RUNNING WINDOWS AND NEED TO CHECK MULTIPLE EMAIL ACCOUNTS SIMULTANEOUSLY Because the time taken to start a new process on Windows is long under Perl there is an optimization for Windows that is active by default: when a new connection is made between your email program and POPFile, POPFile handles it in the 'parent' process. This means that the connect happens fast and mail starts downloading very quickly, but is means that you can only downloaded messages from one server at a time (up to 6 other connections will be queued up and dealt with in the order they arrive) and the UI is unavailable while downloading email. You can turn this behavior off (and get simultaneous UI/email access and as many email connections as you like) by going to the Configuration panel in the UI and making sure that "Allow concurrent POP3 connections:" is set to Yes, or by specifying --set pop3_force_fork=1 on the command line. SPECIAL NOTES FOR WINDOWS USERS POPFile works best if the POPFILE_ROOT and POPFILE_USER environment variables use lowercase short file name format. For example, instead of using C:\Program Files\POPFile you should use c:\progra~1\popfile. The use of lower case is *very important* if an existing flat file or BerkeleyDB corpus is to be converted. The short file name format is not very user friendly, so the Windows installer makes it easy to create suitable environment variables. The Windows installer assumes that each POPFile user will have different POPFILE_USER settings, and provides some simple utilities to make it easy to to create these new environment variables. When the installer is run, it automatically creates the necessary lowercase short file name format environment variables for the user running the installer. This data is stored in a user-specific part of the registry. The "Run POPFile" shortcuts created by the installer in the Start Menu (and, optionally, in the StartUp folder) start the 'runpopfile.exe' program which is responsible for ensuring that POPFILE_ROOT and POPFILE_USER have the correct values before calling the 'popfile.exe' program which starts POPFile. If 'runpopfile.exe' is run by a user who does not yet have any POPFile settings in the registry, the 'Add POPFile User' wizard will be started if it is available. [This wizard is only available if POPFile was installed by a user with 'Admin' rights] This wizard lets the user choose a location for their POPFile data and offers to reconfigure any suitable Outlook Express, Outlook or Eudora email accounts for use with POPFile. If any email settings are changed, the wizard will save the data needed to restore the original settings. After this, the wizard will start POPFile by calling 'popfile.exe'. Windows 9x supports system-wide environment variables which are defined in AUTOEXEC.BAT. Normally to change the value of these environment variables AUTOEXEC.BAT has to be updated and the computer rebooted. The 'runpopfile.exe' program avoids the need to reboot Windows 9x systems in order to switch between different sets of user data (users just need to log on and log off as normal). The 'Add POPFile User' wizard only offers some simple features at present, as it is still under development. For example there is very little error checking in the current version. I AM USING THE CROSS PLATFORM VERSION POPFile requires a number of Perl modules that are available from CPAN. New in v0.21.0 are the need for the following: DBI DBD::SQLite Digest::Base Digest::MD5 HTML::Tagset You may also need: BerkeleyDB (if you are upgrading from v0.20.x) Text::Kakasi (if you want Japanese language support) Encode (if you want Japanese language support) Notes on installing the cross platform version can be found here: http://popfile.sf.net/cgi-bin/wiki/pl?HowTos/CrossPlatformInstall I LIKE TO LIVE DANGEROUSLY In a future version POPFile will add official support for message classification through the SMTP and NNTP (Usenet news) protocols. There are currently proxy modules for these protocols that work with v0.21.0, but they have not been fully tested. If you are interested in getting them get them here: http://cvs.sourceforge.net/viewcvs.py/*checkout*/popfile/engine/Proxy/SMTP.pm?rev=1.26 http://cvs.sourceforge.net/viewcvs.py/*checkout*/popfile/engine/Proxy/NNTP.pm?rev=1.25 and place them in POPFile's Proxy/ directory. DOWNLOADING You can obtain the latest releases of POPFile by visiting http://sourceforge.net/project/showfiles.php?group_id=63137 UPGRADING Just install POPFile on top of the currently installed version. But did you read the ESSENTIAL READING above first. DONATIONS Thank you to everyone who has clicked the Donate! button and donated their hard earned cash to me in support of POPFile. Thank you also to the people who have contributed patches, feature requests, bug reports and translations. http://sourceforge.net/forum/forum.php?forum_id=213876 As well as the usual donations we have a specific "BuyJohnAMac" campaign that you can read about here: http://popfile.sf.net/cgi-bin/wiki.pl?BuyJohnAMac PRAISE This release sees an enormous improvement in the Windows installer which been the exclusive playground of Brian Smith. Brian has work tirelessly to make the Windows installer robust, clear and easy. Nice work, Brian! Sam continues to make contributions to POPFile's core, with the latest being his addition of the CSS parser. Naoki Iimura has done a huge amount of work converting the FAQ from English to Japanese for the benefit of the many users of POPFile in Japan. THE FUTURE The rough plan for the future of POPFile is the as follows: v0.22.0 Theme: Protocols/UI Add support for SMTP, NNTP, IMAP(?) New skinning system Support SSL and SOCKS for all proxies Move history into database v0.23.0 Theme: Multi-user Phase 2 Support for multiple users with logins Scalable to large organizations/ISPs History encryption v1.00.0 Theme: Stable Release First non v0.X release of POPFile The full roadmap can be found here: http://sourceforge.net/docman/display_doc.php?docid=17906&group_id=63137 CONCLUSION Keep the ideas and bug reports coming. If you are interested in knowing more about what's planned for future POPFile versions (or just learning about POPFile's history) visit the POPFile Roadmap: http://sourceforge.net/docman/display_doc.php?docid=17906&group_id=63137 John.
Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.