Out of the box, POPFile is dumb. It doesn't know what spam is, what e-mail is, or what any of the buckets you specified mean. It takes a little time to train it. Why can't POPFile come “pre-trained” to know what spam is? Because email classification is a subjective exercise. If you are a medical doctor, you might well receive important email with words that for other people would indicate a high likelihood of being spam. POPFile is effective because it is personalized for each user by learning from you.
POPFile's classification system needs to be trained for a while before it becomes effective - the more it's trained, the more effective it becomes. In fact, it won't even classify mail the first time you use it - it will leave it as unclassified. As of POPFilev0.20, by default POPFile marks a message as 'unclassified' if it isn't 100 times more certain it's in bucket A than bucket B. This is to reduce the false positive rate. If you wish to adjust this property, find the bayes_unclassified_weight on the Advanced page.
Whenever POPFile misclassifies an email, or doesn't classify it, head to the web interface and take a look at the 'History' tab (it loads by default). There, you'll see the last twenty or so emails you received, along with how POPFile classified them. (If you want to see why POPFile classified an email how it did, click on the subject line.) For each email that was wrong, correct POPFile by selecting the correct classification in the right-hand column (for one or more messages at a time), then clicking the Reclassify button.
The emails are already stored - you can happily move them around or delete them in your email program without affecting what POPFile thinks about them! POPFile only learns when you reclassify an email - it works under the theory of 'if it ain't broke, don't fix it'.
See also:
Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.