Buckets Tab

This tab is dedicated to your POPFile buckets, their options, and the classification statistics.

Bucket options

For each bucket, you can specify:

  • whether to modify the Subject line of mails that are classified to that bucket (see Subject Modification in the Glossary) by clicking the toggle button in the Subject Header Modification column.
  • whether to add the X-Text-Classification header to classified messages (see X-Text-Classfication header in the Glossary) by clicking the toggle button in the X-Text-Classification Header column.
  • whether to add the X-POPFile-Link header to classified messages (see X-POPFile-Link in the Glossary) by clicking the toggle button in the X-POPFile-Link Header column.
  • whether to quarantine classified messages (see quarantine in the Glossary) by clicking the toggle button in the Quarantine Message column.
  • the color for each bucket. This setting is reflected throughout the UI, escpecially on the History tab. Select a color from the drop-down list and click apply for each of your buckets.

Bucket Maintenance

This widget let's you create new buckets and delete and rename existing buckets. (To purge all the words from one bucket, see below under “Bucket Details”.)

Word Lookup

With the Lookup widget, you can check whether a give word is in your corpus. Just enter the word and click “Lookup” and POPFile will tell you whether the word is in your corpus or not. If it is, it will show you the frequency, probability, and score of the word for each bucket and the bucket where the word is most likely to appear.

Statistics

POPFile's accuracy can be determined by a look at the Classification Accuracy box. This lists the number of messages classified, the number of classification errors, and the resulting accuracy (in percent). The classification accuracy can be reset by clicking the “Reset Statistics” button. Below the button you can see when you last reset your statistics.

Messages Classified

This box gives you the details of POPFile's classification accuracy. For each of your buckets, it lists the Classification Count as an absolute number and as a percentage of all classified messages, the False Positives (messages that were wrongly classified to this bucket), and the False Negatives (messages that were wrongly not classified to this bucket).

The bar chart below all those numbers gives you an idea how the classified messages were distributed accross your buckets. If you hover your mouse over one of the bars, you should also see the details.

Word Counts

The word counts statistic, gives you an overview of the number of words in your corpus for each bucket. If you look at the Distinct Words column of the buckets table at the top of the Buckets tab, you can see how many distinct (unique) words are in each bucket. The word counts box farther down the page sums up the distinct words and also takes into account how often each word is stored in the corpus. If you reclassified a message that contained the term “popfile” to bucket X once, the unique word “popfile” will have a word count of 1 in bucket X. If you have to reclassify another message that contains that term to bucket X, the word count in bucket X for “popfile” will now increase to 2.

Bucket Details

Click the name of one of your buckets in the table at the top of the Buckets tab to be taken to the Bucket Details page.

On this page, you can:

  • see yet another word count statistic for the chose bucket
  • remove all the words from the chosen bucket's corpus
  • browse through the words in the corpus of that bucket, by clicking one of the first-letter links.
 
ui/buckets.txt · Last modified: 2008/02/08 19:49 by 127.0.0.1

Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.

Recent changes RSS feed Donate Driven by DokuWiki
The content of this wiki is protected by the GNU Fee Documentation License