This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Last revisionBoth sides next revision | ||
faq:whengood [2009/09/29 09:54] – Minor text adjustment xuesheng | faq:whengood [2012/08/28 11:21] – Add a table to the footnote showing how quickly POPFile can learnquired xuesheng | ||
---|---|---|---|
Line 5: | Line 5: | ||
Most people are reporting a very good accuracy ( > 97% ) after some 1000 messages. The global statistics (see link below), tell us that after 500 messages the average accuracy is at 98.29% and that over 92% of users are getting an accuracy over 95% at that point (as of 28 September 2009; check the link below for the latest data). | Most people are reporting a very good accuracy ( > 97% ) after some 1000 messages. The global statistics (see link below), tell us that after 500 messages the average accuracy is at 98.29% and that over 92% of users are getting an accuracy over 95% at that point (as of 28 September 2009; check the link below for the latest data). | ||
- | This does not mean that you will have to reclassify a thousand times before you reach that point. You will see that the majority of those 1000 messages will be put into the correct bucket. But you should be prepared to see many errors when your corpus is still fresh.((Some real statistics, starting from scratch: In the first 1,000 messages processed only 21 messages had to be reclassified, | + | This does not mean that you will have to reclassify a thousand times before you reach that point. You will see that the majority of those 1000 messages will be put into the correct bucket. But you should be prepared to see many errors when your corpus is still fresh.((Some real statistics, starting from scratch: In the first 1,000 messages processed only 21 messages had to be reclassified, |
+ | \\ | ||
+ | \\ | ||
+ | ^ Messages | ||
+ | ^ ::: ^ ::: ^ of Messages | ||
+ | | 1 to 1,000 | 21 | 1,000 | 21 | 97.9% | | ||
+ | | 1,001 to 2,000 | | ||
+ | | 2,001 to 4,000 | | ||
+ | |||
+ | |||
+ | In other words, out of 4,000 messages only 36 messages had to be reclassified which gives an overall accuracy of 99.1% [Of course your mileage may vary.])) | ||
POPFile offers users an opportunity to help improve over time by sharing statistics anonymously. To turn this feature on, go to http:// | POPFile offers users an opportunity to help improve over time by sharing statistics anonymously. To turn this feature on, go to http:// |
Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.