Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
faq:whengood [2009/09/28 18:19] xueshengfaq:whengood [2012/08/28 13:21] (current) – external edit 127.0.0.1
Line 5: Line 5:
 Most people are reporting a very good accuracy ( > 97% ) after some 1000 messages. The global statistics (see link below), tell us that after 500 messages the average accuracy is at 98.29% and that over 92% of users are getting an accuracy over 95% at that point (as of 28 September 2009; check the link below for the latest data). Most people are reporting a very good accuracy ( > 97% ) after some 1000 messages. The global statistics (see link below), tell us that after 500 messages the average accuracy is at 98.29% and that over 92% of users are getting an accuracy over 95% at that point (as of 28 September 2009; check the link below for the latest data).
  
-This does not mean that you will have to reclassify a thousand times before you reach that point. You will see that the majority of those 1000 messages will be put into the correct bucket. But you should be prepared to see many errors when your corpus is still fresh.+This does not mean that you will have to reclassify a thousand times before you reach that point. You will see that the majority of those 1000 messages will be put into the correct bucket. But you should be prepared to see many errors when your corpus is still fresh.((Some real statistics, starting from scratch: In the first 1,000 messages processed only 21 messages had to be reclassified, giving an accuracy of 97.9%. In the next 1,000 messages received only 7 messages had to be reclassified, and in the next 2,000 messages only 8 messages had to be reclassified. 
 +\\ 
 +\\ 
 +^  Messages  ^ Reclassifications ^ Total Number ^ Total Number of   ^ Accuracy ^ 
 +^  :::       ^ :::               ^ of Messages  ^ Reclassifications ^ :::      ^ 
 +|      1 to 1,000  |  21  |  1,000  |  21  |  97.9%  | 
 +|  1,001 to 2,000  |    |  2,000  |  28  |  98.6%  | 
 +|  2,001 to 4,000  |    |  4,000  |  36  |  99.1%  |
  
-  * Some real statistics, starting from scratch: In the first 1,000 messages processed only 21 messages had to be reclassified, giving an accuracy of 97.9%. In the next 1,000 messages received only 7 messages had to be reclassified, and in the next 2,000 messages only 8 messages had to be reclassified. In other words, out of 4,000 messages only 36 messages had to be reclassified which gives an overall accuracy of 99.1% (Of course your mileage may vary)+ 
 +In other words, out of 4,000 messages only 36 messages had to be reclassified which gives an overall accuracy of 99.1% [Of course your mileage may vary.]))
  
 POPFile offers users an opportunity to help improve over time by sharing statistics anonymously. To turn this feature on, go to http://127.0.0.1:8080/security  POPFile offers users an opportunity to help improve over time by sharing statistics anonymously. To turn this feature on, go to http://127.0.0.1:8080/security 
  
 See also: See also:
-  * [[/popfile_stats.html| Global Statistics (updated every two hours)]] +  * [[/popfile_stats.html| Global Statistics]] (updated every two hours) 
-  * [[/stats2.html| POPFile Statistics analysis (from 2003)]]+  * [[/stats2.html| POPFile Statistics analysis]] (from 2003)
  
 
faq/whengood.1254161944.txt.gz · Last modified: 2009/09/28 20:19 (external edit)
Old revisions

Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.

Recent changes RSS feed Donate Driven by DokuWiki
The content of this wiki is protected by the GNU Fee Documentation License