Differences

This shows you the differences between two versions of the page.

dbverify [2008/02/08 19:49] (current)
Line 1: Line 1:
 +====== dbverify - checking for a corrupt corpus ======
 +
 +**This applies to POPFile 0.20.x only.  It may be be helpful if you are having trouble upgrading from 0.20.x.  The problems described no longer affect the current version of POPFile.**
 +
 +The **unsupported** dbverify utility will check your corpus for corruption. It peforms the following checks;
 +
 +  - uses the db_verify calls of BerkeleyDB to check each bucket's table.db file, reports if any issues are found.
 +  - checks the table.db files for internal consistency by reading the entire bucket and recalculating the total unique word count and wordcount sums and comparing them to the internally stored values, reports if they do not agree.
 +
 +You can be reasonably assured that your table.db files are not corrupt if the utility fails to report any errors.
 +
 +===== Running dbverify =====
 +
 +  * Download the utility by **right clicking** on this link http://www.geocities.com/helphand1/popfile/0_20/dbverify.pl and selecting **save target as**
 +
 +  * save it to your POPFile directory
 +
 +  * open a DOS command box and switch to your POPFile directory (normally c:\program files\popfile), e.g., <code>
 +cd  "\program files\popfile"
 +</code>
 +  * run the utility, e.g.,<code>
 +perl dbverify.pl
 +</code>
 +
 +
 +**Note:** If you have changed the default location of your corpus via the bayes_corpus parameter, you **must** pass the new corpus directory location to the utility on the command line (most users can ignore this note, only advanced users would have made this particular change).
 +
 +===== Example Output for a Corrupt Corpus =====
 +
 +In this example, the utility is shown reporting corruption in both the //magnet// and //spam// buckets for this corpus.
 +
 +<code>
 +Checking corpus/magnet/table.db
 +    *ERROR** bucket corpus/magnet has a corrupt corpus,
 +db_verify returns: DB_VERIFY_BAD: Database verification failed
 +Bucket corpus/magnet is likely corrupt, word count is 6237 versus 5250
 +Bucket corpus/magnet is likely corrupt, unique count is 1767 versus 4308
 +Checking corpus/normal/table.db
 +Checking corpus/spam/table.db
 +    *ERROR** bucket corpus/spam has a corrupt corpus,
 +db_verify returns: DB_VERIFY_BAD: Database verification failed
 +</code>
 +
 +
 +===== Example Output for a Normal Corpus =====
 +
 +In this example, the utility is shown reporting no instances of corruption in any of the three buckets of the corpus.
 +
 +<code>
 +Checking corpus/magnet/table.db
 +Checking corpus/normal/table.db
 +Checking corpus/spam/table.db
 +</code>
 
dbverify.txt · Last modified: 2008/02/08 19:49 (external edit)

Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.

Recent changes RSS feed Donate Driven by DokuWiki
The content of this wiki is protected by the GNU Fee Documentation License