This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
utilityscripts:insert [2007/01/24 09:34] – texasfett | utilityscripts:insert [2008/02/08 19:49] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 5: | Line 5: | ||
**About Sample Size** | **About Sample Size** | ||
- | If you use this script to train POPFile via email samples, be careful about sample size. We do **not** recommend you submit thousands of emails to the script, you will end up with a huge corpus that offers little additional benefit to classification accuracy. Your best approach would be to stick to small representative samples of at most 100 emails per bucket. | + | If you use this script to train POPFile via email samples, be careful about sample size. This is not a recommended way to train POPFile, it is a utility designed for testing. |
===== Usage ===== | ===== Usage ===== | ||
- | **Shutdown POPFile Before Using** Shutdown any running instance of POPFile before you use this script. insert.pl modifies the corpus by adding words to it, it should not be run concurrently with POPFile to avoid damage to the corpus databases. | + | **Shutdown POPFile Before Using** |
+ | Shutdown any running instance of POPFile before you use this script. insert.pl modifies the corpus by adding words to it, it should not be run concurrently with POPFile to avoid damage to the corpus databases. | ||
- | The script must be run from the popfile | + | The script must be run from the POPFile |
- | < | + | < |
cd " | cd " | ||
- | </ | + | </ |
Once in the popfile installation directory, issue the following to run the program. | Once in the popfile installation directory, issue the following to run the program. | ||
- | < | + | **Feeding a directory of messages** |
- | | + | |
+ | < | ||
perl insert.pl bucketname \path\to\messages\*.* | perl insert.pl bucketname \path\to\messages\*.* | ||
- | </ | + | </ |
- | | + | |
- | | + | **Feeding a single message** |
+ | < | ||
perl insert.pl bucketname messagefilename | perl insert.pl bucketname messagefilename | ||
- | </ | + | </ |
Line 41: | Line 44: | ||
The messages will be placed in the poptemp folder as .eml files. You can feed that folder direct to insert.pl as follows: | The messages will be placed in the poptemp folder as .eml files. You can feed that folder direct to insert.pl as follows: | ||
- | < | + | < |
perl insert.pl bucketname \poptemp\*.eml | perl insert.pl bucketname \poptemp\*.eml | ||
- | </ | + | </ |
Line 53: | Line 56: | ||
- Feed the mbx file to insert.pl as follows; | - Feed the mbx file to insert.pl as follows; | ||
- | < | + | < |
perl insert.pl bucketname \path\to\eudora\newsletters.mbx | perl insert.pl bucketname \path\to\eudora\newsletters.mbx | ||
- | </ | + | </ |
Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.