Open Discussion → Where words are inserted into the DB?

Where words are inserted into the DB?

Hi all,

I'm grepping the sources and trying to find out how POPFile works.

In particular I'm failing to see where the data corresponding to the words found in a message is stored in the DB,

I started from grokking the method:
Bayes::classify_and_modify()

which just near its end executes this code:

 $self->{history__}->commit_slot( $session, $slot, $classification, $self->{magnet_detail__} );

and the commit_slot() method of the history module in turn executes:

$self->mq_post_( 'COMIT', $session, $slot, $bucket, $magnet );

which is in turn processed by the so-called "central mutex queue", but at this point I'm completely lost.

What I'd like to know is when and where the word matrix of the database is updated.

I see that the Bayes::add_message_bucket() method has to be somehow involved, but I cannot see where it is called in the code.

I'm having some hard time figuring it out alone so maybe with a knowledge of the code can give me an help into getting to the point.

Many thanks in advance, greets.

  • Message #1246

    The wiki has a general overview of POPFile and there is a database schema (POPFile 1.1.1 uses this schema). There may be other documents but at the moment I cannot think of any (I'm not a Perl programmer so I don't know much about the internal working of POPFile; I rely upon the comments in the code to find things).

    The Perl DBI module has a "tracing" mode that shows low level "behind the scenes" information which might help, in the absence of something better. CPAN has more information: http://search.cpan.org/~timb/DBI-1.609/DBI.pm#TRACING

    Here is an extract from a level 1 trace on my Windows system to give you some idea of what to expect:

        <- prepare('select bucket_params.val from bucket_params
                      where bucket_params.bucketid = ? and
                            bucket_params.btid = ?;')= DBI::st=HASH(0x38514a4) at Bayes.pm line 1043
        <- prepare('replace into bucket_params ( bucketid, btid, val ) values ( ?, ?, ? );')= DBI::st=HASH(0x385154c) at Bayes.pm line 1048
        <- prepare('select bucket_template.def from bucket_template
                      where bucket_template.id = ?;')= DBI::st=HASH(0x38515f4) at Bayes.pm line 1051
        <- prepare('select buckets.name from buckets, magnets
                      where buckets.userid = ? and
                            magnets.id != 0 and
                            magnets.bucketid = buckets.id group by buckets.name order by buckets.name;')= DBI::st=HASH(0x385169c) at Bayes.pm line 1055
    

    Brian

    Edited to fix a minor typo ("by" instead of "but")

    • Message #1247

      Thanks for the reply brian.

      Do you know if it is possible to have the DBI traces on the log file?

      I had a more in-depth look at the code and I think I'm starting to see how it works.

      Statement handlers are prepare()d in Bayes::db_connect(), and set in corresponding members of the Bayes module. I was looking for "INSERT" commands but indeed only REPLACE commands are used.

      Then the statement handlers are executed by calling validate_sql_prepare_and_execute() in various parts of the code (which doesn't prepare the handlers, just execute them, this looks like the fossil of an older design).

      Then in Bayes::db_disconnect() the statements are cleared by calling finish() on them.

      Greetings.

      • Message #1254

        Do you know if it is possible to have the DBI traces on the log file?

        If you mean you want the DBI trace output and the normal POPFile log file entries combined into a single file you can try changing the "Logger output" setting in the CONFIGURATION page in the UI.

        When I selected "To Screen (console)" or "To Screen and File" the log file messages and DBI trace messages both appeared in the console window on my Windows system. (I actually used a modified version of the Message Capture utility to display the console messages.)

        I increased the logger level to 2 to make sure there were lots of POPFile log messages. Although an enormous amount of information was generated, I'm not sure it is really what you wanted because it also includes POPFile's console messages. However it did not require any code changes so it was easy for me to try :)

        Brian

        • Message #1266

          (I actually used a modified version of the Message Capture utility to display the console messages.)

          For the benefit of any Windows users reading this topic I have now uploaded the DBI Trace Capture utility and created some documentation in the wiki for it:

          POPFile DBI Trace Capture utility (this includes a download link for the utility)

          Brian