Help → NULL wordid in matrix table

NULL wordid in matrix table

Hello,

I am using popfile 1.0.1 under gentoo with the MySQL backend.

As far as I can see, words are being correctly added to the words table, however in the matrix table, all records have a wordid of "NULL" and so nothing is being classified.

In the error logs I have:

2008/11/17 10:49:05 12495: bayes: 4013: DBI::execute failed.  Called from package 'Classifier::Bayes' (Classifier/Bayes.pm), line 2117

The line is sometimes 1604.

This corresponds to a sql lookupm which looks like this in the sql log:

7 Query       select matrix.times, matrix.wordid, buckets.name
                  from matrix, buckets
                  where matrix.wordid in (  )
                    and matrix.bucketid = buckets.id
                    and buckets.userid = 1

Notice the empty wordid set?

I am guessing that it is possible for the wordid set to be empty, though I have over 140000 words in the database.

Any ideas, or should I raise a bug?

Thanks,

Paul

  • Message #590

    Hi Paul,

    It seems that something is wrong with the table settings.
    Could you run these commands in the MySQL shell?

    1. mysql> show columns from popfile.words;

    2. mysql> show columns from popfile.matrix;

    3. mysql> show create table popfile.words;

    4. mysql> show create table popfile.matrix;

    Here's the results of my machine (MySQL 4.1.22 on Mac OS X 10.5.5):

    1:

    +-------+-------------+------+-----+---------+----------------+
    | Field | Type        | Null | Key | Default | Extra          |
    +-------+-------------+------+-----+---------+----------------+
    | id    | int(11)     |      | PRI | NULL    | auto_increment |
    | word  | binary(255) | YES  | MUL | NULL    |                |
    +-------+-------------+------+-----+---------+----------------+
    2 rows in set (0.01 sec)
    

    2:

    +----------+---------+------+-----+---------+----------------+
    | Field    | Type    | Null | Key | Default | Extra          |
    +----------+---------+------+-----+---------+----------------+
    | id       | int(11) |      | PRI | NULL    | auto_increment |
    | wordid   | int(11) | YES  | MUL | NULL    |                |
    | bucketid | int(11) | YES  |     | NULL    |                |
    | times    | int(11) | YES  |     | NULL    |                |
    | lastseen | date    | YES  |     | NULL    |                |
    +----------+---------+------+-----+---------+----------------+
    5 rows in set (0.00 sec)
    

    3:

    +-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table | Create Table                                                                                                                                                                                 |
    +-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | words | CREATE TABLE `words` (
      `id` int(11) NOT NULL auto_increment,
      `word` binary(255) default NULL,
      PRIMARY KEY  (`id`),
      UNIQUE KEY `word` (`word`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
    +-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    

    4:

    +--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table  | Create Table                                                                                                                                                                                                                                                                                                  |
    +--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | matrix | CREATE TABLE `matrix` (
      `id` int(11) NOT NULL auto_increment,
      `wordid` int(11) default NULL,
      `bucketid` int(11) default NULL,
      `times` int(11) default NULL,
      `lastseen` date default NULL,
      PRIMARY KEY  (`id`),
      UNIQUE KEY `wordid` (`wordid`,`bucketid`)
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
    +--------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    

    And, please tell me how you created the database.

    Naoki

    • Message #591
      mysql> show columns from popfile.words;
      +-------+-------------+------+-----+---------+----------------+
      | Field | Type        | Null | Key | Default | Extra          |
      +-------+-------------+------+-----+---------+----------------+
      | id    | int(11)     | NO   | PRI | NULL    | auto_increment | 
      | word  | binary(255) | YES  | UNI | NULL    |                | 
      +-------+-------------+------+-----+---------+----------------+ 
      2 rows in set (0.00 sec)                                        
      
      mysql> show columns from popfile.matrix;
      +----------+---------+------+-----+---------+----------------+
      | Field    | Type    | Null | Key | Default | Extra          |
      +----------+---------+------+-----+---------+----------------+
      | id       | int(11) | NO   | PRI | NULL    | auto_increment | 
      | wordid   | int(11) | YES  | MUL | NULL    |                | 
      | bucketid | int(11) | YES  |     | NULL    |                | 
      | times    | int(11) | YES  |     | NULL    |                | 
      | lastseen | date    | YES  |     | NULL    |                | 
      +----------+---------+------+-----+---------+----------------+ 
      5 rows in set (0.00 sec)                                       
      
      mysql> show create table popfile.words;
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                      
      | Table | Create Table                                                                                                                                                                                                     |                                                      
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+                                                      
      | words | CREATE TABLE `words` (                                                                                                         
        `id` int(11) NOT NULL auto_increment,
        `word` binary(255) default NULL,
        PRIMARY KEY  (`id`),
        UNIQUE KEY `word` (`word`)
      ) ENGINE=MyISAM AUTO_INCREMENT=145916 DEFAULT CHARSET=utf8 |
      +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.02 sec)
      
      mysql> show create table popfile.matrix;
      +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Table  | Create Table                                                        |
      +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | matrix | CREATE TABLE `matrix` (
        `id` int(11) NOT NULL auto_increment,
        `wordid` int(11) default NULL,
        `bucketid` int(11) default NULL,
        `times` int(11) default NULL,
        `lastseen` date default NULL,
        PRIMARY KEY  (`id`),
        UNIQUE KEY `wordid` (`wordid`,`bucketid`)
      ) ENGINE=MyISAM AUTO_INCREMENT=482 DEFAULT CHARSET=utf8 |
      +--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.00 sec)
      

      The obvious difference is the charset, and I guess is the cause - double bytes?

      I created the database from my sqlite2 popfile.db, following the instructions on this site. This didn't work incidentally, I forgot to review the contents of the mysql db once the file was imported - so that may be why it failed and I am not sure what state it was in when I fired up the mysql version of popfile.

      I ignored the errors because I was happy to start over - I wanted to restructure the buckets anyway.

      • Message #593

        Hi Paul,

        Sorry for late reply.

        The obvious difference is the charset, and I guess is the cause - double bytes?

        The 'word' column is binary format so I guess the charset does not cause the problem.

        | word  | binary(255) | YES  | UNI | NULL    |                | 
        

        Are there any other errors or warnings recorded in the log files? For example:

        2008/11/18 00:13:07 1400: Perl warning: DBD::mysql::st execute failed: MySQL server has gone away at /path/to/popfile/Classifier/Bayes.pm line 4144.[0a]
        

        Currently POPFile does not reconnect to the MySQL server even if it loses the connection.
        I'll fix this problem later.

        And could you tell me the version of MySQL and dbd::mysql.
        I'm testing POPFile with MySQL 4.1.22 and dbd::mysql 4.010.

        BTW, POPFile 1.0.1 has some incompatibilities with MySQL.
        We've fixed a few of them in POPFile 1.1.0 (we've been releasing the Release Candidates of it) but it is not fully tested with MySQL.

        Naoki

        • Message #596

          Currently POPFile does not reconnect to the MySQL server even if it loses the connection.
          I'll fix this problem later.

          I've fixed the problem.
          This change will be included in the next version of POPFile (v1.1.0).

          You can try the new version (not yet released) by downloading tarballs from here:
          http://getpopfile.org/downloads/popfile_0_22_2.tgz
          (Please note that this is not the officially released version.)

          Naoki