Opened 10 years ago
Last modified 8 years ago
#182 new defect
POPfile failed start due to failure to copy freed scalar in MQ.pm
Reported by: | zephryus | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Database | Version: | 1.1.3 |
Severity: | normal | Keywords: | |
Cc: |
Description
System: POPfile v1.1.3 under Debian GNU/Linux (32bit) [I installed my own copy of POPfile.]
Problem: A transient(?) error seems to have failed the launch of a new instance of POPfile.
I show the session log below.
The script to run POPfile, bin/run-popfile.sh contains the following environnment variables, etc.
: # PD=/extra/ishikawa/download/POPFILE-DIR #PD=/new-hd1/extra/ishikawa/download/POPFILE-DIR #PD=/tmp/usb-old-extra/ishikawa/download/POPFILE-DIR POPFILE_USER=$PD/ POPFILE_ROOT=$PD/ export POPFILE_USER POPFILE_ROOT perl $PD/popfile.pl \ --set html_language=Nihongo \ --set html_local=0 \ --set html_port=9090 \ --set pop3_port=1110
--- end quote ---
--- Start Session Log: with a comment line
debian-vbox-ci:/home/ishikawa# bin/run-popfile.sh POPFile Engine loading Loading... {core: config mq logger history} {classifier: bayes wordmangle} {interface: xmlrpc html} {proxy: pop3 nntp smtp} {services: imap} POPFile Engine v1.1.3 starting Initializing... {core: config history logger mq} {classifier: bayes wordmangle} {interface: html xmlrpc} {proxy: nntp pop3 smtp} {services: imap} Starting... {core: A copy of POPFile appears to be running. Attempting to signal the previous copy. Waiting 10 seconds for a reply. The other POPFile (4393) failed to signal back, starting new copy (32396) config history logger mq} {classifier: bayes wordmangle} {interface: html} {proxy: pop3} {services:} POPFile Engine v1.1.3 running panic: attempt to copy freed scalar 9c71e0c to 982c1d0 at /extra/ishikawa/download/POPFILE-DIR//POPFile/MQ.pm line 330. **** The above is the error line ??? ***** POPFile Engine v1.1.3 stopping Stopping... {classifier: bayes wordmangle} {core: config history logger mq} {interface: html} {proxy: pop3} {services:} POPFile Engine v1.1.3 terminated debian-vbox-ci:/home/ishikawa# bin/run-popfile.sh POPFile Engine loading
Note the error on line 330 of MQ.pm /extra/ishikawa/download/POPFILE-DIRPOPFile/MQ.pm line 330.
Failure to run popfile when the previous copy is deemed non-existent doesn't happen very often. (I think I see this particular bug for the first time.)
It is in read_pipe. My guess is that there was some transitory error or something. But I am not sure.
Just thought to report this bug for someone in the know to take pre-cautionary measure (like the pipe cache? is being removed before use, the child process disappearing before having a chance to wait for it, etc.???).
265 #---------------------------------------------------------------------------- 266 # 267 # read_pipe_ 268 # 269 # reads a single message from a pipe in a cross-platform way. 270 # returns undef if the pipe has no message 271 # 272 # $handle The handle of the pipe to read 273 # 274 #---------------------------------------------------------------------------- 275 sub read_pipe_ 276 { 277 my ( $self, $handle ) = @_; 278 279 if ( $^O eq "MSWin32" ) { 280 281 # bypasses bug in -s $pipe under ActivePerl 282 283 my $message; # PROFILE PLATFORM START MSWin32 284 285 if ( &{ $self->{pipeready_} }($handle) ) { 286 287 # add data to the pipe cache whenever the pipe is ready 288 289 sysread($handle, my $string, -s $handle); 290 291 # push messages onto the end of our cache 292 293 $self->{pipe_cache__} .= $string; 294 } 295 296 # pop the oldest message; 297 298 $message = $1 if (defined($self->{pipe_cache__}) && # PROFILE BLOCK START 299 ( $self->{pipe_cache__} =~ s/(.*?\n)// ) ); # PROFILE BLOCK STOP 300 301 return $message; # PROFILE PLATFORM STOP 302 } else { 303 304 # do things normally 305 306 if ( &{ $self->{pipeready_} }($handle) ) { 307 return <$handle>; 308 } 309 } 310 311 return undef; 312 } 313
TIA
Change History (9)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Hi,
Thank you for your time to look at the report, and am sorry I missed the correct uploading of the problematic lines. (I have no idea why I ended up uploading only the lines preceding the 330th-line and not include it. Oh well.)
Anyway, I found that I am using perl 5.14.2 under Debian GNU/Linux:
perl --version This is perl 5, version 14, subversion 2 (v5.14.2) built for i486-linux-gnu-thread-multi-64int (with 75 registered patches, see perl -V for more detail) Copyright 1987-2011, Larry Wall Perl may be copied only under the terms of either the Artistic License or the GNU General Public License, which may be found in the Perl 5 source kit. Complete documentation for Perl, including FAQ lists, should be found on this system using "man perl" or "perldoc perl". If you have access to the Internet, point your browser at http://www.perl.org/, the Perl Home Page. [email protected]:~/bin$
My observation about "(like the pipe cache? is being removed before use, the child process disappearing before having a chance to wait for it, etc.???). " because the parameters passed to flush_child_data seems to be some kind of data related to the child process or the data in pipe from cursory reading of the code.
grep -n -e flush_child /extra/ishikawa/download/POPFILE-DIR/POPFile/*.pm /dev/null /extra/ishikawa/download/POPFILE-DIR/POPFile/MQ.pm:118: $self->flush_child_data_( $self->{children__}{$kid} ); /extra/ishikawa/download/POPFILE-DIR/POPFile/MQ.pm:178: $self->flush_child_data_( $pipe ) /extra/ishikawa/download/POPFILE-DIR/POPFile/MQ.pm:255: $self->flush_child_data_( $self->{children__}{$kid} ); /extra/ishikawa/download/POPFILE-DIR/POPFile/MQ.pm:316:# flush_child_data_ /extra/ishikawa/download/POPFILE-DIR/POPFile/MQ.pm:328:sub flush_child_data_
So maybe by the time flush_child_data tries to access the parameter, the underlying data silently disappeared (?). I am not sure how perl handles such system related data that changes its status dynamically. Maybe this is like referencing an object through a pointer AFTER the referenced object is freed immediately before the usage?
Thank you again for your time.
comment:3 by , 10 years ago
I am having exactly the same problem.
POPFile running on my Ubuntu 12.04 LTS server
Error in POPFile log:
2013/3/22 11:03:52 1059: Perl fatal error : panic: attempt to copy freed scalar 205ffe0 to 185b4b8 at /usr/share/popfile/POPFile/MQ.pm line 330.[0a]
perl version information:
This is perl 5, version 14, subversion 2 (v5.14.2) built for x86_64-linux-gnu-thread-multi (with 55 registered patches, see perl -V for more detail)
comment:4 by , 10 years ago
Hi,
So I think this problem is caused by some race condition between the main POPfile process, whatever children process / pipes that are created by it, and the interaction. The problem caused by this is very rare, but if you have 100,000 of users or more, it will eventually shows up on someone's computer, I suppose.
Sorry, I am not able to figure out the issue for now, but now it seems it can be repeatably reproduced under carefully managed test environment by knowledgeable someone.
TIA
comment:5 by , 10 years ago
Hi,
Sorry for late reply.
To find the cause, I've written a patch to record more details (caller functions) when POPFile crashes with a fatal error.
This change will be included the next version (perhaps v1.1.4). I hope this patch help us to find out the cause.
Naoki
comment:6 by , 8 years ago
I'm getting the same error on Ubuntu 14.04. It's rare but regular (once a month). I do not see v1.1.4 in SVN, but I could try to patch my installation if you can provide a patch (the one with more diagnostics).
Thanks, Andy
comment:7 by , 8 years ago
I do not see v1.1.4 in SVN.
An SVN tag is only created when a new version is released. The "Browse Source" link at the top of the page can be used to display the available tags: http://getpopfile.org/browser/tags
The current state of POPFile 1.1.4 development can be found in the "b0_22_2" branch (the "trunk" is being used for POPFile 2.x development): http://getpopfile.org/browser/branches/b0_22_2
I could try to patch my installation if you can provide a patch (the one with more diagnostics).
Naoki referred to a patch to record more details (caller functions) when POPFile crashes with a fatal error on 26 October 2013. I think the relevant commit was changeset:3850 which only modified one file (Loader.pm).
Since then the only change to Loader.pm has been to change the copyright date.
The on-line manual has some information about obtaining the source code: http://getpopfile.org/docs/faq:getsourcecode
If you don't want to use an SVN client you can use the download link in the source code browser (as described in the on-line manual) to get the most recent version of Loader.pm: http://getpopfile.org/browser/branches/b0_22_2/engine/POPFile/Loader.pm
Shut down POPFile, rename the existing file (e.g. Loader.pm.old), copy the new Loader.pm file to the relevant folder and restart POPFile.
comment:8 by , 8 years ago
I patched Loader.pm, here is the log from the first crash that I saw today:
2015/7/13 00:39:43 32035: mq: 259: Done with 16172 (0 to go) 2015/7/13 00:39:49 16176: bayes: 811: Attempting to connect to dbi:SQLite:dbname=/var/lib/popfile/popfile.db (1) 2015/7/13 00:39:49 16176: bayes: 912: Using SQLite library version 3.8.2 2015/7/13 00:39:49 16176: bayes: 512: Performing tweak 1 to 1 2015/7/13 00:39:49 16176: pop3: 311: +OK POP3 POPFile (v1.1.3) server ready[0d][0a] 2015/7/13 00:39:49 16176: pop3: 311: -ERR No secure server specified[0d][0a] 2015/7/13 00:39:49 16176: pop3: 682: POP3 proxy done 2015/7/13 00:39:49 32035: mq: 259: Done with 16176 (0 to go) 2015/7/13 00:39:49 32035: Perl fatal error : panic: attempt to copy freed scalar 2f4c7b0 to 24f6f78 at /usr/share/popfile/POPFile/MQ.pm line 330.[0a] 2015/7/13 00:39:49 32035: called from POPFile::Loader /usr/share/popfile/POPFile/Loader.pm 135 POPFile::Loader::CORE_die 1 0 2015/7/13 00:39:49 32035: called from POPFile::MQ /usr/share/popfile/POPFile/MQ.pm 330 POPFile::Loader::__ANON__ 1 1766 UUUUUUUUUUUUUUU[05] 2015/7/13 00:39:49 32035: called from POPFile::MQ /usr/share/popfile/POPFile/MQ.pm 118 POPFile::MQ::flush_child_data_ 1 1766 UUUUUUUUUUUUUUU[05] 2015/7/13 00:39:49 32035: called from POPFile::Loader /usr/share/popfile/POPFile/Loader.pm 796 POPFile::MQ::service 1 256 2015/7/13 00:39:49 32035: called from main /usr/share/popfile/popfile.pl 118 POPFile::Loader::CORE_service 1 2022 2015/7/13 00:39:49 32035: POPFile stopped
comment:9 by , 8 years ago
One more crash today , essentially the same:
2015/7/15 15:26:13 29134: mq: 259: Done with 19527 (0 to go) 2015/7/15 15:28:12 19645: bayes: 811: Attempting to connect to dbi:SQLite:dbname=/var/lib/popfile/popfile.db (1) 2015/7/15 15:28:12 19645: bayes: 912: Using SQLite library version 3.8.2 2015/7/15 15:28:12 19645: bayes: 512: Performing tweak 1 to 1 2015/7/15 15:28:12 19645: pop3: 311: +OK POP3 POPFile (v1.1.3) server ready[0d][0a] 2015/7/15 15:28:12 19645: pop3: 311: -ERR No secure server specified[0d][0a] 2015/7/15 15:28:12 19645: pop3: 682: POP3 proxy done 2015/7/15 15:28:12 29134: mq: 259: Done with 19645 (0 to go) 2015/7/15 15:28:12 29134: Perl fatal error : panic: attempt to copy freed scalar 3483bf0 to 2896058 at /usr/share/popfile/POPFile/MQ.pm line 330.[0a] 2015/7/15 15:28:12 29134: called from POPFile::Loader /usr/share/popfile/POPFile/Loader.pm 135 POPFile::Loader::CORE_die 1 0 2015/7/15 15:28:12 29134: called from POPFile::MQ /usr/share/popfile/POPFile/MQ.pm 330 POPFile::Loader::__ANON__ 1 1766 UUUUUUUUUUUUUUU[05] 2015/7/15 15:28:12 29134: called from POPFile::MQ /usr/share/popfile/POPFile/MQ.pm 118 POPFile::MQ::flush_child_data_ 1 1766 UUUUUUUUUUUUUUU[05] 2015/7/15 15:28:12 29134: called from POPFile::Loader /usr/share/popfile/POPFile/Loader.pm 796 POPFile::MQ::service 1 256 2015/7/15 15:28:12 29134: called from main /usr/share/popfile/popfile.pl 118 POPFile::Loader::CORE_service 1 2022 2015/7/15 15:28:12 29134: POPFile stopped
Hi,
Thanks for your report.
The line 330 of MQ.pm is in 'flush_child_data_' function:
http://getpopfile.org/browser/tags/v1_1_3/engine/POPFile/MQ.pm#L330
This code just copies values of the parameter and I have no idea why the error occurs.
This seems to be an internal error of Perl. I think this might be a bug of Perl itself or some Perl module which POPFile depends on.
What version of Perl are you using?
Naoki