Help → Possible parsing problem

Possible parsing problem

I've been using popfile for quite some time, now on 1.0.0


My ISP recently changed their upstream and from about that time I am getting maybe 1 in a 1000 emails fatally broken. The XP box here is stable.
Popfile seems to be otherwise operating normally.


The system here is complex. There is software between popfile (ignoring AV which is well respected) and the mail server but this is useful and allowed me to turn on logging there.


This shows the dialog both sides, local to remote mail server and local to popfile. No abnormality is seen.


Having the stream which was fed to popfile in a log file and the corrupt message as stored by Thunderbird the finger points at popfile.


I don't yet know how to turn on popfile logging.


Example of the problem (fake users, is spam, but problem is not unique to spam)


Sent to popfile

RETR 3
+OK Message follows
Return-path: <[email protected]>                                                  
Envelope-to: [email protected]                                             
Delivery-date: Thu, 10 Apr 2008 21:24:48 +0100    


As stored by TB after passing through popfile

Return-path: <[email protected]>                                                  
Subject:
X-Text-Classification: unclassified
X-POPFile-Link: http://127.0.0.1:9091/jump_to_message?view=623834

.
Envelope-to: [email protected]       


Don't miss the '.' on the line above. Quoted above is followed by the message header in the body as seen by TB.


A correct email is as follows: -


Return-path: <[email protected]>                                     
Envelope-to: [email protected]                                                   
Delivery-date: Thu, 10 Apr 2008 21:36:24 +0100                             
Received: from exprod5mx242.postini.com ([64.18.0.162] helo=psmtp.com)


Popfile in this case correctly tags on at the end of the header.


I have examples which in theory can be forwarded to a developer and perhaps fed into popfile from a local test pop3 server.

Out of amusement...

"Messages classified: 724,786
Classification errors: 1,020
Accuracy: 99.85%
(Last Reset: Mon Apr 16 15:23:53 2007 )"

Database check seems to show no errors.

  • Message #106

    That's ugly!

    It's been quite a while since we've seen reports of POPFile not being able to correctly parse a message, so I suspect that those messages arrive in a really weird state. But let's try to find out what is going on here.

    First, enable full logging in POPFile.

    Then wait for a mangled message to arrive and check the log file, you can access it from the Configuration Tab.

    Post the relevant parts of the log file here and perhaps we'll can see even better what is going on.

    Manni

    • Message #107

      Got one. A fairly short spam.

      Err... we have a whoops!

      Sent to popfile

      RETR 7
      +OK Message follows
      Return-path: <[email protected]>                            
      Envelope-to: [email protected]                                                
      Delivery-date: Fri, 11 Apr 2008 16:42:18 +0100                             
      Received: from exprod5mx223.postini.com ([64.18.0.82] helo=psmtp.com)
      	  by pih-sunmxcore11.plus.net with smtp (PlusNet MXCore v2.00) id 1JkLOM-0007Ss-5h 
      	  for [email protected]; Fri, 11 Apr 2008 16:42:18 +0100                      
      Received: from source ([88.252.146.213]) by exprod5mx223.postini.com ([64.18.4.11]) with SMTP;
      	Fri, 11 Apr 2008 11:42:15 EDT                                              
      Message-ID: <[email protected]>
      From: "diarmid rudolph" <[email protected]>
      To: <[email protected]>
      Date: Fri, 11 Apr 2008 13:54:53 +0000
      MIME-Version: 1.0
      Content-Type: multipart/alternative;
      	boundary="----=_NextPart_000_0005_01C89BEA.03B204FB"
      X-Priority: 3                                                              
      X-MSMail-Priority: Normal                                                  
      X-Mailer: Microsoft Outlook Express 6.00.2900.3138
      X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
      X-pstn-levels:     (S: 0.00000/71.15880 CV:99.9000 R:95.9108 P:95.9108 M:97.0282 C:97.9508 )
      X-pstn-status: off                                                         
      X-pn-pstn: Spam 1                                                          
      Subject:  special deal on replica
      X-EsetId: AA1C812BD0B53636FE5C                                             
      X-RegEx-Score: 65.6                                                        
      X-RegEx: [59.6] FROM_AND_RECEIVED_DO_NOT_MATCH FQDN in From and Received header do not match
      X-RegEx: [6.0] BE_AMAZED Apparently, you'll be amazed                      
      X-SpamPal: SPAM SBL+XBL 89.38.113.105 BODY                                                                                                                                                                                                                                                                                                                                                                                                                                      
      
      This is a multi-part message in MIME format.
      
      ------=_NextPart_000_0005_01C89BEA.03B204FB
      Content-Type: text/plain;
      	charset="iso-8859-1"
      Content-Transfer-Encoding: quoted-printable
      
      " My order arrived yesterday via registered mail in good order THE WATCH =
      IS BEAUTIFUL AND EVEN BETTER THAN I EXPECTED."=20
         Try it for yourself - u will be amazed!!
      - The worlds largest online retailer of luxury products, including:
      Rolex Sports Models
      Rolex Datejusts
        Breitling
      Cartier
      Porsche Design
      Dolce & Gabbana
      Dior
      Gucci
      Hermes Watches
      Patek Philippe
       Visit - www.selemmu.com
      ------=_NextPart_000_0005_01C89BEA.03B204FB
      Content-Type: text/html;
      	charset="iso-8859-1"
      Content-Transfer-Encoding: quoted-printable
      
      <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
      <HTML><HEAD>
      <META http-equiv=3DContent-Type content=3D"text/html; =
      charset=3Diso-8859-1">
      <META content=3D"MSHTML 6.00.2900.3199" name=3DGENERATOR>
      <STYLE></STYLE>
      </HEAD>
      <BODY bgColor=3D#ffffff>
      <p>" My order arrived yesterday via registered mail in good order THE =
      WATCH IS BEAUTIFUL AND EVEN BETTER THAN I EXPECTED." </p>=20
      
        <p>Try it for yourself - u will be amazed!!</p>
      
      <p>- The worlds largest online retailer of luxury products, =
      including:</p>
      <ul><li>Rolex Sports Models<br/></li>
      <li>Rolex Datejusts<br/></li> =20
      <li>Breitling<br /></li>
      <li>Cartier<br /></li>
      <li>Porsche Design<br /></li>
      <li>Dolce & Gabbana<br /></li>
      <li>Dior<br /></li>
      <li>Gucci<br /></li>
      <li>Hermes Watches<br /></li>
      <li>Patek Philippe<br/></li></ul>=20
      
      
      Visit - <a =
      href=3D"http://selemmu.com/">www.selemmu.com</a></BODY></HTML>
      ------=_NextPart_000_0005_01C89BEA.03B204FB--
      
      
      .
      

      popfile log without timestamp

      pop3: 209: Command: --RETR 7--
      pop3: 301: RETR 7[0d][0a]
      pop3: 658: Read slurp data +OK Message follows[0d][0a]
      pop3: 301: +OK Message follows[0d][0a]
      history: 284: reserve_slot selected random number 350280763
      history: 316: reserve_slot returning slot id 627817
      bayes: 658: Read slurp data Return-path: <[email protected]>                            
      bayes: 658: Read slurp data [0d][0a].[0d][0a]Envelope-to: [email protected]                                                [0d][0a]Delivery-date: Fri, 11 Apr 2008 16:42:18 +0100                             [0d][0a]R
      mq: 377: post COMIT (27hi7n1s0Dlubi7kKq:627817:unclassified:0)
      mq: 392: sending post COMIT (27hi7n1s0Dlubi7kKq:627817:unclassified:0) to parent GLOB(0x2116c50)
      mq: 377: post COMIT (27hi7n1s0Dlubi7kKq:627817:unclassified:0)
      mq: 384: queuing post COMIT (27hi7n1s0Dlubi7kKq:627817:unclassified:0)
      mq: 386: COMIT queue length now 0
      mq: 128: Message COMIT (27hi7n1s0Dlubi7kKq:627817:unclassified:0) ready for delivery
      mq: 131: Delivering message COMIT (27hi7n1s0Dlubi7kKq:627817:unclassified:0) to history
      pop3: 209: Command: --DELE 7--
      mq: 131: Delivering message COMIT (27hi7n1s0Dlubi7kKq:627817:unclassified:0) to bayes
      pop3: 301: DELE 7[0d][0a]
      pop3: 658: Read slurp data +OK Deleted.[0d][0a]
      pop3: 301: +OK Deleted.[0d][0a]
      pop3: 209: Command: --RETR 8--
      

      Message from email client

      From - Fri Apr 11 16:49:26 2008
      X-Account-Key: account5
      X-UIDL: UID778688-1152481720
      X-Mozilla-Status: 0001
      X-Mozilla-Status2: 00000000
      X-Mozilla-Keys:                                                                                 
      Return-path: <[email protected]>                            
      Subject:
      X-Text-Classification: unclassified
      X-POPFile-Link: http://127.0.0.1:9091/jump_to_message?view=627817
      
      .
      Envelope-to: [email protected]                                                
      Delivery-date: Fri, 11 Apr 2008 16:42:18 +0100                             
      Received: from exprod5mx223.postini.com ([64.18.0.82] helo=psmtp.com)
      	  by pih-sunmxcore11.plus.net with smtp (PlusNet MXCore v2.00) id 1JkLOM-0007Ss-5h 
      	  for [email protected]; Fri, 11 Apr 2008 16:42:18 +0100                      
      Received: from source ([88.252.146.213]) by exprod5mx223.postini.com ([64.18.4.11]) with SMTP;
      	Fri, 11 Apr 2008 11:42:15 EDT                                              
      Message-ID: <[email protected]>
      From: "diarmid rudolph" <[email protected]>
      To: <[email protected]>
      Date: Fri, 11 Apr 2008 13:54:53 +0000
      MIME-Version: 1.0
      Content-Type: multipart/alternative;
      	boundary="----=_NextPart_000_0005_01C89BEA.03B204FB"
      X-Priority: 3                                                              
      X-MSMail-Priority: Normal                                                  
      X-Mailer: Microsoft Outlook Express 6.00.2900.3138
      X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
      X-pstn-levels:     (S: 0.00000/71.15880 CV:99.9000 R:95.9108 P:95.9108 M:97.0282 C:97.9508 )
      X-pstn-status: off                                                         
      X-pn-pstn: Spam 1                                                          
      Subject:  special deal on replica
      X-EsetId: AA1C812BD0B53636FE5C                                             
      X-RegEx-Score: 65.6                                                        
      X-RegEx: [59.6] FROM_AND_RECEIVED_DO_NOT_MATCH FQDN in From and Received header do not match
      X-RegEx: [6.0] BE_AMAZED Apparently, you'll be amazed                      
      X-SpamPal: SPAM SBL+XBL 89.38.113.105 BODY                                                                                                                                                                                                                                                                                                                                                                                                                                      
      
      This is a multi-part message in MIME format.
      
      ------=_NextPart_000_0005_01C89BEA.03B204FB
      Content-Type: text/plain;
      	charset="iso-8859-1"
      Content-Transfer-Encoding: quoted-printable
      
      " My order arrived yesterday via registered mail in good order THE WATCH =
      IS BEAUTIFUL AND EVEN BETTER THAN I EXPECTED."=20
         Try it for yourself - u will be amazed!!
      - The worlds largest online retailer of luxury products, including:
      Rolex Sports Models
      Rolex Datejusts
        Breitling
      Cartier
      Porsche Design
      Dolce & Gabbana
      Dior
      Gucci
      Hermes Watches
      Patek Philippe
       Visit - www.selemmu.com
      ------=_NextPart_000_0005_01C89BEA.03B204FB
      Content-Type: text/html;
      	charset="iso-8859-1"
      Content-Transfer-Encoding: quoted-printable
      
      <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
      <HTML><HEAD>
      <META http-equiv=3DContent-Type content=3D"text/html; =
      charset=3Diso-8859-1">
      <META content=3D"MSHTML 6.00.2900.3199" name=3DGENERATOR>
      <STYLE></STYLE>
      </HEAD>
      <BODY bgColor=3D#ffffff>
      <p>" My order arrived yesterday via registered mail in good order THE =
      WATCH IS BEAUTIFUL AND EVEN BETTER THAN I EXPECTED." </p>=20
      
        <p>Try it for yourself - u will be amazed!!</p>
      
      <p>- The worlds largest online retailer of luxury products, =
      including:</p>
      <ul><li>Rolex Sports Models<br/></li>
      <li>Rolex Datejusts<br/></li> =20
      <li>Breitling<br /></li>
      <li>Cartier<br /></li>
      <li>Porsche Design<br /></li>
      
      <li>Dolce & Gabbana<br /></li>
      <li>Dior<br /></li>
      <li>Gucci<br /></li>
      <li>Hermes Watches<br /></li>
      <li>Patek Philippe<br/></li></ul>=20
      
      
      Visit - <a =
      href=3D"http://selemmu.com/">www.selemmu.com</a></BODY></HTML>
      ------=_NextPart_000_0005_01C89BEA.03B204FB--
      
      
      • Message #108

        No takers?

      • Message #109

        Well, yes and no.

        POPFile didn't react quite gracefully, but it was forced to deal with a badly mangled message in the first place. Apparently some piece of software that dealt with the message before POPFile simply added its headers AFTER the Content-Type header, thus completely breaking the message. Then along came POPFile and broke it some more.

        I have no idea how we should deal with something like that.

        Regards,
        Manni

        • Message #110

          Okay and yes.

          I'm trying to avoid more long posts with data.

          As I understand it the usual processing is add new header content at the end of the header. End of header is deliminated by a blank line which is not classed as breaking white space within the header.

          So "simply added its headers AFTER the Content-Type header" is entirely normal if that is within the header and happens to be the last item.

          RFC822 "A message consists of header fields and, optionally, a body.

          The body is simply a sequence of lines containing ASCII charac-
          ters. It is separated from the headers by a null line (i.e., a
          line with nothing preceding the CRLF)."

          crlf

          Is breaking white space inside the header.

          Oldest mail outside of archives here is 1993 and random checks since then show as above, including later on popfile doing just that. The content-type header is part of the header, not necessarily the last item. The blank line always delimits.

          The ISP change here seems to have propagated back to how it used to be, presumably the MX record changed. This means that examples of the problem appear with both mails servers. Sigh, so it isn't that.

          The defective emails after processing are always this format

          Return-path: xxxxxxxxxxxxx
          Subject:
          X-Text-Classification: xxxxxxxxxxxxx
          X-POPFile-Link: http://127.0.0.1:9091/jump_to_message?view=643902

          .

          Subject is always blank but the real subject line isn't.

          Looks to me like popfile is being stalled, somehow aborts out, gets as far as writing out Subject:
          Then tags on it's header tags and end of header. Somehow the message then gets written out. Thunderbird gets the message and correctly takes the header as above, putting the '.' as the first visible line of the body and the blank Subject as the subject.

          Do you have the parser broken out to a standalone so it can be fed data?

          • Message #111

            Sorry, you are right. I was barking up the wrong tree. I guess that Content-Type header was simply the first thing that struck me as strange.

            Did you somehow edit the POPFile log, apart from cutting away the time stamps? Are the timestamps just plain boring or is there a significant delay at some point?

            As you might have noticed, I am not an expert on POPFile's email parser. But I guess it should be possible to figure out where the dot is inserted. I'll look into that later today.

            Regards,
            Manni

            PS: I might be helpful if you posted your popfile.cfg file which contains your POPFile configuration data.

            • Message #112

              Sorry, you are right. I was barking up the wrong tree. I guess that Content-Type header was simply the first thing that struck me as strange.

              No worry, the comments might help someone one day. If something is not mentioned, it is not discussed.


              Did you somehow edit the POPFile log, apart from cutting away the time stamps? Are the timestamps just plain boring or is there a significant delay at some point?

              Not that I recall. I'm embarressed about posting acres of content. A massive timestamp seemed of no interest. Time was going forwards as ususal. When you get several thousand emails first thing (2630 today) there had better not be major processing delays. Better than one msg a second, so I don't care. More limited by RBL lookups (typical response <200ms per lookup)

              As you might have noticed, I am not an expert on POPFile's email parser. But I guess it should be possible to figure out where the dot is inserted. I'll look into that later today.

              I don't even know the layout of popfile, seem to be a lot of files where it doesn't at first sight make much sense.

              PS: I might be helpful if you posted your popfile.cfg file which contains your POPFile configuration data.

              "bayes_bad_sqlite_version 3.0.0
              bayes_corpus corpus
              bayes_database popfile.db
              bayes_dbauth
              bayes_dbconnect dbi:SQLite2:dbname=$dbname
              bayes_dbuser
              bayes_hostname xxxxxxx
              bayes_localhostname
              bayes_nihongo_parser kakasi
              bayes_sqlite_tweaks 4294967295
              bayes_subject_mod_left [
              bayes_subject_mod_right ]
              bayes_unclassified_weight 100
              bayes_xpl_angle 0
              config_piddir ./
              GLOBAL_debug 1
              GLOBAL_message_cutoff 100000
              GLOBAL_msgdir messages/
              GLOBAL_timeout 60
              history_archive 0
              history_archive_classes 0
              history_archive_dir archive
              history_history_days 2
              html_cache_templates 0
              html_column_characters 39
              html_columns +inserted,+from,+to,-cc,+subject,-date,-size,+bucket
              html_date_format
              html_language English
              html_last_reset Mon Apr 16 15:23:53 2007
              html_last_update_check 1207872000
              html_local 1
              html_page_size 200
              html_password b61124a488acc5f4406e74272629ee86
              html_port 9091
              html_send_stats 1
              html_session_dividers 1
              html_show_bucket_help 0
              html_show_training_help 0
              html_skin default
              html_strict_templates 0
              html_test_language 0
              html_update_check 0
              html_wordtable_format
              imap_bucket_folder_mappings
              imap_enabled 1
              imap_expunge 0
              imap_hostname 127.0.0.1
              imap_login xxxxxxx
              imap_password xxxxxxxxx
              imap_port 5143
              imap_training_mode 0
              imap_uidnexts
              imap_uidvalidities
              imap_update_interval 20
              imap_use_multiple_connections 0
              imap_use_ssl 0
              imap_watched_folders INBOX
              logger_format default
              logger_level 0
              logger_logdir ./
              pop3_enabled 1
              pop3_force_fork 1
              pop3_local 1
              pop3_port 111
              pop3_secure_port 110
              pop3_secure_server
              pop3_separator :
              pop3_socks_port 1080
              pop3_socks_server
              pop3_toptoo 0
              pop3_welcome_string POP3 POPFile (v1.0.0) server ready
              smtp_chain_port 25
              smtp_chain_server
              smtp_enabled 0
              smtp_force_fork 0
              smtp_local 1
              smtp_port 25
              smtp_socks_port 1080
              smtp_socks_server
              smtp_welcome_string SMTP POPFile (v0.22.5) welcome
              windows_console 0
              windows_trayicon 1
              xmlrpc_enabled 0
              xmlrpc_local 1
              xmlrpc_port 8081"

              IMAP is not being used. Was used experimentally with a local server.

      • Message #113

        Hi,

        bayes: 658: Read slurp data Return-path: <[email protected]>
        bayes: 658: Read slurp data [0d][0a].[0d][0a]Envelope-to: [email protected] [0d][0a]Delivery-date: Fri, 11 Apr 2008 16:42:18 +0100 [0d][0a]R

        According to the log file, the mail server (or some other proxies) sent
        a line which contains just a single period '.' after sending the first
        line of the message.

        In the POP3 protocol, that line means the end of the message.
        So POPFile stopped analyzing the message and added its message headers
        after the "last" header of the message.

        I think this is not a POPFile's problem.

        Here's a snippet from the RFC 1939:

        Responses to certain commands are multi-line. In these cases, which
        are clearly indicated below, after sending the first line of the
        response and a CRLF, any additional lines are sent, each terminated
        by a CRLF pair. When all lines of the response have been sent, a
        final line is sent, consisting of a termination octet (decimal code
        046, ".") and a CRLF pair. If any line of the multi-line response
        begins with the termination octet, the line is "byte-stuffed" by
        pre-pending the termination octet to that line of the response.
        Hence a multi-line response is terminated with the five octets
        "CRLF.CRLF". When examining a multi-line response, the client checks
        to see if the line begins with the termination octet. If so and if
        octets other than CRLF follow, the first octet of the line (the
        termination octet) is stripped away. If so and if CRLF immediately
        follows the termination character, then the response from the POP
        server is ended and the line containing ".CRLF" is not considered
        part of the multi-line response.

        http://www.faqs.org/rfcs/rfc1939.html

        Naoki

        • Message #114

          bayes: 658: Read slurp data Return-path: <[email protected]>
          bayes: 658: Read slurp data [0d][0a].[0d][0a]Envelope-to: [email protected] [0d][0a]Delivery-date: Fri, 11 Apr 2008 16:42:18 +0100 [0d][0a]R


          According to the log file, the mail server (or some other proxies) sent
          a line which contains just a single period '.' after sending the first
          line of the message.

          In the POP3 protocol, that line means the end of the message.

          That seems to be what popfile _output_, the stream sent _to_ popfile is the first log: I posted logs of input, popfile log and as stored by email client.

          The last item of sent to popfile is '.'

          Looks normal, similar to all other messages, ie. I cannot so far see anything peculiar about the feed from spampal (version is over a year old, primary task is RBL lookup of all URL in header and message body, in the case of the example message a hit on a URL in the message body, selemmu.com, the spam money route).

          Here is my decode of the sent to popfile: -

          preamble and early headers not interesting

          X-pstn-status: off
          X-pn-pstn: Spam 1
          Added by remote Postini mail server


          Subject: special deal on replica
          Presumably subject written here by mail server

          X-EsetId?: AA1C812BD0B53636FE5C
          Added by NOD32 AV here, 3.0.642.0

          X-RegEx?-Score: 65.6
          X-RegEx?: [59.6] FROM_AND_RECEIVED_DO_NOT_MATCH FQDN in From and Received header do not match
          X-RegEx?: [6.0] BE_AMAZED Apparently, you'll be amazed
          X-SpamPal?: SPAM SBL+XBL 89.38.113.105 BODY
          Added by spampal 1.73g here


          Blank line end of header

          This is a multi-part message in MIME format."
          First line of body

          ...

          Visit - <a =
          href=3D"http://selemmu.com/">www.selemmu.com</a></BODY></HTML>


          .
          That is the end of the whole message sent to popfile.

          It might be possible to insert yet another snoop point to confirm what spampal is receiving but a better way would be have spampal clone it's input to a file.
          The only possibility I can see of anything corrupting between spampal output and popfile input is a scan by AV and this is not toy AV.

          As I understand it popfile is logging what it thinks is going on, not actual input data, hence the log is being produced under some kind of fault condition. popfile reports the '.' in the wrong place and indeed that is what it writes to it's output.

          Q: the popfile log shows a very long line, is that normal?

          • Message #115

            As I understand it popfile is logging what it thinks is going on, not actual input data,

            POPFile's logging happens right when the stream is read from the socket, there is no "thinking" going on here. I definitely think that we should regard POPFile's log as correct.

            Q: the popfile log shows a very long line, is that normal?

            How long is very long? The data are read (and logged) in chunks of 160 Bytes. Add the timestamp, pid and other overhead and you can end up with some 200 characters per line.

            Regards,
            Manni

            • Message #117

              I have now reconfigured AV for watch streams only.

              Q: the popfile log shows a very long line, is that normal?


              How long is very long? The data are read (and logged) in chunks of 160 Bytes. Add the timestamp, pid and other overhead and you can end up with some 200 characters per line.

              Ok, from popfile log at start of this thread and looks about that.

              "bayes: 658: Read slurp data [0d][0a].[0d][0a]Envelope-to: [email protected] [0d][0a]Delivery-date: Fri, 11 Apr 2008 16:42:18 +0100 [0d][0a]R"

          • Message #116

            Hi,

            bayes: 658: Read slurp data Return-path: <[email protected]>
            bayes: 658: Read slurp data [0d][0a].[0d][0a]Envelope-to: [email protected] [0d][0a]Delivery-date: Fri, 11 Apr 2008 16:42:18 +0100 [0d][0a]R


            According to the log file, the mail server (or some other proxies) sent
            a line which contains just a single period '.' after sending the first
            line of the message.

            In the POP3 protocol, that line means the end of the message.


            That seems to be what popfile _output_, the stream sent _to_ popfile is the first log: I posted logs of input, popfile log and as stored by email client.

            No. The 'Read slurp data' log shows what was input into POPFile.

            It seems that something is wrong with some software between
            POPFile and the mail server.

            How did you get the first log? That was the mail server's log?
            If not using POPFile, there's no problem?

            Naoki

            • Message #118

              How did you get the first log? That was the mail server's log?
              If not using POPFile, there's no problem?

              mail_server ==> AV ==> (inlog) spampal (outlog) ==> popfile ==> thunderbird

              (outlog posted)

              Not known if it is really.

              mail_server ==> AV ==> spampal ==> AV ==> popfile ==> AV ==> thunderbird

              Done in that order so that popfile can see RBL and other results from spampal.

            • Message #120

              It seems that something is wrong with some software between
              POPFile and the mail server.

              Update 17th April

              10th April 8 bad
              11th April 6 bad
              12th April 5 bad
              13th April 2 bad
              14th April 7 bad
              15th April 7 bad last received at 16:14, AV reconfigured
              16th April 0 bad
              17th April 0 bad

              Been about 15,000 emails since the 15th.

              I will give it a few more days and if it continues at zero, reconfigure AV back to how it was.

              • Message #134

                The defect has been traced and stopped but the root cause is unknown.

                I am posting this to close the issue but also put the information on record so that it might help someone else.

                1. No local software change is known to have occured at the onset of the problem.

                2. The email feed changed at the ISP at about the time of the problem onset but seems to be RFC compliant.

                3. The defect was in AV scanning of a POP3 stream on an alien port, 111 instead of the standard 110.

                4. It was noticed that the AV was inserting a "scanned" marker into the header in more than one position, in other words multiple scanning was going on. In itself benign but not desirable.

                5. Setting AV to only watch the stream, no modification allowed, stopped the data corruption, part proof that there was a connection between the two.

                6. The solution worked out with AV vender support, who responded well, is removal of port 110 and adding of port 111 as POP3 source. This means that scanning will only take place on 111 and that is after Spampal and Popfile, between Popfile and Thunderbird, the email client.

                &. Since the change, over >20,000 emails, zero defects. The current handful of infected emails every day continue to be detected and deleted/quarantined/sent for further investigation, depending.