This is an old revision of the document!


POPFile Developer's Guide

1. How do I become a POPFile developer?

If you are interested in developing code for POPFile then you came to the right place. The first thing you must do is read this document and understand it.

Then go to the Bleeding Edge - Source Code forum on the POPFile page on SourceForge and post a message saying that you are interested in writing some code and making a suggestion. You'll get honest feedback on the idea. This is very important because you don't want to waste your time coding something that someone else is working on. It's also a good idea to find out whether I think your idea is good otherwise you might end up coding something that gets rejected, but most likely you'll get encouragement and suggestions from other POPFile developers and a “show me” response from me.

It is important to realize that one thing I really care about in the code base is the quality of the code itself. That means taking the time to understand the POPFile coding style and how to write POPFile tests is very important. Changes to POPFile that are untested or don't meet the coding style will be rejected.

You will also need to sign a POPFile License Agreement, see below.

2. How do I post a patch?

First read this entire document, then create a patch using diff3 (or a similar program) and post it to the Patches database on SourceForge. Make sure that your patch meets the coding standard and has associated unit tests (unless it's really, really, really simple).

Before you post your patch it's a good idea if you run the POPFile test suite against your version of the code to make sure you didn't break anything else. Posting a patch that meets the coding standard, has its own tests and passes the full POPFile test suite is the best way to get on my good side, get your code in CVS and one day maybe get CVS access for yourself.

3. How do I get CVS commit access?

Contact the project owner, John Graham-Cumming, [email protected]. You'll get CVS access if you consistently deliver great patches and come join in with the other developers in the [Bleeding Edge - Source Code] forum, follow the coding guidelines, write good tests and, even better, take direction from me on changes that I want to change in POPFile.

4. Does POPFile have a coding style?

Yes, and here it is (note that not all of the POPFile source code matches this coding style… yet… if you see something that needs cleaning up then clean it up!).

POPFile has a coding style so that all the code looks consistent and to make it easy for others to read and understand. POPFile's coding standard may not be the one you prefer. Too bad! Individual style is one thing; but getting a single consistent style over multiple developers is vital to keeping the code clean.

  • Don't obfuscate your Perl. Perl is a great language for writing cross platform code, but it is easy to be obscure and there's no excuse. So

minimize your use of implicit variables and imagine that a new Perl programmer is trying to decipher what you have written.

  • No TAB characters in files, and use 4 space indentation. No carriage return characters in files.
  • { goes at the beginning of a new line after a sub declaration but at the end of the line in all other cases.
    sub foo
    {
    }
     
    if ( $done != 0 ) {
    }
  • Leave blank lines above and below blocks of comments, even when the comments are small.
    # Check whether the elephant feeding module is loaded and load it
    # if necessary
     
    if ( $efm->loaded() ) {
  • Every module or file must have a file header that states the name of the code and the purpose of the code in the file and the standard copyright notice as follows:
    #---------------------------------------------------------------------------
    #
    # Module.pm --- A module that handles the loading of banana wumpus drivers
    #               and links them into the octopus subsystem using dinolinking
    #
    # Copyright (c) 2001-2003 John Graham-Cumming
    #
    #---------------------------------------------------------------------------
  • Every subroutine must have a header of the following form:
    #---------------------------------------------------------------------------
    #
    # update_word
    #
    # Updates the word frequency for a word
    #
    # $word         The word that is being updated
    # $encoded      1 if the line was found in encoded text (base64)
    # $before       The character that appeared before the word in the original
    #               line
    # $after        The character that appeared after the word in the original
    #               line
    # $prefix       A string to prefix any words with in the corpus, used for
    #               the special identification of values found in for example
    #               the subject line
    #
    #---------------------------------------------------------------------------
  • Explain any code where the operation is not obvious with a comment. The bar for not obvious should be very low, but never write something like:
    # Increment i
     
    $i += 1;

    but a good comment would be:

    # This is the A PIECE OF PLATFORM SPECIFIC CODE and all it does is force
    # Windows users to have v5.8.0 because that's the version with good fork()
    # support everyone else can use whatever they want.  This is probably only
    # temporary because at some point I am going to force 5.8.0 for everyone
    # because of the better Unicode support
     
    my $on_windows = 0;
     
    if ( $^O eq 'MSWin32' ) {
        require v5.8.0;
        $on_windows = 1;
    }
  • Use parens instead of relying on precendence rules. Write
    if ( ( $foo == $bar ) && ( $number > 0 ) ) {

    instead of

    if ( $foo == $bar && $number > 0 ) {
  • Don't use elsif. This is because of some lameness in the POPFile code coverage tool.
  • Use lowercase with underscore between words for subroutine and variable names, e.g.
    start_your_engines();

5. How is POPFile tested?

POPFile has an automatic test suite that is run using the tests.pl script. You can type make test in the engine directory to run the tests or simply run tests.pl yourself.

tests.pl searches the tests/ subdirectory for files ending with .tst and loads them as Perl scripts. Each .tst module uses two helper functions in test_assert( $test ) and test_assert_equal( $test, $expected ). test_assert is used to assert that a particular test (an arbitrary piece of Perl that the test_assert subroutine will eval) is true, test_assert_equal is a glorified string and number comparison subroutine that is used to assert that the result of some test (test_assert_equal does no eval the $test parameter) is equal to some expected value.

tests.pl will run all the .tst files printing a . for each test that passes and an appropriate error for those that fail, and then print out a summary at the end of the total number of tests and the number that failed.

Each Perl module in POPFile should have a corresponding test file in the tests/ subdirectory. For example, for MailParse.pm we have TestMailParse.tst.

Before checking in new code or submitting a patch run the test suite to protect against regressions.

POPFile is released under the General Public License used for free software but in order to ensure that the actual code is free of any claims by people who's interests are different from the GPL and to enable me to litigate cleanly if someone were to break the GPL and to create derivative versions of POPFile from code that is contributed without legal problems contributors are required to sign the POPFile License Agreement.

A simple summary of this license is “you tell me that the code you wrote doesn't belong to someone else, you give me the right to do what I like with it in the context of POPFile, and I protect you from getting sued if your code in POPFile 'injures' someone”.

Some background reading on the subject of copyright and copyleft and the POPFile License Agreement can be found here:

  1. My original thread on POPFile and code copyright: http://sourceforge.net/forum/forum.php?thread_id=800798&forum_id=230652
  2. The Free Software Foundation's FAQ on the GPL: http://www.fsf.org/licenses/gpl-faq.html
      
      A very important section reads:
      http://www.fsf.org/licenses/gpl-faq.html#WhoHasThePower
      
      "Who has the power to enforce the GPL?
    
       Since the GPL is a copyright license, the copyright holders of the 
       software are the ones who have the power to enforce the GPL. If you see a 
       violation of the GPL, you should inform the developers of the GPL-covered
       software involved. They either are the copyright holders, or are connected
       with the copyright holders."
       
       It is vital that I have clear ownership of the copyright on POPFile so 
       that if it becomes necessary to sue under the GPL I can without hindrance.
       
       I URGE YOU TO READ THE ENTIRE FAQ.
  3. The Free Software Foundation's note on copyright assignment: http://www.fsf.org/licenses/why-assign.html
       
      "By Professor Eben Moglen, Columbia University Law School 
       
       Under US copyright law, which is the law under which most free software 
       programs have historically been first published, there are very 
       substantial procedural advantages to registration of copyright. And 
       despite the broad right of distribution conveyed by the GPL, enforcement 
       of copyright is generally not possible for distributors: only the 
       copyright holder or someone having assignment of the copyright can enforce 
       the license. If there are multiple authors of a copyrighted work, 
       successful enforcement depends on having the cooperation of all authors. 
       
       In order to make sure that all of our copyrights can meet the 
       recordkeeping and other requirements of registration, and in order to be 
       able to enforce the GPL most effectively, FSF requires that each author of 
       code incorporated in FSF projects provide a copyright assignment, and, 
       where appropriate, a disclaimer of any work-for-hire ownership claims by 
       the programmer's employer. That way we can be sure that all the code in FSF 
       projects is free code, whose freedom we can most effectively protect, and 
       therefore on which other developers can completely rely." 
 
developersguide.1169721336.txt.gz · Last modified: 2008/02/08 19:49 (external edit)

Should you find anything in the documentation that is incomplete, unclear, outdated or just plain wrong, please let us know and leave a note in the Documentation Forum.

Recent changes RSS feed Donate Driven by DokuWiki
The content of this wiki is protected by the GNU Fee Documentation License