Some brief notes on patch submission

Contents

The aim of this chunk of text is to encourage people to produce beautiful, consistent, maintainable code (as patches) that one can take pleasure in reading.

Read your patch

You should read what the reviewer is going to read; ie. when you take your diff -u <files>, read it through rather than just mailing it. If you see chunks like this:

@@ -838,9 +838,24 @@
...
	rtl::OUString   aWorkPath;
-
+   
 	aSecurity.getHomeDir( aHomePath );
...

Then it's best to manually chop these out of the patch - this is just some rather irrelevant whitespace / other change that has happened. Strip all your debugging code out, and ensure what remains makes sense.

Reading the patch also has the distinct advantage of refreshing your memory of the whole change, which often flags missing pieces. If you notice any, go back and make it complete, then re-diff.

Some people believe that line-count is important in patches - however, the opposite is true, particularly for fixes; a neat & well contained solution is usually best.

Cleanliness: code hygine

Commented out cruft

Leaving commented out code lying around is really bad news. It also advertises the programmers' uncertainty to the reader; such an uncertain change should never be committed. There is no point in pretending we have a fix, unless we are certain it is the correct fix. Until we have exceeded the understanding of the original author who made the slip, we can't make a reliable & accurate fix. A simple example:

//    if( 0 != a + 1 ) removed - looks wrong
    if( 0 != a )
Of course - as is likely with carelessly thought out code, that isn't written with a full understanding, this is likely to be wrong again; so we get:
//    if( 0 != a + 1 ) removed before later H.Opeless change
//    if( 0 != a ) - removed H.Opeless 2005/01/26 bug #123456
    if( 0 != a - 1 )
Notice how the programmer has helpfully tried to improve his lot by referencing the bug he (hopefully) fixed. Notice how unconfident he is in his fix. If this is indeed the correct fix. Notice also how the date information is redundant & unreliable clutter source should always be verified from an RCS for regression analysis.

Poor programmers often 'try' things, ie. when they reach a problem, instead of reading around it, understanding it well and then making a simple incision; they start fiddling (unsystematically) with ever increasing combinations of tweaks, testing each one - without any real understanding. Luckily they often leave a great trail of commented out things they tried before (perhaps) happening on a fix.

An excellent programmer would provide a patch that removed any such cruft; it would perhaps add a single line and remove 3.

-//    if( 0 != a + 1 ) removed before H.Opeless change
-//    if( 0 != a ) - removed H.Opeless 2005/01/26 bug #123456
-    if( 0 != a - 1 )
+    if( a > 1 )
It would require no comment. It would perhaps also fix a similar mis-understanding in 2 other places; and add some pre-conditions or assertions, such that this couldn't happen again. She would also perhaps re-write a few German comments & rename a few German variable-names to English equivalents. She would also provide a simple justification to the patch reviewer to make thei job easier.

Over-commenting

No over-commenting: comments may describe the function of functions - however, functions should have clear names to avoid the necessity for that; people doing: if( curLang & LANGUAGE_ENGLISH == LANGUAGE_ENGLISH ) // Some sort of english language style commenting will be laughed at & told to remove it.

Bracket over-use

Some people like (to (insert(random) ) brackets) in their expressions that server no useful purpose. Of course - sometimes brackets are helpful - when operator precedence is not obvious; eg. a = 1 << 2 + 3 is a shooting offence [ NB. this is suprisingly effectively a = 1 << (2 + 3) ].

Other places people like to abuse brackets: 'return' is not a function: ie. return 3; not return (3);

Style

Carefully written code matches the style of the surrounding code, or - where that is hetrogenous - creates an oasis of clean homogenous style amid the mess. Your code should be consistent and conservative wrt. whitespace usage.

It's well worth reading Linus' coding style document - although ignoring the tab-stop section. Particularly wasting vspace is frowned upon.

if( something )
{
    one_line();
}
else
{
    another_line();
}
Wastes huge chunks of vspace; it should be:
if( something )
    one_line();
else
    another_line();
But never:
if( something) one_line();
else another_line();
or something equally cramped.

Copy & Paste

In the bad old days before keyboards, cutting and pasting took lots of time, glue & co-ordination; it was also obvious that it filled your 128 bytes of memory with duplicate rubbish. In the modern world - the invention of the copy/paste combination has lead to a massive increasing in programmer productivity. Now - without the use of artificial arms, thousands of lines of bad code can be duplicated in many different places all over a largeish project. An added career benefit is that it looks as if you are really clever to have (apparently) written such large chunks of (what looks like) new code so quickly.

Conversely - anyone cutting and pasting more than 2 lines of code will just be laughed at, called an amateur, and poked with a sharp stick; before sending them to tame some vicious, but simple sounding bug somewhere else. The net result of a nice patch should be to reduce complexity & code-size while adding features and increasing maintainability; if some existing code needs to be re-factored to avoid copy/paste - then you need to do that first; it's not optional. Particularly laughed at are large blocks of very similar code with a few numbers changed.

Tab stops

So - this is an old problem, and one that keeps coming back to bite people. Here is the problem: a tab character is 0x09. A space is 0x20. Indentation - is determined by the amount of visible space before code on a given line.

When an editor / viewer renders a '0x09' character there is a disagreement as to how many on-screen '0x20' characters to transliterate it into. Often in Unix-land it's 8. In Windows-land it's often 4 (by default etc.). ie. "0x09Hello World" would be:

    Hello World	        on Win32
        Hello World     in Linux

So what you say; it should be at least consistent. The problem is - for smaller indents there is a consistent method - the space. ie. for a small indents of 4 stops on Unix - you can just use 4 spaces; leaving tabs for the bigger 8 stop indents. ie.

{
    if (foo)
0x09baa();
}

Which will look fine on default Unix:

{
    if (foo)
        baa();
}

But on Win32 since the 0x09 will turn into 4 spaces; you will see:

{
    if (foo)
    baa();
}

which is ugly - and of course, is only a simple instance of this problem.

How to fix it - since the OO.o standard is different to the stock Unix standard - first we need to configure our editor to deal with spaces, inside emacs you can use M-x set-variable tab-width 4 to change the view of a single file; or a setq tab-width 4 hook for certain files.

When it all goes wrong: there is an app called 'expand' that will convert all your 0x09 tab characters to a given number of spaces. If you created the entire file on Unix ie. it all has this 8stop tab assumption, then expand <file> will do what you want.

Testing

There are a few things that people tend to forget about when altering the code - that require special care and testing.

Explain your patch

The person that reviews your patch is going to need to understand the context, and your reasoning. Hopefully the code you're changing is clear, and the bug is quite obvious, in that case - no explanation is necessary. Perhaps it's a simple kind of bug: eg. a buffer over-run. Then something like:

-      for( i = 0; i < nLength; i++ )
+      for( i = 0; i < nLength + 1; i++ )
nLength is not the real length, cf. it's decl.would be adequate.

When the bug or patch is more complex, proportionally more explanation is required. No essays are required though - bad grammer, mis-spelliing, rough code pointers & terse explanation are fine. If you find the reviwer is asking you for an ever more detailed analysis - it's most likely because they think your fix is wrong & want to encourage you to find that our for yourself - try thinking again - of course, they in turn may be wrong - convince them.

When the patch re-factors, or re-indents a chunk of code, you can get hundreds of lines of unreadable diff; it's nice to have a comment on any substantive changes there: Re-factored getFoo() into two methods getBaa() and getBaz() to allow re-use in getNurgh() - no other changes is great.