Skip to content

Programming Pet Peeves from Real Code Reviews

July 19, 2011

Doing code reviews, I’ve noticed some recurring problems often enough that I started documenting the most aggravating ones. I know that over my career I am culpable for some bad code, so I’m not excusing my former self from any of this either, but hope that recording these bad practices here will prevent myself and others from repeating them.

I’ve broken them into rough categories, and included examples encountered from real code at my workplace (changing the names of variables in most cases).

Using branches to compute booleans

This is one of the most commonly encountered, as it seems to permeate the code around here. It’s when a branching instruction (such as if) is used to set the result of a conditional check, when instead the value of the conditional statement itself suffices:

bool isOverThreshold = (val > threshold) ? true : false;
...
if(isOverThreshold)
    return false;
return true;

Here’s a more succinct way to express the same, that is easier to read (no need to parse the logic of the ifs) and doesn’t take any costly branch statements:

bool isOverThreshold = (val > threshold);
...
return !isOverThreshold;

Taking the long road to get there

Sometimes, there’s code that works correctly, but chooses to take a scenic route. Take this example:

if(text[0].ToString().Equals("q"))
   DoSomething();

It compares the first character of a string in text to the character 'q'. But to get there, creates a new string on the heap to hold a single character, and then does string comparison when it only needs to check a character. I guess the author thought the garbage collector wanted the extra work that is avoided by using a direct character compare:

if(text[0] == 'q')
   DoSomething();

This is another more seemingly innocuous version, but it has some unintended effects:

if(someEnum.Equals(SomeEnumType.EnumeratedElement))
   DoSomething();

The problem here is that the argument to Equals() is an object. This has two consequences: first is performance, because an enum value must be boxed, and second is maintenance, because the lack of type checking makes it possible to compare this enum to the elements of a different enum (or anything, for that matter), which the compiler would have otherwise prevented.

if(someEnum == SomeEnumType.EnumeratedElement)
   DoSomething();

The new code works without boxing and allows the compiler to prevent accidental comparison to the wrong type (which did occur in the case of this particular code).

Not choosing the proper function for the job

I’ve seen the following code:

try
{
    int val = Int.Parse(str);
}
catch(Exception e)
{
    val = defaultVal;
}

And also this:

int val;
bool good = Int.TryParse(str, out val);
if(!good)
{
   throw new Exception("Couldn't parse int");
}

The difference between Parse and TryParse is the first will throw on error, but the second will not, instead returning a status. In both the examples, the author decided to enforce the opposite behavior of their chosen function, instead of just selecting the function they really wanted. They could be rewritten as:

//non-throwing version:
int val;
if(!Int.TryParse(str, out val))
{
   val = defaultVal;
}
 
//throwing version
int val = Int.Parse(str);

Ignoring features of the language

Here’s some code that searches through a list of items, returning the first one that is not marked as processed:

ListItem unprocessed = null;
for(int i = 0; i < list.Count; i++)
{
    if(!list[i].IsProcessed)
    {
        unprocessed = list[i];
        break;
    }
}
... // do something with unprocessed

While that’s perfectly valid C code, it becomes cringe-worthy in C# because LINQ obsoletes many common looping tasks. The following code does the same, and to me is easier to understand:

ListItem unprocessed = list.FirstOrDefault(l => !l.IsProcessed);
... // do something with unprocessed

For a newbie to C#, the lambda expression might throw them at first, but after some exposure becomes second nature.

Ignoring API helpers

The author of this code must have surely examined the API surrounding the X509 Certificate objects in order to write this code:

foreach (X509Certificate2 cert in store.Certificates)
{
    if (cert.Thumbprint == certificateInfo.Thumbprint)
    {
        found= cert;
        break;
    }
}

However, they overlooked a function that does the job even better, because it is case-insensitive, whereas the original is not (potentially leading to a bug):

var found = store.Certificates.Find(X509FindType.FindByThumbprint, certificateInfo.Thumbprint, false);

While that function’s existence may not be obvious unless you look for it, people still often submit their own custom binary search, instead of using List.BinarySearch. That kind of grievous oversight is inexcusable, as most languages that support collections will provide an efficient search function. And I admit to being guilty of this some years ago: I wrote my own binary search algorithm (I forget why now, but recall thinking there was a valid reason, something about the structure of the data not working well with STL). Even though the code (eventually) worked (once the bugs that were already released in earlier versions got fixed) and I was proud of it, I later reworked the structure to replace the search with std::lower_bound. I felt sad deleting my battle-hardened code, but also knew it was for the better in the long run.

In another example, for some reason, users of the TimeSpan struct always seems to miss out on the various available functions:

// doing this:
double sec = ts.TotalMilliseconds / 1000;
//instead of this:
double sec = ts.TotalSeconds;
 
// or this:
var ts = TimeSpan.FromSeconds(0);
// instead of this:
var ts = TimeSpan.Zero;
 
// or this:
var ticksPerSec = TimeSpan.FromSeconds(1.0f).Ticks;
// instead of this:
var ticksPerSec = TimeSpan.TicksPerSecond;

I’ve seen them all (although I suspect they came from the same author).

Overkill

I once witnessed a function to parse a boolean configuration setting. The parser checked the following cases:

bool setting = false;
string[] positiveAnswers = {"true" , "t", "1", "yes", "y", "on" , "set" , "positive", "pos", "+"};
foreach(var p in positiveAnswers)
{
   if(p == val)
   {
       setting = true;
       break;
   }
}

If the configuration setting string matched any of those positiveAnswers, the value was set to true. What’s lacking here? Well they could’ve added “aye-aye”, “absolutely”, “sí”, and scraped a thesaurus of every language for every possible affirmative statement.

Why go to extremes to support a config setting which should be limited to either “true” or “false”? It should’ve said:

setting = bool.Parse(val);

Here’s another piece of WTF:

List<string> links = new List<string>();
links.Add(data.Url);
foreach (string link in links)
{
    ... // do something with the link
}

Looking at it out of context you may not see what is wrong, but here’s the catch: links was not referenced anywhere else. The creation of the list and the loop was totally gratuitous.

Non-symmetrical APIs

I remember reviewing some legacy C code that for the most part used the OOP practice of encapsulation (so far, so good). However this particular “object” had routines called

Conn *conn_create();
void conn_destroy(Conn *);

(again, still good so far). However, the kicker is that conn_destroy was not the inverse of conn_create. It did not destroy the connection, nor did it free the resources created by it. It merely set some bookkeeping that the connection was no longer active. When you actually wanted to destroy the object you had to go and manually free all its resources.

When I brought this up in the code review, I received a five minute explanation why it was impossible to make these two functions symmetrical (which still didn’t make sense after that). Even more non-nonsensical was the refusal to address the problem (even by renaming the function to avoid confusion) because, although the new code under review was added to this poorly named routine, the author didn’t want to touch any legacy code that existed before. Which brings me to the final peeve:

“Sorry, this is outside the scope of the review” syndrome

Part of software development is maintenance. Well, more than “part of”, rather a significant part of our work involves changing the behavior of existing code, either to add features or fix bugs. Of course, code reviews are supposed to ensure the new code is up to standard, contains no obvious logical bugs, makes sense, etc. This means that code under review must be understood in the context of existing work, which follows that part of reviewing code is examining legacy parts surrounding it. And when the reviewer notices and suggests something to improve the nearby code, I hope that the author will incorporate those suggestions into their change set.

All too often, the developer attempts to apply changes as though it’s non-invasive surgery, when instead they should be custodians of the code. If you’re in a position to change a module, then you should consider yourself a part-owner, even if not the original author (often the original author is long gone). But too many times a request to fix up the surrounding bits is replied with “that’s not my code, that was there before I started, etc.” instead of the attitude that you should always leave the code better than you found it.

Leave a Reply

Your email address will not be published. Required fields are marked *