Coding Idioms Gone Wrong

Before I started this blog, I researched the exact meaning of the word Idiom, and was surprised to find from Wikipedia that it does not mean quite what I expected; “An idiom is a term or phrase whose meaning cannot be deduced from the literal definitions and the arrangement of its parts, but refers instead to a figurative meaning that is known only through common use”. Their example is ‘Kick the bucket’ meaning to die, which has no apparent connection with striking an object for carrying solids or liquids with the extremity of one’s leg!

So surely the use of the word ‘idiom’ in relation to programming should not apply? Programming languages are, after all, interpreted or compiled under a very restricted set of rules that surely means that nothing could be expressed in the language (or at least that interpreted / compiled correctly)?

Well, yes and no. Although I have my own examples to follow, I found this wonderful example of a perl implementation of a switch statement in Matt Ryall’s blog:

for ($action) {
    /list/ && do { print list($cgi, $articles, $template); last; };
    /edit/ && do { print edit($cgi, $articles, $template); last; };
    ... code removed ...
    die "Action not supported: $action";
}

Firstly, the language uses a for loop construct to set up the context for the switch-equivalent statement, in this case the $action variable. The entries between slashes represent regex matches; if $action matches /list/ then… and so on. Finally, wait, what is a logical && doing in here? Well, it is saying “if the first part is true (the regex match) then run the second part to see if it too is true” (it does not need to bother running the second part if the first part is not true, as the interpreter knows that the result has to be false). The second part of the && is a do command which happens to execute the desired statements.

So yes, with my new understanding of the meaning of the word, I would say that an idiom is something that can be applied to languages because – who in the hell would think to use a for loop construct to replicate the effect of a switch?

My own idiom examples for today are this from unix korn-shell-scripting:

[expression] && [command to run if true / success]

and the opposite:

[expression] || [command to run if false / failure]

which, as per the perl example above, are shorthand ways to test a condition and only run the command if something is true or false, success or failure.  Either is a handy space-saver when compared to an if block or test – especially in unix scripting where the testing of variables etc is clunky.

However, I have seen a lot of code recently where it seems the developer has got into a habit of doing something (quite possibly for a good or appropriate reason), but then applied the same style or approach to something that is not necessary.  Here’s an example:

return (someList.Count == 0) ? null : someList[0];

which might, say, be used to return an individual item from a list, but only if there are any values returned by a SQL query, and is reasonably valid (though I know some people hate the ‘? :’ conditional expression construct). I guess we could also pause to argue that this is not an idiom as such; but I think it probably is from the point of view that this is a very definite contraction of a more complicated if – else contstruct to define what we return, or the use of an intermediate variable – again using an if – else or the conditional expression.

Somewhere along the line, they’ve used the same constuct to represent this (return a boolean result from a function):

return varX == 0 ? false : true;

or even worse this:

return varX == 0 ? true : false;

when they could have done it all (for the former case) with:

return varX != 0;

I don’t know if this is because the developer misunderstood that you can return the result of a comparison, or something else, but I find it frustrating when I find code like this.

Another case I’ve found recently is this:

if (myFlagVariable & anEnumeration.SomeFlag > 0)
{
    ...
}

which to my mind is a dangerous way to use the & ‘bitwise And’ operator. That’s because, in C# at least and probably many other languages, the enumeration value that you are testing for might be a collation of several flags at once. Here’s a C# and contrived example:

[Flags]
enum Vehicle
{
    None = 0,
    SpeedTriple = 1,
    SprintST = 2,
    Ka = 4,
    Gt = 8,
    Bonneville = 16,
    // Cars happens to work out as 12
    Cars = Ka | Gt,
    // Bikes happens to work out as 19
    Bikes = SpeedTriple | SprintST | Bonneville
}

Now, this is a contrived example and it seems to fit the pattern perfectly for the pattern of testing used above; I have some sort of flag variable that describes what a vehicle is, and I run a test as follows:

Vehicle something = Vehicle.Bonneville;
...
if (something & Vehicle.Bikes > 0)
    // do something related to this vehicle being a bike

and everything makes sense still, so what’s the worry? Well, the worry is if you want to find another type of enumeration where the ‘collections’ of values have some signifigant meaning, but only when they are all present:

[Flags]
enum FraudMarker
{
    None = 0,
    Address = 1,
    Telephone = 2,
    Card = 4,
    Name = 8,
    IP = 16,
    // High Level frauds must include Card and Name
    High = Card | Name,
    // Medium Level fraud must have Telephone and Address markers
    Medium = Telephone | Address
}

In this case, we might want to run a test as follows: if an application has been marked as having possibly-fraudulent markers that match all the High Level flags, then do something. In this situation we must use the following semantic:

if (myFraudMarker & FraudMarker.High == FraudMarker.High)
{
    // do something related to high fraud-warnings
}

Using this approach to matching flags is generally safer than the method comparing the result of the bitwise-and to 0, because in this situation, we want to be sure that both flags are set as necessary, not ‘one of them’. I guess my underlying assumption here is that people will likely set groupings of flags to be things that have to have all elements present (and maybe this is a bad assumption), but I can’t help but feel the comparison back to itself is the safer mechanism, because it makes it very clear that the test is ‘all flags must be present’, whereas the original test style says something different:

if (myFraudMarker & FraudMarker.High > 0)
{
    // One or more High fraud warnings are set.
    // ...which might be right: 'I intersect with the High warnings',
    // but unless it is commented, I can't know as a maintenance
    // programmer if that was really intended
}

Finally, I guess I should admit that although the examples here are contrived and therefore ‘wrong’ at some level, perhaps I should think about how we would test if some particular set of flags were set but no others – hey, it’s easy!

if (myFraudMarker == FraudMarker.High)
{
    // Only the High Level flag(s) are set
}

Leave a Reply

Your email address will not be published. Required fields are marked *