Programming Paradigms – Variable Declaration

I’ve never understood the idea that variables ‘should be declared as close to their first usage as possible’. I guess I would agree with it from a perspective of scope rules, but not so much on any other point.

OK, so let’s just quickly cover some basics. Variables are items that are used in programs that can be used to hold information that can be changed. There are two main categories of language:

  • Some languages allow you to use variables that you have never mentioned to it before simply by assigning a value to them – these are often known as weak or loosely-typed languages. They tend to be extremely easy to write programs in, as their operations are at quite a high-level, but the very fact that the ‘type’ of the variable is decided almost-ad-hoc can contribute to some problems;
  • Strongly-typed languages however insist that you tell the system about the variable you want to use before you use it for the first time. You tell the system, in the right syntax, what the variable is to be called and what type of information it is. It might be an integer of a certain size, a pointer to a type of information, an array, or even a ‘structure’ that includes other distinct data types.

In practice, these are two extremes of a continuum, with many languages falling somewhere in-between the two.

In this article, I am referring to relatively strongly-typed languages where variables must be declared before first use.

Historically (as in, pre C++ and before), strongly types languages tended to have all their variables declared at the top of a block or unit of code, and in some cases, this was heavily enforced by using some kind of ‘declare’ section. Here’s an example structure of an Oracle PL/SQL block of code:

DECLARE
— Variable declarations MUST all go here
BEGIN
— Your PL/SQL ‘program’ goes here, and this program block ends with the END keyword below:
END;

and in practice, although the constraints were not typically quite so tight, in languages like C one tended to write:

void functionName(void)
{
/* Most users put declarations early on in the function */
int i; /* and then used them later …
maybe several pages later … */
for (i = 0; i < 20; i++) { printf("ksb was 'ere\n"); } ...

For larger programs, or functions, this could mean finding a variable that you were looking at – and wanted to know more about it – and then scrolling several pages up to find the start of the function / code unit to find out how the variable had been declared, etc etc. This undoubtedly had some disadvantages – but I’m sure that lots of clever people with clever programming editors figured out ways to bookmark the places where the declarations were, and so on. The convenience with this approach (to my mind) was that you knew where to look for the variable; if you could not find it at the start of the unit you were looking at, either in the parameter list or in the ‘virtual’ declarations section, then it must have been a global variable (normally considered to be bad practice) or you had made a mistake.

Then along came C++, which seemed to make quite a big thing about a more lax approach to variable declarations, and where you could do them. In C++, you could even declare a variable inside a statement like for!

void functionName(void)
{
/* In C++ you could forget having to name stuff up here*/
/* and instead declare them where you first needed them …
maybe several pages later … */
for (int i = 0; i < 20; i++) { // The declaration with initialisation printf("ksb was 'ere\n"); // should probably use cout here but I prefer printf } ...

Now, this is fine and helpful if the scope of that ‘i’ variable was just limited to the loop that it is declared in. You would likely see very easily that the i was a ‘new’ i and started off as being 0 etc etc. However, this is not the case and i now exists within this program unit. The result is that you can not redeclare it, so a few pages down in the code you might have to write:

for (i = 0; i < 20; i++) { // The bold bit would fail if you
// stuck ‘int’ before it
printf(“ksb was ‘ere\n”);
}

So now we lack consistency. As a programmer looking for the source declaration of i, we now need to look from here up to the top of the code unit. We could redeclare the int i if we wrapped the for statement in braces (the braces would then become the smallest enclosing compund statement, forming the scope of the new int i variable)… but this sort of thing could lead to craziness which I am not going to talk about right now.

I’m just a little bit uncomfortable with this approach. Maybe I’m stuck in the mud of procedural languages, whilst these C++ high-flyers are shooting into the stratosphere. However, I don’t think so as we are discussing procedural elements that occur even within OO languages.

Since the first Draft…

Since the first draft of this blog, I have found out that certain refactoring tools within Visual Studio really can take advantage of variables declared very close to their first usage. If you select a block of code and refactor it into a subroutine, the automation is pretty handy at figuring out if it can move the variable declaration inside the new function, or if it has to pass it to the function. Of course, the nearer the start of the block of code, the less likely this will end up being neat.

Despite this, and despite all the very excellent tools within Visual Studio for finding definition points and usage points of a particular variable, I still think the ‘declare your variables at the top of your function’ has some merit – though perhaps not as much as it once did.

Leave a Reply

Your email address will not be published. Required fields are marked *