The Basic Lookup Table

‘Lookup Tables’ are commonly created in relational data models and databases as part of the normalisation process. For example, instead of having address rows in an address table that continually repeat the words ‘HOME’ or ‘WORK’ to indicate if this is a customer’s home or work address, we might introduce an ‘AddressType’ table, with a primary key of ‘AddressTypeId’. Our AddressType table may look like this:

AddressTypeId    Name
1                Home
2                Work

Now, we only have to put the integer AddressTypeId value in our address table, a relationship between the two tables, and we run far less risk of having invalid data in our Address table.

What I find interesting about this is that it is now faster in SQL (and in any programming language that can access that database) that it is now preferable to refer to that AddressTypeId by number in your queries, for the following reasons:

  • The query will be slightly easier to read (potentially) and quicker, because you won’t have to refer to the AddressType table in the SQL
  • Because in many cases it is still possible and reasonable to consider changing the ‘Name’ or ‘Description’ column at some point; especially when you realise that you have named it poorly in the first place. e.g. perhaps I realise that I ought not to have called the second address-type above ‘Work’ but ‘Delivery’ instead.

But now we are in a bit of a bind. How do we make references to look-up table keys readable so that we do not have ‘magic numbers’ littered through our code? A common response is to introduce an enumeration in the DB-access-layer that essentially matches the values of the lookup table with enumeration entries (in C#):

enum AddressType : short
{
    Home = 1,
    Work = 2
}

It’s a little annoying that in pure SQL you still need to refer to the ‘magic’ Id key or the name, but inside of our 3GL we can use the enumeration for meaningful code… perhaps:

AddressTypeRecord at = customer.GetAddress(AddressType.Home);

While I am not massively happy with this duplication between the lookup table data and the enumeration, it does make for so-much-more readable (and close-to strongly-typed code) that it really is worth it… at least for simpler lookup tables.

Now, consider what would happen if we did realise that we had named the second address type poorly, and we did want to change it from ‘Work’ to ‘Delivery’… what would we have to do?

  • If we use the enumeration method above, to rename the enumeration we can use refactoring tools (e.g. those built into Visual Studio), or perhaps compiler errors, to help us identify what needs to be changed. Easy!
  • We will change the Name column on our lookup table’s second row, to ‘Delivery’. We will also need to check all our SQL for references to the AddressTypeId column. Equality tests where the test is ‘where addressTypeId = 2‘ will be left unchanged, but any reference to ‘AddressType.Name = 'Work'‘ will need to have the comparison string changed.

The latter will likely be awkward, depending on how clever you’ve been in centralising your SQL and of course the quantity of it, but the enumerations have given us a quick win on the program code side of things.

In contrast, if we worked with a denormalised database like in the original example (a text ‘AddressType’ string on every address row), we would have to:

  • Update every ‘Work’ row in the address table to say ‘Delivery’ – this could take quite some time depending on the table size;
  • Change all references in SQL to do comparisons on the new string…
  • and find all queries in code that make the same comparisons

In truth, if you are lucky, a well-organised environment will have standards and strategies in place to help ensure changes like this are simply not necessary, or easy to do. In the environment I am currently in, though, using the enumeration has helped significantly.

3 thoughts on “The Basic Lookup Table

  1. Hi James – typically YES, this would be the ‘one’ side of the relationship.

    If you imagine a classic shopping-cart system, you might have a ‘Country’ lookup table that limits the possible choices of countries that you will ship to. A ‘Country’ lookup table would have a primary key of CountryId (or often, CountryCode – the two or three digit standard code for the country). This table’s Primary Key would then be used ‘zero or many times’ in a table such as CustomerAddress, or similar.

    I hope that makes sense!

Leave a Reply

Your email address will not be published. Required fields are marked *