2

I am going to implement several lookup tables in a system. In general all lookup tables have the same structure like id, name, value, rank, active

We are using AngularJS as front-end and Web API/Entity framework as backend in this project

There are some options on my mind

Option 1 - Create a set of lookup tables with the same structure e.g. LKRegion, LKStatus, LKPeriod, LKState, LKDepartment, etc. This option is a traditional design. The data schema is structural and easy to understand. It is easy to implement/enforce foreign key integrity. But you have to create separated web methods to handle CRUD actions. You have to repeat the same thing if you have another lookup table to add in the future.

Option 2 - Create a big lookup table by adding an extra column called LookupType to identify the lookup group This option reduces the number of tables. Make the lookup table easy to maintain and retrieve (e.g. One schema, one web method can handle all general lookup CRUD actions). But the foreign key integrity is a little bit loose due to the LookupType.

Please share your preference and the tell me why. I would like to get the best practise on this implementation. Thank you!

5
  • 3
    There is no advantage whatsoever in the option 2) and the loss of a proper foreign key is a show-stopper from my point of view. There is no point in "reducing the number of tables". It won't make your database faster or smaller or easier to maintain (actually it will be harder to maintain: what if you need to add a new attribute for one lookup type? And then another attribute for another lookup type?). Option 2 is known as the anti pattern "one true lookup table" Commented Dec 7, 2015 at 22:42
  • If my lookup tables are just for serving dropdown list text/values, I may not need the extra attributes in the future so I will ignore the disadvantage on the attribute maintenance. My concern on option 1 is that if I need to implement a new look up table, I need to drag new data schema and recompile the project. For option 2, it is just a data change. If considering to promote a project from dev to production and many developments happen in the same project and some are not ready to go live, option 2 is more suitable and has less risk. Commented Dec 7, 2015 at 23:12
  • 3
    @RayGao, when designing a database the data integrity is more important than ease of CRUD operations. Often data lives much longer than the application that provides user interface to this data. In 10 years there will be another new shiny frameworks and new fresh graduates will be putting them to use everywhere, but the business will continue to use the accumulated data and it is important that data remains valid and correct. That's why it is important to enforce constraints like foreign keys on the database level, not on the application level. Commented Dec 8, 2015 at 4:03
  • 1
    If you add a new lookup table, that table will be used by some other table - which most probably is new as well. So you would need to roll out a new version of your application anyway (what sense would it make to add a new lookup table that isn't used). And you might not see the need to extend those tables now, but are you sure there isn't one in a year? Or two years? Important data lives much longer than an application. Very often the applications are re-designed but keep the old data. Fixing a bad data model later is much more complicated than doing it the right way from the start. Commented Dec 8, 2015 at 6:48
  • Re "I may not need the extra attributes" - even if you stated "I will never need the extra attributes", this would change within a year or so anyway :) Commented Dec 8, 2015 at 12:39

3 Answers 3

3

I'll defend Option 2, although in general, you want Option 1. As others have mentioned, Option 1 is the simpler method and easily allows foreign key relationships.

There are some circumstances where having a single reference table is handy. For instance, if you are writing a system that will support multiple human languages, then having a single reference table with the names of things is much simpler than a zillion reference tables spread throughout the database. Or, I suppose, you could have very arcane security requirements that require complex encryption algorithms -- and dealing with a single table is easier.

Nevertheless, referential integrity on reference tables is important. Some databases have non-trigger-based mechanisms that will support referential integrity for one table (foreign keys and computed columns in Oracle and SQL Server). These mechanisms are a bit cumbersome but they do allow different foreign key references to a single table. And, you can always enforce referential integrity using triggers, although I don't recommend that approach.

As with most things that databases do, there isn't a right answer. There is a generally accepted answer that works/is correct in most cases (Option 1). The second option would only be desirable under limited circumstances depending on system requirements.

Sign up to request clarification or add additional context in comments.

Comments

2

I suggest that :

A. Follow the organization standard if this is an enterprise system (some may laugh loud on this, I know). If such a thing exists, it would certainly promote individual tables.

B. Use Enums or 1 aggregated lookup table for programming level lookups only (such as error messages, etc,) if you must only. Any lookup data for business related data should be (in my opinion) be in a separate table for the following reasons at least:

  1. When you have separate tables, you need to use the correct table name when you join and not use a code column of the reference table. This makes writing queries less error prone. Writing "Select ... Where (TableID=12 and State="NY") AND (TableId=133 and Country="USA")"...style of coding is quite error prone during development. This is the major issue for me from coding perspective.

  2. RI errors on inserts and updates may be ambiguous when there is more 1 than reference to the lookup in the row being inserted or updated.

  3. In some cases, the a lookup table may have self references (relationships). For example, a Geographical location can be described as a hierarchy which would add more confusion to the model.

  4. The relationships (references) could loose meaning in your database. You will find that almost every table in your system is linked to this one table. It some how will not make sense.

  5. If you ever decided to allow the user to perform ad-hoc reporting, it would be difficult for them to use codes for lookup tables instead of names.

  6. I feel that the 1 table approach breaks Normalization concepts - Can't prove it now though.

An disadvantage, is that you may need to build an indexes on PKs and FKs for some (or all) of the separate tables. However, in the world of powerful database available today, this may not be a big deal.

There are plenty of discussion in the net that I should have read before answering your question, however I present some of the links if you care to take a look at some yourself:

Link 1, Link 2, Link 3, Link 4, Link 5...

Comments

2

Avoid option 2 at all costs, go with option 1 without even thinking about it.(*)

Referential integrity is far too important to compromise in favour of virtually any other concern.

If there you go, only pain will you find.

If you want to reduce duplication, implement a list of services in your web-api implementation language (java?) and parametrize each service with the name of the lookup table to work with.

Edit

(*) It was wrong on my behalf to say "without even thinking about it". Of course, think about it. If need be, go ahead and even post a question on stackoverflow about it. Thinking is good, and Gordon Linoff's answer above demonstrates this nicely.

1 Comment

@usr I am not sure these are enum lookup tables, and I am not sure he has any other option, because the OP speaks of "CRUD actions".

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.