Learning from my mistakes on SQL…

Here’s some design mistakes I’ve made in the past… learn from my example and shun these anti-patterns!

  1. Normalization is here for a reason. Normalization defines a set of methods to break down tables to their constituent parts until each table represents one and only one “thing”, and its columns serve to fully describe only the one “thing” that the table represents. If you find yourself using string operators in WHERE clauses a lot (SUBSTRING, CHARINDEX, LIKE) to parse out a value in a column – you’ve missed the boat design-wise and your data will become less and less searchable as you move away from the SQL paradigm. (And don’t get me started on COLDESC1, cOLDESC2, etc)
  2. Natural keys are the thing, man. Think carefully before you use an identity or a GUID as your only key. You should reconsider your design if you can’t pick out which row you want from a table without knowledge of the surrogate key. Each of your tables in general should have a natural key that means something to the user and uniquely identifies each row in the table – i.e. go with PartNumber over an integer PartID GUID. In the extremely rare event that there’s no natural key (like with an audit table) – then and only then use a surrogate key.
  3. Protect your data, dammit. Your tables should have constraints on nullability, string length, FK assignment, etc.
  4. Naming standards are actually important. For example, think about X304E_DESCR. It’s important to be CONSISTENT above all and be descriptive – this is your first line of defense when it comes to documentation. Make your names clear, simple and descriptive – and don’t include metadata in the column names.
  5. Don’t use one table to hold all your domain values. Remember that SQL works in sets, according to normalization rules. If you’re doing joins like “AND customer.TableName = ‘Customer’ and customer.RelatedToColumn = ‘descr’, well jeez… that’s a performance killer and will quickly make your queries maddeningly hard to write and update. Remember, tables should represent one ‘thing’ – and one thing only.
  6. Avoid dynamic SQL unless there’s truly no other choice. The same reason above applies to trying to code generic T-SQL objects – such as accepting a table or set of column values as a parameter. It seems neat and tidy to do this initially as it can handle multiple tables with one sproc – but we don’t think of the performance implications of misusing sp_executeSQL in this way. Sometimes there’s no shortcut to writing CRUD statements against the entities directly – or using EF.

Looking through this article on the interwebs, I’m thinking a lot of my experience here isn’t unique. Tip of the hat to that blogger and the great books from Celko and the Guru series from Ken Henderson.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.