Friday, April 28

Refactoring Databases - Reducing the Risk

I am reading the book "Refactoring Databases: Evolutionary Database Design" by Scott W Ambler and Pramod J Sadalage at the moment.

The biggest difficulty with refactoring databases is caused by having multiple client applications (that you may or may not know about) accessing your database in ways that are inconsistent with good programming practice, such as using "select * from table" and referring to the columns returned by numbered parameters in your code. I wrote an article a few months ago about the strengths and weaknesses of using stored procedures.

If you want to remove a column, you therefore need to be sure to change all your client applications.

The risk of doing this could be greatly reduced if there were a way in which you could trigger on select statements. This would mean that if you performed a "select *", you could replace it with a select statement that returned the columns in the correct order and inserted a null column in place of the one you have just removed. You could also record the fact a select had been made to an audit table, and use this to help identify if you have changed all your client applications.

Unfortunately, you can only trigger on insert, update and delete statements. Dejan Sarka describes a workaround for this by using the profiler and defining a FOR INSERT trigger on the table that it generates. However, this slows things down, and with the increased use of agile practices, a way to trigger on select statements should be introduced.

Until databases support this functionality, a first stage of any refactoring should be to reduce the numbers of ways in which you can connect to a database, and creating an interface that is as small as possible. I discussed the arguments for and against using stored procedures in January. I would now be more inclined to use SPs to connect to the database. If there are a small number of entry points to the database, there are less things to test when you refactor, and it will be more likely you will keep your database schema up to date.

2 comments:

Helephant said...

I think that your point about store procedures making the surface area of your database smaller from a maintenance point of view is an interesting one.

I think you get a similar advantage from a security point of view (provided you make sure that the programs accessing your database are using DB users that have access to the stored procedure and not any of the underlying tables) because then suddenly any malicious people using the programs have a much fewer ways to attack you.

Richard Jonas said...

Good point - for security a small surface area makes it easier to grant and deny access to certain users and applications with a reduced risk of side effects if you deny access to one table, but later find another application does in fact need it.

Using stored procedures also allows you to create an audit trail, so if someone changes something maliciously and denies it, they could be found out more easily.