The Art of SQL Server Database Administration, Development, and Career Skills for the Technically Minded
I absolutely love this month’s TSql2sDay topic. Take a generally accepted rule of being a DBA and argue against it. If there is one thing I like more than being a DBA it is being cornered and out-numbered in a fight. Hosting Adam Machanic’s rolling blog party this month is Michael J Swart – great topic Michael – great topic.
As I pondered which of the many rules I would argue against I realized that in my experience one stood above the rest. User Defined Functions. I’ll say it right now. You might hate UDFs. Well … I’m not ashamed to say it … I HEART UDFs.
User defined functions to me are the screw driver of the tool box world. They do one thing extremely well. Better than any other tool in the box. They’re also capable of doing other things – but because of their shortcomings in all but that one area they get no love. It seems most people now avoid them entirely. Most posts on UDFs seem to go something like this.
Stop using flat head screw drivers – just stop it. Every time you try to pry out a nail with a flat head screw driver it bends. They are horrible at punching out roll pins. They can’t scrape paint worth a damn. Don’t even get me started on how bad they are at chiseling. They just pretty much suck at everything. The last time I used one I ended up stabbing myself in the thigh because the back makes such a poor strike point for a mallet and it slipped right off the bolt I was trying to break free.
Pry Tools and UDFs in Where Clauses
Does a screw driver make a good pry tool? If we’re talking about a small paint can – then yes – a screw driver makes the perfect pry tool. It’s the first thing I’d look for when needing to open a paint can. Does a screw driver make a good pry tool for removing a four inch nail from a six inch stud. Who cares?? I’m talking about paint cans right now.
In Defense of the UDF
We wrote a database one time to track information about a chain of child care sites. Each site got it’s own database that used replication to consolidate back at the main office. We made heavy use of scalar functions to answer questions like what a kids outstanding balance was – what a kids next payment due date was – what a kids primary emergency contact number was.
We were working with LINQ to SQL at the time and encapsulating some of these simple lookups into UDFs exposed by the data context as regular C# functions made the code elegant, easy to use, easy to troubleshoot, and easy to document. Further, needing these data points on the website, and the management application, and the check in kiosk, and the parent phone app made these UDFs solid gold for ensuring data consistency across each UI.
Did it work? It worked great actually – thanks for asking. I’ll even admit it in front of my peers. I put some of those UDFs into the where clauses of queries, report logic, and other places that needed that same info. I’m a rebel like that.
Say it aint so Johnny!! How did the program survive? A UDF in a where clause?? That should have caused a fracture in the space time continuum. How could I have been so reckless? Was it raging ignorance? No. I could have re-written the logic to use inline code – complex joins – CTEs or other approaches – but the thing is… I wasn’t writing code to remove a four inch nail from a six inch stud. I just needed to open a paint can. I had a UDF with the data I needed, a deadline, and above all – a good understanding of scale.
The reality is – the largest site in the program had a couple hundred kids at most. Even if they attended every single day in a year – that is only 365 days. Let’s say we decided to track before and after lunch separately – ok – 700+. These aren’t huge numbers folks. Further – each use of these UDFs were in relation to the activity for a single child. Let’s say suddenly this program went national and president Obama needed it to track every kid in America. Scale isn’t changing on a child per child basis or even a child per site basis. One room can only hold around 40 kids before the fire marshal and child protective services starts getting involved.
Granted – we would need to be careful not to try and use these UDFs to distinguish one kid apart from every other kid in America. Something like a search on all kids nationwide whose outstanding balance is $252.36 with the UDF calculating balance in the where clause. But again, we’re not talking about 4 inch nails and studs.
Ok fine, you might say, scale wasn’t an issue in this single instance – but those queries still perform badly in comparison to other techniques.
Hypothetical Question: If a query that gets run by one person – once a week – in a forest – could run in 10 milliseconds but in reality comes back in 20, does that person notice?
Answer: Unless they are a DBA with a blog – NO.
Here is another one.
Question: If that same code is run in the forest and because of poor code re-use starts reporting the wrong due date for a bill … and then, the person gets a late fee because the date on the website doesn’t match the date controlling fees back at the office – does anyone notice?
Answer: Yes. If stuff like this happened often you might even start loosing customers.
UDFs have a place in database design. Could improvements be made in how they scale? Sure. But, if you’re smart enough to figure out how many milliseconds a UDF runs in compared to an inline query or complex join with the same output you should also be able to recognize when the benefits of code reuse make the hit worth it and how to minimize that hit to the furthest extent possible.
For prying open paint cans I’ll grab a screwdriver every time. What say you?