The Art of SQL Server Database Administration, Development, and Career Skills for the Technically Minded
I’m actually pretty used to finding things out about myself that I would rather not know. I can’t bench 400 lbs. I haven’t run a sub 6 minute mile in 20 years, a sub 7 in 15, or a sub 8 in 10. I make annoying clicking sounds in my cube with the novelty poker chips I acquired at SQL Saturday Vegas and I’ve been known to talk too much. So, you might think that finding out yet another undesirable trait wouldn’t be that injurious to my sense of self. You’d be wrong. I discovered this week that I’m no better than those annoying hippies at the mall waiting in line for an iPhone. I’m a brand snob. This revelation really bothers me because, well, those frickin’ hippies are annoying. I don’t want to be one of those guys.
I’ve always considered myself enlightened. I grew up on Pepsi but switched to Coke after I started drinking diet. I grew up with Chevys and John Deere tractors but currently drive a Ford Focus and kind of like Case. I thought I was past this whole cola war thing. I go with what works best for the job at hand – not what looks great on a t-shirt and fits my peer group image.
I inherited a project a few weeks ago that required me to perform a significant amount of ETL on flat files obtained from a main frame source. I’ve never worked in the mainframe world as a front line contributor but I have lots of ETL experience. I didn’t think it’d be such a big deal; data is data. I knew the file to be extracted was some type of “fixed width” file produced by COBOL.
I’ve been around but this was my first introduction to COBOL and COBOL copy books in particular. Despite what you may think COBOL still has a pretty significant presence in industries with legacy such as finance, government, etc. The copy book serves as a type of schema descriptor. It defines specific byte ranges and rules for reading an extract. It also includes instructions for repeating values, dependent values, and redefine clauses for when a value or set of bytes is actually going to be something else for awhile. In other words a COBOL export can be hierarchical, semi-structured, and multi-valued all at the same time. I needed to pull it into a purely relational format.
On the first day one of our newer hires over in the finance area said this data could probably be read by Informatica. Informatica?? yuck, I’m an SSIS guy. Besides I’ve written my own parsers before to grab fixed width data. I was pretty sure even if SSIS couldn’t make quick sense of the data I could script something out that would. I started a multi day effort that lead to dead end after dead end.
I spent at least a full day just wrapping my head around the syntax and format of a copy book. I then spent another good day searching the inter webs, stack exchange, blogs, free ware sites, and just about any other source imaginable for existing code or solutions to making sense of the copy book. I considered everything from hiring a foreign asset to write the .Net code for me to purchasing a no-name tool that promised it could convert copy books into anything including SQL Server, JSON, XML, Oracle, DB2, CSV, Tab Delimited, Flag Semaphore, Navajo, Mandarin Chinese and Klingon. Four days into the project I was getting concerned.
Somewhere in the back of my mind a small voice kept prodding. Yo, dude, why not give Informatica a shot. I kept ignoring it, but deadlines have a funny way of making you desperate. I finally grabbed that IBM yuppi from finance and asked for a tutorial on Informatica. A few hours later I had a full SQL Server schema generated from my original copy book. I’m not talking something basic. The final product was made up of over 70 tables with every data type imaginable and foreign keys intact.
The moral of this story isn’t that Informatica is a great product. Frankly, I don’t know enough about it to judge. I do know it’s ridiculously expensive and finding online tutorials, walkthroughs, or documentation is near pointless. The moral of the story is my own snobbery wasted several days of my time and my employers time. Whatever it’s merits, shortcomings, or expense – it was a sunken cost – we had the license. I was avoiding it for a task it was obviously very capable of performing because it’s not my brand. That’s just stupid.