The Art of SQL Server Database Administration, Development, and Career Skills for the Technically Minded
ETL or Extract, Transform, and Load is an entire discipline in data science. There are companies who focus on nothing but selling tools to meet this business need. ETL can be as simple as pulling a query from a database table, converting dates from a 2 digit year to a 4, and inserting it into another table – or as complex as pulling prices from a stock ticker, making calculations against the value of the dollar to the yen, aggregating millions of rows of data, and pushing the final result as a json feed to a web service. Today we’ll talk about what kind of ETL you can do right in SSMS.
SSIS, or SQL Server Integration Services is Microsoft’s ETL product in the SQL Server stack. Any ETL development of significance is better built there. SSIS development occurs within Visual Studio. The feature set is wide and ever growing. With all that said however, SSMS is not entirely void of options for simple ETL.
Right click any database, look under “Tasks”, and you will see two options. Import and Export Data. Either option brings up the exact same Wizard. To cut right to the chase this wizard will walk through building an ETL package that is in fact an SSIS package.
The Wizard will first establish a data source. Your data source options will depend somewhat on what is installed on your specific box but usually will consist of flat file sources (i.e. text files, csv, delimited etc.), ODBC sources, .Net Sources, OLE Sources and others such as Oracle, Excel etc.
Next, the Wizard will establish a destination. Data Source options are the same as what was available for sources.
As stated above, choosing Import or Export brings up the exact same wizard. The difference is that depending on which one you pick the wizard will either pre-fill the import or export page of the wizard with the database you right clicked to start the wizard, as long as you select a data source that is compatible with the database that is. If you want to Import and/or Export between two data sources having nothing to do with this specific database or instance, you can do that as well.
With source and destination identified you have defined the locations for both the Extract and Load portions of the ETL. As far as what data will be exported/imported, that depends somewhat on what source you select.
Picking a Flat File source for example, as shown to the right, brings up another window where you must identify further parameters such as delimiter, code page, whether or not the column names are available, if not, what you want them to be etc.
If you pick a SQL Server source your options will be to either identify a table or view where the data will come from or a TSQL Query.
This is where the Transform phase of ETL can occur. You can effect some pretty significant data transformations in a single TSQL Query. This query can leverage complex joins, CTEs, sub queries, aggregates, on and on and on. Below, for example, is a quick CTE from AdventureWorks.
Once the wizard parses this query your next page will allow you to map the columns established at the source to the columns at the destination. Click the ‘Mappings’ button at the bottom of the wizard page. You can set options for dealing with data types, whether to create a destination table and add new rows, or append rows to an existing destination. You can also edit the exact SQL Statements that will create the destination table if that is what you select.
The final page is where you can either simply run the package as is and/or save it as an SSIS package to either the file system or an Integration Services catalog on your SQL Server.
If saved to the file system you can open this SSIS package in Visual Studio and work with it as you would any other SSIS package. This allows you to enhance it, expand it, and/or deploy it to multiple servers on schedules as you would a full end to end ETL package developed in Visual Studio.