The Art of SQL Server Database Administration, Development, and Career Skills for the Technically Minded
A typical question while reviewing code for release to production is whether an alter statement will change meta data only or cause page level activity. I am not aware of a matrix or chart in the wild that contains an exhaustive guide. There is a great article by Kalen Delaney circa SQL 2000 that describes the different behaviors of alter table statements with regards to page activity but it seems to come down to experience and common sense for most DBAs.
I started working on my own exhaustive matrix. I’ll publish if I ever complete it. The problem is one of scope. An alter table statement can have a LOT of variables. Kalen herself warned me on twitter – you’re going to need a really big chart.
My lazy side is daunted by the exponential math involved in an exhaustive matrix. The OCD portion of my brain wants to carry on. While these two personalities fight it out here is a non-exhaustive guide from my notes.
Disclaimer: Just because a change affects meta data only doesn’t mean it’s automatically a good idea. A meta data change today can still lead to long term fragmentation, wasted storage, poor query plans, and data integrity issues tomorrow.
ALTER TABLE DROP COLUMN
Any time you are allowed to drop a column it is meta data only. If an exception to this exists – I haven’t discovered it yet.
Why wouldn’t you be allowed to drop a column? Well – it might be part of an index key. It might be schema bound to another object. It might be participating in a calculated column. The good news is – the error message you’ll get under these circumstances comes back really fast as well.
Knowing that dropping a column is meta data only, the next question is typically – “How then, do we reclaim space after dropping a column?”. Answer: rebuild the clustered index.
ALTER TABLE ADD COLUMN
Adding a new column is usually a meta data only change. Starting with SQL Server 2012 even specifying a default value is almost always a meta data only change. Test for yourself against AdventureWorks2012.
-- simple column add with null allowed ALTER TABLE person.person ADD test1 char(10) NULL -- column add with default constraint ALTER TABLE person.person ADD test2 char(10) DEFAULT 'test' NULL -- new column with non null and default ALTER TABLE person.person ADD test3 char(10) DEFAULT 'test' NOT NULL
Clearly adding a default value of test – 4 characters – to a char(10) would probably indicate a poor data type choice. I use it here simply to demonstrate that even fixed types are meta data only.
MORE ON DEFAULT VALUES
As stated, default values can usually be added with a new column and only change meta data.
How default values are now stored in 2012 is out of scope of this post but a very interesting topic. It is covered extremely well by Remus Rusanu on his site. This outstanding post also identifies the exceptions – i.e. blobs, xml, hierarchy, geometry, geography, CLR UDTs, etc. All types where a default value would be really odd anyway.
Even adding a new column with a default value that is non deterministic can be a meta data only change as long as it doesn’t require a different value for each row by design. GetDate() is a classic non deterministic example. The trick? It only gets evaluated once at runtime. All existing rows get the same value.
-- non deterministic default still meta data only ALTER TABLE person.perosn ADD test4 varchar(15) DEFAULT cast(getdate() as varchar(15)) NOT NULL
I highly encourage the Rusanu article for a deeper dive. He also talks about what happens when a default value will push a row over the 8060 byte limit.
SPARSE
Adding the sparse keyword does not affect the data page when adding a new column, but changing a column that already exists is a different story. See alter table alter column section.
WHEN ADDING COLUMNS DOES HIT THE PAGES
There are probably others but two examples where adding a new column requires more disk activity than a simple meta data change:
Adding an identity column.
ALTER TABLE person.person ADD person_id INT Identity(1, 1)
Adding a persisted computed column. Key word here is persisted.
ALTER TABLE person.person ADD FullName AS LastName + ', ' + FirstName + ' ' + MiddleName PERSISTED
ALTER TABLE ALTER COLUMN
CHANGING DATA TYPES
Changes to data type always hit disk. The engine must verify that conversions of existing values can be accommodated and then conversions effected.
While they are different data types you might think a TINY to an INT or an INT to a BIGINT would be meta data only. They are all integer types but changes from one to the other still hits the page.
MAKING COLUMNS LARGER
Variable length types appear to be the only columns you can grow with a meta data only change. (varchar, nvarchar, varbinary)
Increasing the size of fixed length data types must hit disk. This makes sense if you think about what a fixed size data type is and how it is stored.
Increasing the size or precision of numeric and decimal types also hits disk.
MAKING COLUMNS SMALLER
Whether variable, fixed length, or numeric this kind of change needs more than just a meta data change. It may not always make changes at the page level but it must at least scan the table to find out.
SPARSE
Changing a sparse column to non sparse or non sparse to sparse will take a table lock and affects all pages. This change is definitely not meta data only. Books online describes the process thoroughly.
ALTER TABLE ADD CONSTRAINT
Here is a quick breakdown of each of the constraints and how adding one affects meta data vs page data.
CONSTRAINT | META DATA ONLY? | |
Default: | Y | There is a subtle difference in the way SQL 2012 handles default constraints vs default values but adding a default constraint like adding a default value is meta data only. Enforcement takes place on future inserts only. |
Check: | N | Adding a check constraint (with check) doesn’t mean data pages are changed but it does mean that existing rows must be checked to ensure that they meet the requirements of the constraint – so depending on the number of rows to check this could be a lengthy IO process even if ultimately no pages are changed. |
Foreign Key: | N | Adding an FK constraint (with check) like a check constraint must scan rows to verify each value complies with the constraint. |
Unique: | N | Adding unique constraints creates a unique index so obviously a lot more than just a meta data change going on there. |
NOCHECK Option:
Adding a check constraint or a foreign key constraint with NOCHECK is a meta data only change. You may have just created an untrusted constraint so make sure you understand how this works if you are using NOCHECK to get around the overhead of adding new constraints and what you should do to ensure future query performance. Thanks to Hugo Kornelis for the post on untrusted constraints.
ALTER TABLE DROP CONSTRAINT
Dropping constraints is no problem. Like dropping columns, if you are allowed to drop the constraint then the change is meta data only.
PARITION SCHEMES
Table partitioning has several activities related to ALTER TABLE statements that are very powerful and useful. There are some really cool things you can do with partitioning and meta data only changes that deserve a full post of its own. I will update a link from this post when it’s posted.
FINAL DISCLAIMER
I’d love to hear from the community on other variances in ALTER TABLE statements and how they affect data on the pages. Hopefully I haven’t made any mistakes in this post. I tested each of my statements above to the best of my ability – but I’d still recommend testing on your own before making production decisions based on what is written here.
Pingback: (SFTW) SQL Server Links 28/03/14 • John Sansom
Thanks for the exhaustive list. Really appreciate your efforts.
No problem – I’m still working on how to chart it in a nice easy reference and clean up some of the nooks and crannies that I didn’t cover above. Appreciate the feedback.