The Art of SQL Server Database Administration, Development, and Career Skills for the Technically Minded
You never thought it’d happen. Way back when you picked an Int as your PK with an auto-increment identity you said. 2.1 billion inserts?? That’ll never happen. But business exploded. Developers kept inserting gaps with failed loads. That giant ETL project ate up a bunch of your keys. That company you purchased and whose data you had to import threw your keys all crazy. Whatever the reason … you’re now facing the prospect of having to change the data type of your PK from an Int to a Big Int. Scary on a table that size.
The error message would look like this:
“Arithmetic overflow error converting IDENTITY to data type int”.
We recently played this scenario out when trying to choose between using an Int or a Big Int as our PK on a new project. No matter how we ran the numbers we could never wrap our heads around it ever hitting 2.1 billion entries. (the max value of integer being 2,147,483,647). In the back of my mind however I kept asking myself … yeah, but what if it actually happened – how bad would that be? What would our options be? Is it worth the performance hit to just go with Big Int from the beginning?
An interesting thought came out of that discussion. We kept saying, we only have 2.1 billion. We only have 2.1 billion. But in reality the Int data type supports 2.1 billion positive integers. It also supports that many negative integers. So, in truth – you really have 4.2 + billion potential keys. After all the Int data type supports the full range between -2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647).
Can an auto-incrementing identity PK deal with negative numbers? How would we implement that? Can a negative value be seeded? What if we didn’t do it now and waited until we actually hit the max number. Could we reseed at that point with a negative number? Turns out the answer to all these questions is yes. Take a look:
Checking if a identity can accept a negative value with an increment moving lower each time is easy. Try this. It works:
CREATE TABLE [dbo].[mytest]( [mykey] [int] IDENTITY(-1,-1) NOT NULL, [myvalue] [varchar](20) NULL ) ON [PRIMARY]
Insert a few records you’ll move from -1, -2, -3, -4 and so on.
What we really wanted to test however is if we would have this option later? Could we take an existing table, that already had positive values, and reseed it to a negative value? Keep in mind this new value will be lower than all the other existing values. Would SQL let you get away with this? I had my doubts – but, check it out. Using AdventureWorks2012 we reseeded the Person.Address table to negative one million. Worked fine.
dbcc checkident ('Person.Address', reseed, -1000000000)
Does inserting a new value really use it? Would it increment by one from that point?
INSERT INTO person.address ( AddressLine1 , City , StateProvinceID , PostalCode ) VALUES ( '1234 street' , 'alamosa' , 1 , '81140' )
select * from person.address where city = 'alamosa'
The answer is yes:
The interesting thing is that dbcc checkident still reports the max value (32521 in this case) as the current column.
dbcc checkident ('Person.Address', noreseed)
But that is a minor inconvenience:
Things to Keep in Mind
The one gotcha is that while you can reseed your auto-incrementing value, you can’t change the increment direction. In other words if you said start at 1 and add 1 with each new value. You can reseed the first number but not the second, the number to add and direction it moves is a one time decision.
So if you are facing running out of integers with your PK and you are thinking great, I’ll just re-seed to zero and then move backwards the other way. That won’t work. If your seeding mechanism was set originally to add one to each new increment then you can’t change that without recreating the table. You’ll have to accept that your still moving in a increasing value direction and reseed at the lowest possible value instead. -2,147.483,647 heading back towards zero (which presumably was your initial seed point).
You might be asking yourself how this approach will affect page splits. There is a well known rule among DBAs that your clustered index, which is usually the PK, should be made up of an ever increasing key. Following this rule ensures new inserts do not cause “bad” page splits. (Keep in mind “good” splits are a fact of life and unavoidable in a growing table). David Browne wrote an interesting article on this topic. His example uses sequential guids but the concept is the same. He demonstrates that creating a new sequential series which occurs before existing records does not create “bad” page splits the way a random key would.
Look at that! You just bought yourself 2.1 billion more keys to work with. By the time you run out a second time the database table will be twice as big… but chances are you’ll have moved on to a different company by then anyway and it will be some other schmoe’s problem. Ok, I was being purposely snarky – that’s not a very good attitude. If you really have passed 2.1 billion keys you need to pro-actively address the reality that you’re on your way towards the true 4.2 billion dead-end. So do yourself a favor, use the time you’ve bought to address the true problem. You need to update that Int to a BigInt. Kendra Little shares some excellent advice on that undertaking. You might also look into archiving while your at it !?!