Sunday, 10 June 2012

I broke the Distribution database

I've been dragged away from the creative side of my job to fix creaking infrastructure recently. It's not a part of the job I enjoy. But I recognise I've reached a level at which I may never again be able to escape back into pure development.

This weeks fun involved a corrupt LDF files on a production server. The server is virtual and appears to be having disk issues. The databases in question continued to mirror and replicate without issue and a test restore followed by DBCC checks revealed no database corruption or data loss.

To fix the corrupt ldf files we had to failover to our mirror, break the mirroring for the affected database, restore form a full backup and re-sync with the mirror again. It worked fine. But the second corruption coming just 6 days after the first suggested an underlying disk issue so our Network guys suggested we nuke the log file  disk partition and recreate it. We failed-over all the mirrored databases and stopped replication from the suspect server. We also moved the ldf files for a few small less important databases away from the dodgy partition.

The disk work done, we had a few hours of reconfiguration to get the environment back to normal. However, when I attempted to setup replication again I realised I'd made a stupid mistake. The distribution database - which is used extensively by SQL Server in replication - was broken as I'd not moved the ldf file for this database. Googling the fix didn't turn up much of use. Most posts were DBA's having issues with replication after "accidentely" dropping the distribution database. None of the posts help me solve the particular issue I'd created. Consequently it took a couple of hours to find a solution. Hence why I've blogged it here.

The problem:
My Distribution database has no ldf file anymore. I want to drop the broken distribution database but decommissioning using the wizard initiated by right clicking Replication in the SSMS UI returns errors.

The solution:
MSDN revealed a number of system stored procedures for managing the distribution database and the procedure shown below with the addition of the parameter @Ignore_Distributor = 1 has the effect of removing the distribution database cleanly with 'No Questions Asked'. This enabled me to then use the simple UI wizard for reconfiguring a new distribution database and quickly establishing replication again.

 USE [master]  
 EXEC sp_dropdistributor @no_checks = 1, @ignore_distributor = 1  
 GO  

What we learned this week.

  • Creative DBA work is more fun than high stress open heart surgery on your production database.
  • Our DR solution works!!
  • Our documentation is non-existent and despite my best efforts will probably remain that way.


Monday, 4 June 2012

Detecting & Fixing Collation Issues

Differences between database and server collations is a common DBA problem especially if the environment has developed organically over years. This is exactly the problem my team faces as we begin our data warehouse development. We're lucky in the sense that we're upgrading and consolidating old SQL Server instances to 2008 R2. This provides an excellent opportunity to standardise the server and database collations although it does require extra work as some legacy apps will need to be thoroughly tested.

We found a two useful scripts that help us identify and synchronise our databases which had collation differences.

The first is from respected SQL Blogger Tibor Karaski and identifies tables and columns within a database that do not match the database collation. Download this script here.
1:  IF OBJECT_ID('tempdb..#res') IS NOT NULL   
2:    DROP TABLE #res  
3:  GO  
4:  DECLARE @db SYSNAME,  
5:    @sql NVARCHAR(2000)  
6:  CREATE TABLE #res  
7:    (  
8:     server_name SYSNAME,  
9:     db_name SYSNAME,  
10:     db_collation SYSNAME,  
11:     table_name SYSNAME,  
12:     column_name SYSNAME,  
13:     column_collation SYSNAME  
14:    )  
15:  DECLARE c CURSOR  
16:  FOR  
17:  SELECT name  
18:  FROM  sys.databases  
19:  WHERE  NAME NOT IN ('master','model','tempdb','msdb') AND  
20:      state_desc = 'ONLINE'  
21:  OPEN c  
22:  WHILE 1 = 1   
23:    BEGIN  
24:      FETCH NEXT FROM c INTO @db  
25:      IF @@FETCH_STATUS <> 0   
26:        BREAK  
27:      SET @sql = 'SELECT  
28:    @@SERVERNAME AS server_name  
29:   ,''' + @db + ''' AS db_name  
30:   ,CAST(DATABASEPROPERTYEX(''' + @db + ''', ''Collation'') AS sysname) AS db_collation  
31:   ,OBJECT_NAME(c.object_id, ' + CAST(DB_ID(@db) AS SYSNAME) + ') AS table_name  
32:   ,c.name AS column_name  
33:   ,c.collation_name AS column_collation   
34:  FROM ' + QUOTENAME(@db) + '.sys.columns AS c  
35:   INNER JOIN ' + QUOTENAME(@db) + '.sys.tables AS t ON t.object_id = c.object_id  
36:  WHERE t.type = ''U''  
37:   AND c.collation_name IS NOT NULL  
38:   AND c.collation_name <> CAST(DATABASEPROPERTYEX(''' + @db + ''', ''Collation'') AS sysname)  
39:  '  
40:  --PRINT @sql  
41:      INSERT INTO #res  
42:          EXEC (@sql)  
43:    END  
44:  CLOSE c  
45:  DEALLOCATE c  
46:  SELECT *  
47:  FROM  #res  

The second script is much bigger and I can't remember where we found it online. If you wrote it I'm happy to reference you here as I certainly can't claim it.

The script will amend a database and the objects within it to a collation of your choice. Line 68 is where you set the variable for the collation you're changing to and line 70 is the database you're altering. It takes time to run as there are some complex conversions involved.

DO NOT RUN THIS SCRIPT IN A PRODUCTION ENVIRONMENT BEFORE TESTING EXTENSIVELY. By publishing this script I am not guaranteeing it is safe to use in your environment and I'm not advocating using it without first understanding it. However, it worked really well for us.

Here is a link to the script.