MS SQL Server - Trust me it's easy!: March 2010

Sunday, March 28, 2010

What if you forget the password of Database Master Key (DMK)

Have you forgotten the Database Masker Key password of your database?

If yes, then execute the below statement in your database as soon as possible.

ALTER MASTER KEY REGENERATE WITH ENCRYPTION BY PASSWORD = 'YourPassword';

Have the statement completed? If yes, then be relaxed, you are at safer side now.

Is this really a matter of urgency?

Answer could be yes or no and that depends on the current configuration of Database Master Key. By default Database Master Key is encrypted by Service Master Key and Password both. And re-generating the password is not a matter of urgency. But if you have already altered the Database Master Key to drop encryption by Service Master Key, in that case password is the only weapon to recover that database if the database get corrupted.

NOTE: The REGENERATE option re-creates the database master key and all the keys it protects. The keys are first decrypted with the old master key, and then encrypted with the new master key. This resource-intensive operation should be scheduled during a period of low demand, unless the master key has been compromised.

You can get the syntax of ALTER MASTER KEY statement here:

http://msdn.microsoft.com/en-us/library/ms186937.aspx

Database Master Key (DMK)

One database can have only one Database Master Key at a time.

Like Service Master Key, Database Master Key is also a symmetric key.

Database Master Key is created using CREATE MASTER KEY statement. It doesn’t generated automatically as like Service Master Key. The syntax to create a DMK is:

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'password'

For example:

USE TestDataBase

GO

CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'SQLReality_1'

In default settings a Database Master Key can be decrypted either by Service Master Key or by password that was used to create it. It helps in reducing the key administration task. Because when a database that has Database Master Key, is restored on same server, its DMK is automatically decrypted by Service Master Key without coming into the notice. In such cases password that is specified while creating the DMK are not used. But it doesn’t mean that password is just a value to create the DMK and through away.

Password that is used while creating a DMK plays its crucial role when database is restored on different server. Because the Service Master Key of new server can not decrypt the DMK, after restore the database we will have to decrypt the DMK with password as below:

OPEN MASTER KEY DECRYPTION BY PASSWORD = 'password'

For example:

USE TestDataBase

GO

OPEN MASTER KEY DECRYPTION BY PASSWORD = 'SQLReality_1'

Once decrypted by password we can alter it to decrypt by Service Master Key of new server so that it can automatically decrypt in future. For that execute following statement on new server:

USE TestDataBase

GO

ALTER MASTER KEY ADD ENCRYPTION BY SERVICE MASTER KEY

We can also use the ALTER MASTER KEY statement to re-set the password of Database master key for security purpose or if forget.

Service Master Key (SMK)

Service Master Key is automatically generated the first time when it is needed. In most cases its created just after SQL Server installed. Because this key is not only required for key or data encryption but also required to encrypt Linked server password and SQL Server credentials. SMK is encrypted by MACHINE KEY (computer specific) and by windows credential of SQL Server’s service account. It means that this key becomes invalid on changing the SQL Server computer or on changing the SQL server service account. But to minimize the management trouble it automatically re-generate SMK when either of these credentials becomes invalid on the basis of remaining valid key. We can also use ALTER SERVICE MASTER KEY statement to alter the service account. But the DBA has to take action when both of these credentials (computer and service account) changes at once.

Service Master Key is the base of complete automatic key management and directly or indirectly encrypts all other keys on that server, it is crucial to take precautions to handle such situations.

So what is the precaution to keep the business running when SMK turns invalid?

And the answer is: have a backup of SMK.

The syntax to take the backup of SMK:

BACKUP SERVICE MASTER KEY TO FILE = 'path_to_file' ENCRYPTION BY PASSWORD = 'password'

For example:

BACKUP SERVICE MASTER KEY TO FILE = 'D:\Key_Backup\SMK' ENCRYPTION BY PASSWORD = 'SQLReality_1'

The password provided here is used to encrypt the service master key in backup file and will require when restoring this SMK. So the complexity and security of this password along with backup file is subject of high concern for successful restore of SMK when needed.

Syntax for restore statement is following:

RESTORE SERVICE MASTER KEY FROM FILE = 'path_to_file' DECRYPTION BY PASSWORD = 'password' [FORCE]

For example:

RESTORE SERVICE MASTER KEY FROM FILE = 'D:\Key_Backup\SMK' DECRYPTION BY PASSWORD = 'SQLReality_1'

Even there are T-SQL statements for ALTER, BACKUP and RESTORE to the Service Master key but there is not CREATE statement to create this key. This can be viewed by below statement:

USE master

SELECT * FROM sys.symmetric_keys

WHERE name = '##MS_ServiceMasterKey##'

Difference between Cell- level Encryption and Transparent Data Encryption (TDE)

Cell- level Encryption

Transparent Data Encryption (TDE)

Granular control over which data is encrypted

Encrypts the entire database

User‐aware encryption can control access on a need‐to‐know basis

Encryption is not user‐aware; data is open to all users who have permission to access the database

Requires analysis to find sensitive data

No analysis required because entire database is encrypted

Affect table structure because encrypted data can only be stored in varbinary data type columns.

No change is required in table structure.

Database applications need to be modified to use specific functions to encrypt and decrypt data.

No database application change needed.

Indexes, primary keys, and foreign keys cannot be encrypted.

No impact on indexing, primary keys, or foreign keys

Potential impact on performance because indexes on encrypted columns are not used while searching a value.

Small impact on performance (up to 5%)

Facts about SQL Server Data Encryption & Decryption

1. Encryption is supported by SQL Server Developer and Enterprise editions only.

2. Encryption is the conversion of readable plaintext into ciphertext, which cannot be easily understood by unauthorized people.

3. Decryption is the process of converting ciphertext back into its original form so it can be understood.

There is no SSMS interface to create encryption keys, and certificate. They can be created only by T-SQL.

4. Two types of encryption can be implemented in SQL Server: Cell-level encryption and Transparent Data Encryption (TDE). You can implement any one or both in a database.

5. Data can be encrypted by any of following methods:

·         Password (ENCRYPTBYPASSPHRASE function)
·         Hashing (HASHBYTES function)
·         Symmetric Key
·         Asymmetric key
·         Certificate
·         TDE

6. TDE secure the data on disk. TDE encrypt data while saving to disk in data files or backup files and decrypt data while reading from disk. TDE ensured that database files or backup files cannot be stolen and restored on another machine without the necessary DMK. No application and database design changes are required for TDE.

7. Cell-level encryption is the most part of encryption and it is maintained by multi level hierarchy of encryption keys. Particular value in a table can be encrypted with this type of encryption.

8. Following is the hierarchy of encryption keys (following scenario is one of many possible key hierarchy configurations):

Table data encrypted by Symmetric key

Symmetric key encrypted by Asymmetric key or certificate

Asymmetric key encrypted by Database Master Key (DMK)

DMK encrypted by Service Master key (SMK)

SMK encrypted by Windows Data Protection API

9. Best practices for cell-level data encryption

·         Encrypt high-value and sensitive data only.
·         Use symmetric keys to encrypt data, and asymmetric keys or certificates to protect the symmetric keys.
·         Password-protect keys and remove master key encryption for the most secure configuration.
·         Always back up the service master key, database master keys, and certificates by using the key-specific DDL statements.
·         Always back up your database to back up your symmetric and asymmetric keys.

10. Data encrypted by cell-level encryption can only be stored in a varbinary data type column. So to encrypt the data of existing table needs table structure changes.

11. Cell-level encryption is implemented through many encryption and decryption functions. So it needs T-SQL code changes to implement DML operations.

12. Certificate is preferred against Asymmetric key because certificate can be backed up to a file while asymmetric key not.

13. The automatic key management hierarchy works as follow (take the above mentioned scenario):

14. The Windows Data Protection API automatically decrypts SMK (Service Master Key).

SMK automatically decrypts DMK (Database Master Key).

DMK automatically decrypts asymmetric key.

Asymmetric key automatically decrypts symmetric key

Symmetric key automatically decrypts table data.

So by default, the automatic key management does not provide any more security because data is decrypted automatically when accessed.
15. Data can be protected if we take control over automatic key management. This can be done by following three methods:

1. Granting authorization on key to authorized users only.
2. Encrypting the data or keys with secure password.
3. Extensible Key Management (EKM).

16. Password can be used to encrypt table data and any level of key up to DMK.

Using OUTPUT clause for auditing

In SQL Server 2000 and earlier versions, trigger was used to log changes into a table for audit purpose. But the OUTPUT clause introduced in SQL Server 2005 can also be used to perform such logging. Let us see by an example:

CREATE TABLE [dbo].[MyTable1](

[Col1] [int] NULL,

[Col2] [varchar](10) NULL

) ON [PRIMARY]

CREATE TABLE [dbo].[MyTable1_Audit](

[Col1_Old] [int] NULL,

[Col1_New] [int] NULL,

[Col2_Old] [varchar](10) NULL,

[Col2_New] [varchar](10) NULL,

[Action] varchar(50) NULL

) ON [PRIMARY]

GO

INSERT INTO MyTable1

OUTPUT NULL, inserted.Col1, NULL, inserted.Col2, 'INSERT'

INTO [MyTable1_Audit]

VALUES (1,'WOW')

SELECT * FROM MyTable1

SELECT * FROM [MyTable1_Audit]

UPDATE MyTable1 SET Col1 = 10 , Col2 = 'Great'

OUTPUT deleted.Col1, inserted.Col1,

deleted.Col2, inserted.Col2, 'UPDATE' INTO [MyTable1_Audit]

WHERE Col1 = 1

SELECT * FROM MyTable1

SELECT * FROM [MyTable1_Audit]

Saturday, March 6, 2010

Get free space in Data and Log files in all databases

To get total size and free space (unallocated space) in transaction log files of all databases we have a very handy command:

DBCC SQLPERF(LOGSPACE)

But there is no such command to get the free space in data files of all databases. Even in current database, data file size and free space can be checked by sp_spaceused.

But if you have many databases than connecting to each database and executing the sp_spaceused command is troublesome.

To simplify this task I have written following query. This provides you all information about file size and free space in all databases.

CREATE TABLE #temp (DatabaseName VARCHAR(100), DataFileSize_MB INT,

Free_Space_in_Data_Files_MB INT, LogFileSize_MB INT)

INSERT INTO #temp

EXEC sp_msforeachdb 'SELECT ''?'' as DatabaseName, DBSize.DataSize,

case when DBSize.DataSize - allocated.ReservedPages > 0 then

    DBSize.DataSize - allocated.ReservedPages else 0 end as FreeSpace_MB, DBSize.LogSize

FROM (SELECT (sum(convert(bigint,case when status & 64 = 0 then

    size else 0 end))* 8192 / 1048576) DataSize,

(sum(convert(bigint,case when status & 64 <> 0 then

    size else 0 end))* 8192 / 1048576) LogSize

FROM [?].dbo.sysfiles) DBSize,

(SELECT (sum(total_pages)* 8192 / 1048576) AS ReservedPages

FROM [?].sys.allocation_units) allocated'

SELECT instance_name AS 'DatabaseName', tt.DataFileSize_MB, tt.Free_Space_in_Data_Files_MB,

tt.LogFileSize_MB, (tt.LogFileSize_MB -(MAX(cntr_value)/1024)) AS 'Free_Space_in_Log_Files_MB'

FROM #temp tt inner join sysperfinfo on tt.DatabaseName = sysperfinfo.instance_name

WHERE counter_name ='Log File(s) Used Size (KB)' AND instance_name != '_total'

GROUP BY instance_name , tt.DataFileSize_MB, tt.Free_Space_in_Data_Files_MB, tt.LogFileSize_MB

ORDER BY tt.Free_Space_in_Data_Files_MB + (tt.LogFileSize_MB -(MAX(cntr_value)/1024)) DESC

--drop table #temp

Following is the screenshot of its output

Friday, March 5, 2010

Best practices to write complex JOINs on multiple tables

Always use ANSI standard joins: Tables can be joined in two join standards: ANSI and Transact-SQL. Syntax for these two standards is following:

--Transact-SQL standard:

SELECT tableA.colX, tableB.colY
FROM tableA, tableB
WHERE tableA.col1= tableB.col1

--ANSI standard:

SELECT tableA.colX, tableB.colY
FROM tableA INNER JOIN tableB
ON tableA.col1= tableB.col1

In earlier version of SQL Server, Transact-SQL standard had *= and =* operator for LEFT and RIGHT OUTER joins. But now these operators have been deprecated. So always use the ANSI standard for unanimity and clarity. Find the easier to understand query out of following two same queries:

--Query 1: 
SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableA, tableB
LEFT OUTER JOIN tableC
ON tableA.col2 = tableC.col2
WHERE tableA.col1= tableB.col1 

--Query 2: 
SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableA INNER JOIN tableB
ON tableA.col1= tableB.col1 
LEFT OUTER JOIN tableC
ON tableA.col2 = tableC.col2

I find the second query easier because it tells the relation between joining tables at the time of joining and uses same syntax to join all tables.

Do not mix LEFT and RIGHT OUTER JOIN: If we are joining three tables with LEFT or RIGHT OUTER joins than we can write in any one of following three ways:

--Query 1: Using left outer join

SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableA LEFT OUTER JOIN tableB
ON tableA.col1= tableB.col1 
LEFT OUTER JOIN tableC
ON tableB.col2 = tableC.col2

--Query 2: Using left and right outer join<?xml:namespace prefix = o />

SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableB RIGHT OUTER JOIN tableA
ON tableA.col1= tableB.col1 
LEFT OUTER JOIN tableC
ON tableB.col2 = tableC.col2

--Query 3: Using right outer join<?xml:namespace prefix = o />

SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableC RIGHT OUTER JOIN tableB
ON tableB.col2 = tableC.col2
RIGHT OUTER JOIN tableA
ON tableA.col1= tableB.col1

Here you would observe that Query 2 is most complex and Query 1 is easiest to understand. We can always convert RIGHT OUTER JOIN into LEFT OUTER JOIN or vice versa by just changing the sequence of tables. So why not we write the queries in easiest way.

Write INNER joins first and then LEFT OUTER joins: If a table is joined with INNER and LEFT OUTER joins than always write the INNER joins first and then LEFT OUTER joins. Find the easier to understand query out of following two same queries:

--Query 1: 
SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableA LEFT OUTER JOIN tableB
ON tableA.col1= tableB.col1 
INNER JOIN tableC
ON tableA.col2 = tableC.col2

--Query 2: 
SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableA INNER JOIN tableC
ON tableA.col2 = tableC.col2
LEFT OUTER JOIN tableB
ON tableA.col1= tableB.col1

I find second query easier because it defines the essential result-set first and then joins with other tables for additional data.

Keep the ON clause adjacent to appropriate JOIN: Find the easier to understand query out of following two same queries:

--Query 1: 
SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableA INNER JOIN tableB
LEFT OUTER JOIN tableC
ON tableA.col1= tableB.col1 
ON tableA.col2 = tableC.col2

--Query 2: 
SELECT tableA.colX, tableB.colY, tableC.colZ
FROM tableA INNER JOIN tableB
ON tableA.col1= tableB.col1 
LEFT OUTER JOIN tableC
ON tableA.col2 = tableC.col2

I find second query easier because it tells the relation within joining tables at the time of joining.

Use database diagram to identify the relation between tables:
If I am to write a query to know the Salesperson, product and the total sold quantity of that product, I have to identify the relation between Salesperson, Product and there sold quantity. Following is the screenshot of database diagram of required tables in AdventureWorks database:

Now when we know the relation, we can write a query to join these tables:

SELECT Emp.LoginID, Prod.Name, SUM (SOD.OrderQty) AS TotalQuantitySold

FROM HumanResources.Employee Emp

LEFT OUTER JOIN Sales.SalesOrderHeader SOH

ON Emp.EmployeeID = SOH.SalesPersonID

INNER JOIN Sales.SalesOrderDetail SOD

ON SOH.SalesOrderID = SOD.SalesOrderID

INNER JOIN Production.Product Prod

ON SOD.ProductID = Prod.ProductID

GROUP BY Emp.LoginID, Prod.Name

ORDER BY Emp.LoginID, Prod.Name

The screenshot of query result is following:

clip_image002[5]

Thursday, March 4, 2010

The purpose of Transaction Log

We all know that all DDL and DML changes are logged in Transaction log. But when I ask a question that what is the primary purpose of Transaction log in SQL Server then the most common answers that I get are following:

· For Transaction log backup.

· For Transactional Replication, Log shipping or Mirroring.

As we know these two features can not be implemented in SIMPLE recovery model but transaction log is still maintained in this recovery model. This indicates that even these features use transaction log but these are not the prime purpose of Transaction log.

The primary purpose is Rollback & Recovery.

We all know that reversing the changes is called Rollback in SQL Server. This is the process to keep the database consistent and integrated. Transaction is the unit of action and either all statements of a transactions complete successfully or none.

So what is Recovery? Recovery includes two steps: Redo and Undo. In Redo all changes of completed transactions, after the last checkpoint are updated in data pages. In Undo process all changes of uncompleted transactions up to last checkpoint are rolled back.

Recovery is performed on a database in two cases:

1. While starting the database when SQL Server starts

2. While restoring the database.

The recovery process while restoring a database can be paused for later time by using NORECOVERY option in RESTORE DATABASE or RESTORE LOG command.