Sunday, August 23, 2009

Change Data Capture - Part 2

As I mention in my previous post on Change Data Capture in SQL Server 2008, Microsoft provides table valued functions (TVF's) to access changed data. The main TVFs used to query the change data are:

cdc.fn_cdc_get_all_changes_capture_instance
and
cdc.fn_cdc_get_net_changes_capture_instance

Each of these TVF's require two LSN's or log sequence numbers to identify the beginning and ending of the range of changes to be queried. Two functions are provided to aid in determining the min and max LSN available. They are:

sys.fn_cdc_get_min_lsn and sys.fn_cdc_get_max_lsn


Get All Changes

The first TVF, cdc.fn_cdc_get_all_changes_capture_instance
,will return one row per changed row on the source table. This means that if more than one column is modified in a transaction then one row is returned in the result set for the given data modification.

The TVF accepts the following paramters when utilized.

  • Min LSN

  • Max LSN

  • Row Filter option



The min LSN is obtained using the sys.fn_cdc_get_min_lsn function. Using Employees table from the previous post as an example, the min lsn can be obtained as follows

SELECT sys.fn_cdc_get_min_lsn ('dbo_Employees')AS min_lsn;


The max LSN is obtained using the sys.fn_cdc_get_max_lsn function. This function does not require any inputs and returns the max lsn for the captured data of the source table. To get the max LSN from the Employees table:

SELECT sys.fn_cdc_get_max_lsn()AS max_lsn;


The row filter option has two settings, All and All Update Old. The All option will return all changes made to the source table in one row. The All Update Old option will return one row for an update showing the previous value and one row showing the changed value.

Putting it all together to get the changes:

DECLARE @minLSN BINARY(10),@maxLSN BINARY(10)
SELECT @minLSN=sys.fn_cdc_get_min_lsn ('dbo_Employees');
SELECT @maxLSN=sys.fn_cdc_get_max_lsn();
select * from cdc.fn_cdc_get_all_changes_dbo_Employees(@minLSN,@maxLSN,N'All');



Get Net Changes

The second TVF available to work with captured change data is sys.fn_cdc_get_min_lsn. This TVF returns a row that specifies the whole change to a source row if more than one transaction has made modifications to that data. To use this feature the @supports_net_changes parameter must be set to 1 when enabling CDC on a given table.


The input parameters are the same as cdc.fn_cdc_get_all_changes_capture_instance
however the row filter option is slightly different. The options available are All, All With Mask, and All With Merge. All will return all net changes. All With Mask will return the final row plus the operation needed to apply the row. The All With Merge will return the final row with the resturn column _$oprtation being either a 1 ( indicating a delete ) or a 5 ( indicating either an insert or update is needed to apply the change )


Here is an example utilizing the net changes.


DECLARE @minLSN BINARY(10),@maxLSN BINARY(10)
SELECT @minLSN=sys.fn_cdc_get_min_lsn ('dbo_Employees');
SELECT @maxLSN=sys.fn_cdc_get_max_lsn();
select * from cdc.fn_cdc_get_net_changes_dbo_Employees(@minLSN,@maxLSN,N'all');



One last item to mention is that Microsoft provides a function that correlate time to LSN's. This is useful if you have a specific time period in which you would like to capture changes made to data. The function is sys.fn_cdc_map_time_to_lsn and can be used to determine lsn boundries based on time to be used in acquiring the change data.

Saturday, August 15, 2009

Change Data Capture in SQL Server 2008

Some applications require the ability to understand the full history of how data has been modified over time. A new feature included in SQL Server 2008 is Change Data Capture. Change Data Capture (CDC) allows for the ability to query data that has been modified through INSERT, UPDATE, or DELETE statements on a database table.

The SQL Server 2008 database engine provides the necessary framework to implement Change Data Capture. It is currently available in SQL Server 2008 Enterprise, Developer and Evaluation editions. The CDC process is accomplished through the use of the SQL Server transaction logs. As data modifications are recorded in the database transaction log, the CDC process reads the log files and captures the changes for those tables that have been identified to participate in CDC. It then writes those modifications to the Change Capture Data tables. Each table participating in CDC has a corresponding capture table that follows the naming standard and is owned by the database user cdc to hold the modified data. SQL Server allows for the placement of the table containing the modified data to be placed on a separate filegroup. This is a recommended configuration however the default location for the table will be the default filegroup for the database.

Once a database has been enabled for CDC, two SQL Agent jobs are created to manage the capture process. One job is for the purpose of capturing data on those tables that have been identified for CDC. The other job is created for the purpose of cleaning up the collected change data. Two procedures are provided to view and modify configuration parameters of these jobs. They are named sys.sp_cdc_change_job and sys.sp_cdc_help_jobs. Changes made through these procedures do not go into effect until the job is stopped and restarted.

The following is a simple demo that walks through enabling CDC and querying changes made.

Set Up

Create a test table to execute INSERT, UPDATE, and DELETE.


CREATE TABLE [dbo].[EMPLOYEES](
[EMP_ID] [int] IDENTITY(1,1) NOT NULL,
[EMP_NAME] [varchar](100) NOT NULL,
[EMP_NUMBER] [int] NOT NULL,
[EMP_TITLE] [varchar](50) NOT NULL,
[EMP_HIRE_DATE] [date] NOT NULL,
[EMP_TERM_DATE] [date] NULL,
CONSTRAINT [PK_EMPLOYEES] PRIMARY KEY CLUSTERED
(
[EMP_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY])


Create database role for the management of the cdc tables.

CREATE ROLE [ChangeData_Admin] AUTHORIZATION [cdc] 


To utilize CDC, it must be enabled for the database. This is accomplished using the following stored procedure.

EXECUTE sys.sp_cdc_enable_db;


Once CDC has been enabled for the database, the tables you wish to participate in CDC can then be identified through sys.sp_cdc_enable_table.


EXECUTE sys.sp_cdc_enable_table
@source_schema = N'dbo'
, @source_name = N'Employees'
, @role_name = N'ChangeData_Admin';


The cdc agent jobs are created as a result of this procedure


Results
------------
Job 'cdc.TestDB_1_capture' started successfully.
Job 'cdc.TestDB_1_cleanup' started successfully.


Now we are ready to insert data into our table.


INSERT INTO [Test_db1].[dbo].[EMPLOYEES]
([EMP_NAME]
,[EMP_NUMBER]
,[EMP_HIRE_DATE]
,[EMP_TERM_DATE]
,[EMP_TITLE])
VALUES
('Steve Stark'
,1001
,'05/15/2008'
,null
,'Associate Developer')
GO

INSERT INTO [Test_db1].[dbo].[EMPLOYEES]
([EMP_NAME]
,[EMP_NUMBER]
,[EMP_HIRE_DATE]
,[EMP_TERM_DATE]
,[EMP_TITLE])
VALUES
('Mary Doe'
,1002
,'10/24/2003'
,null
,'Database Administrator')
GO


To see the changes we can query the cdc.dbo_employees_CT table. Microsoft does provide table valued functions to work with captured change data. For the sake of this post, I will directly query the capture table to view the results. For more information on how to use the provided table value functions, go here.


SELECT * FROM cdc.dbo_employees_CT


Here you will see two rows for the previous insert statements.

Now update some data:


UPDATE dbo.EMPLOYEES
SET EMP_TITLE = 'Development Director'
WHERE EMP_ID = 1


Again query cdc.dbo_employees_CT to observe the changes. Since we performed an update there will be two rows captured. The first is the old value and the second is the new value. Captured delete data will also only contain one row displaying the value prior to the deletion.


Clean Up


DROP TABLE dbo.EMPLOYEES;
EXECUTE sys.sp_cdc_disable_db;

Thursday, August 13, 2009

UPDATE: SQL Encryption Assistant v.1.0.6.238


Devenius has just released an update to SQL Encryption Assistant. The latest version is now 1.0.6.238.

This update applies to the following editions:


Standard
Standard 2008
Professional
Professional 2008


Some of the new features include:

- Expanded EKM functionality
- "Color-Coded" Encryption Status
- Extensible Key Management device support or asymmetric and symmetric keys

To apply the update, please see the post "How do I get the latest version of SQL Encryption Assistant?".

Thursday, August 6, 2009

The DBA vs. Service Broker Assistant Part II

So being the DBA portion of Devenius ( see About ) I figured I would try to test my developer partner's statistics from Part 1 using Server Broker Assistant in implementing the event notification logging example. I did deviate slightly from the plan by creating the tables and stored procedure prior to beginning the time study. My test went as follows.

1) Enable service broker for the test database.
2) Create the queue for notification events.
3) Create the service to manage the queue.
4) Alter the queue to use the previously created stored procedure.


I suppose I could have just as easily skipped step 4 and hooked in the stored procedure upon queue creation. However, I figured I would try to get close to the same process I am attempting to validate.

Test Case A - scripted

Using Management Studio's Query window, I wrote and executed the T-SQL manually. I too had the syntax available to me and wasn't required to perform any lookups on MSDN or BOL. Here is a copy of the sytax:

1. ALTER DATABASE TestDB_1 SET ENABLE_BROKER;

2. CREATE QUEUE NotifyQueue;

3. CREATE SERVICE NotifyService
ON QUEUE NotifyQueue
(
[http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]
);

4. ALTER QUEUE [dbo].[NotifyQueue]
WITH ACTIVATION (
STATUS = ON,
PROCEDURE_NAME = [dbo].[LogEventsProc] ,
MAX_QUEUE_READERS = 2,
EXECUTE AS SELF
);

Test Case B - Service Broker Assistant

In this scenario, I used the Service Broker Assistant to perform the same actions in Test Case A.


And the results:

Test Case A = 3 min. 54 sec.
Test Case B = 1 min. 46 sec.

Service Broker Assistant showed quite an improvement in time to execute over the traditional method of manual T-SQL.

Sunday, August 2, 2009

The DBA vs. Service Broker Assistant Part I


A while back I posted the MSDN example for event logging with SQL Service Broker using our latest product SQL Service Broker Assistant. The example is a Service Broker implementation for logging event notifications. Then I got to thinking...how long would it have taken me to script the application in T-SQL? Good question. I have no idea.

So, I decided to have a little competition.

The DBA vs. Service Broker Assistant

A couple notes for the test. The example has several steps and for the sake of simplicity the dependent tables and stored procedure are deployed ahead of time. This test will be performed using the Standard edition of Service Broker Assistant. The professional edition is in the final stages of development and once it is released I will attempt the same test. The professional edition will allow you to create event notifications for Service Broker, the final step in the example.

Steps

1. Enable the Service Broker on the instance.

2. Create the service queue for logging.

3. Create the logging service.

4. Create logging tables, one in relational format the other to handle XML.

5. Create the stored procedure for executing upon activation.

6. Alter the queue to add the stored procedure created in step 5.

Ready...set...GO!!!

The completely unscientific results:

DBA 2 minutes 55 seconds
SBA 1 minute 25 seconds

Over a 50% decrease in time!
Now that assumes I remembered all the syntax for each step, while possible, I don't know about you but most of the time I need to look things up. Remember, I'm a developer by trade not a DBA...so wait maybe it should be Developer vs. Service Broker Assistant?

I...need...a...DBA. Hmmm...to be continued.