Sync Group – A good solution to synchronize SQL Databases using the Windows Azure infrastructure
Working with Windows Azure is becoming more and more pleasant. In this post we will focus on how to synchronize multiple databases hosted on Azure or on premise using Sync Group.
First of all, let’s start with the following requirement: we have an application that contains a database that needs to be replicated in each data center.
What should we do to replicate the entire content to all the data centers?
A good solution could be Sync Group (SDS). Using this feature we can define one or more instances of SQL Databases (from the cloud or on premise) which will be synchronized. For this group we can specify the tables and the columns which will be synchronized.
Creating an SDS can be done very easily from the Windows Azure portal. This feature can be found under the SQL DATABASES tab – SYNC. I will not start explaining how to create this group, because it is very easy; all weneed to know are the server addresses of the databases, the required user names and the passwords.
I think it’s more important that we focus on the options available on an SDS.
One of the databases that form the group needs to be the hub. The hub represents the master node of the group, from where all the data propagates.
The Synchronization Direction
Once we add the hub, we can add more databases to the group. At this point, we need to specify the synchronization direction and to do so we have 3 options:
- Sync from the Hub – With this option, all the changes made within the hub are replicated to the rest of the databases in the group. In this configuration, when data is different, the hub will win. In addition, changes made in the databases are not written to the hub.
- Sync to the Hub – With this option, no changewithin the hub is written in the databases. But all the changes made in the databases are written in the hub.
- Bi-directional – The synchronization is done both ways – from the Hub to the databases and from the databases to the hub.
From the portal, we have the option to select the tables and the columns from the hub table that will be synchronized. This way we don’t need to have databases with the same schema. The most important thing to keep in mind is to have the tables that you want to synchronize in the same schema/format.
Note: We don’t need to replicate the database schema to all the databases. Once we select the tables and columns that we want to synchronize (from the Hub schema), all these tables will be replicated to the rest of the group.
Conflict Resolution Policies
When creating a hub, we have two options for conflict resolution.
- Hub Wins – With this option, all the changes written to the hub will be preserved and, in case of conflict, the version on the hub will be the ‘good one’.
- Client Wins – With this option, the changes written on the slaves (non-hub databases) will win and the change within the slaves will propagate to the hub and to the rest of the group.
For more information about these conflict resolution policies, I recommend additional search on MSDN.
The Synchronization Frequency
The synchronization between the databases in a group is not carried out in real time. We can select the time interval in which the synchronization needs to be made. This time interval can be between 5 minutes and 1 month. We also have a button that can trigger the synchronization action.
The on-premise SQL Server
To be able to use this feature on the SQL Server we need to download and install SQL Data Sync. This is a tool that will integrate this functionality on the SQL Server.
All the synchronization actions between group nodes are logged. Using this information, we can determine how long the synchronization action has taken, what nodes have been synchronized and how the action has ended.
I think this feature has a lot of potential. Why is that? Because database synchronization can be carried out very easily now. This feature can really add value to your application with minimal costs and few headaches.