Introduction
Earlier, most of the applications were using standalone environment where a single centralized server was responding to multiple users, working in different locations.
Centralized Approach and Problems
- Performance problems
- Availability problems
- Maintenance problems
To overcome all the above problems, we can use replication solution.
Replication allows to maintain same database multiple copies at different locations. Log shipping and mirroring allows to maintain complete database redundancy whereas replication allows to maintain some part of the database (a set of required objects) at users location. Changes made at different user locations are synchronized to the main server. It is object level high availability feature. According to Books Online:
Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency.
Unlike other methods of high availability, it doesn’t distribute entire database, but only distributes some part of database like tables or views.
Advantages
- Improved performance
- To reduce locking conflicts when multiple users are working
- Improved availability
- Easy maintenance
- To allow sites work independently. So that each location can set up its own rules and procedures for working with its copy of the data.
- To move data closer to the user
SQL Server 2005 Features
- Restartable Snapshots
- Oracle Publishing
- Replicating all DDLs
- Merge Replication allows to introduce custom business logic into the synchronization process
- Merge Replication provides the ability to replicate data over HTTP with web synchronization option
- Updatable Transactional Subscriptions can now handle updates to large data types at Subscribers
SQL Server 2008 Features
- In SQL Server 2005, replication had to be stopped in order to perform some actions like adding nodes, making schema changes, etc. But in 2008, these can be done online.
- Conflict detection capacity in peer-to-peer replication.
- All types of conflicts are detected and reported through agent error log or conflicts table.
- In SQL Server 2005, switch partition is unsupported, but in 2008 it supports.
@allow_partition_switch
@replicate_partition_switch
- Performance improvements, under Windows 2008
- Snapshot delivery of more than 500MB/minute
- Time to deliver 100000 varbinary(max) records in less than 2minutes where in 2005 223 minutes.
SQL Server 2012 Features
- Updatable subscriptions with transactional publications are discontinued.
- Four new stored procedures provide replication support for AlwaysOn.
sp_get_redirected_publisher
sp_redirect_publisher
sp_validate_replica_hosts_as_publishers
sp_validate_redirected_publisher
- Replication supports the following features on Availability groups:
- A publication database can be part of an availability group. The publisher instances must share a common distributor.
- In an AlwaysOn Availability Group, an AlwaysOn secondary cannot be a publisher. Republishing is not supported when replication is combined with AlwaysOn.
- Heterogeneous replication to non-SQL Server subscribers is deprecated. To move data, create solutions using change data capture and SSIS.
- Oracle Publishing is deprecated.
Replication Architecture
REPLICATION ENTITIES
SQL Server replication is based on the “Publish and Subscribe” metaphor. Let us look at each of the individual components in detail.
- Publisher
- It is a source database where replication starts. It makes data available for replication.
- Publishers define what they publish through a publication.
- Article
- Articles are the actual database objects included in replication like tables, views, indexes, etc.
- An article can be filtered when sent to the subscriber.
- Publication
- A group of articles is called publication.
- An article can’t be distributed individually. Hence publication is required.
- Distributor
- It is intermediary between publisher and subscriber.
- It receives published transactions or snapshots and then stores and forwards these publications to the subscriber.
- It has 6 system databases including distribution.
- Subscriber
- It is the destination database where replication ends.
- It can subscribe to multiple publications from multiple publishers.
- It can send data back to publisher or publish data to other subscribers.
- Subscription
- It is a request by a subscriber to receive a publication.
- We have two types of subscriptions - push and pull.
- Push Subscriptions
- With this subscription, the publisher is responsible for updating all the changes to the subscriber without the subscriber asking those changes.
- Push subscriptions are created at the Publisher server
- Pull Subscriptions -
- With this subscription the subscriber initiates the replication instead of the publisher.
- The subscriptions are created at the Subscriber server.
REPLICATION AGENTS
- We have discussed that replication process works in the background with the help of jobs.
- These jobs are also called as agents. These jobs internally uses respective .exe files present in …………….. \110\COM folder.
- All the agents information is present in Distribution db in the following tables.
dbo.MSxxx_agents
dbo.MSxxx_history
Snapshot Agent
- It is an executable file that prepares snapshot files containing schema and data of published tables and db objects.
- It stores the files in the snapshot folder, and records synchronization jobs in the distribution database.
Distribution Agent
- It is used with snapshot and transactional replication.
- It applies the initial snapshot to the Subscriber and moves transactions held in the Distribution db to Subscribers.
- It runs at either the Distributor for push subscriptions or at the Subscriber for pull subscriptions.
Log Reader Agent
- It is used with transactional replication, which moves transactions marked for replication from the transaction log on the publisher to the distribution db.
- Each db has its own Log Reader Agent that runs on the Distributor and connects to the Publisher.
Merge Agent
- It is used with merge replication.
- It applies the initial snapshot to the Subscriber and moves incremental data changes that occur.
- Each merge subscription has its own Merge Agent that connects to both the Publisher and the Subscriber and updates both.
- It captures changes using triggers.
Queue Reader Agent
- It is used with transactional replication with the queued updating option.
- It runs at the Distributor and moves changes made at the Subscriber back to the Publisher.
- Unlike Distribution Agent and Merge Agent, only one instance of the Queue Reader Agent exists to service all Publishers and publications for a given distribution db.
REPLICATION TYPES
- Snapshot Replication
- Transactional Replication
- Merge Replication
1. Snapshot Replication
- The snapshot process is commonly used to provide the initial set of data and database objects for transactional and merge publications.
- It copies and distributes data and database objects exactly as they appear at the current moment of time.
- Snapshot replication is used to provide the initial data set for transactional and merge replication.
- It can also be used when complete refreshes of data are appropriate (BOL).
- Scenarios
- When the data is not changing frequently.
- If we want to replicate small amount of data.
- To replicate Look-up tables which are not changing frequently.
- It is acceptable to have copies of data that are out of date with respect to the publisher for a period of time
For example, if a sales organization maintains a product price list and the prices are all updated at the same time once or twice each year, replicating the entire snapshot of data after it has changed is recommended.
Snapshot Replication Architecture
Source: BOL
How it Works?
- Snapshot Agent establish a connection from distributor to publisher and generates fresh snapshot into snapshot folder by placing locks.
- Snapshot agent writes copy of the table schema for each article to .sch file.
- Copies data from published table at the Publisher and writes data to the snapshot folder in the form of.bcp file.
- Appends rows to the
Msrepl_commands
and Msrepl_transactions
.
- Releases any locks on published tables.
Configuring Replication
- Configuring distributor
- Configuring publisher
- Creating publication of required type
- Creating subscription(s)
Step 1: Configuring distributor and publisher
- Take three instances
- Go to second instance -> Right click on Replication -> Configure Distribution…
- Next -> Select ‘SERVER2’ will act as its own distributor;
- Next
- Next
- Next
- Uncheck the check box present at Server2 -> Add
- Select instance Server1
- Next
- Enter strong password. (Automatically one login is created in distributor with the name Distributor_Admin)
- Next
- Next
- Finish
Observations
- Go to distributor -> Databases -> Find the new database “
Distribution
”
- Go to Security -> Logins -> Find a new login “
Distributor_admin
”
- Go to Server Objects -> Linked servers -> Find new linked server “
repl_distributor
”
- Right Click on Replication -> Select distributor Properties
- Transactions stored in distribution database are removed after 72 hrs and agents history is removed after 48 hrs.
- To view snapshot folder path -> Click on publishers -> click on browse button (…) present to right side of publisher name.
- Go to SQL Server Agent -> Jobs -> Find 6 new jobs are created automatically.
Step 2: Creating Snapshot Publication
- Go to publisher (Server1) -> Replication -> Right Click on Local Publications -> New publication.
- Next
- Select second option -> Click on Add -> Select Distributor instance (
Server2
)
- Connect ? Next
- Enter password of
Distributor_admin
login which we have mentioned while configuring publisher.
- Next
- Select required database. For example
SSISDb
- Next
- Select “Snapshot Publication” -> Next
- Select required tables -> Next
- Next -> Next
- Select the check box to create snapshot as follows
- Next
- Click on security settings
- Select as follows
- OK
- Next
- Next -> Next
- Enter publication name as follows
- Finish
Observations
- Go to publisher -> Replication -> Local publications -> Find new publication is created
- To check snapshot was created or not -> Right click on the publication (SSISDBSP) -> View Snapshot Agent Status
- Go to repldata folder as follows:
- Go to sub folders find the snapshot files (.bcp, .sch, idx, .trg)
- Go to distributor -> SQL Server Agent -> Jobs -> Find snapshot agent job was created
FAQ: How to display database names which consists of publications?
Ans: Go to publisher -> take new query ->
Hide Copy Code
select name from sys.databases where is_published=1 or is_subscribed=1
Creating Subscription
- Go to publisher -> Replication -> Local Publications -> Right Click on SSISDBSP -> New Subscription
- Next
- Select the publication name: SSISDBSP
- Next
- Select Push subscriptions
- Next
- Add Subscriber -> Select third instance (Server1\test) -> Connect
- Next
- Under Subscription Database if there is no database exists with same name -> Select New database -> Enter Database Name -> OK -> Next
- Click on browse button (…) under distribution agent security page.
- Select “Run under Agent Service Account” and “By impersonating the process account” options as both distributor and subscriber’s service accounts are same. If the service account of subscriber is different, then create a login in subscriber with sysadmin privileges then mention that login details.
- Next
- Under Agent Schedule -> Select “Run Continuously”
- Under Initialize when select -> Immediately
- Next -> Next -> Finish
Observations
- Go to subscriber -> SSISDB -> Tables -> Find two tables are created
- Go to distributor -> SQL Server Agent -> Find new job is created, related to Distribution Agent
Verifying Replication
- Go to publisher perform some changes in any table present in publication
- Go to distributor run Snapshot Agent Job
- Go to subscriber observe the changes in the respective table
FAQ: How many articles may be there in a snapshot publication?
32767
FAQ: Max columns in a table?
1000