Hello All, We are going to start new batch from next week. message/call or mail us for more details.

30 March 2015

SQL Server :: How Transactional Replication Works

Transactional replication is implemented by the SQL Server Snapshot Agent, Log Reader Agent, and Distribution Agent. The Snapshot Agent prepares snapshot files containing schema and data of published tables and database objects, stores the files in the snapshot folder, and records synchronization jobs in the distribution database on the Distributor.
The Log Reader Agent monitors the transaction log of each database configured for transactional replication and copies the transactions marked for replication from the transaction log into the distribution database, which acts as a reliable store-and-forward queue. The Distribution Agent copies the initial snapshot files from the snapshot folder and the transactions held in the distribution database tables to Subscribers.
Incremental changes made at the Publisher flow to Subscribers according to the schedule of the Distribution Agent, which can run continuously for minimal latency, or at scheduled intervals. Because changes to the data must be made at the Publisher (when transactional replication is used without immediate updating or queued updating options), update conflicts are avoided. Ultimately, all Subscribers will achieve the same values as the Publisher. If immediate updating or queued updating options are used with transactional replication, updates can be made at the Subscriber, and with queued updating, conflicts might occur. 
The following illustration shows the principal components of transactional replication.

Transactional replication components and data flow

Initial Dataset

Before a new transactional replication Subscriber can receive incremental changes from a Publisher, the Subscriber must contain tables with the same schema and data as the tables at the Publisher. The initial dataset is typically a snapshot that is created by the Snapshot Agent and distributed and applied by the Distribution Agent. The initial dataset can also be supplied through a backup or other means, such as SQL Server Integration Services. 
When snapshots are distributed and applied to Subscribers, only those Subscribers waiting for initial snapshots are affected. Other Subscribers to that publication (those that have already been initialized) are unaffected.

Concurrent Snapshot Processing

Snapshot replication places shared locks on all tables published as part of replication for the duration of snapshot generation. This can prevent updates from being made on the publishing tables. Concurrent snapshot processing, the default with transactional replication, does not hold the share locks in place during the entire snapshot generation, which allows users to continue working uninterrupted while replication creates initial snapshot files.

Snapshot Agent

The procedures by which the Snapshot Agent implements the initial snapshot in transactional replication are the same procedures used in snapshot replication (except as outlined above with regard to concurrent snapshot processing). 
After the snapshot files have been generated, you can view them in the snapshot folder using Microsoft Windows Explorer.

Modifying Data and the Log Reader Agent

The Log Reader Agent runs at the Distributor; it typically runs continuously, but can also run according to a schedule you establish. When executing, the Log Reader Agent first reads the publication transaction log (the same database log used for transaction tracking and recovery during regular SQL Server Database Engine operations) and identifies any INSERT, UPDATE, and DELETE statements, or other modifications made to the data in transactions that have been marked for replication. Next, the agent copies those transactions in batches to the distribution database at the Distributor. The Log Reader Agent uses the internal stored procedure sp_replcmds to get the next set of commands marked for replication from the log. The distribution database then becomes the store-and-forward queue from which changes are sent to Subscribers. Only committed transactions are sent to the distribution database.
After the entire batch of transactions has been written successfully to the distribution database, it is committed. Following the commit of each batch of commands to the Distributor, the Log Reader Agent calls sp_repldone to mark where replication was last completed. Finally, the agent marks the rows in the transaction log that are ready to be purged. Rows still waiting to be replicated are not purged.
Transaction commands are stored in the distribution database until they are propagated to all Subscribers or until the maximum distribution retention period has been reached. Subscribers receive transactions in the same order in which they were applied at the Publisher.

Distribution Agent

The Distribution Agent runs at the Distributor for push subscriptions and at the Subscriber for pull subscriptions. The agent moves transactions from the distribution database to the Subscriber. If a subscription is marked for validation, the Distribution Agent also checks whether data at the Publisher and Subscriber match. 

SQL Server :: How Snapshot Replication Works

By default, all three types of replication use a snapshot to initialize Subscribers. The SQL Server Snapshot Agent always generates the snapshot files, but the agent that delivers the files differs depending on the type of replication being used. Snapshot replication and transnational replication use the Distribution Agent to deliver the files, whereas merge replication uses the SQL Server Merge Agent. The Snapshot Agent runs at the Distributor. The Distribution Agent and Merge Agent run at the Distributor for push subscriptions, or at Subscribers for pull subscriptions. 
Snapshots can be generated and applied either immediately after the subscription is created or according to a schedule set at the time the publication is created. The Snapshot Agent prepares snapshot files containing the schema and data of published tables and database objects, stores the files in the snapshot folder for the Publisher, and records tracking information in the distribution database on the Distributor. You specify a default snapshot folder when you configure a Distributor, but you can specify an alternate location for a publication instead of or in addition to the default. 
The following illustration shows the principal components of snapshot replication.
Snapshot replication components and data flow


Snapshot Agent


For merge replication, a snapshot is generated every time the Snapshot Agent runs. For transactional replication, snapshot generation depends on the setting of the publication property immediate_sync. If the property is set to TRUE (the default when using the New Publication Wizard), a snapshot is generated every time the Snapshot Agent runs, and it can be applied to a Subscriber at any time. If the property is set to FALSE (the default when using sp_addpublication), the snapshot is generated only if a new subscription has been added since the last Snapshot Agent run; Subscribers must wait for the Snapshot Agent to complete before they can synchronize.
The Snapshot Agent performs the following steps:
  1. Establishes a connection from the Distributor to the Publisher, and then takes locks on published tables if necessary:
    • For merge publications, the Snapshot Agent does not take any locks.
    • For transactional publications, by default the Snapshot Agent take locks only during the initial phase of snapshot generation.
    • For snapshot publications, locks are held during the entire snapshot generation process.
  2. Writes a copy of the table schema for each article to a .sch file. If other database objects are published, such as indexes, constraints, stored procedures, views, user-defined functions, and so on, additional script files are generated.
  3. Copies the data from the published table at the Publisher and writes the data to the snapshot folder. The snapshot is generated as a set of bulk copy program (BCP) files.
  4. For snapshot and transactional publications, the Snapshot Agent appends rows to the MSrepl_commands and MSrepl_transactions tables in the distribution database. The entries in the MSrepl_commands table are commands indicating the location of .sch and .bcp files, any other snapshot files, and references to any pre- or post-snapshot scripts. The entries in the MSrepl_transactions table are commands relevant to synchronizing the Subscriber.
  5. Releases any locks on published tables.
During snapshot generation, you cannot make schema changes on published tables. After the snapshot files are generated, you can view them in the snapshot folder using Windows Explorer.

Distribution Agent and Merge Agent


For snapshot publications, each time the Distribution Agent runs for the publication, it moves a new snapshot to each Subscriber that has not yet been synchronized, has been marked for reinitialization, or includes new articles.
For snapshot and transactional replication, the Distribution Agent performs the following steps:
  1. Establishes a connection to the Distributor.
  2. Examines the MSrepl_commands and MSrepl_transactions tables in the distribution database on the Distributor. The agent reads the location of the snapshot files from the first table and Subscriber synchronization commands from both tables.
  3. Applies the schema and commands to the subscription database.
For an unfiltered merge replication publication, the Merge Agent performs the following steps:
  1. Establishes a connection to the Publisher.
  2. Examines the sysmergeschemachange table on the Publisher and determines whether there is a new snapshot that should be applied at the Subscriber.
  3. If a new snapshot is available, the Merge Agent applies to the subscription database the snapshot files from the location specified in sysmergeschemachange.

SQL Server Replication Step by Step

Introduction

Earlier, most of the applications were using standalone environment where a single centralized server was responding to multiple users, working in different locations.

Centralized Approach and Problems

  • Performance problems
  • Availability problems
  • Maintenance problems
To overcome all the above problems, we can use replication solution.
Replication allows to maintain same database multiple copies at different locations. Log shipping and mirroring allows to maintain complete database redundancy whereas replication allows to maintain some part of the database (a set of required objects) at users location. Changes made at different user locations are synchronized to the main server. It is object level high availability feature. According to Books Online:
Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency.
Unlike other methods of high availability, it doesn’t distribute entire database, but only distributes some part of database like tables or views.

Advantages

  • Improved performance
  • To reduce locking conflicts when multiple users are working
  • Improved availability
  • Easy maintenance
  • To allow sites work independently. So that each location can set up its own rules and procedures for working with its copy of the data.
  • To move data closer to the user

SQL Server 2005 Features

  • Restartable Snapshots
  • Oracle Publishing
  • Replicating all DDLs
  • Merge Replication allows to introduce custom business logic into the synchronization process
  • Merge Replication provides the ability to replicate data over HTTP with web synchronization option
  • Updatable Transactional Subscriptions can now handle updates to large data types at Subscribers

SQL Server 2008 Features

  • In SQL Server 2005, replication had to be stopped in order to perform some actions like adding nodes, making schema changes, etc. But in 2008, these can be done online.
  • Conflict detection capacity in peer-to-peer replication.
  • All types of conflicts are detected and reported through agent error log or conflicts table.
  • In SQL Server 2005, switch partition is unsupported, but in 2008 it supports.
    • @allow_partition_switch
    • @replicate_partition_switch
  • Performance improvements, under Windows 2008
    • Snapshot delivery of more than 500MB/minute
    • Time to deliver 100000 varbinary(max) records in less than 2minutes where in 2005 223 minutes.

SQL Server 2012 Features

  • Updatable subscriptions with transactional publications are discontinued.
  • Four new stored procedures provide replication support for AlwaysOn.
    • sp_get_redirected_publisher
    • sp_redirect_publisher
    • sp_validate_replica_hosts_as_publishers
    • sp_validate_redirected_publisher
  • Replication supports the following features on Availability groups:
    • A publication database can be part of an availability group. The publisher instances must share a common distributor.
    • In an AlwaysOn Availability Group, an AlwaysOn secondary cannot be a publisher. Republishing is not supported when replication is combined with AlwaysOn.
    • Heterogeneous replication to non-SQL Server subscribers is deprecated. To move data, create solutions using change data capture and SSIS.
    • Oracle Publishing is deprecated.

Replication Architecture


REPLICATION ENTITIES

SQL Server replication is based on the “Publish and Subscribe” metaphor. Let us look at each of the individual components in detail.
  1. Publisher
    • It is a source database where replication starts. It makes data available for replication.
    • Publishers define what they publish through a publication.
  2. Article
    • Articles are the actual database objects included in replication like tables, views, indexes, etc.
    • An article can be filtered when sent to the subscriber.
  3. Publication
    • A group of articles is called publication.
    • An article can’t be distributed individually. Hence publication is required.
  4. Distributor
    • It is intermediary between publisher and subscriber.
    • It receives published transactions or snapshots and then stores and forwards these publications to the subscriber.
    • It has 6 system databases including distribution.
  5. Subscriber
    • It is the destination database where replication ends.
    • It can subscribe to multiple publications from multiple publishers.
    • It can send data back to publisher or publish data to other subscribers.
  6. Subscription
    • It is a request by a subscriber to receive a publication.
    • We have two types of subscriptions - push and pull.
  7. Push Subscriptions
    • With this subscription, the publisher is responsible for updating all the changes to the subscriber without the subscriber asking those changes.
    • Push subscriptions are created at the Publisher server
  8. Pull Subscriptions -
    • With this subscription the subscriber initiates the replication instead of the publisher.
    • The subscriptions are created at the Subscriber server.

REPLICATION AGENTS

  • We have discussed that replication process works in the background with the help of jobs.
  • These jobs are also called as agents. These jobs internally uses respective .exe files present in …………….. \110\COM folder.
  • All the agents information is present in Distribution db in the following tables.
    • dbo.MSxxx_agents
    • dbo.MSxxx_history

Snapshot Agent

  • It is an executable file that prepares snapshot files containing schema and data of published tables and db objects.
  • It stores the files in the snapshot folder, and records synchronization jobs in the distribution database.

Distribution Agent

  • It is used with snapshot and transactional replication.
  • It applies the initial snapshot to the Subscriber and moves transactions held in the Distribution db to Subscribers.
  • It runs at either the Distributor for push subscriptions or at the Subscriber for pull subscriptions.

Log Reader Agent

  • It is used with transactional replication, which moves transactions marked for replication from the transaction log on the publisher to the distribution db.
  • Each db has its own Log Reader Agent that runs on the Distributor and connects to the Publisher.

Merge Agent

  • It is used with merge replication.
  • It applies the initial snapshot to the Subscriber and moves incremental data changes that occur.
  • Each merge subscription has its own Merge Agent that connects to both the Publisher and the Subscriber and updates both.
  • It captures changes using triggers.

Queue Reader Agent

  • It is used with transactional replication with the queued updating option.
  • It runs at the Distributor and moves changes made at the Subscriber back to the Publisher.
  • Unlike Distribution Agent and Merge Agent, only one instance of the Queue Reader Agent exists to service all Publishers and publications for a given distribution db.

REPLICATION TYPES

  • Snapshot Replication
  • Transactional Replication
  • Merge Replication

1. Snapshot Replication

  • The snapshot process is commonly used to provide the initial set of data and database objects for transactional and merge publications.
  • It copies and distributes data and database objects exactly as they appear at the current moment of time.
  • Snapshot replication is used to provide the initial data set for transactional and merge replication.
  • It can also be used when complete refreshes of data are appropriate (BOL).
  • Scenarios
  • When the data is not changing frequently.
  • If we want to replicate small amount of data.
  • To replicate Look-up tables which are not changing frequently.
  • It is acceptable to have copies of data that are out of date with respect to the publisher for a period of time
For example, if a sales organization maintains a product price list and the prices are all updated at the same time once or twice each year, replicating the entire snapshot of data after it has changed is recommended.
Snapshot Replication Architecture

Source: BOL

How it Works?

  • Snapshot Agent establish a connection from distributor to publisher and generates fresh snapshot into snapshot folder by placing locks.
  • Snapshot agent writes copy of the table schema for each article to .sch file.
  • Copies data from published table at the Publisher and writes data to the snapshot folder in the form of.bcp file.
  • Appends rows to the Msrepl_commands and Msrepl_transactions.
  • Releases any locks on published tables.

Configuring Replication

  1. Configuring distributor
  2. Configuring publisher
  3. Creating publication of required type
  4. Creating subscription(s)
Step 1: Configuring distributor and publisher
  1. Take three instances
  2. Go to second instance -> Right click on Replication -> Configure Distribution…
  3. Next -> Select ‘SERVER2’ will act as its own distributor;
  4. Next
  5. Next
  6. Next
  7. Uncheck the check box present at Server2 -> Add
  8. Select instance Server1
  9. Next
  10. Enter strong password. (Automatically one login is created in distributor with the name Distributor_Admin)
  11. Next
  12. Next
  13. Finish
Observations
  1. Go to distributor -> Databases -> Find the new database “Distribution
  2. Go to Security -> Logins -> Find a new login “Distributor_admin
  3. Go to Server Objects -> Linked servers -> Find new linked server “repl_distributor
  4. Right Click on Replication -> Select distributor Properties


    • Transactions stored in distribution database are removed after 72 hrs and agents history is removed after 48 hrs.
    • To view snapshot folder path -> Click on publishers -> click on browse button (…) present to right side of publisher name.
    • Go to SQL Server Agent -> Jobs -> Find 6 new jobs are created automatically.
Step 2: Creating Snapshot Publication
  1. Go to publisher (Server1) -> Replication -> Right Click on Local Publications -> New publication.
  2. Next
  3. Select second option -> Click on Add -> Select Distributor instance (Server2)
  4. Connect ? Next
  5. Enter password of Distributor_admin login which we have mentioned while configuring publisher.
  6. Next
  7. Select required database. For example SSISDb
  8. Next
  9. Select “Snapshot Publication” -> Next
  10. Select required tables -> Next
  11. Next -> Next
  12. Select the check box to create snapshot as follows
  13. Next
  14. Click on security settings
  15. Select as follows
  16. OK
  17. Next
  18. Next -> Next
  19. Enter publication name as follows
  20. Finish
Observations
  1. Go to publisher -> Replication -> Local publications -> Find new publication is created
  2. To check snapshot was created or not -> Right click on the publication (SSISDBSP) -> View Snapshot Agent Status
  3. Go to repldata folder as follows:
  4. Go to sub folders find the snapshot files (.bcp, .sch, idx, .trg)
  5. Go to distributor -> SQL Server Agent -> Jobs -> Find snapshot agent job was created
FAQ: How to display database names which consists of publications?
Ans: Go to publisher -> take new query ->
select name from sys.databases where is_published=1 or is_subscribed=1

Creating Subscription

  1. Go to publisher -> Replication -> Local Publications -> Right Click on SSISDBSP -> New Subscription
  2. Next
  3. Select the publication name: SSISDBSP
  4. Next
  5. Select Push subscriptions
  6. Next
  7. Add Subscriber -> Select third instance (Server1\test) -> Connect
  8. Next
  9. Under Subscription Database if there is no database exists with same name -> Select New database -> Enter Database Name -> OK -> Next
  10. Click on browse button (…) under distribution agent security page.
  11. Select “Run under Agent Service Account” and “By impersonating the process account” options as both distributor and subscriber’s service accounts are same. If the service account of subscriber is different, then create a login in subscriber with sysadmin privileges then mention that login details.
  12. Next
  13. Under Agent Schedule -> Select “Run Continuously”
  14. Under Initialize when select -> Immediately
  15. Next -> Next -> Finish
Observations
  1. Go to subscriber -> SSISDB -> Tables -> Find two tables are created
  2. Go to distributor -> SQL Server Agent -> Find new job is created, related to Distribution Agent

Verifying Replication

  1. Go to publisher perform some changes in any table present in publication
  2. Go to distributor run Snapshot Agent Job
  3. Go to subscriber observe the changes in the respective table
FAQ: How many articles may be there in a snapshot publication?
32767
FAQ: Max columns in a table?
1000