MySQL Replication Troubleshooting for Oracle DBAs September, 22, 2016 Sveta Smirnova
•MySQL Replicaiton: must know •Troubleshooting tools •Typical MySQL replication issues Table of Contents 2
MySQL Replicaiton: must know
• How it works • Slave initiates replication • It requests a packet from the master • Master sends the packet • Master does not check if slave received the packet Asynchronous 4
• How it works • There is no conflict resolution • You can shoot yourself in the leg with option –slave skip errors • Global • No table-specific setup Asynchronous 4
• How it works • There is no conflict resolution • Master does not know if slave is up to date • Data inconsistency may happen and stay unnoticed Asynchronous 4
• MySQL Replication either ON or OFF • No ”once-per-hour” synchronizations • Delayed replication just applies updates specified number of seconds later • You must stop replication if you want to stop applying updates Timing 5
• No remotely accessible MATERIALIZED VIEWs • Federated tables are the nearest to them • They do not store any data • These are just clients • Old, but not stable! Whole data with filters 6
• No remotely accessible MATERIALIZED VIEWs • Filters on master • binlog-do-db=db • binlog-ignore-db=db • db must be default database! Whole data with filters 6
• No remotely accessible MATERIALIZED VIEWs • Filters on master • Filters on slave • replicate-do | ignore-db • replicate-do | ignore-table • replicate-wild-do | ignore-table • Handling depends on the binary log format! • IGNORE SERVER IDS Whole data with filters 6
• How it works • After slave receives packet it sends ”Ack” • Master waits for only rpl semi sync master wait for slave count ”Ack” answers This option was introduced in 5.7 Before it was always one slave Semisynchronous 7
• If there are more than rpl semi sync master wait for slave count slaves master will not wait ”Ack” from all of them • Master does not know if event applied by SQL thread successfully or not • If timeout occurs without any slave acknowledged event replication fails back to asynchronous Semisynchronous 7
• Master maintains binary (update) log files • Write happens two times: into engine files and into binary log file • Extra IO activity: any event written on disk twice • You can write on slave Logical 8
• IO thread • Reads events from the master • Stores received events in the relay log file • SQL thread • Reads events from the relay log • Executes them Two types of slave threads 9
• Since 5.6 you can specify maximum number of parallel SQL threads • Still single IO thread • Performance is better than if one SQL thread used, but can be lower than master’s • Error during one thread execution stops all of them Multiple SQL threads 10
• Since 5.7 you can replicate from multiple masters • How it works • Multiple masters store data in binary logs • Slave creates replication channel for each of masters Multiple sets of relay logs Multiple IO and SQL threads Error stops replication from the affecting master only slave parallel workers for every channel Filters for all channels Multiple Masters 11
• A is master for B • B is master for C • C is master for A • Number of servers in circle is unlimited • This is still asynchronous replication! Circular replication 12
• MySQL Cluster • Galera/Percona XtraDB Cluster • Group Replication Plugin • Separate products! Synchronous replication 13
• When start replication you need to specify • Master’s binary log file name • Position in the log file • Skip events by moving position forward • Move position backward: duplicate transactions • No check if transaction already applied Position-based replication 14
• Each transaction receives its own unique id: GTID • You indicate AUTO POSITION=1 when point slave to the master • No need to care about binary log file names and positions • Easy failover GTID-based replicaiton 15
• Statement-based (SBR) • SQL statements stored in the binary log as is • Not-safe statements can lead to data inconsistency INSERT IGNORE LIMIT without ORDER BY Non-deterministic functions ... • Row-based (RBR) • Mixed • Triggers fired only for SBR events! Binary log format 16
• Statement-based (SBR) • Row-based (RBR) • Binary row as written to the table • Can log more data IO Network transfer cost binlog row image • Data consistency issues and slowdowns may happen for tables without PRIMARY KEY • Mixed • Triggers fired only for SBR events! Binary log format 16
• Statement-based (SBR) • Row-based (RBR) • Mixed • Combines advantages of both • Triggers fired only for SBR events! Binary log format 16
• Extremely easy • Make sure binary log is on: –log-bin=basename • Record log name and position: mysql> show master statusG *************************** 1. row *************************** File: mysqld-bin.000001 Position: 583 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: d60a1531-7491-11e6-a7f6-30b5c2208a0f:1-2 1 row in set (0,00 sec) Setup on master 17
• Extremely easy • GTID-based: mysql> change master to master_user=’repl’, master_host=’127.0.0.1’, master_port=13000, master_auto_position=1; Query OK, 0 rows affected, 1 warning (0,52 sec) • Position-based: mysql> change master to master_user=’root’, master_host=’127.0.0.1’, master_port=13000, master_log_file=’mysqld-bin.000001’, master_log_pos=583; Query OK, 0 rows affected, 1 warning (0,39 sec) • START SLAVE in both cases: mysql> start slave; Query OK, 0 rows affected (0,06 sec) Setup on slave 18
• Temporarily on mastermysql> set sql_log_bin=0; Query OK, 0 rows affected (0,00 sec) mysql> insert into t1 values(1); Query OK, 1 row affected (0,00 sec) mysql> set sql_log_bin=1; Query OK, 0 rows affected (0,00 sec) • Requires privilege SUPER • Temporarily on slave • STOP SLAVE • Permanently on slave • RESET MASTER Switch replication OFF 19
• Temporarily on master • Requires privilege SUPER • Temporarily on slave • Permanently on slave • RESET SLAVE Clears master and relay info repositories Deletes all relay log files Resets replication delay to 0 • RESET SLAVE ALL Additionally clears master connection options • RESET MASTER Switch replication OFF 19
• Temporarily on master • Requires privilege SUPER • Temporarily on slave • Permanently on slave • RESET MASTER • Deletes all binary logs • Empties binary log index • Creates new binary log file • Does not stop binary logging! Switch replication OFF 19
• REPLICATION SLAVE on master • REPLICATION CLIENT on master • SHOW MASTER STATUS • SHOW SLAVE STATUS • SHOW BINARY LOGS. • No replication roles • No DBMS REPUTIL.FROM REMOTE • SUPER to setup slave Access 20
• Always available, setup required • Asynchronous master-slave • Master• Keeps all updates in the binary log file Two formats: ROW and STATEMENT • Slave • IO thread reads updates from the master • SQL thread executes updates Multiple SQL threads since 5.6 Multiple masters since 5.7 • GTIDs since 5.6 Replication essentials: summary 21
Troubleshooting tools
• Error log file • Slave • SHOW SLAVE STATUS • Tables in Performance Schema • Tables in mysql database • Master • SHOW MASTER STATUS • SHOW BINLOG EVENTS • mysqlbinlog • mysql command line client Main troubleshooting tools 23
• Slave start 2016-09-07T11:00:27.004560Z 2 [Note] ’CHANGE MASTER TO FOR CHANNEL ’’ executed’. Previous state master_host=’’, master_port= 3306, master_log_file=’’, master_log_pos= 4, master_bind=’’. New state master_host=’127.0.0.1’, master_port= 13000, master_log_file=’’, master_log_pos= 4, master_bind=’’. 2016-09-07T11:00:29.787214Z 3 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the ’START SLAVE Syntax’ in the MySQL Manual for more information. 2016-09-07T11:00:29.787871Z 3 [Note] Slave I/O thread for channel ’’: connected to master ’root@127.0.0.1:13000’,replication started in log ’FIRST’ at position 4 2016-09-07T11:00:29.788005Z 4 [Note] Slave SQL thread for channel ’’ initialized, starting replication in log ’FIRST’ at position 0, relay log ’./Thinkie-relay-bin.000001’ position: 4 • Errors • Slave shutdown Error log file 24
• Slave start • Errors 2016-08-23T12:11:21.867440Z 4 [ERROR] Slave SQL for channel ’master-1’: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213, Error_code: 1032 2016-08-23T12:11:21.867471Z 4 [Warning] Slave: Can’t find record in ’t1’ Error_code: 1032 2016-08-23T12:11:21.867484Z 4 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log ’master-bin.000001’ position 989 • Slave shutdown Error log file 24
• Slave start • Errors • Slave shutdown 2016-09-07T11:03:34.703008Z 4 [Note] Error reading relay log event for channel ’’: slave SQL thread was killed 2016-09-07T11:03:34.704772Z 3 [Note] Slave I/O thread killed while reading event for channel ’’ 2016-09-07T11:03:34.704828Z 3 [Note] Slave I/O thread exiting for channel ’’, read up to log ’mysqld-bin.000001’, position 154 Error log file 24
Everything about running slave • IO thread configuration • SQL thread configuration • IO thread state • SQL thread state • Error information Only last error shown All are written into error log file SHOW SLAVE STATUS 25
mysql> show slave status G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: root Master_Port: 13000 Connect_Retry: 60 Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 63810611 Relay_Log_File: slave-relay-bin-master@002d1.000004 Relay_Log_Pos: 1156 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: SHOW SLAVE STATUS 25
Replicate_Wild_Ignore_Table: Last_Errno: 1032 Last_Error: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 Skip_Counter: 0 Exec_Master_Log_Pos: 989 Relay_Log_Space: 63814652 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No SHOW SLAVE STATUS 25
Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1032 Last_SQL_Error: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: d08c509e-6857-11e6-8872-30b5c2208a0f Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 10 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 160823 15:11:21 SHOW SLAVE STATUS 25
Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: master-1 Master_TLS_Version: SHOW SLAVE STATUS 25
• No need to parse SHOW output • Configuration • IO thread • SQL thread Tables in Performance Schema 26
• No need to parse SHOW output • Configuration • replication connection configuration • replication applier configuration • IO thread • SQL thread Tables in Performance Schema 26
• No need to parse SHOW output • Configuration • IO thread • replication connection status • SQL thread Tables in Performance Schema 26
• No need to parse SHOW output • Configuration • IO thread • SQL thread • replication applier status • replication applier status by coordinator - MTS only • replication applier status by worker Tables in Performance Schema 26
mysql> select * from replication_connection_configuration -> join replication_applier_configuration using(channel_name)G *************************** 1. row *************************** CHANNEL_NAME: HOST: 127.0.0.1 PORT: 13000 USER: root NETWORK_INTERFACE: AUTO_POSITION: 1 SSL_ALLOWED: NO SSL_CA_FILE: ... Performance Schema: Configuration 27
... CONNECTION_RETRY_INTERVAL: 60 CONNECTION_RETRY_COUNT: 10 HEARTBEAT_INTERVAL: 60.000 CHANNEL_NAME: DESIRED_DELAY: 0 1 row in set (0.00 sec) Performance Schema: Configuration 27
mysql> select * from replication_connection_statusG *************************** 1. row *************************** CHANNEL_NAME: GROUP_NAME: SOURCE_UUID: d0753e78-14ec-11e5-b3fb-28b2bd7442fd THREAD_ID: 21 SERVICE_STATE: ON COUNT_RECEIVED_HEARTBEATS: 17 LAST_HEARTBEAT_TIMESTAMP: 2015-06-17 15:49:08 RECEIVED_TRANSACTION_SET: LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00 1 row in set (0.00 sec) Performance Schema: State of IO Thread 28
• Coordinator thread for multiple workers mysql> select * from replication_applier_status join -> replication_applier_status_by_coordinator using(channel_name)G *************************** 1. row *************************** CHANNEL_NAME: SERVICE_STATE: ON REMAINING_DELAY: NULL COUNT_TRANSACTIONS_RETRIES: 0 THREAD_ID: 22 SERVICE_STATE: ON LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00 1 row in set (0.00 sec) • Other cases Performance Schema: State of SQL Thread 29
• Coordinator thread for multiple workers • Other casesmysql> select * from replication_applier_status join -> replication_applier_status_by_worker using(channel_name)G *************************** 1. row *************************** CHANNEL_NAME: master-1 SERVICE_STATE: OFF REMAINING_DELAY: NULL COUNT_TRANSACTIONS_RETRIES: 0 WORKER_ID: 0 THREAD_ID: NULL SERVICE_STATE: OFF LAST_SEEN_TRANSACTION: ANONYMOUS LAST_ERROR_NUMBER: 1032 LAST_ERROR_MESSAGE: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 LAST_ERROR_TIMESTAMP: 2016-08-23 15:48:20 Performance Schema: State of SQL Thread 29
• Coordinator thread for multiple workers • Other cases *************************** 2. row *************************** CHANNEL_NAME: master-2 SERVICE_STATE: ON REMAINING_DELAY: NULL COUNT_TRANSACTIONS_RETRIES: 0 WORKER_ID: 0 THREAD_ID: 42 SERVICE_STATE: ON LAST_SEEN_TRANSACTION: ANONYMOUS LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00 2 rows in set (0,00 sec) Performance Schema: State of SQL Thread 29
• RECEIVED TRANSACTION SET in table replication connection status • LAST SEEN TRANSACTION in replication applier status by worker GTID diagnostics 30
• Single-threaded slave mysql> select cs.CHANNEL_NAME, cs.SOURCE_UUID, cs.RECEIVED_TRANSACTION_SET, -> asw.LAST_SEEN_TRANSACTION, aps.SERVICE_STATE from -> replication_connection_status cs join replication_applier_status_by_worker -> asw using(channel_name) join replication_applier_status aps -> using(channel_name) G *************************** 1. row *************************** CHANNEL_NAME: SOURCE_UUID: 9038967d-7164-11e6-8c88-30b5c2208a0f RECEIVED_TRANSACTION_SET: 9038967d-7164-11e6-8c88-30b5c2208a0f:1-2 LAST_SEEN_TRANSACTION: 9038967d-7164-11e6-8c88-30b5c2208a0f:2 SERVICE_STATE: ON 1 row in set (0,00 sec) • Multi-threaded GTID: all in one place 31
• Single-threaded slave • Multi-threaded *************************** 1. row *************************** THREAD_ID: 30 SERVICE_STATE: ON RECEIVED_TRANSACTION_SET: 9038967d-7164-11e6-8c88-30b5c2208a0f:1-3 LAST_SEEN_TRANSACTION: ... *************************** 8. row *************************** THREAD_ID: 37 SERVICE_STATE: ON RECEIVED_TRANSACTION_SET: 9038967d-7164-11e6-8c88-30b5c2208a0f:1-3 LAST_SEEN_TRANSACTION: 9038967d-7164-11e6-8c88-30b5c2208a0f:3 8 rows in set (0,00 sec) GTID: all in one place 31
• Master Infomysql> select * from slave_master_infoG *************************** 1. row *************************** Number_of_lines: 25 Master_log_name: mysqld-bin.000001 Master_log_pos: 154 Host: 127.0.0.1 User_name: root User_password: secret Port: 13000 Connect_retry: 60 Enabled_ssl: 0 ... Uuid: 31ed7c8f-74ea-11e6-8de8-30b5c2208a0f Retry_count: 86400 ... Enabled_auto_position: 1 ... mysql database: on the slave only 32
• Master Info • Relay log info mysql> select * from slave_relay_log_infoG *************************** 1. row *************************** Number_of_lines: 7 Relay_log_name: ./slave-relay-bin-master@002d1.000004 Relay_log_pos: 1156 Master_log_name: master-bin.000001 Master_log_pos: 989 Sql_delay: 0 Number_of_workers: 0 Id: 1 Channel_name: master-1 mysql database: on the slave only 32
• Master Info • Relay log info • Worker info: multi-threaded slavemysql> select * from slave_worker_infoG *************************** 1. row *************************** Id: 1 ... *************************** 8. row *************************** Id: 8 Relay_log_name: ./Thinkie-relay-bin.000004 Relay_log_pos: 1216 Master_log_name: mysqld-bin.000001 Master_log_pos: 1342 Checkpoint_relay_log_name: ./Thinkie-relay-bin.000004 Checkpoint_relay_log_pos: 963 Checkpoint_master_log_name: mysqld-bin.000001 Checkpoint_master_log_pos: 1089 mysql database: on the slave only 32
mysql> show master statusG *************************** 1. row *************************** File: mysqld-bin.000001 Position: 583 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 31ed7c8f-74ea-11e6-8de8-30b5c2208a0f:1-2 1 row in set (0,00 sec) SHOW MASTER STATUS 33
mysql> show binlog events in ’master-bin.000001’ from 989; +-------------------+------+----------------+-----------+-------------+-------------------------------- | Log_name | Pos | Event_type | Server_id | End_log_pos | Info +-------------------+------+----------------+-----------+-------------+-------------------------------- | master-bin.000001 | 989 | Anonymous_Gtid | 1 | 1054 | SET @@SESSION.GTID_NEXT= ... | master-bin.000001 | 1054 | Query | 1 | 1124 | BEGIN | master-bin.000001 | 1124 | Table_map | 1 | 1167 | table_id: 109 (m2.t1) | master-bin.000001 | 1167 | Update_rows | 1 | 1213 | table_id: 109 flags: STMT_END_F | master-bin.000001 | 1213 | Xid | 1 | 1244 | COMMIT /* xid=64 */ +-------------------+------+----------------+-----------+-------------+-------------------------------- 5 rows in set (0,00 sec) SHOW BINLOG EVENTS 34
$ mysqlbinlog -v var/mysqld.1/data/master-bin.000001 --start-position=989 --stop-position=1213 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #160822 14:01:44 server id 1 end_log_pos 123 CRC32 0xc6ab0154 Start: binlog v 4, server v 5.7.13-debug-log created 160822 14:01:44 at startup ROLLBACK/*!*/; BINLOG ’ mNu6Vw8BAAAAdwAAAHsAAAAAAAQANS43LjEzLWRlYnVnLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAACY27pXEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA AVQBq8Y= ’/*!*/; # at 989 #160822 14:15:11 server id 1 end_log_pos 1054 CRC32 0xe388e040 Anonymous_GTID last_committed=4 sequence_number=5 SET @@SESSION.GTID_NEXT= ’ANONYMOUS’/*!*/; mysqlbinlog 35
# at 1054 #160822 14:15:11 server id 1 end_log_pos 1124 CRC32 0x82cb7a14 Query thread_id=9 exec_time=0 error_code=0 SET TIMESTAMP=1471864511/*!*/; SET @@session.pseudo_thread_id=9/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33, @@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; # at 1124 #160822 14:15:11 server id 1 end_log_pos 1167 CRC32 0xa1ce6163 Table_map: ‘m2‘.‘t1‘ mapped to number 109 mysqlbinlog 35
# at 1167 #160822 14:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b Update_rows: table id 109 flags: STMT_END_F BINLOG ’ v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ== v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw== ’/*!*/; ### UPDATE ‘m2‘.‘t1‘ ### WHERE ### @1=5 ### SET ### @1=6 ROLLBACK /* added by mysqlbinlog */ /*!*/; SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/; DELIMITER ; # End of log file /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; mysqlbinlog 35
• Percona Toolkit • pt-table-checksum Checks master and slave for consistency • pt-table-sync Fixes consistency issues • pt-slave-find Displays replication topology Troubleshooting Toolkits 36
• MySQL Utilities • mysqlrplcheck Checks replication pre-requisites • mysqlrplshow Displays replication topology • mysqlrplsync Checks master and slave for consistency • mysqlslavetrx Skips 1-N transactions Troubleshooting Toolkits 36
• MySQL Utilities • mysqldbcompare Compares two databases for changes • mysqldiff Compares object definitions • mysqlserverinfo Displays basic server options, such as port and datadir Replication-oriented Troubleshooting Toolkits 36
Typical MySQL replication issues
• Performance • More IO binlog row image = FULL | MINIMAL | NOBLOB Binlog stmt | cache disk use • Synchronization • Behavior • expire log days • READ COMMITTED and READ UNCOMMITTED not supported for SBR • Non-deterministic events with SBR Master issues 38
• Various connection-related issues • Check error log file • Connect with regular MySQL client as replication user • Performance issues solved same way as other client-server MySQL network issues Slave IO thread issues 39
• Simple master-slave • Data is different on master and slave Replication event can not be applied • Different errors on master and slave • Slave lags far behind the master • Circular replication or other writes in addition to slave SQL thread add complexity, but solved same way SQL thread issues 40
• Performance • Single relay log • slave parallel type=DATABASE | LOGICAL CLOCK • Wrong behavior • Same as with single-threaded slave, but mind additional concurrency • Error in one SQL thread stops them all Multi-threaded slave 41
• Replication should be configured for each channel • Mixing masters with GTID and without GTIDs is possible • Same issues as with traditional replication • Replication filters are same for all channels Multi-master 42
Summary
• Master issues • Same as for regular server • Additional IO and safety checks • Slave IO thread • Common network-related issues • Use mysql command line client to test • Slave SQL thread • Common SQL-related issues • Common engine-specific issues • Less concurrency than on master Summary 44
• MySQL Replication Troubleshooting webinar • MySQL High Availability book • MySQL Replication Team blog • Percona Webinars • Percona Toolkit • MySQL Utilities More information 45
??? Place for your questions 46
http://www.slideshare.net/SvetaSmirnova https://twitter.com/svetsmirnova Thank you! 47

MySQL Replication Troubleshooting for Oracle DBAs

  • 1.
    MySQL Replication Troubleshooting forOracle DBAs September, 22, 2016 Sveta Smirnova
  • 2.
    •MySQL Replicaiton: mustknow •Troubleshooting tools •Typical MySQL replication issues Table of Contents 2
  • 3.
  • 4.
    • How itworks • Slave initiates replication • It requests a packet from the master • Master sends the packet • Master does not check if slave received the packet Asynchronous 4
  • 5.
    • How itworks • There is no conflict resolution • You can shoot yourself in the leg with option –slave skip errors • Global • No table-specific setup Asynchronous 4
  • 6.
    • How itworks • There is no conflict resolution • Master does not know if slave is up to date • Data inconsistency may happen and stay unnoticed Asynchronous 4
  • 7.
    • MySQL Replicationeither ON or OFF • No ”once-per-hour” synchronizations • Delayed replication just applies updates specified number of seconds later • You must stop replication if you want to stop applying updates Timing 5
  • 8.
    • No remotely accessibleMATERIALIZED VIEWs • Federated tables are the nearest to them • They do not store any data • These are just clients • Old, but not stable! Whole data with filters 6
  • 9.
    • No remotelyaccessible MATERIALIZED VIEWs • Filters on master • binlog-do-db=db • binlog-ignore-db=db • db must be default database! Whole data with filters 6
  • 10.
    • No remotelyaccessible MATERIALIZED VIEWs • Filters on master • Filters on slave • replicate-do | ignore-db • replicate-do | ignore-table • replicate-wild-do | ignore-table • Handling depends on the binary log format! • IGNORE SERVER IDS Whole data with filters 6
  • 11.
    • How itworks • After slave receives packet it sends ”Ack” • Master waits for only rpl semi sync master wait for slave count ”Ack” answers This option was introduced in 5.7 Before it was always one slave Semisynchronous 7
  • 12.
    • If thereare more than rpl semi sync master wait for slave count slaves master will not wait ”Ack” from all of them • Master does not know if event applied by SQL thread successfully or not • If timeout occurs without any slave acknowledged event replication fails back to asynchronous Semisynchronous 7
  • 13.
    • Master maintainsbinary (update) log files • Write happens two times: into engine files and into binary log file • Extra IO activity: any event written on disk twice • You can write on slave Logical 8
  • 14.
    • IO thread • Readsevents from the master • Stores received events in the relay log file • SQL thread • Reads events from the relay log • Executes them Two types of slave threads 9
  • 15.
    • Since 5.6 youcan specify maximum number of parallel SQL threads • Still single IO thread • Performance is better than if one SQL thread used, but can be lower than master’s • Error during one thread execution stops all of them Multiple SQL threads 10
  • 16.
    • Since 5.7 youcan replicate from multiple masters • How it works • Multiple masters store data in binary logs • Slave creates replication channel for each of masters Multiple sets of relay logs Multiple IO and SQL threads Error stops replication from the affecting master only slave parallel workers for every channel Filters for all channels Multiple Masters 11
  • 17.
    • A is masterfor B • B is master for C • C is master for A • Number of servers in circle is unlimited • This is still asynchronous replication! Circular replication 12
  • 18.
    • MySQL Cluster • Galera/PerconaXtraDB Cluster • Group Replication Plugin • Separate products! Synchronous replication 13
  • 19.
    • When startreplication you need to specify • Master’s binary log file name • Position in the log file • Skip events by moving position forward • Move position backward: duplicate transactions • No check if transaction already applied Position-based replication 14
  • 20.
    • Each transaction receivesits own unique id: GTID • You indicate AUTO POSITION=1 when point slave to the master • No need to care about binary log file names and positions • Easy failover GTID-based replicaiton 15
  • 21.
    • Statement-based (SBR) • SQLstatements stored in the binary log as is • Not-safe statements can lead to data inconsistency INSERT IGNORE LIMIT without ORDER BY Non-deterministic functions ... • Row-based (RBR) • Mixed • Triggers fired only for SBR events! Binary log format 16
  • 22.
    • Statement-based (SBR) •Row-based (RBR) • Binary row as written to the table • Can log more data IO Network transfer cost binlog row image • Data consistency issues and slowdowns may happen for tables without PRIMARY KEY • Mixed • Triggers fired only for SBR events! Binary log format 16
  • 23.
    • Statement-based (SBR) • Row-based(RBR) • Mixed • Combines advantages of both • Triggers fired only for SBR events! Binary log format 16
  • 24.
    • Extremely easy •Make sure binary log is on: –log-bin=basename • Record log name and position: mysql> show master statusG *************************** 1. row *************************** File: mysqld-bin.000001 Position: 583 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: d60a1531-7491-11e6-a7f6-30b5c2208a0f:1-2 1 row in set (0,00 sec) Setup on master 17
  • 25.
    • Extremely easy • GTID-based: mysql>change master to master_user=’repl’, master_host=’127.0.0.1’, master_port=13000, master_auto_position=1; Query OK, 0 rows affected, 1 warning (0,52 sec) • Position-based: mysql> change master to master_user=’root’, master_host=’127.0.0.1’, master_port=13000, master_log_file=’mysqld-bin.000001’, master_log_pos=583; Query OK, 0 rows affected, 1 warning (0,39 sec) • START SLAVE in both cases: mysql> start slave; Query OK, 0 rows affected (0,06 sec) Setup on slave 18
  • 26.
    • Temporarily onmastermysql> set sql_log_bin=0; Query OK, 0 rows affected (0,00 sec) mysql> insert into t1 values(1); Query OK, 1 row affected (0,00 sec) mysql> set sql_log_bin=1; Query OK, 0 rows affected (0,00 sec) • Requires privilege SUPER • Temporarily on slave • STOP SLAVE • Permanently on slave • RESET MASTER Switch replication OFF 19
  • 27.
    • Temporarily onmaster • Requires privilege SUPER • Temporarily on slave • Permanently on slave • RESET SLAVE Clears master and relay info repositories Deletes all relay log files Resets replication delay to 0 • RESET SLAVE ALL Additionally clears master connection options • RESET MASTER Switch replication OFF 19
  • 28.
    • Temporarily on master • Requiresprivilege SUPER • Temporarily on slave • Permanently on slave • RESET MASTER • Deletes all binary logs • Empties binary log index • Creates new binary log file • Does not stop binary logging! Switch replication OFF 19
  • 29.
    • REPLICATION SLAVEon master • REPLICATION CLIENT on master • SHOW MASTER STATUS • SHOW SLAVE STATUS • SHOW BINARY LOGS. • No replication roles • No DBMS REPUTIL.FROM REMOTE • SUPER to setup slave Access 20
  • 30.
    • Always available,setup required • Asynchronous master-slave • Master• Keeps all updates in the binary log file Two formats: ROW and STATEMENT • Slave • IO thread reads updates from the master • SQL thread executes updates Multiple SQL threads since 5.6 Multiple masters since 5.7 • GTIDs since 5.6 Replication essentials: summary 21
  • 31.
  • 32.
    • Error logfile • Slave • SHOW SLAVE STATUS • Tables in Performance Schema • Tables in mysql database • Master • SHOW MASTER STATUS • SHOW BINLOG EVENTS • mysqlbinlog • mysql command line client Main troubleshooting tools 23
  • 33.
    • Slave start 2016-09-07T11:00:27.004560Z2 [Note] ’CHANGE MASTER TO FOR CHANNEL ’’ executed’. Previous state master_host=’’, master_port= 3306, master_log_file=’’, master_log_pos= 4, master_bind=’’. New state master_host=’127.0.0.1’, master_port= 13000, master_log_file=’’, master_log_pos= 4, master_bind=’’. 2016-09-07T11:00:29.787214Z 3 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the ’START SLAVE Syntax’ in the MySQL Manual for more information. 2016-09-07T11:00:29.787871Z 3 [Note] Slave I/O thread for channel ’’: connected to master ’root@127.0.0.1:13000’,replication started in log ’FIRST’ at position 4 2016-09-07T11:00:29.788005Z 4 [Note] Slave SQL thread for channel ’’ initialized, starting replication in log ’FIRST’ at position 0, relay log ’./Thinkie-relay-bin.000001’ position: 4 • Errors • Slave shutdown Error log file 24
  • 34.
    • Slave start • Errors 2016-08-23T12:11:21.867440Z4 [ERROR] Slave SQL for channel ’master-1’: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213, Error_code: 1032 2016-08-23T12:11:21.867471Z 4 [Warning] Slave: Can’t find record in ’t1’ Error_code: 1032 2016-08-23T12:11:21.867484Z 4 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log ’master-bin.000001’ position 989 • Slave shutdown Error log file 24
  • 35.
    • Slave start •Errors • Slave shutdown 2016-09-07T11:03:34.703008Z 4 [Note] Error reading relay log event for channel ’’: slave SQL thread was killed 2016-09-07T11:03:34.704772Z 3 [Note] Slave I/O thread killed while reading event for channel ’’ 2016-09-07T11:03:34.704828Z 3 [Note] Slave I/O thread exiting for channel ’’, read up to log ’mysqld-bin.000001’, position 154 Error log file 24
  • 36.
    Everything about runningslave • IO thread configuration • SQL thread configuration • IO thread state • SQL thread state • Error information Only last error shown All are written into error log file SHOW SLAVE STATUS 25
  • 37.
    mysql> show slavestatus G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: root Master_Port: 13000 Connect_Retry: 60 Master_Log_File: master-bin.000002 Read_Master_Log_Pos: 63810611 Relay_Log_File: slave-relay-bin-master@002d1.000004 Relay_Log_Pos: 1156 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: SHOW SLAVE STATUS 25
  • 38.
    Replicate_Wild_Ignore_Table: Last_Errno: 1032 Last_Error: Couldnot execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 Skip_Counter: 0 Exec_Master_Log_Pos: 989 Relay_Log_Space: 63814652 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No SHOW SLAVE STATUS 25
  • 39.
    Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1032 Last_SQL_Error:Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: d08c509e-6857-11e6-8872-30b5c2208a0f Master_Info_File: mysql.slave_master_info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Master_Retry_Count: 10 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: 160823 15:11:21 SHOW SLAVE STATUS 25
  • 40.
  • 41.
    • No needto parse SHOW output • Configuration • IO thread • SQL thread Tables in Performance Schema 26
  • 42.
    • No needto parse SHOW output • Configuration • replication connection configuration • replication applier configuration • IO thread • SQL thread Tables in Performance Schema 26
  • 43.
    • No needto parse SHOW output • Configuration • IO thread • replication connection status • SQL thread Tables in Performance Schema 26
  • 44.
    • No needto parse SHOW output • Configuration • IO thread • SQL thread • replication applier status • replication applier status by coordinator - MTS only • replication applier status by worker Tables in Performance Schema 26
  • 45.
    mysql> select *from replication_connection_configuration -> join replication_applier_configuration using(channel_name)G *************************** 1. row *************************** CHANNEL_NAME: HOST: 127.0.0.1 PORT: 13000 USER: root NETWORK_INTERFACE: AUTO_POSITION: 1 SSL_ALLOWED: NO SSL_CA_FILE: ... Performance Schema: Configuration 27
  • 46.
    ... CONNECTION_RETRY_INTERVAL: 60 CONNECTION_RETRY_COUNT: 10 HEARTBEAT_INTERVAL:60.000 CHANNEL_NAME: DESIRED_DELAY: 0 1 row in set (0.00 sec) Performance Schema: Configuration 27
  • 47.
    mysql> select *from replication_connection_statusG *************************** 1. row *************************** CHANNEL_NAME: GROUP_NAME: SOURCE_UUID: d0753e78-14ec-11e5-b3fb-28b2bd7442fd THREAD_ID: 21 SERVICE_STATE: ON COUNT_RECEIVED_HEARTBEATS: 17 LAST_HEARTBEAT_TIMESTAMP: 2015-06-17 15:49:08 RECEIVED_TRANSACTION_SET: LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00 1 row in set (0.00 sec) Performance Schema: State of IO Thread 28
  • 48.
    • Coordinator threadfor multiple workers mysql> select * from replication_applier_status join -> replication_applier_status_by_coordinator using(channel_name)G *************************** 1. row *************************** CHANNEL_NAME: SERVICE_STATE: ON REMAINING_DELAY: NULL COUNT_TRANSACTIONS_RETRIES: 0 THREAD_ID: 22 SERVICE_STATE: ON LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00 1 row in set (0.00 sec) • Other cases Performance Schema: State of SQL Thread 29
  • 49.
    • Coordinator threadfor multiple workers • Other casesmysql> select * from replication_applier_status join -> replication_applier_status_by_worker using(channel_name)G *************************** 1. row *************************** CHANNEL_NAME: master-1 SERVICE_STATE: OFF REMAINING_DELAY: NULL COUNT_TRANSACTIONS_RETRIES: 0 WORKER_ID: 0 THREAD_ID: NULL SERVICE_STATE: OFF LAST_SEEN_TRANSACTION: ANONYMOUS LAST_ERROR_NUMBER: 1032 LAST_ERROR_MESSAGE: Could not execute Update_rows event on table m2.t1; Can’t find record in ’t1’, Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event’s master log master-bin.000001, end_log_pos 1213 LAST_ERROR_TIMESTAMP: 2016-08-23 15:48:20 Performance Schema: State of SQL Thread 29
  • 50.
    • Coordinator threadfor multiple workers • Other cases *************************** 2. row *************************** CHANNEL_NAME: master-2 SERVICE_STATE: ON REMAINING_DELAY: NULL COUNT_TRANSACTIONS_RETRIES: 0 WORKER_ID: 0 THREAD_ID: 42 SERVICE_STATE: ON LAST_SEEN_TRANSACTION: ANONYMOUS LAST_ERROR_NUMBER: 0 LAST_ERROR_MESSAGE: LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00 2 rows in set (0,00 sec) Performance Schema: State of SQL Thread 29
  • 51.
    • RECEIVED TRANSACTIONSET in table replication connection status • LAST SEEN TRANSACTION in replication applier status by worker GTID diagnostics 30
  • 52.
    • Single-threaded slave mysql>select cs.CHANNEL_NAME, cs.SOURCE_UUID, cs.RECEIVED_TRANSACTION_SET, -> asw.LAST_SEEN_TRANSACTION, aps.SERVICE_STATE from -> replication_connection_status cs join replication_applier_status_by_worker -> asw using(channel_name) join replication_applier_status aps -> using(channel_name) G *************************** 1. row *************************** CHANNEL_NAME: SOURCE_UUID: 9038967d-7164-11e6-8c88-30b5c2208a0f RECEIVED_TRANSACTION_SET: 9038967d-7164-11e6-8c88-30b5c2208a0f:1-2 LAST_SEEN_TRANSACTION: 9038967d-7164-11e6-8c88-30b5c2208a0f:2 SERVICE_STATE: ON 1 row in set (0,00 sec) • Multi-threaded GTID: all in one place 31
  • 53.
    • Single-threaded slave •Multi-threaded *************************** 1. row *************************** THREAD_ID: 30 SERVICE_STATE: ON RECEIVED_TRANSACTION_SET: 9038967d-7164-11e6-8c88-30b5c2208a0f:1-3 LAST_SEEN_TRANSACTION: ... *************************** 8. row *************************** THREAD_ID: 37 SERVICE_STATE: ON RECEIVED_TRANSACTION_SET: 9038967d-7164-11e6-8c88-30b5c2208a0f:1-3 LAST_SEEN_TRANSACTION: 9038967d-7164-11e6-8c88-30b5c2208a0f:3 8 rows in set (0,00 sec) GTID: all in one place 31
  • 54.
    • Master Infomysql>select * from slave_master_infoG *************************** 1. row *************************** Number_of_lines: 25 Master_log_name: mysqld-bin.000001 Master_log_pos: 154 Host: 127.0.0.1 User_name: root User_password: secret Port: 13000 Connect_retry: 60 Enabled_ssl: 0 ... Uuid: 31ed7c8f-74ea-11e6-8de8-30b5c2208a0f Retry_count: 86400 ... Enabled_auto_position: 1 ... mysql database: on the slave only 32
  • 55.
    • Master Info •Relay log info mysql> select * from slave_relay_log_infoG *************************** 1. row *************************** Number_of_lines: 7 Relay_log_name: ./slave-relay-bin-master@002d1.000004 Relay_log_pos: 1156 Master_log_name: master-bin.000001 Master_log_pos: 989 Sql_delay: 0 Number_of_workers: 0 Id: 1 Channel_name: master-1 mysql database: on the slave only 32
  • 56.
    • Master Info •Relay log info • Worker info: multi-threaded slavemysql> select * from slave_worker_infoG *************************** 1. row *************************** Id: 1 ... *************************** 8. row *************************** Id: 8 Relay_log_name: ./Thinkie-relay-bin.000004 Relay_log_pos: 1216 Master_log_name: mysqld-bin.000001 Master_log_pos: 1342 Checkpoint_relay_log_name: ./Thinkie-relay-bin.000004 Checkpoint_relay_log_pos: 963 Checkpoint_master_log_name: mysqld-bin.000001 Checkpoint_master_log_pos: 1089 mysql database: on the slave only 32
  • 57.
    mysql> show masterstatusG *************************** 1. row *************************** File: mysqld-bin.000001 Position: 583 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 31ed7c8f-74ea-11e6-8de8-30b5c2208a0f:1-2 1 row in set (0,00 sec) SHOW MASTER STATUS 33
  • 58.
    mysql> show binlogevents in ’master-bin.000001’ from 989; +-------------------+------+----------------+-----------+-------------+-------------------------------- | Log_name | Pos | Event_type | Server_id | End_log_pos | Info +-------------------+------+----------------+-----------+-------------+-------------------------------- | master-bin.000001 | 989 | Anonymous_Gtid | 1 | 1054 | SET @@SESSION.GTID_NEXT= ... | master-bin.000001 | 1054 | Query | 1 | 1124 | BEGIN | master-bin.000001 | 1124 | Table_map | 1 | 1167 | table_id: 109 (m2.t1) | master-bin.000001 | 1167 | Update_rows | 1 | 1213 | table_id: 109 flags: STMT_END_F | master-bin.000001 | 1213 | Xid | 1 | 1244 | COMMIT /* xid=64 */ +-------------------+------+----------------+-----------+-------------+-------------------------------- 5 rows in set (0,00 sec) SHOW BINLOG EVENTS 34
  • 59.
    $ mysqlbinlog -vvar/mysqld.1/data/master-bin.000001 --start-position=989 --stop-position=1213 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #160822 14:01:44 server id 1 end_log_pos 123 CRC32 0xc6ab0154 Start: binlog v 4, server v 5.7.13-debug-log created 160822 14:01:44 at startup ROLLBACK/*!*/; BINLOG ’ mNu6Vw8BAAAAdwAAAHsAAAAAAAQANS43LjEzLWRlYnVnLWxvZwAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAACY27pXEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA AVQBq8Y= ’/*!*/; # at 989 #160822 14:15:11 server id 1 end_log_pos 1054 CRC32 0xe388e040 Anonymous_GTID last_committed=4 sequence_number=5 SET @@SESSION.GTID_NEXT= ’ANONYMOUS’/*!*/; mysqlbinlog 35
  • 60.
    # at 1054 #16082214:15:11 server id 1 end_log_pos 1124 CRC32 0x82cb7a14 Query thread_id=9 exec_time=0 error_code=0 SET TIMESTAMP=1471864511/*!*/; SET @@session.pseudo_thread_id=9/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33, @@session.collation_server=8/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; BEGIN /*!*/; # at 1124 #160822 14:15:11 server id 1 end_log_pos 1167 CRC32 0xa1ce6163 Table_map: ‘m2‘.‘t1‘ mapped to number 109 mysqlbinlog 35
  • 61.
    # at 1167 #16082214:15:11 server id 1 end_log_pos 1213 CRC32 0x1f346c6b Update_rows: table id 109 flags: STMT_END_F BINLOG ’ v966VxMBAAAAKwAAAI8EAAAAAG0AAAAAAAEAAm0yAAJ0MQABAwABY2HOoQ== v966Vx8BAAAALgAAAL0EAAAAAG0AAAAAAAEAAgAB///+BQAAAP4GAAAAa2w0Hw== ’/*!*/; ### UPDATE ‘m2‘.‘t1‘ ### WHERE ### @1=5 ### SET ### @1=6 ROLLBACK /* added by mysqlbinlog */ /*!*/; SET @@SESSION.GTID_NEXT= ’AUTOMATIC’ /* added by mysqlbinlog */ /*!*/; DELIMITER ; # End of log file /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; mysqlbinlog 35
  • 62.
    • Percona Toolkit • pt-table-checksum Checksmaster and slave for consistency • pt-table-sync Fixes consistency issues • pt-slave-find Displays replication topology Troubleshooting Toolkits 36
  • 63.
    • MySQL Utilities •mysqlrplcheck Checks replication pre-requisites • mysqlrplshow Displays replication topology • mysqlrplsync Checks master and slave for consistency • mysqlslavetrx Skips 1-N transactions Troubleshooting Toolkits 36
  • 64.
    • MySQL Utilities •mysqldbcompare Compares two databases for changes • mysqldiff Compares object definitions • mysqlserverinfo Displays basic server options, such as port and datadir Replication-oriented Troubleshooting Toolkits 36
  • 65.
  • 66.
    • Performance • More IO binlog rowimage = FULL | MINIMAL | NOBLOB Binlog stmt | cache disk use • Synchronization • Behavior • expire log days • READ COMMITTED and READ UNCOMMITTED not supported for SBR • Non-deterministic events with SBR Master issues 38
  • 67.
    • Various connection-related issues • Checkerror log file • Connect with regular MySQL client as replication user • Performance issues solved same way as other client-server MySQL network issues Slave IO thread issues 39
  • 68.
    • Simple master-slave • Datais different on master and slave Replication event can not be applied • Different errors on master and slave • Slave lags far behind the master • Circular replication or other writes in addition to slave SQL thread add complexity, but solved same way SQL thread issues 40
  • 69.
    • Performance • Single relaylog • slave parallel type=DATABASE | LOGICAL CLOCK • Wrong behavior • Same as with single-threaded slave, but mind additional concurrency • Error in one SQL thread stops them all Multi-threaded slave 41
  • 70.
    • Replication shouldbe configured for each channel • Mixing masters with GTID and without GTIDs is possible • Same issues as with traditional replication • Replication filters are same for all channels Multi-master 42
  • 71.
  • 72.
    • Master issues •Same as for regular server • Additional IO and safety checks • Slave IO thread • Common network-related issues • Use mysql command line client to test • Slave SQL thread • Common SQL-related issues • Common engine-specific issues • Less concurrency than on master Summary 44
  • 73.
    • MySQL ReplicationTroubleshooting webinar • MySQL High Availability book • MySQL Replication Team blog • Percona Webinars • Percona Toolkit • MySQL Utilities More information 45
  • 74.
    ??? Place for yourquestions 46
  • 75.