Skip to main content
Fixed syntax and missing letters
Source Link
nelaaro
  • 767
  • 3
  • 10
  • 22
MariaDB [(none)]> stop slave; Query OK, 0 rows affected (0.35 sec) MariaDB [(none)]> set global slave_parallel_threads = 0; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> set global slave_parallel_mode = none; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> STartStart SLAVE; Query OK, 0 rows affected (0.00 sec) 

NoNow if I check Parallel slave threads I see

MariaDB [(none)]> stop slave; Query OK, 0 rows affected (0.35 sec) MariaDB [(none)]> set global slave_parallel_threads = 0; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> set global slave_parallel_mode = none; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> STart SLAVE; Query OK, 0 rows affected (0.00 sec) 

No if I check Parallel slave threads I see

MariaDB [(none)]> stop slave; Query OK, 0 rows affected (0.35 sec) MariaDB [(none)]> set global slave_parallel_threads = 0; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> set global slave_parallel_mode = none; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> Start SLAVE; Query OK, 0 rows affected (0.00 sec) 

Now if I check Parallel slave threads I see

removed ip address
Source Link
nelaaro
  • 767
  • 3
  • 10
  • 22
MariaDB [(none)]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.56.228.64masterhost Master_User: maxscale Master_Port: 3306 Connect_Retry: 5 Master_Log_File: binary.000600 Read_Master_Log_Pos: 37801368 Relay_Log_File: tmsdb-relay-bin.001242 Relay_Log_Pos: 37801653 Relay_Master_Log_File: binary.000600 Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 37801368 Relay_Log_Space: 37801991 Until_Condition: None Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Master_Server_Id: 1050 Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-1050-5014401,2-1051-379101,3-1010-3273 Parallel_Mode: none 1 row in set (0.00 sec) 
MariaDB [(none)]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.56.228.64 Master_User: maxscale Master_Port: 3306 Connect_Retry: 5 Master_Log_File: binary.000600 Read_Master_Log_Pos: 37801368 Relay_Log_File: tmsdb-relay-bin.001242 Relay_Log_Pos: 37801653 Relay_Master_Log_File: binary.000600 Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 37801368 Relay_Log_Space: 37801991 Until_Condition: None Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Master_Server_Id: 1050 Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-1050-5014401,2-1051-379101,3-1010-3273 Parallel_Mode: none 1 row in set (0.00 sec) 
MariaDB [(none)]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: masterhost Master_User: maxscale Master_Port: 3306 Connect_Retry: 5 Master_Log_File: binary.000600 Read_Master_Log_Pos: 37801368 Relay_Log_File: tmsdb-relay-bin.001242 Relay_Log_Pos: 37801653 Relay_Master_Log_File: binary.000600 Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 37801368 Relay_Log_Space: 37801991 Until_Condition: None Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Master_Server_Id: 1050 Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-1050-5014401,2-1051-379101,3-1010-3273 Parallel_Mode: none 1 row in set (0.00 sec) 
Source Link
nelaaro
  • 767
  • 3
  • 10
  • 22

I found the following worked for me. This does not restore a slave into state that is an exact replica of master. There will be data differences. I will use pt-table-sync to fix those.

1. Restart Replication without GTID method
2. Stop Parallel slave threads
3. Enable GTID replication
4. Using percona-toolkit pt-slave-restart to skip past all the errors.

1. Restart Replication without GTID method Using master binglog position

CHANGE MASTER TO MASTER_HOST='12.34.56.789',MASTER_USER='slave_user', MASTER_PASSWORD='password', MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS= 107; 

This is well documented, Please google and find instructions.

2. Stop Parallel slave threads

This was part of the problem as seen in the original question.

ERROR 1966 (HY000): When using parallel replication and GTID with multiple replication domains, @@sql_slave_skip_counter cannot be used. Instead, setting @@gtid_slave_pos explicitly can be used to skip to after a given GTID position.

I want to be able to skip events and not worry about trying to figure out or increase the GTID position for everyone.

MariaDB [(none)]> stop slave; Query OK, 0 rows affected (0.35 sec) MariaDB [(none)]> set global slave_parallel_threads = 0; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> set global slave_parallel_mode = none; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> STart SLAVE; Query OK, 0 rows affected (0.00 sec) 

No if I check Parallel slave threads I see

MariaDB [(none)]> show slave status \G *************************** 1. row *************************** .......... Parallel_Mode: none 

I can reverse this process to re-enable Parallel slave threads when I am done. And I know that GTID is working.

3. Enable GTID replication

I can now try restarting the slave with GTID enabled.

On the master

MariaDB [(none)]> SHOW MASTER STATUS\G *************************** 1. row *************************** File: mariadb-bin.000001 Position: 510 Binlog_Do_DB: Binlog_Ignore_DB: 1 row in set (0.00 sec) SELECT BINLOG_GTID_POS('mariadb-bin.000001', 510); +--------------------------------------------+ | BINLOG_GTID_POS('mariadb-bin.000001', 510) | +--------------------------------------------+ | 1-101-1 | +--------------------------------------------+ 1 row in set (0.00 sec) 

On the slave

STOP SLAVE; SET GLOBAL gtid_slave_pos = '1-101-1'; CHANGE MASTER TO master_use_gtid=slave_pos; START SLAVE; 

Now when I check the slave it has some events to skip to get back into the same state as the master.

Last_Error: An attempt was made to binlog GTID 1-1050-5004291 which would create an out-of-order sequence number with existing GTID 1-1050-5004322, and gtid strict mode is enabled.

MariaDB [(none)]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Log_File: binary.000599 Read_Master_Log_Pos: 364810491 Relay_Log_File: tmsdb-relay-bin.001240 Relay_Log_Pos: 716 Relay_Master_Log_File: binary.000599 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1950 Last_Error: An attempt was made to binlog GTID 1-1050-5004291 which would create an out-of-order sequence number with existing GTID 1-1050-5004322, and gtid strict mode is enabled. Skip_Counter: 0 Exec_Master_Log_Pos: 286447058 Relay_Log_Space: 78364447 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1950 Last_SQL_Error: An attempt was made to binlog GTID 1-1050-5004291 which would create an out-of-order sequence number with existing GTID 1-1050-5004322, and gtid strict mode is enabled. Replicate_Ignore_Server_Ids: Master_Server_Id: 1050 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-1050-5005223,2-1051-379101,3-1010-3273 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: none 1 row in set (0.00 sec) 

4. Using percona-toolkit pt-slave-restart to skip past all the errors

sudo yum install http://www.percona.com/downloads/percona-release/redhat/0.1-4/percona-release-0.1-4.noarch.rpm sudo yum search percona-toolkit 

pt-slave-restart will skip all the events need to get the slave into a working state.

# pt-slave-restart 2017-12-22T13:39:59 tmsdb-relay-bin.001240 716 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 69702 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 97912 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 98144 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 363903 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 364135 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 712776 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 713008 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 759737 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 827932 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 828164 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 934851 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 952088 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 952320 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1084249 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1084481 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1351188 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1351420 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1621561 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1693920 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1711677 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1711909 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1880931 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1881163 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 1916544 1950 2017-12-22T13:40:00 tmsdb-relay-bin.001240 2124672 1950 2017-12-22T13:40:01 tmsdb-relay-bin.001240 2124904 1950 2017-12-22T13:40:01 tmsdb-relay-bin.001240 2125136 1950 2017-12-22T13:40:01 tmsdb-relay-bin.001240 2452030 1950 2017-12-22T13:40:01 tmsdb-relay-bin.001240 2452262 1950 2017-12-22T13:40:01 tmsdb-relay-bin.001240 2819749 1950 2017-12-22T13:40:01 tmsdb-relay-bin.001240 2819981 1950 

Now when I check my slave status

MariaDB [(none)]> show slave status \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 10.56.228.64 Master_User: maxscale Master_Port: 3306 Connect_Retry: 5 Master_Log_File: binary.000600 Read_Master_Log_Pos: 37801368 Relay_Log_File: tmsdb-relay-bin.001242 Relay_Log_Pos: 37801653 Relay_Master_Log_File: binary.000600 Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 37801368 Relay_Log_Space: 37801991 Until_Condition: None Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Master_Server_Id: 1050 Using_Gtid: Slave_Pos Gtid_IO_Pos: 1-1050-5014401,2-1051-379101,3-1010-3273 Parallel_Mode: none 1 row in set (0.00 sec) 

Lastly I need to restart the server and make sure it is reboot safe, etc.