功能:

1)master的故障切换(keepalived VIP的飘移)

2)主从复制角色的提升和重新转向

其中master 对外提供写服务,备选master2(实际的slave提供读服务,slave1和slave2也提供相关的读服务,一旦master1宕机,将会把备选的master2提升为新的master1,slave1和slave2指向新的master

3)MHA由两部分组成,MHA manager(管理节点)和MHA node(数据节点),MHA manager可以单独部署一台独立的机器上管理多个master-slave集群,也可以部署一台slave上。MHA node 运行在每台mysql服务器上及manager服务器上,MHA manager会定时探测集群中的master节点,当matser出现故障,它可以自动拥有最新数据的slave提升为新的master,然后将所有其他的slave重新指向提升的master.

5台机器

Master1主机192.168.30.25 server1     写

Master2 备主机192.168.30.24 server2     写

Slave1主机192.168.30.23 server3       读

Slave2主机192.168.30.21 server4       读

manager主机192.168.30.26 server5      监控复制组

配置所有主机名映射主机名一定要写清楚,因为后面实验会用到

[root@bogon ~]# vim /etc/hosts

192.168.30.25 server1

192.168.30.24 server2

192.168.30.23 server3

192.168.30.21 server4

192.168.30.26 server5

关闭所有主机防火墙

Systemctl stop firewalld

Setenforce 0

Iptables -F

Systemctl disadble firewalld

下载mha-manager 和mha-node

所有主机安装mha node 及相关perl依赖包

rpm -ivh epel-release-latest-7.noarch.rpm      epel源可以在阿里云镜像网站下载

yum install -y perl-DBD-MySQL.x86_64 perl-DBI.x86_64 perl-CPAN perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker

安装后检查是否全部安装

软件包 perl-DBD-MySQL-4.023-6.el7.x86_64 已安装并且是最新版本

软件包 perl-DBI-1.627-4.el7.x86_64 已安装并且是最新版本

软件包 perl-CPAN-1.9800-292.el7.noarch 已安装并且是最新版本

软件包 1:perl-ExtUtils-CBuilder-0.28.2.6-292.el7.noarch 已安装并且是最新版本

软件包 perl-ExtUtils-MakeMaker-6.68-3.el7.noarch 已安装并且是最新版本

无须任何处理

所有主机安装mha node

tar xf mha4mysql-node-0.56.tar.gz

cd mha4mysql-node-0.56/

perl Makefile.PL

make && make install

Mha node安装后在/usr/local/bin生成以下脚本

[root@bogon ~]# ls -l /usr/local/bin

总用量 40

-r-xr-xr-x. 1 root root 16346 4月  11 12:29 apply_diff_relay_logs

-r-xr-xr-x. 1 root root  4807 4月  11 12:29 filter_mysqlbinlog

-r-xr-xr-x. 1 root root  7401 4月  11 12:29 purge_relay_logs

-r-xr-xr-x. 1 root root  7395 4月  11 12:29 save_binary_logs

Server5机器安装mha manager 只需一台作为manager监控即可

[root@bogon ~]# yum install -y perl perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-DBI perl-Time-HiRes

之前时候安装会安装不上,需要rpm包,添进去就可以

有的时候本地yum仓库没有log包和perl-parallel,需要去联网阿里云的yum仓库,epel一定要放在/etc/yum.repos.d下,不然找不到包的位置

wget -O /etc/yum.repos.d/aliyun.repo https://mirrors.aliyun.com/repo/Centos-7.repo

[root@bogon ~]# rpm -ivh perl-Config-Tiny-2.14-7.el7.noarch.rpm 必须加上

安装mha manager软件包

tar xf mha4mysql-manager-0.56.tar.gz

cd mha4mysql-manager-0.56/

perl Makefile.PL

make && make install

安装后会有以下脚本文件

[root@bogon mha4mysql-manager-0.56]# ls -l /usr/local/bin

总用量 76

-r-xr-xr-x. 1 root root 16346 4月  11 12:29 apply_diff_relay_logs

-r-xr-xr-x. 1 root root  4807 4月  11 12:29 filter_mysqlbinlog

-r-xr-xr-x. 1 root root  1995 4月  11 13:31 masterha_check_repl

-r-xr-xr-x. 1 root root  1779 4月  11 13:31 masterha_check_ssh

-r-xr-xr-x. 1 root root  1865 4月  11 13:31 masterha_check_status

-r-xr-xr-x. 1 root root  3201 4月  11 13:31 masterha_conf_host

-r-xr-xr-x. 1 root root  2517 4月  11 13:31 masterha_manager

-r-xr-xr-x. 1 root root  2165 4月  11 13:31 masterha_master_monitor

-r-xr-xr-x. 1 root root  2373 4月  11 13:31 masterha_master_switch

-r-xr-xr-x. 1 root root  3879 4月  11 13:31 masterha_secondary_check

-r-xr-xr-x. 1 root root  1739 4月  11 13:31 masterha_stop

-r-xr-xr-x. 1 root root  7401 4月  11 12:29 purge_relay_logs

-r-xr-xr-x. 1 root root  7395 4月  11 12:29 save_binary_logs

配置SSH 秘钥对验证

服务器先生成一个秘钥对

把自己的公钥传给对方

[root@server5 ~]# ssh-keygen -t rsa

[root@server5 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.25

[root@server5 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.24

[root@server5 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.23

[root@server5 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.21

Server 5 (192.168.30.26)上

注意每个都需要进行测试,输入yes ,这样不影响故障切换,对每个主机号SSH控制

[root@server5 ~]# ssh server1

[root@server5 ~]# ssh server2

[root@server5 ~]# ssh server3

[root@server5 ~]# ssh server4

Master(192.168.30.25):

[root@server1 ~]# ssh-keygen -t rsa

[root@server1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.24

[root@server1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.23

[root@server1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.21

Master2(192.168.30.24):

[root@server2 ~]# ssh-keygen -t rsa

[root@server2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.25

[root@server2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.23

[root@server2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.21

[root@server2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.26

Slave1(192.168.30.23):

[root@server3 ~]# ssh-keygen -t rsa

[root@server3 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.25

[root@server3 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.24

[root@server3 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.21

[root@server2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.26

Slave2(192.168.30.21):

[root@server4 ~]# ssh-keygen -t rsa

[root@server4 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.25

[root@server4 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.24

[root@server4 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.23

[root@server2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.26

安装MySQL

25-24-23-21 主机都安装mysql

[root@server1~]# yum -y install mariadb mariadb-server mariadb-devel

[root@server1 ~]# systemctl start mariadb

设置数据库密码

[root@server1~]# mysqladmin -u root password 123456

[root@server1 ~]# mysql -u root -p123456

搭建主从复制环境

修改mysql 主机的配置文件

Master (192.168.30.25):

[mysqld]

server-id = 1

log-bin=master-bin

log-slave-updates=true

relay_log_purge=0

[root@server1 ~]# systemctl restart mariadb

Master2(192.168.30.24):

[mysqld]

server-id = 2

log-bin=master-bin

log-slave-updates=true

relay_log_purge=0

[root@server2 ~]# systemctl restart mariadb

Slave1(192.168.30.23):

[mysqld]

server-id = 3

log-bin=mysql-bin

relay-log=slave-relay-bin

log-slave-updates=true

relay_log-purge=0

[root@server3 ~]# systemctl restart mariadb

Slave2(192.168.30.21):

[mysqld]

server-id = 4

log-bin=mysql-bin

relay-log=slave-relay-bin

log-slave-updates=true

relay_log_purge=0

[root@server4 ~]# systemctl restart mariadb

Mysql 服务器都创建复制授权用户

MariaDB [(none)]> grant replication slave on *.* to 'repl'@'192.168.30.%'identified by '123456';

flush privileges;

查看主库备份时的binlog名称和位置

MariaDB [(none)]> show master status;

+-------------------+----------+--------------+------------------+

| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB |

+-------------------+----------+--------------+------------------+

| master-bin.000001 |      472 |              |                  |

+-------------------+----------+--------------+------------------+

给24-23-21从服务器    授权主master的ip:192.168.30.25 日志文件需要写master上的

MariaDB [(none)]> stop slave;

MariaDB [(none)]> change master to

-> master_host='192.168.30.25',

-> master_user='repl',

-> master_password='123456',

-> master_log_file='master-bin.000001',

-> master_log_pos=472;

MariaDB [(none)]> start slave;

MariaDB [(none)]> show slave status\G

并且为  yes

Yes

三台slave服务器设置read_only状态 (读)

从库对外只提供读服务,只所有没有写进mysql配置文件,是因为随时server2会提升为master

[root@server2 ~]# mysql -uroot -p123456 -e 'set global read_only=1'

[root@server3 ~]# mysql -u root -p123456 -e 'set global read_only=1'

[root@server4 ~]# mysql -u root -p123456 -e 'set global read_only=1'

创建监控用户(25-24-23-21 主机上操作)

MariaDB [(none)]> grant all privileges on *.* to 'root'@'192.168.30.%' identified by '123456';

MariaDB [(none)]> flush privileges;

为自己的主机名授权

MariaDB [(none)]> grant all privileges on *.* to 'root'@'server1' identified by '123456';

MariaDB [(none)]> flush privileges;

到这里整个mysql 主从集群环境已经搭建完毕

配置MHA环境

创建MHA的工作目录及相关配置文件

Server5(192.168.30.26):在软件包加压后的目录里面有样配置文件

修改app1.cnf配置文件

/usr/local/bin/master_ip_failover 脚本需要根据自己环境修改IP和网卡名称等

[root@server5 ~]# mkdir /etc/masterha

[root@server5 ~]# cp mha4mysql-manager-0.56/samples/conf/app1.cnf /etc/masterha/

[root@server5 ~]# vim /etc/masterha/app1.cnf

[server default]

manager_workdir=/var/log/masterha/app1

manager_log=/var/log/masterha/app1/manager.log

master_binlog_dir=/var/lib/mysql

master_ip_failover_script=/usr/local/bin/master_ip_failover

password=123456

user=root

ping_interval=1

remote_workdir=/tmp

repl_password=123456

repl_user=repl

[server1]

hostname=server1

port=3306

#candidate_master=1

[server2]

hostname=server2

candidate_master=1

port=3306

check_repl_delay=0

[server3]

hostname=server3

port=3306

[server4]

hostname=server4

port=3306

配置故障转移脚本

[root@server5 ~]# vim /usr/local/bin/master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL =>'all';

use Getopt::Long;

my (

$command,          $ssh_user,        $orig_master_host, $orig_master_ip,

$orig_master_port, $new_master_host, $new_master_ip,    $new_master_port

);

my $vip = '192.168.30.254';  # Virtual IP

my $key = "1";

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

$ssh_user="root";

GetOptions(

'command=s'          => \$command,

'ssh_user=s'         => \$ssh_user,

'orig_master_host=s' => \$orig_master_host,

'orig_master_ip=s'   => \$orig_master_ip,

'orig_master_port=i' => \$orig_master_port,

'new_master_host=s'  => \$new_master_host,

'new_master_ip=s'    => \$new_master_ip,

'new_master_port=i'  => \$new_master_port,

);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

# $orig_master_host, $orig_master_ip, $orig_master_port are passed.

@

# If you manage master ip address at global catalog database,

# invalidate orig_master_ip here.

my $exit_code = 1;

#eval {

#print "Disabling the VIP - $vip on old master: $orig_master_host\n";

#&stop_vip();

#           $exit_code = 0;

#        };

eval {

print "Disabling the VIP on old master: $orig_master_host \n";

#my $ping=`ping -c 1 10.0.0.13 |grep "packet loss" |awk -F',''{print $3}' |awk '{print $1}'`;

#if ($ping le "90.0%"&& $ping gt "0.0%" ){

#$exit_code = 0;

#}

#else {

& stop_vip();

# updating global catalog, etc

$exit_code = 0;

#}

};

if ($@) {

warn "Got Error: $@\n";

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "start" ) {

# all arguments are passed.

# If you manage master ip address at global catalog database,

# activate new_master_ip here.

# You can also grant write access (create user, set read_only=0, etc) here.

my $exit_code = 10;

eval {

print "Enabling the VIP - $vip on new master: $new_master_host \n";

&start_vip();

$exit_code = 0;

};

if ($@) {

warn $@;

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "status" ) {

print "Checking the Status of the script.. OK \n";

`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;

exit 0;

}

else {

&usage();

exit 1;

}

}

# A simple system call that enable the VIP on the new master

sub start_vip() {

`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

# A simple system call that disable the VIP on the old_master

sub stop_vip() {

`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub usage {

print

"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po

rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n";

}

[root@server5 ~]# chmod +x /usr/local/bin/master_ip_failover

设置从库relay log 的清除方式(24-23-21)

手动清除

mysql -u root -p123456 -e 'set global relay_log_purge=0;'

配置从库(24-23-21)relay_log清除脚本加入计划任务

[root@server2 ~]# vim purge_relay_log.sh

!/bin/bash

user=root

passwd=123456

port=3306

log_dir='/tmp'

work_dir='/tmp'

purge='/usr/local/bin/purge_relay_logs'

if [ ! -d $log_dir ]

then

mkdir $log_dir -p

fi

$purge --user=$user --password=$password --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1

[root@server2 ~]# chmod +x purge_relay_log.sh

[root@server2 ~]# crontab -e

0   4  *   *    *  /bin/bash /root/purgr_relay_log.sh

手动清除中继日志在从节点上

在从(24-23-21)

[root@server2 ~]# purge_relay_logs --user=root --password=123456 --disable_relay_log_purge --port=3306 --workdir=/tmp

检测MHA shh 通信状态

[root@server5 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf

[root@server5 ~]#  masterha_check_ssh --conf=/etc/masterha/app1.cnf

Sat Apr 13 19:42:47 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Sat Apr 13 19:42:47 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf..

Sat Apr 13 19:42:47 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf..

Sat Apr 13 19:42:47 2019 - [info] Starting SSH connection tests..

Sat Apr 13 19:42:51 2019 - [debug]

Sat Apr 13 19:42:48 2019 - [debug]  Connecting via SSH from root@server2(192.168.30.24:22) to root@server1(192.168.30.25:22)..

Sat Apr 13 19:42:49 2019 - [debug]   ok.

Sat Apr 13 19:42:49 2019 - [debug]  Connecting via SSH from root@server2(192.168.30.24:22) to root@server3(192.168.30.23:22)..

Sat Apr 13 19:42:50 2019 - [debug]   ok.

Sat Apr 13 19:42:50 2019 - [debug]  Connecting via SSH from root@server2(192.168.30.24:22) to root@server4(192.168.30.21:22)..

Sat Apr 13 19:42:51 2019 - [debug]   ok.

Sat Apr 13 19:42:51 2019 - [debug]

Sat Apr 13 19:42:47 2019 - [debug]  Connecting via SSH from root@server1(192.168.30.25:22) to root@server2(192.168.30.24:22)..

Sat Apr 13 19:42:48 2019 - [debug]   ok.

Sat Apr 13 19:42:48 2019 - [debug]  Connecting via SSH from root@server1(192.168.30.25:22) to root@server3(192.168.30.23:22)..

Sat Apr 13 19:42:49 2019 - [debug]   ok.

Sat Apr 13 19:42:49 2019 - [debug]  Connecting via SSH from root@server1(192.168.30.25:22) to root@server4(192.168.30.21:22)..

Sat Apr 13 19:42:50 2019 - [debug]   ok.

Sat Apr 13 19:42:51 2019 - [debug]

Sat Apr 13 19:42:48 2019 - [debug]  Connecting via SSH from root@server3(192.168.30.23:22) to root@server1(192.168.30.25:22)..

Sat Apr 13 19:42:50 2019 - [debug]   ok.

Sat Apr 13 19:42:50 2019 - [debug]  Connecting via SSH from root@server3(192.168.30.23:22) to root@server2(192.168.30.24:22)..

Sat Apr 13 19:42:50 2019 - [debug]   ok.

Sat Apr 13 19:42:50 2019 - [debug]  Connecting via SSH from root@server3(192.168.30.23:22) to root@server4(192.168.30.21:22)..

Sat Apr 13 19:42:51 2019 - [debug]   ok.

Sat Apr 13 19:42:52 2019 - [debug]

Sat Apr 13 19:42:49 2019 - [debug]  Connecting via SSH from root@server4(192.168.30.21:22) to root@server1(192.168.30.25:22)..

Sat Apr 13 19:42:50 2019 - [debug]   ok.

Sat Apr 13 19:42:50 2019 - [debug]  Connecting via SSH from root@server4(192.168.30.21:22) to root@server2(192.168.30.24:22)..

Sat Apr 13 19:42:51 2019 - [debug]   ok.

Sat Apr 13 19:42:51 2019 - [debug]  Connecting via SSH from root@server4(192.168.30.21:22) to root@server3(192.168.30.23:22)..

Sat Apr 13 19:42:52 2019 - [debug]   ok.

Sat Apr 13 19:42:52 2019 - [info] All SSH connection tests passed successfully.

检查整个集群的状态

[root@server5 ~]#  masterha_check_repl --conf=/etc/masterha/app1.cnf

[root@server5 ~]#  masterha_check_repl --conf=/etc/masterha/app1.cnf

Sat Apr 13 20:05:46 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Sat Apr 13 20:05:46 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf..

Sat Apr 13 20:05:46 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf..

Sat Apr 13 20:05:46 2019 - [info] MHA::MasterMonitor version 0.56.

Sat Apr 13 20:05:47 2019 - [info] Dead Servers:

Sat Apr 13 20:05:47 2019 - [info] Alive Servers:

Sat Apr 13 20:05:47 2019 - [info]   server1(192.168.30.25:3306)

Sat Apr 13 20:05:47 2019 - [info]   server2(192.168.30.24:3306)

Sat Apr 13 20:05:47 2019 - [info]   server3(192.168.30.23:3306)

Sat Apr 13 20:05:47 2019 - [info]   server4(192.168.30.21:3306)

Sat Apr 13 20:05:47 2019 - [info] Alive Slaves:

Sat Apr 13 20:05:47 2019 - [info]   server2(192.168.30.24:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 20:05:47 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 20:05:47 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Sat Apr 13 20:05:47 2019 - [info]   server3(192.168.30.23:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 20:05:47 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 20:05:47 2019 - [info]   server4(192.168.30.21:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 20:05:47 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 20:05:47 2019 - [info] Current Alive Master: server1(192.168.30.25:3306)

Sat Apr 13 20:05:47 2019 - [info] Checking slave configurations..

Sat Apr 13 20:05:47 2019 - [warning]  relay_log_purge=0 is not set on slave server2(192.168.30.24:3306).

Sat Apr 13 20:05:47 2019 - [warning]  relay_log_purge=0 is not set on slave server3(192.168.30.23:3306).

Sat Apr 13 20:05:47 2019 - [warning]  relay_log_purge=0 is not set on slave server4(192.168.30.21:3306).

Sat Apr 13 20:05:47 2019 - [info] Checking replication filtering settings..

Sat Apr 13 20:05:47 2019 - [info]  binlog_do_db= , binlog_ignore_db=

Sat Apr 13 20:05:47 2019 - [info]  Replication filtering check ok.

Sat Apr 13 20:05:47 2019 - [info] Starting SSH connection tests..

Sat Apr 13 20:05:53 2019 - [info] All SSH connection tests passed successfully.

Sat Apr 13 20:05:53 2019 - [info] Checking MHA Node version..

Sat Apr 13 20:05:54 2019 - [info]  Version check ok.

Sat Apr 13 20:05:54 2019 - [info] Checking SSH publickey authentication settings on the current master..

Sat Apr 13 20:05:54 2019 - [info] HealthCheck: SSH to server1 is reachable.

Sat Apr 13 20:05:54 2019 - [info] Master MHA Node version is 0.56.

Sat Apr 13 20:05:54 2019 - [info] Checking recovery script configurations on the current master..

Sat Apr 13 20:05:54 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000002

Sat Apr 13 20:05:54 2019 - [info]   Connecting to root@server1(server1)..

Creating /tmp if not exists..    ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to master-bin.000002

Sat Apr 13 20:05:55 2019 - [info] Master setting check done.

Sat Apr 13 20:05:55 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Sat Apr 13 20:05:55 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server2 --slave_ip=192.168.30.24 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 20:05:55 2019 - [info]   Connecting to root@192.168.30.24(server2:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000002

Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 20:05:55 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server3 --slave_ip=192.168.30.23 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 20:05:55 2019 - [info]   Connecting to root@192.168.30.23(server3:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 20:05:56 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server4 --slave_ip=192.168.30.21 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 20:05:56 2019 - [info]   Connecting to root@192.168.30.21(server4:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 20:05:56 2019 - [info] Slaves settings check done.

Sat Apr 13 20:05:56 2019 - [info]

server1 (current master)

+--server2

+--server3

+--server4

Sat Apr 13 20:05:56 2019 - [info] Checking replication health on server2..

Sat Apr 13 20:05:56 2019 - [info]  ok.

Sat Apr 13 20:05:56 2019 - [info] Checking replication health on server3..

Sat Apr 13 20:05:56 2019 - [info]  ok.

Sat Apr 13 20:05:56 2019 - [info] Checking replication health on server4..

Sat Apr 13 20:05:56 2019 - [info]  ok.

Sat Apr 13 20:05:56 2019 - [info] Checking master_ip_failover_script status:

Sat Apr 13 20:05:56 2019 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=server1 --orig_master_ip=192.168.30.25 --orig_master_port=3306

Checking the Status of the script.. OK

Sat Apr 13 20:05:56 2019 - [info]  OK.

Sat Apr 13 20:05:56 2019 - [warning] shutdown_script is not defined.

Sat Apr 13 20:05:56 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

VIP配置管理

打开在前面编辑的文件/etc/masterha/app1.cnf检查如下行是否正确,再检查集群状态

[root@server5 ~]# grep -n 'master_ip_failover_script' /etc/masterha/app1.cnf

5:master_ip_failover_script=/usr/local/bin/master_ip_failover

Master1(192.168.30.25)

[root@server1 ~]# ip a |grep ens33

2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

inet 192.168.30.25/24 brd 192.168.30.255 scope global noprefixroute ens33

inet 192.168.30.254/24 brd 192.168.30.255 scope global secondary ens33:1

Server5(192.168.30.26)修改故障转移脚本

[root@server5 ~]# head -15 /usr/local/bin/master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL =>'all';

use Getopt::Long;

my (

$command,          $ssh_user,        $orig_master_host, $orig_master_ip,

$orig_master_port, $new_master_host, $new_master_ip,    $new_master_port

);

my $vip = '192.168.30.254';  # Virtual IP

my $key = "1";

my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";

/usr/local/bin/master_ip_failover 文件的内容意思当主库发生故障时,会触发MHA切换

。MHA manager 会停掉主库的ens33:1接口,触发虚拟IP飘移到备选从库,从而完成切换

Server5 (192.168.30.26),检查manager 状态

[root@server5 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf

app1 is stopped(2:NOT_RUNNING).

如果正常会显示ping OK,否则会显示not_running,代表MHA 监控没有开启

Server5 (192.168.30.26) 开启manager 监控

--remove_dead_master_conf 代表党发送主从切换后,老的主库的IP将会从配置文件中移除

[root@server5 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover< /dev/null > /var/log/masterha/app1/manager.log 2>&1&

[1] 10458

Server5(192.168.30.26) 查看server5 监控是否正常

[root@server5 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf

app1 (pid:10458) is running(0:PING_OK), master:server1

可以看见已经在监控了

Server5(192.168.30.26)查看启动日志

[root@server5 ~]# cat /var/log/masterha/app1/manager.log

Sat Apr 13 20:27:40 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Sat Apr 13 20:27:40 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf..

Sat Apr 13 20:27:40 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf..

Sat Apr 13 20:27:40 2019 - [info] MHA::MasterMonitor version 0.56.

Sat Apr 13 20:27:41 2019 - [info] Dead Servers:

Sat Apr 13 20:27:41 2019 - [info] Alive Servers:

Sat Apr 13 20:27:41 2019 - [info]   server1(192.168.30.25:3306)

Sat Apr 13 20:27:41 2019 - [info]   server2(192.168.30.24:3306)

Sat Apr 13 20:27:41 2019 - [info]   server3(192.168.30.23:3306)

Sat Apr 13 20:27:41 2019 - [info]   server4(192.168.30.21:3306)

Sat Apr 13 20:27:41 2019 - [info] Alive Slaves:

Sat Apr 13 20:27:41 2019 - [info]   server2(192.168.30.24:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 20:27:41 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 20:27:41 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Sat Apr 13 20:27:41 2019 - [info]   server3(192.168.30.23:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 20:27:41 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 20:27:41 2019 - [info]   server4(192.168.30.21:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 20:27:41 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 20:27:41 2019 - [info] Current Alive Master: server1(192.168.30.25:3306)

Sat Apr 13 20:27:41 2019 - [info] Checking slave configurations..

Sat Apr 13 20:27:41 2019 - [warning]  relay_log_purge=0 is not set on slave server2(192.168.30.24:3306).

Sat Apr 13 20:27:41 2019 - [warning]  relay_log_purge=0 is not set on slave server3(192.168.30.23:3306).

Sat Apr 13 20:27:41 2019 - [warning]  relay_log_purge=0 is not set on slave server4(192.168.30.21:3306).

Sat Apr 13 20:27:41 2019 - [info] Checking replication filtering settings..

Sat Apr 13 20:27:41 2019 - [info]  binlog_do_db= , binlog_ignore_db=

Sat Apr 13 20:27:41 2019 - [info]  Replication filtering check ok.

Sat Apr 13 20:27:41 2019 - [info] Starting SSH connection tests..

Sat Apr 13 20:27:46 2019 - [info] All SSH connection tests passed successfully.

Sat Apr 13 20:27:46 2019 - [info] Checking MHA Node version..

Sat Apr 13 20:27:47 2019 - [info]  Version check ok.

Sat Apr 13 20:27:47 2019 - [info] Checking SSH publickey authentication settings on the current master..

Sat Apr 13 20:27:48 2019 - [info] HealthCheck: SSH to server1 is reachable.

Sat Apr 13 20:27:48 2019 - [info] Master MHA Node version is 0.56.

Sat Apr 13 20:27:48 2019 - [info] Checking recovery script configurations on the current master..

Sat Apr 13 20:27:48 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000002

Sat Apr 13 20:27:48 2019 - [info]   Connecting to root@server1(server1)..

Creating /tmp if not exists..    ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to master-bin.000002

Sat Apr 13 20:27:48 2019 - [info] Master setting check done.

Sat Apr 13 20:27:48 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Sat Apr 13 20:27:48 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server2 --slave_ip=192.168.30.24 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 20:27:48 2019 - [info]   Connecting to root@192.168.30.24(server2:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000002

Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 20:27:49 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server3 --slave_ip=192.168.30.23 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 20:27:49 2019 - [info]   Connecting to root@192.168.30.23(server3:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 20:27:49 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server4 --slave_ip=192.168.30.21 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 20:27:49 2019 - [info]   Connecting to root@192.168.30.21(server4:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 20:27:50 2019 - [info] Slaves settings check done.

Sat Apr 13 20:27:50 2019 - [info]

server1 (current master)

+--server2

+--server3

+--server4

Sat Apr 13 20:27:50 2019 - [info] Checking master_ip_failover_script status:

Sat Apr 13 20:27:50 2019 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=server1 --orig_master_ip=192.168.30.25 --orig_master_port=3306

Checking the Status of the script.. OK

Sat Apr 13 20:27:50 2019 - [info]  OK.

Sat Apr 13 20:27:50 2019 - [warning] shutdown_script is not defined.

Sat Apr 13 20:27:50 2019 - [info] Set master ping interval 1 seconds.

Sat Apr 13 20:27:50 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Sat Apr 13 20:27:50 2019 - [info] Starting ping health check on server1(192.168.30.25:3306)..

Sat Apr 13 20:27:50 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

注意其中ping succeeded waiting until MYSQL doesn’t respond 说明整个系统已经开始监控了

关闭MHA manager 监控,忽略操作

Masterha_stop --conf=/etc/masterha/app1.cnf

发现已经将VIP:192.168.30.254 绑定在网卡ens33上

[root@server1 ~]# ip a |grep ens33

2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

inet 192.168.30.25/24 brd 192.168.30.255 scope global noprefixroute ens33

inet 192.168.30.254/24 brd 192.168.30.255 scope global secondary ens33:1

Master(192.168.30.25) 模拟主库故障

[root@server1 ~]# systemctl stop mariadb

[root@server1 ~]# ip a | grep ens33

2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

inet 192.168.30.25/24 brd 192.168.30.255 scope global noprefixroute ens33

查看slave1 (192.168.30.23)状态 已经切换到master2备用主上(192.168.30.24)

MariaDB [(none)]> show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.30.24

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: master-bin.000003

Read_Master_Log_Pos: 2472

Relay_Log_File: slave-relay-bin.000002

Relay_Log_Pos: 530

Relay_Master_Log_File: master-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 2472

Relay_Log_Space: 824

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 2

1 row in set (0.00 sec)

查看slave2(192.168.30.21)状态 已经切换到master2备用主上(192.168.30.24)

MariaDB [(none)]> show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.30.24

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: master-bin.000003

Read_Master_Log_Pos: 2472

Relay_Log_File: slave-relay-bin.000002

Relay_Log_Pos: 530

Relay_Master_Log_File: master-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 2472

Relay_Log_Space: 824

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 2

1 row in set (0.00 sec)

Server5(192.168.30.26)监控已经自动关闭

^C[1]+  完成                  nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1

Server(192.168.30.26) 查看监控配置文件已经发生了变化(server1的配置已被删除)

[root@server5 ~]# cat /etc/masterha/app1.cnf

[server default]

manager_log=/var/log/masterha/app1/manager.log

manager_workdir=/var/log/masterha/app1

master_binlog_dir=/var/lib/mysql

master_ip_failover_script=/usr/local/bin/master_ip_failover

password=123456

ping_interval=1

remote_workdir=/tmp

repl_password=123456

repl_user=repl

user=root

[server2]

candidate_master=1

check_repl_delay=0

hostname=server2

port=3306

[server3]

hostname=server3

port=3306

[server4]

hostname=server4

port=3306

Server5(192.168.30.25) 故障切换过程中的日志文件内容如下

[root@server5 ~]# tail -f /var/log/masterha/app1/manager.log

Sat Apr 13 20:59:11 2019 - [info] Checking master_ip_failover_script status:

Sat Apr 13 20:59:11 2019 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=server2 --orig_master_ip=192.168.30.24 --orig_master_port=3306

Checking the Status of the script.. OK

Sat Apr 13 20:59:11 2019 - [info]  OK.

Sat Apr 13 20:59:11 2019 - [warning] shutdown_script is not defined.

Sat Apr 13 20:59:11 2019 - [info] Set master ping interval 1 seconds.

Sat Apr 13 20:59:11 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Sat Apr 13 20:59:11 2019 - [info] Starting ping health check on server2(192.168.30.24:3306)..

Sat Apr 13 20:59:11 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

Sat Apr 13 21:00:10 2019 - [info] Got terminate signal. Exit.

故障主库修复及vip 切回测试

Master(192.168.30.25):

[root@server1 ~]# systemctl start mariadb

[root@server1 ~]# netstat -anpt |grep :3306

tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      7435/mysqld

Master (192.168.30.25)指向新的主库

[root@server1 ~]# mysql -u root -p123456

Welcome to the MariaDB monitor.  Commands end with ; or \g.

Your MariaDB connection id is 2

Server version: 5.5.56-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> stop slave;

Query OK, 0 rows affected, 1 warning (0.00 sec)

MariaDB [(none)]> change master to

-> master_host='192.168.30.24',

-> master_user='repl',

-> master_password='123456';

Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> start slave;

Query OK, 0 rows affected (0.12 sec)

MariaDB [(none)]> show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.30.24

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: master-bin.000003

Read_Master_Log_Pos: 2472

Relay_Log_File: mariadb-relay-bin.000004

Relay_Log_Pos: 1421

Relay_Master_Log_File: master-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 2472

Relay_Log_Space: 2002

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 2

1 row in set (0.00 sec)

Server5(192.168.30.26) 修改监控配置文件添加server1配置

[server1]

hostname=server1

port=3306

Server5(192.168.30.26) 检测集群状态

[root@server5 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

Sat Apr 13 21:25:17 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Sat Apr 13 21:25:17 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf..

Sat Apr 13 21:25:17 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf..

Sat Apr 13 21:25:17 2019 - [info] MHA::MasterMonitor version 0.56.

Sat Apr 13 21:25:26 2019 - [info] Dead Servers:

Sat Apr 13 21:25:26 2019 - [info] Alive Servers:

Sat Apr 13 21:25:26 2019 - [info]   server1(192.168.30.25:3306)

Sat Apr 13 21:25:26 2019 - [info]   server2(192.168.30.24:3306)

Sat Apr 13 21:25:26 2019 - [info]   server3(192.168.30.23:3306)

Sat Apr 13 21:25:26 2019 - [info]   server4(192.168.30.21:3306)

Sat Apr 13 21:25:26 2019 - [info] Alive Slaves:

Sat Apr 13 21:25:26 2019 - [info]   server1(192.168.30.25:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 21:25:26 2019 - [info]     Replicating from 192.168.30.24(192.168.30.24:3306)

Sat Apr 13 21:25:26 2019 - [info]   server3(192.168.30.23:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 21:25:26 2019 - [info]     Replicating from 192.168.30.24(192.168.30.24:3306)

Sat Apr 13 21:25:26 2019 - [info]   server4(192.168.30.21:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 21:25:26 2019 - [info]     Replicating from 192.168.30.24(192.168.30.24:3306)

Sat Apr 13 21:25:26 2019 - [info] Current Alive Master: server2(192.168.30.24:3306)

Sat Apr 13 21:25:26 2019 - [info] Checking slave configurations..

Sat Apr 13 21:25:26 2019 - [info]  read_only=1 is not set on slave server1(192.168.30.25:3306).

Sat Apr 13 21:25:26 2019 - [warning]  relay_log_purge=0 is not set on slave server1(192.168.30.25:3306).

Sat Apr 13 21:25:26 2019 - [warning]  relay_log_purge=0 is not set on slave server3(192.168.30.23:3306).

Sat Apr 13 21:25:26 2019 - [warning]  relay_log_purge=0 is not set on slave server4(192.168.30.21:3306).

Sat Apr 13 21:25:26 2019 - [info] Checking replication filtering settings..

Sat Apr 13 21:25:26 2019 - [info]  binlog_do_db= , binlog_ignore_db=

Sat Apr 13 21:25:26 2019 - [info]  Replication filtering check ok.

Sat Apr 13 21:25:26 2019 - [info] Starting SSH connection tests..

Sat Apr 13 21:25:31 2019 - [info] All SSH connection tests passed successfully.

Sat Apr 13 21:25:31 2019 - [info] Checking MHA Node version..

Sat Apr 13 21:25:32 2019 - [info]  Version check ok.

Sat Apr 13 21:25:32 2019 - [info] Checking SSH publickey authentication settings on the current master..

Sat Apr 13 21:25:32 2019 - [info] HealthCheck: SSH to server2 is reachable.

Sat Apr 13 21:25:33 2019 - [info] Master MHA Node version is 0.56.

Sat Apr 13 21:25:33 2019 - [info] Checking recovery script configurations on the current master..

Sat Apr 13 21:25:33 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000003

Sat Apr 13 21:25:33 2019 - [info]   Connecting to root@server2(server2)..

Creating /tmp if not exists..    ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to master-bin.000003

Sat Apr 13 21:25:33 2019 - [info] Master setting check done.

Sat Apr 13 21:25:33 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Sat Apr 13 21:25:33 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server1 --slave_ip=192.168.30.25 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 21:25:33 2019 - [info]   Connecting to root@192.168.30.25(server1:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000004

Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000004

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 21:25:34 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server3 --slave_ip=192.168.30.23 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 21:25:34 2019 - [info]   Connecting to root@192.168.30.23(server3:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 21:25:34 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server4 --slave_ip=192.168.30.21 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 21:25:34 2019 - [info]   Connecting to root@192.168.30.21(server4:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 21:25:35 2019 - [info] Slaves settings check done.

Sat Apr 13 21:25:35 2019 - [info]

server2 (current master)

+--server1

+--server3

+--server4

Sat Apr 13 21:25:35 2019 - [info] Checking replication health on server1..

Sat Apr 13 21:25:35 2019 - [info]  ok.

Sat Apr 13 21:25:35 2019 - [info] Checking replication health on server3..

Sat Apr 13 21:25:35 2019 - [info]  ok.

Sat Apr 13 21:25:35 2019 - [info] Checking replication health on server4..

Sat Apr 13 21:25:35 2019 - [info]  ok.

Sat Apr 13 21:25:35 2019 - [info] Checking master_ip_failover_script status:

Sat Apr 13 21:25:35 2019 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=server2 --orig_master_ip=192.168.30.24 --orig_master_port=3306

Checking the Status of the script.. OK

Sat Apr 13 21:25:35 2019 - [info]  OK.

Sat Apr 13 21:25:35 2019 - [warning] shutdown_script is not defined.

Sat Apr 13 21:25:35 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

Server5(192.168.30.26) 开启监控

[root@server5 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover< /dev/null > /var/log/masterha/app1/manager.log 2>&1&

[2] 14177

Master(192.168.30.24) 关闭现在主库mysql

[root@server2 ~]# systemctl stop mariadb

[root@server2 ~]# netstat -anpt |grep :3306

Master(192.168.30.21) 发现关了第二个master,vip就会自动分配到原来的master1上

[root@server1 ~]# ip a |grep ens33

2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

inet 192.168.30.25/24 brd 192.168.30.255 scope global noprefixroute ens33

inet 192.168.30.254/24 brd 192.168.30.255 scope global secondary ens33:1

Slave 1(192.168.30.23)状态

MariaDB [(none)]> show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.30.25

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: master-bin.000003

Read_Master_Log_Pos: 1807

Relay_Log_File: slave-relay-bin.000002

Relay_Log_Pos: 530

Relay_Master_Log_File: master-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 1807

Relay_Log_Space: 824

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 1

1 row in set (0.00 sec)

Slave2(192.168.30.24)状态

MariaDB [(none)]> show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.30.25

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: master-bin.000003

Read_Master_Log_Pos: 1807

Relay_Log_File: slave-relay-bin.000002

Relay_Log_Pos: 530

Relay_Master_Log_File: master-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 1807

Relay_Log_Space: 824

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 1

1 row in set (0.00 sec)

Server5 (192.168.30.26)配置文件变化,(已经移除故障件server2配置)

[server default]

manager_log=/var/log/masterha/app1/manager.log

manager_workdir=/var/log/masterha/app1

master_binlog_dir=/var/lib/mysql

master_ip_failover_script=/usr/local/bin/master_ip_failover

password=123456

ping_interval=1

remote_workdir=/tmp

repl_password=123456

repl_user=repl

user=root

[server1]

hostname=server1

port=3306

[server3]

hostname=server3

port=3306

[server4]

hostname=server4

port=3306

Server5(192.168.30.25)监控日志

[root@server5 ~]# tail -f /var/log/masterha/app1/manager.log

Selected server1 as a new master.

server1: OK: Applying all logs succeeded.

server1: OK: Activated master IP address.

server3: This host has the latest relay log events.

server4: This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

server4: OK: Applying all logs succeeded. Slave started, replicating from server1.

server3: OK: Applying all logs succeeded. Slave started, replicating from server1.

server1: Resetting slave info succeeded.

Master failover to server1(192.168.30.25:3306) completed successfully.

修复master(192.168.30.24)主机

[root@server2 ~]# systemctl start mariadb

[root@server2 ~]# netstat -anpt |grep :3306

tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      8451/mysqld

Master(192.1968.30.24)指向新的主库

[root@server2 ~]# mysql -u root -p123456

MariaDB [(none)]> stop slave;

Query OK, 0 rows affected, 1 warning (0.00 sec)

MariaDB [(none)]> change master to

-> master_host='192.168.30.25',

-> master_user='repl',

-> master_password='123456';

Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> start slave;

Query OK, 0 rows affected (0.00 sec)

MariaDB [(none)]> show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.30.25

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: master-bin.000003

Read_Master_Log_Pos: 1807

Relay_Log_File: mariadb-relay-bin.000004

Relay_Log_Pos: 530

Relay_Master_Log_File: master-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 1807

Relay_Log_Space: 2447

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 1

Server5(192.168.30.26)修改监控配置文件添加server2配置

[server2]

hostname=server2

candidate_master=1

port=3306

check_repl_delay=0

Server5(192.168.30.26) 检查集群状态

[root@server5 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

Sat Apr 13 21:51:01 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Sat Apr 13 21:51:01 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf..

Sat Apr 13 21:51:01 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf..

Sat Apr 13 21:51:01 2019 - [info] MHA::MasterMonitor version 0.56.

Sat Apr 13 21:51:02 2019 - [info] Dead Servers:

Sat Apr 13 21:51:02 2019 - [info] Alive Servers:

Sat Apr 13 21:51:02 2019 - [info]   server1(192.168.30.25:3306)

Sat Apr 13 21:51:02 2019 - [info]   server2(192.168.30.24:3306)

Sat Apr 13 21:51:02 2019 - [info]   server3(192.168.30.23:3306)

Sat Apr 13 21:51:02 2019 - [info]   server4(192.168.30.21:3306)

Sat Apr 13 21:51:02 2019 - [info] Alive Slaves:

Sat Apr 13 21:51:02 2019 - [info]   server2(192.168.30.24:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 21:51:02 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 21:51:02 2019 - [info]     Primary candidate for the new Master (candidate_master is set)

Sat Apr 13 21:51:02 2019 - [info]   server3(192.168.30.23:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 21:51:02 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 21:51:02 2019 - [info]   server4(192.168.30.21:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled

Sat Apr 13 21:51:02 2019 - [info]     Replicating from 192.168.30.25(192.168.30.25:3306)

Sat Apr 13 21:51:02 2019 - [info] Current Alive Master: server1(192.168.30.25:3306)

Sat Apr 13 21:51:02 2019 - [info] Checking slave configurations..

Sat Apr 13 21:51:02 2019 - [info]  read_only=1 is not set on slave server2(192.168.30.24:3306).

Sat Apr 13 21:51:02 2019 - [warning]  relay_log_purge=0 is not set on slave server2(192.168.30.24:3306).

Sat Apr 13 21:51:02 2019 - [warning]  relay_log_purge=0 is not set on slave server3(192.168.30.23:3306).

Sat Apr 13 21:51:02 2019 - [warning]  relay_log_purge=0 is not set on slave server4(192.168.30.21:3306).

Sat Apr 13 21:51:02 2019 - [info] Checking replication filtering settings..

Sat Apr 13 21:51:02 2019 - [info]  binlog_do_db= , binlog_ignore_db=

Sat Apr 13 21:51:02 2019 - [info]  Replication filtering check ok.

Sat Apr 13 21:51:02 2019 - [info] Starting SSH connection tests..

Sat Apr 13 21:51:10 2019 - [info] All SSH connection tests passed successfully.

Sat Apr 13 21:51:10 2019 - [info] Checking MHA Node version..

Sat Apr 13 21:51:12 2019 - [info]  Version check ok.

Sat Apr 13 21:51:12 2019 - [info] Checking SSH publickey authentication settings on the current master..

Sat Apr 13 21:51:13 2019 - [info] HealthCheck: SSH to server1 is reachable.

Sat Apr 13 21:51:13 2019 - [info] Master MHA Node version is 0.56.

Sat Apr 13 21:51:13 2019 - [info] Checking recovery script configurations on the current master..

Sat Apr 13 21:51:13 2019 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000003

Sat Apr 13 21:51:13 2019 - [info]   Connecting to root@server1(server1)..

Creating /tmp if not exists..    ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to master-bin.000003

Sat Apr 13 21:51:13 2019 - [info] Master setting check done.

Sat Apr 13 21:51:13 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Sat Apr 13 21:51:13 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server2 --slave_ip=192.168.30.24 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 21:51:13 2019 - [info]   Connecting to root@192.168.30.24(server2:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000004

Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000004

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 21:51:14 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server3 --slave_ip=192.168.30.23 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 21:51:14 2019 - [info]   Connecting to root@192.168.30.23(server3:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 21:51:14 2019 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=server4 --slave_ip=192.168.30.21 --slave_port=3306 --workdir=/tmp --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx

Sat Apr 13 21:51:14 2019 - [info]   Connecting to root@192.168.30.21(server4:22)..

Checking slave recovery environment settings..

Opening /var/lib/mysql/relay-log.info ... ok.

Relay log found at /var/lib/mysql, up to slave-relay-bin.000002

Temporary relay log file is /var/lib/mysql/slave-relay-bin.000002

Testing mysql connection and privileges.. done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Sat Apr 13 21:51:15 2019 - [info] Slaves settings check done.

Sat Apr 13 21:51:15 2019 - [info]

server1 (current master)

+--server2

+--server3

+--server4

Sat Apr 13 21:51:15 2019 - [info] Checking replication health on server2..

Sat Apr 13 21:51:15 2019 - [info]  ok.

Sat Apr 13 21:51:15 2019 - [info] Checking replication health on server3..

Sat Apr 13 21:51:15 2019 - [info]  ok.

Sat Apr 13 21:51:15 2019 - [info] Checking replication health on server4..

Sat Apr 13 21:51:15 2019 - [info]  ok.

Sat Apr 13 21:51:15 2019 - [info] Checking master_ip_failover_script status:

Sat Apr 13 21:51:15 2019 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=server1 --orig_master_ip=192.168.30.25 --orig_master_port=3306

Checking the Status of the script.. OK

Sat Apr 13 21:51:15 2019 - [info]  OK.

Sat Apr 13 21:51:15 2019 - [warning] shutdown_script is not defined.

Sat Apr 13 21:51:15 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

实验完成

最新文章

  1. 离线安装swashbuckle(webapi自动文档及测试工具)
  2. CONTAINING_RECORD的实现
  3. EasyUI-DataGrid之批量删除
  4. 几个makefile小例子
  5. 边坡优化主题5——bzoj 1096 [ZJOI2007]仓库建设 解决问题的方法
  6. C++中内存泄漏的检测方法介绍
  7. Check SQL Server Deadlock
  8. python笔记26-编码规范层级目录
  9. windows刷新本机DNS缓存
  10. 高手进阶,终极内存技术指南——完整/进阶版 II (转)【转】
  11. npm太慢, 修改npm镜像
  12. java-IO流-字节流-概述及分类、FileInputStream、FileOutputStream、available()方法、定义小数组、BufferedInputStream、BufferedOutputStream、flush和close方法的区别、流的标准处理异常代码
  13. 使用nosql实现页面静态化的一个小案列
  14. 七种bond模式说明
  15. Touch事件传递机制 Android
  16. mysql asyn 示例
  17. N个不同球取出M个的组合个数求解
  18. Oracle Schema Objects——伪列ROWID Pseudocolumn(ROWNUM、ROWID)
  19. 5、RabbitMQ-订阅模式 Publish/Subscribe
  20. linux创建lvm分区

热门文章

  1. Spring MVC JSON自己定义类型转换(续)
  2. 关于HuffmanCoding的简单分析
  3. zoj3605 Find the Marble --- 概率dp
  4. zedboard 流水灯
  5. Java 实现简答的单链表的功能
  6. ios5--计算器
  7. Algorithmic Crush
  8. Opencv打开内置摄像头
  9. 第四周 Leetcode 124. Binary Tree Maximum Path Sum (HARD)
  10. pandas删除满足特定列信息的行记录