mysql-MHA 故障收集
2024-10-21 16:03:08
在manager 主机上开启监控服务,启动不了
[root@manager ~]# managerStart
[]
[root@manager ~]# managerStatus
app1 is stopped(:NOT_RUNNING).
[]+ Exit nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log >&1
#说明: 这里我对启动服务的命令做了 别名命令。
#查看日志 发现有这么一句话:
Sun Mar :: - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln781] Multi-master configuration is detected,
but two or more masters are either writable (read-only is not set) or dead! Check configurations for details. Master configurations are as below:
Master 10.0.0.50(10.0.0.50:)
Master 10.0.0.60(10.0.0.60:), replicating from 10.0.0.50(10.0.0.50:)
这句话的大概意思,有两个成为主,而且两个都可写,按照原则同一时间只能有一台主机可以数据写入,不然可能会造成数据不一致的灾难性故障!
在10.0.0.60 上开启mysql设置开启只读
mysql -e 'set global read_only=1'
设置完,还没完依旧开启不了这个监控程序,错误依旧存在
Sun Mar :: - [info] Multi-master configuration is detected. Current primary(writable) master is 10.0.0.50(10.0.0.50:)
Sun Mar :: - [info] Master configurations are as below:
Master 10.0.0.50(10.0.0.50:)
Master 10.0.0.60(10.0.0.60:), replicating from 10.0.0.50(10.0.0.50:), read-only Sun Mar :: - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln726] Slave 10.0.0.70(10.0.0.70:) replicates from 10.0.0.60:, but real master is 10.0.0.50(10.0.0.50:)!
Sun Mar :: - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/local/share/perl5/MHA/MasterMonitor.pm line
Sun Mar :: - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Sun Mar :: - [info] Got exit code (Not master dead).
分析了 下,为什么会出现两个master呢? 因为之前模拟master宕机故障之后,vip飘到60并且60主机被提升为主,70主机本来是50主机的小弟,现在成为了60主机的小弟,这就导致了出现两个master,
为了验证我这样的猜想,我强行设置,70跟随50 混,就change master to 指定 主机是50 什么位置信息和binlog文件也是50主机的信息
( ̄▽ ̄)"哈哈,猜中。。。开森了下。。
[root@manager ~]# managerStatus
app1 monitoring program is now on initialization phase(:INITIALIZING_MONITOR). Wait for a while and try checking again.
[root@manager ~]# managerStatus
app1 (pid:) is running(:PING_OK), master:10.0.0.50
Sun Mar :: - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Mar :: - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sun Mar :: - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sun Mar :: - [info] MHA::MasterMonitor version 0.56.
Sun Mar :: - [info] GTID failover mode =
Sun Mar :: - [info] Dead Servers:
Sun Mar :: - [info] Alive Servers:
Sun Mar :: - [info] 10.0.0.50(10.0.0.50:)
Sun Mar :: - [info] 10.0.0.60(10.0.0.60:)
Sun Mar :: - [info] 10.0.0.70(10.0.0.70:)
Sun Mar :: - [info] Alive Slaves:
Sun Mar :: - [info] 10.0.0.60(10.0.0.60:) Version=5.6.-log (oldest major version between slaves) log-bin:enabled
Sun Mar :: - [info] Replicating from 10.0.0.50(10.0.0.50:)
Sun Mar :: - [info] Primary candidate for the new Master (candidate_master is set)
Sun Mar :: - [info] 10.0.0.70(10.0.0.70:) Version=5.6. (oldest major version between slaves) log-bin:disabled
Sun Mar :: - [info] Replicating from 10.0.0.50(10.0.0.50:)
Sun Mar :: - [info] Current Alive Master: 10.0.0.50(10.0.0.50:)
Sun Mar :: - [info] Checking slave configurations..
Sun Mar :: - [warning] relay_log_purge= is not set on slave 10.0.0.60(10.0.0.60:).
Sun Mar :: - [warning] relay_log_purge= is not set on slave 10.0.0.70(10.0.0.70:).
Sun Mar :: - [warning] log-bin is not set on slave 10.0.0.70(10.0.0.70:). This host cannot be a master.
Sun Mar :: - [info] Checking replication filtering settings..
Sun Mar :: - [info] binlog_do_db= , binlog_ignore_db=
Sun Mar :: - [info] Replication filtering check ok.
Sun Mar :: - [info] GTID (with auto-pos) is not supported
Sun Mar :: - [info] Starting SSH connection tests..
Sun Mar :: - [info] All SSH connection tests passed successfully.
Sun Mar :: - [info] Checking MHA Node version..
Sun Mar :: - [info] Version check ok.
Sun Mar :: - [info] Checking SSH publickey authentication settings on the current master..
Sun Mar :: - [info] HealthCheck: SSH to 10.0.0.50 is reachable.
Sun Mar :: - [info] Master MHA Node version is 0.56.
Sun Mar :: - [info] Checking recovery script configurations on 10.0.0.50(10.0.0.50:)..
Sun Mar :: - [info] Executing command: save_binary_logs --command=test --start_pos= --binlog_dir=/mysql/data --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.
Sun Mar :: - [info] Connecting to root@10.0.0.50(10.0.0.50:)..
Creating /tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /mysql/data, up to mysql-bin.
Sun Mar :: - [info] Binlog setting check done.
Sun Mar :: - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun Mar :: - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.0.0.60 --slave_ip=10.0.0.60 --slave_port= --workdir=/tmp --target_version=5.6.-log --manager_version=0.56 --relay_log_info=/mysql/data/relay-log.info --relay_dir=/mysql/data/ --slave_pass=xxx
Sun Mar :: - [info] Connecting to root@10.0.0.60(10.0.0.60:)..
Checking slave recovery environment settings..
Opening /mysql/data/relay-log.info ... ok.
Relay log found at /mysql/data, up to cadicate-master-relay-bin.
Temporary relay log file is /mysql/data/cadicate-master-relay-bin.
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sun Mar :: - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.0.0.70 --slave_ip=10.0.0.70 --slave_port= --workdir=/tmp --target_version=5.6. --manager_version=0.56 --relay_log_info=/mysql/data/relay-log.info --relay_dir=/mysql/data/ --slave_pass=xxx
Sun Mar :: - [info] Connecting to root@10.0.0.70(10.0.0.70:)..
Checking slave recovery environment settings..
Opening /mysql/data/relay-log.info ... ok.
Relay log found at /mysql/data, up to slave-relay-bin.
Temporary relay log file is /mysql/data/slave-relay-bin.
Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sun Mar :: - [info] Slaves settings check done.
Sun Mar :: - [info]
10.0.0.50(10.0.0.50:) (current master)
+--10.0.0.60(10.0.0.60:)
+--10.0.0.70(10.0.0.70:) Sun Mar :: - [info] Checking master_ip_failover_script status:
Sun Mar :: - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.0.0.50 --orig_master_ip=10.0.0.50 --orig_master_port= IN SCRIPT TEST====/etc/init.d/keepalived stop==/etc/init.d/keepalived start=== Checking the Status of the script.. OK
Sun Mar :: - [info] OK.
Sun Mar :: - [warning] shutdown_script is not defined.
Sun Mar :: - [info] Set master ping interval seconds.
Sun Mar :: - [info] Set secondary check script: /usr/local/bin/masterha_secondary_check -s server03 -s server02
Sun Mar :: - [info] Starting ping health check on 10.0.0.50(10.0.0.50:)..
Sun Mar :: - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
分析日志,分析日志,分析日志,重要事情强调3遍!
最新文章
- C#语言基础——递归
- CRLF line terminators导致shell脚本报错:command not found
- Leetcode: Ones and Zeroes
- 升级到VS2013.Update.4的问题
- Excel快速改变行列的次序
- 高端大气上档次Ergotron Neo-Flex+MBP Retina的组合~
- flex设置成1和auto有什么区别
- Android -- 自定义带进度条的按钮
- NodeJS介绍
- JAVA设计模式之【简单工厂模式】
- 贴心小技能——纯CSS实现的帮助提示
- Combinations ——LeetCode
- bootstrap scaffold框架
- 石子合并 区间dp模板
- Js 运行机制 (重点!!)
- T-SQL 局部变量和全局变量
- PeopleSoft单点登录工作原理
- Java高并发编程(四)
- haproxy 非常完整的配置
- Java多线程编程实战指南(核心篇)读书笔记(二)