HA 即(high available cluster)高可用集群,又称双机热备,保证关键性业务的不间断提供服务。 如:两台机器A和B,正常情况A提供服务,B待命闲置;一但A宕机或服务宕掉,自动切换至B机继续提供服务。实现高可用的开源软件有heartbeat和keepalived,其中keepalived还有负载均衡的功能。heartbeat作为常用集群开源软件,熟悉它的配置方法,非常有必要。

说明:以下是heartbeat的yum安装和配置的方法介绍,需要扩展epel源,如果没有,执行命令:

# yum install -y epel-release

 1. 试验环境:
  两个CentOS 6.0 64位虚拟机(master:eth1: 192.168.220.11;slave:eth1: 192.168.220.22),master主机设置一个虚拟ip作为心跳线(虚拟机只有一个网卡,实际应用中应该有多个网卡,或者用串口来连接,否则会有不安全因素)

2. 前期准备:
   【1】修改hostname:(修改hostname的目的是为了便于记忆,hostname可以自定义)
  master主机:

# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master
# hostname master;bash

  slave主机:

# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=slave
# hostname slave;bah

【2】修改/etc/hosts文件(两台主机作同样的配置)

# vim /etc/hosts
192.168.220.11 master
192.168.220.22 slave

【3】关闭防火墙

# iptables -F
# getenforce //若get到Disabled,不需做配置;若get到的是Enforcing,作如下修改:
# vim /etc/selinux/config
SELINUX=enforcing --> SELINUX=disabled

【4】虚拟ip的设定

# cd /etc/sysconfig/network-scripts
# cp ifcfg-eth1 ifcfg-eth1:0
# vim ifcfg-eth1:0 //简单配置,很多参数都不需要设定,如下:
DEVICE=eth1:0 //修改为eth1:0
TYPE=Ethernet
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=static
IPADDR=192.168.220.33 //修改为33
NETMASK=255.255.255.0
# /etc/init.d/network restart
# ifconfig //配置正确的话,能列出eth1:0虚拟网卡的信息

 

3. heartbeat的安装和配置:
 【1】yum安装:# yum install -y heartbeat* libnet nginx   //依赖libnet,nginx是我们试验的服务,可以用yum安装。
 【2】master主机的配置:

# cd /usr/share/doc/heartbeat-3.0.4/   //注意版本的问题,可以不是3.0.4
# cp authkeys ha.cf haresources /etc/ha.d/ //拷贝3个核心配置文件
# cd /etc/ha.d

  (1)修改authkeys

# vim authkeys   //最后4行配置如下:
# auth 1
#1 crc //最不严谨
#2 sha1 HI! //最严谨
#3 md5 Hello! //中间值

  将第一行的auth后面的值修改成3,并且,打开最后一行的注释,即选择中间严谨的类型。

# chmod 600 authkeys   //修改权限为600,否则heartbeat无法启动

  (2)修改haresources

# vim haresources   //默认是全部注释的,所以可以在后面追加一行:
master 192.168.220.33/24/eth1:0 nginx //注意这里的ip是虚拟网卡的ip,即心跳线的配置ip,24规定网段,nginx是我们要试验的服务名称

  (3)修改ha.cf

# > ha.cf   //清空配置
# vim !$ //编辑,添加如下配置:
debugfile /var/log/ha-debug //排错日志路径
logfile /var/log/ha-log //运行日志
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 60
udpport 694
ucast eth1 192.168.220.22 //slave的网卡ip
auto_failback on
node master
node slave
ping 192.168.220.2 //仲裁地址,一般为路由器地址,或者一个稳妥的、服务稳定的ip
respawn hacluster /usr/lib64/heartbeat/ipfail //注意: 32bit的linux系统,路径为lib,而非lib64,如下:
########## ERROR: Client child command [/usr/lib/heartbeat/ipfail] is not executable ##############

  (4)复制配置文件到slave主机:

# scp authkeys ha.cf haresources slave:/etc/ha.d/

【3】slave主机的配置:    只需要修改ha.cf:

ucast eth1 192.168.220.22 --> ucast eth1 192.168.220.11   //将ip改成master的ip地址

【4】启动heartbeat(先master,后slave)

  (1)master主机

# /etc/init.d/heartbeat start
Starting High-Availability services: INFO: Running OK
CRITICAL: Resource 192.168.220.33/24/eth1:0 is active, and should not be!
CRITICAL: Non-idle resources can affect data integrity!
info: If you don't know what this means, then get help!
info: Read the docs and/or source to /usr/share/heartbeat/ResourceManager for more details.
CRITICAL: Resource 192.168.220.33/24/eth1:0 is active, and should not be!
CRITICAL: Non-idle resources can affect data integrity!
info: If you don't know what this means, then get help!
info: Read the docs and/or the source to /usr/share/heartbeat/ResourceManager for more details.
CRITICAL: Non-idle resources will affect resource takeback!
CRITICAL: Non-idle resources may affect data integrity!
Done.

  heartbeat会自动拉起nginx,不过第一次启动会比较慢。过一段时间(10S多),检查nginx是否被拉起:

# ps aux |grep nginx

  (2)修改nginx的index.html,方便查看机器的运行状况:

# > /usr/share/doc/nginx/html/index.html   //清空
# echo "masterMMMMMMMMMMMM" > !$

  如果nginx已经启动,在浏览器里面输入下面网址:192.168.220.33,应该可以得到回执结果(虚拟网卡的ip): masterMMMMMMMMMMMM

  (3)slave主机:
  正常情况下,nginx是不被拉起的,因为主机还没宕机,所以ps aux |grep nginx的结果是空。
修改nginx的index.html:

# > /usr/share/doc/nginx/html/index.html
# echo "slaveSSSSSSSSSSSSSS" > !$

  心跳线检测的原理是ping,那么我们将master的ping服务关闭,heartbeat检测到ping失败后,会将nginx的服务转给slave来执行:
  iptables -A INPUT -p icmp -j DROP   //ping命令来自icmp协议,关掉协议,ping失效。
  这时候,可以用tail -f /var/log/ha-log命令来查看heartbeat的处理过程:

  master的ha-log日志内容:

Jan 11 22:47:32 master heartbeat: [2574]: WARN: node 192.168.220.2: is dead    //ping 192.168.220.2路由器失败
Jan 11 22:47:32 master ipfail: [2601]: info: Status update: Node 192.168.220.2 now has status dead
Jan 11 22:47:32 master heartbeat: [2574]: info: Link 192.168.220.2:192.168.220.2 dead. //路由器挂了
harc(default)[2929]: 2016/01/11_22:47:32 info: Running /etc/ha.d//rc.d/status status
Jan 11 22:47:33 master ipfail: [2601]: info: NS: We are dead. :<
Jan 11 22:47:33 master ipfail: [2601]: info: Link Status update: Link 192.168.220.2/192.168.220.2 now has status dead
Jan 11 22:47:34 master ipfail: [2601]: info: We are dead. :< //哦,原来是我们自己挂了
Jan 11 22:47:34 master ipfail: [2601]: info: Asking other side for ping node count.
Jan 11 22:47:37 master ipfail: [2601]: info: Giving up because we were told that we have less ping nodes.
Jan 11 22:47:37 master ipfail: [2601]: info: Delayed giveup in 4 seconds.
Jan 11 22:47:41 master ipfail: [2601]: info: giveup() called (timeout worked)
Jan 11 22:47:42 master heartbeat: [2574]: info: master wants to go standby [all]
Jan 11 22:47:42 master heartbeat: [2574]: info: standby: slave can take our all resources //从可接管服务
Jan 11 22:47:42 master heartbeat: [2956]: info: give up all HA resources (standby). //放弃我们的工作
ResourceManager(default)[2969]: 2016/01/11_22:47:42 info: Releasing resource group: master 192.168.220.33/24/eth1:0 nginx
ResourceManager(default)[2969]: 2016/01/11_22:47:42 info: Running /etc/init.d/nginx stop //停掉nginx服务
ResourceManager(default)[2969]: 2016/01/11_22:47:42 info: Running /etc/ha.d/resource.d/IPaddr 192.168.220.33/24/eth1:0 stop
IPaddr(IPaddr_192.168.220.33)[3057]: 2016/01/11_22:47:42 INFO: IP status = ok, IP_CIP=
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.220.33)[3031]: 2016/01/11_22:47:42 INFO: Success
Jan 11 22:47:42 master heartbeat: [2956]: info: all HA resource release completed (standby).
Jan 11 22:47:42 master heartbeat: [2574]: info: Local standby process completed [all].
Jan 11 22:47:43 master heartbeat: [2574]: WARN: 1 lost packet(s) for [slave] [459:461]
Jan 11 22:47:43 master heartbeat: [2574]: info: remote resource transition completed. //远程资源传递完成
Jan 11 22:47:43 master heartbeat: [2574]: info: No pkts missing from slave! //没有遗失数据
Jan 11 22:47:43 master heartbeat: [2574]: info: Other node completed standby takeover of all resources. //slave节点完全接管我们的工作

   slave的ha-log内容:

Jan 12 11:48:17 slave ipfail: [115215]: info: Telling other node that we have more visible ping nodes.   //告知master,我们可以ping通
Jan 12 11:48:22 slave heartbeat: [115188]: info: master wants to go standby [all] //master想让我们接手
Jan 12 11:48:22 slave heartbeat: [115188]: info: standby: acquire [all] resources from master //接受来自master的资源
Jan 12 11:48:22 slave heartbeat: [115841]: info: acquire all HA resources (standby).
ResourceManager(default)[115854]: 2016/01/12_11:48:22 info: Acquiring resource group: master 192.168.220.33/24/eth1:0 nginx
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.220.33)[115882]: 2016/01/12_11:48:22 INFO: Resource is stopped
ResourceManager(default)[115854]: 2016/01/12_11:48:22 info: Running /etc/ha.d/resource.d/IPaddr 192.168.220.33/24/eth1:0 start //启动心跳线网卡
IPaddr(IPaddr_192.168.220.33)[116015]: 2016/01/12_11:48:22 INFO: Adding inet address 192.168.220.33/24 with broadcast address 192.168.220.255 to device eth1 (with label eth1:0) //虚拟网卡指向我们的网卡
IPaddr(IPaddr_192.168.220.33)[116015]: 2016/01/12_11:48:22 INFO: Bringing device eth1 up
IPaddr(IPaddr_192.168.220.33)[116015]: 2016/01/12_11:48:22 INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.220.33 eth1 192.168.220.33 auto not_used not_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.220.33)[115989]: 2016/01/12_11:48:22 INFO: Success //网卡配置完毕
ResourceManager(default)[115854]: 2016/01/12_11:48:22 info: Running /etc/init.d/nginx start //启动nginx服务
Jan 12 11:48:23 slave heartbeat: [115841]: info: all HA resource acquisition completed (standby). //所有HA资源接手完毕
Jan 12 11:48:23 slave heartbeat: [115188]: info: Standby resource acquisition done [all]. //资源接手完毕
Jan 12 11:48:24 slave heartbeat: [115188]: info: remote resource transition completed. //远程资源传送完毕,完活儿!!!

  根据这些内容,我们可以知道heartbeat的运行过程;如此,在浏览器输入心跳线网卡地址的时候:192.168.220.33,得到如下返回结果:
slaveSSSSSSSSSSSSSS
  这时候,master的nginx被关闭,而slave的nginx正式接手,完成了服务的不间断提供。
  如果刚才不是用防火墙,而是执行命令,将heartbeat服务关闭,结果也是一样的,slave会接手nginx服务。那么,如果将ipatables刚设的规则去掉,或者重新开启heartbeat服务,会怎么样呢?

# ipatales -D INPUT -p icmp -j DROP
# service heartbeat start

   结果是,slave自动关闭nginx,master的nginx又重新启动,接手web服务,可以自己亲自试验一下。刷新浏览器,可以清楚的看到结果。

最新文章

  1. tomcat重启session不过期的处理
  2. Win10系统下编译GDAL1.9.2版本
  3. javascript面向对象方式,调用属性和方法
  4. 菜鸟学Linux命令:find命令 查找文件
  5. nohup启动命令(转)
  6. oracle判断一个字符串中是否包含另外一个字符串
  7. ios开发 UITableViewController
  8. uboot下 Nand flash 启动 内核与根文件系统
  9. phonegap apk
  10. python字符串(移除空白,长度,索引,分割,切片,拼接,格式化输出)
  11. 「mysql优化专题」详解引擎(InnoDB,MyISAM)的内存优化攻略?(9)
  12. Linux——浅析信号处理
  13. mysql 存储过程的实现原理
  14. SqlHelper模板
  15. 【工匠大道】升级Mac下的svn,解决命令行不能使用svn的问题
  16. Gitlab安装以及汉化
  17. 【C++ Primer 第15章】定义派生类析构函数
  18. nodejs sequelize 对应数据库操作符的定义
  19. Android-Activity的切换效果
  20. android之视频播放系统VideoView和自定义VideoView控件的应用

热门文章

  1. OpenDayLight Helium实验一 OpenDaylight的C/S模式实验
  2. zookeeper和Eureka对CAP理论的支持
  3. Java回顾之多线程同步
  4. Idea使用(摘抄至java后端技术公众号-孤独烟)
  5. git 设置 代理服务器
  6. MySQL查询in操作排序
  7. Socket编程理论
  8. 谈谈WPF中的CollectionView与CollectionViewSource
  9. 使用POI导入小数变成浮点数异常
  10. avast! 2014正式版下载