本文共 16010 字,大约阅读时间需要 53 分钟。
MMM配置官方文档:
本文环境:
主库:CentOS6.7 x64 192.168.0.201 mysql-5.6.29 主库:CentOS6.7 x64 192.168.0.202 mysql-5.6.29 备库:CentOS6.7 x64 192.168.0.203 mysql-5.6.29 备库:CentOS6.7 x64 192.168.0.204 mysql-5.6.29 管理:CentOS6.7 x64 192.168.0.205MMM服务器角色对应关系:
node1(主1) 192.168.0.201 db1 1 node2(主2) 192.168.0.202 db2 2 node3(从1) 192.168.0.203 db3 11 node4(从1) 192.168.0.204 db4 12 node5(监控) 192.168.0.205 mon -配置完成后,使用下面的VIP访问MySQL Cluster。
192.168.0.211 writer 192.168.0.212 reader 192.168.0.213 reader 192.168.0.214 reader 192.168.0.215 reader
[mysqld]
server-id = 1 binlog-format = ROW log-bin = master-bin log-bin-index = master-bin.index log-slave-updates = true auto_increment_offset = 1 auto_increment_increment = 2 mysql> show master status; +-------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-------------------+----------+--------------+------------------+-------------------+ | master-bin.000001 | 120 | | | | +-------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec)mysql>
[mysqld]
server-id = 2 binlog-format = ROW log-bin = master-bin log-bin-index = master-bin.index log-slave-updates = true auto_increment_offset = 2 auto_increment_increment = 2 mysql> show master status; +-------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-------------------+----------+--------------+------------------+-------------------+ | master-bin.000001 | 120 | | | | +-------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec)mysql>
grant replication slave on *.* to 'repl'@'%' identified by '123456';
flush privileges;
[mysqld]
binlog-format = ROW log-bin = mysql-bin relay-log = slave-relay-bin relay-log-index = slave-relay-bin.index log-slave-updates = true server-id = 11 skip-name-resolve
[mysqld]
binlog-format = ROW log-bin = mysql-bin relay-log = slave-relay-bin relay-log-index = slave-relay-bin.index log-slave-updates = true server-id = 12 skip-name-resolve
(1) 配置同步,手动执行同步参数,该配置会写入master.info文件中。
mysql >
CHANGE MASTER TO
MASTER_HOST='192.168.0.202', MASTER_PORT=3306, MASTER_USER='repl', MASTER_PASSWORD='123456', MASTER_LOG_FILE='master-bin.000001', MASTER_LOG_POS=120;mysql> start slave;
mysql> show slave status\G;
*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.0.202 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 407 Relay_Log_File: node1-relay-bin.000002 Relay_Log_Pos: 571 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB:
(1) 配置同步,手动执行同步参数,该配置会写入master.info文件中。
mysql >
CHANGE MASTER TO
MASTER_HOST='192.168.0.201', MASTER_PORT=3306, MASTER_USER='repl', MASTER_PASSWORD='123456', MASTER_LOG_FILE='master-bin.000001', MASTER_LOG_POS=120;mysql> start slave;
mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.0.201 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000001 Read_Master_Log_Pos: 694 Relay_Log_File: node2-relay-bin.000002 Relay_Log_Pos: 571 Relay_Master_Log_File: master-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB:
(1) node1上创建数据库
mysql> create database mydb;
Query OK, 1 row affected (0.00 sec)mysql> use mydb;
Database changed mysql> show tables; +----------------+ | Tables_in_mydb | +----------------+ | user | +----------------+ 1 row in set (0.01 sec)mysql> select * from user;
+----+----------+----------+ | id | username | password | +----+----------+----------+ | 1 | koumm | 123456 | +----+----------+----------+ 1 row in set (0.01 sec)mysql>
(2) node2上的操作如下: mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mydb | | mysql | | performance_schema | +--------------------+ 4 rows in set (0.00 sec)mysql> use mydb;
Database changed mysql> CREATE TABLE `user` ( -> `id` varchar(20) NOT NULL, -> `username` varchar(20) NOT NULL, -> `password` char(32) NOT NULL, -> PRIMARY KEY (`id`) -> ) ; Query OK, 0 rows affected (0.14 sec)mysql> INSERT INTO user VALUES ('1', 'koumm', '123456');
Query OK, 1 row affected (0.00 sec)mysql>
(1) 主库锁表
mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec) mysql> show master status; +-------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-------------------+----------+--------------+------------------+-------------------+ | master-bin.000001 | 1226 | | | | +-------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec) (2) 主库备份[root@master ~]# mysqldump -uroot -p -B mydb > mydb.sql
说明:-B参数有建库语句。 (3) 主库解开锁表功能mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec) mysql>
# mysql -uroot -padmin < mydb.sql
(1) 配置同步,手动执行同步参数,该配置会写入master.info文件中。
mysql >
CHANGE MASTER TO
MASTER_HOST='192.168.0.201', MASTER_PORT=3306, MASTER_USER='repl', MASTER_PASSWORD='123456', MASTER_LOG_FILE='master-bin.000001', MASTER_LOG_POS=1226; mysql> start slave; Query OK, 0 rows affected (0.04 sec)mysql> show slave status\G;
*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.121 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000005 Read_Master_Log_Pos: 730 Relay_Log_File: slave-relay-bin.000002 Relay_Log_Pos: 284 Relay_Master_Log_File: master-bin.000005 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: 可以简单测试一下同步数据情况。
(1) node1,node2,node3,node4,node5节点
# wget # rpm -ivh epel-release-6-8.noarch.rpm(2) node5(监控节点)
# yum install mysql-mmm mysql-mmm-agent mysql-mmm-tools mysql-mmm-monitor(3) node1,node2,node3,node4 (mysql节点)
# yum install mysql-mmm-agent
mysql> grant replication client on *.* to 'mmm_monitor'@'192.168.0.%' identified by '123456';
mysql> grant super, replication client, process on *.* to 'mmm_agent'@'192.168.0.%' identified by '123456'; mysql> flush privileges;
[root@node1 ~]# vi /etc/mysql-mmm/mmm_common.conf
active_master_role writer<host default>
cluster_interface eth0 pid_path /var/run/mysql-mmm/mmm_agentd.pid bin_path /usr/libexec/mysql-mmm/ replication_user repl replication_password 123456 agent_user mmm_agent agent_password 123456 </host><host db1>
ip 192.168.0.201 mode master peer db2 </host><host db2>
ip 192.168.0.202 mode master peer db1 </host><host db3>
ip 192.168.0.203 mode slave </host><host db4>
ip 192.168.0.204 mode slave </host><role writer>
hosts db1, db2 ips 192.168.0.211 mode exclusive </role><role reader>
hosts db1, db2, db3, db4 ips 192.168.0.212,192.168.0.213,192.168.0.214,192.168.0.215 mode balanced </role>复制该配置文件到所有节点上:
[root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node2:/etc/mysql-mmm/ [root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node3:/etc/mysql-mmm/ [root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node4:/etc/mysql-mmm/ [root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node5:/etc/mysql-mmm/
(1) node1,node2,node3,node4 的MySQL节点配置mmm_agent.conf
# vi /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf this db1 #在四台mysql节点上设置对应的db,分别为db1、db2、db3、db4 (2) 默认为启用,可以不用修改 # cat /etc/default/mysql-mmm-agent # mysql-mmm-agent defaults ENABLED=1(3) mysql节点启动服务
# chkconfig mysql-mmm-agent on
# /etc/init.d/mysql-mmm-agent start
(1) 配置监控
[root@node5 ~]# vi /etc/mysql-mmm/mmm_mon.conf include mmm_common.conf<monitor>
ip 127.0.0.1 pid_path /var/run/mysql-mmm/mmm_mond.pid bin_path /usr/libexec/mysql-mmm status_path /var/lib/mysql-mmm/mmm_mond.status ping_ips 192.168.0.201,192.168.0.202,192.168.0.203,192.168.0.204 auto_set_online 30# The kill_host_bin does not exist by default, though the monitor will
# throw a warning about it missing. See the section 5.10 "Kill Host # Functionality" in the PDF documentation. # # kill_host_bin /usr/libexec/mysql-mmm/monitor/kill_host # </monitor><host default>
monitor_user mmm_monitor monitor_password 123456 </host>debug 0
(2) mysql节点启动服务# chkconfig mysql-mmm-monitor on
# /etc/init.d/mysql-mmm-monitor start
(1) 管理服务器上查看状态
[root@node1 ~]# mmm_control show
db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.215), writer(192.168.0.211) db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213) db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.212) db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)[root@node1 ~]# mmm_control checks all
db4 ping [last change: 2016/04/27 16:45:49] OK db4 mysql [last change: 2016/04/27 16:45:49] OK db4 rep_threads [last change: 2016/04/27 16:45:49] OK db4 rep_backlog [last change: 2016/04/27 16:45:49] OK: Backlog is null db2 ping [last change: 2016/04/27 16:45:49] OK db2 mysql [last change: 2016/04/27 16:45:49] OK db2 rep_threads [last change: 2016/04/27 16:45:49] OK db2 rep_backlog [last change: 2016/04/27 16:45:49] OK: Backlog is null db3 ping [last change: 2016/04/27 16:45:49] OK db3 mysql [last change: 2016/04/27 16:45:49] OK db3 rep_threads [last change: 2016/04/27 16:45:49] OK db3 rep_backlog [last change: 2016/04/27 16:45:49] OK: Backlog is null db1 ping [last change: 2016/04/27 16:45:49] OK db1 mysql [last change: 2016/04/27 16:45:49] OK db1 rep_threads [last change: 2016/04/27 16:45:49] OK db1 rep_backlog [last change: 2016/04/27 16:45:49] OK: Backlog is null [root@node1 ~]# (2) 服务器的启动VIP地址[root@node1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:ff:33:6a brd ff:ff:ff:ff:ff:ff inet 192.168.0.201/24 brd 192.168.0.255 scope global eth0 inet 192.168.0.215/32 scope global eth0 inet 192.168.0.211/32 scope global eth0 inet6 fe80::20c:29ff:feff:336a/64 scope link valid_lft forever preferred_lft forever [root@node1 ~]#[root@node2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:8c:60:58 brd ff:ff:ff:ff:ff:ff inet 192.168.0.202/24 brd 192.168.0.255 scope global eth0 inet 192.168.0.213/32 scope global eth0 inet6 fe80::20c:29ff:fe8c:6058/64 scope link valid_lft forever preferred_lft forever [root@node2 ~]#服务器读写采有VIP地址进行读写,出现故障时VIP会漂移到其它节点,由其它节点提供服务。
四、MMM高可用测试
首先查看整个集群的状态,可以看到整个集群状态正常。
[root@node5 ~]# mmm_control show db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212), writer(192.168.0.211) db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213) db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.215) db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214) [root@192.168.0.30 ~]#1. 模拟node2宕机,手动停止mysql服务,观察monitor日志node2由关闭到启动的日志如下:
[root@192.168.0.30 ~]# tail -f /var/log/mysql-mmm/mmm_mond.log
2016/04/29 09:58:33 FATAL State of host 'db2' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK) 2016/04/29 10:00:54 FATAL State of host 'db2' changed from HARD_OFFLINE to AWAITING_RECOVERY 2016/04/29 10:01:24 FATAL State of host 'db2' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(30 seconds). It was in state AWAITING_RECOVERY for 30seconds
重新查看集群的最新状态:
[root@node5 ~]# mmm_control show
db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212), writer(192.168.0.211) db2(192.168.0.202) master/HARD_OFFLINE. Roles: #从日志发现db2的状态有ONLINE转换为HARD_OFFLINE db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.213), reader(192.168.0.215) db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)[root@node5 ~]# mmm_control checks all
db4 ping [last change: 2016/04/29 09:01:08] OK db4 mysql [last change: 2016/04/29 09:01:08] OK db4 rep_threads [last change: 2016/04/29 09:01:08] OK db4 rep_backlog [last change: 2016/04/29 09:01:08] OK: Backlog is null db2 ping [last change: 2016/04/29 09:01:08] OK db2 mysql [last change: 2016/04/29 09:58:33] ERROR: Connect error (host = 192.168.0.202:3306, user = mmm_monitor)! Lost connection to MySQL server at 'readinginitial communication packet', system error: 111 能ping通,说明只是服务死掉了。
db2 rep_threads [last change: 2016/04/29 09:01:08] OK db2 rep_backlog [last change: 2016/04/29 09:01:08] OK: Backlog is null db3 ping [last change: 2016/04/29 09:01:08] OK db3 mysql [last change: 2016/04/29 09:01:08] OK db3 rep_threads [last change: 2016/04/29 09:01:08] OK db3 rep_backlog [last change: 2016/04/29 09:01:08] OK: Backlog is null db1 ping [last change: 2016/04/29 09:01:08] OK db1 mysql [last change: 2016/04/29 09:56:33] OK db1 rep_threads [last change: 2016/04/29 09:01:08] OK db1 rep_backlog [last change: 2016/04/29 09:01:08] OK: Backlog is null[root@node5 ~]#
启动node2,node2由HARD_OFFLINE转到AWAITING_RECOVERY状态。这里db2再次接管读请求。
[root@node5 ~]# mmm_control show
db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212), writer(192.168.0.211) db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213) db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.215) db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214) [root@node5 ~]# 2. 模拟node1主库宕机,查看集群状态。[root@node5 ~]# mmm_control show
db1(192.168.0.201) master/HARD_OFFLINE. Roles: db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213), writer(192.168.0.211) db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.212), reader(192.168.0.215) db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)node1主库再启动:
[root@node5 ~]# mmm_control show
db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212) db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213), writer(192.168.0.211) db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.215) db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)可以看到主库启动用会自动转到读的角色,不会接管主,只到现有的主再次宕机。
[root@192.168.0.30 ~]# tail -f /var/log/mysql-mmm/mmm_mond.log
2016/04/29 10:03:25 FATAL State of host 'db1' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK) 2016/04/29 10:06:20 FATAL State of host 'db1' changed from HARD_OFFLINE to AWAITING_RECOVERY 2016/04/29 10:06:51 FATAL State of host 'db1' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(30 seconds). It was in state AWAITING_RECOVERY for 31seconds
3. 测试小结:
(1)node2备选主节点宕机不影响集群的状态,就是移除了node2备选节点的读状态。 (2)node1主节点宕机,由node2备选主节点接管写角色,node3,node4指向新node2主库进行复制,node3,node4会自动change master到node2.(3)问题来了: 如果node1主库宕机,node2复制应用又落后于node1时就变成了主可写状态,这进的数据主无法保证一致性。 如果node2,node3,node4延迟于node1主,这个时node1宕机,node3,node4将会等待数据追上db1后,再重新指向新的主node2进行复制操作,这时的数据也无法保证同步的一致性。 (4)如查采用MMM高可用架构,主,主备选节点机器配置一样,而且开启半同步进一步提高安全性或采用MariaDB进行多线程从复制,提高复制的性能。说明:MMM(Master-Master Replication Manager for MySQL)主主复制管理器, MMM集群套件具有良好的稳定性、高可用性和可扩展性。当活动的Master节点出现故障时,备用Master节点可以立即接管,而其他的Slave节点也能自动切换到备用Master节点继续进行同步复制,而无需人为干涉;MMM架构需要多个节点、多个IP,对服务器数量有要求,在读写非常繁忙的业务系统下表现不是很稳定,可能会出现复制延时、切换失效等问题。MMM方案并不太适应于对数据安全性要求很高,并且读、写繁忙的环境中。
本文转自 koumm 51CTO博客,原文链接:http://blog.51cto.com/koumm/1768381,如需转载请自行联系原作者