原创

分布式实战(七)——Redis集群模式实战

Redis从3.0版本开始支持原生的集群模式,即 Redis Cluster。Redis Cluster,主要是针对海量数据下的高并发、高可用场景,海量数据就是说单机Redis无法完全容纳数据,需要进行数据分片。

本章,我就来讲解如何搭建一个3主3从的Redis Cluster。关于Redis Cluster的基本原理,读者可以参考进阶篇中的《分布式框架之高性能:Redis集群模式》

一、集群部署

Redis Cluster至少要求有3个Master节点,同时为了保证高可用,每个Master都建议至少有一个Slave,同时Master不能跟自己的Slave部署在同一台机器上。所以,我会在ressmix-dsf01、ressmix-dsf02、ressmix-dsf03这三台机器上分别部署一个Master一个Slave。首先,我将ressmix-dsf01、ressmix-dsf02、ressmix-dsf03上的哨兵和Redis进程都停止掉。

1.1 集群配置

我们先分配下端口,6个Redis节点,端口依次为7001、7002、7003、7004、7005、7006。ressmix-dsf01上分配7001、7002,ressmix-dsf02上分配7003、7004,ressmix-dsf03上分配7005、7006。

我以ressmix-dsf01为例,说明下需要建立的目录:

#存放redis的配置文件
mkdir -p /etc/redis
#cluster模式下,Redis将集群状态保存在那里,包括集群机器信息等
mkdir -p /etc/redis-cluster
#存放redis的日志
mkdir -p /var/log/redis
#运行时工作目录,注意端口
mkdir -p /var/redis/7001
mkdir -p /var/redis/7002

然后,将redis安装包目录下redis.conf配置文件拷贝到各台机器的/etc/redis/目录下,分别重名为7001.conf、7002.conf……7006.conf。我以7001.conf为例,说下Redis Cluster模式要修改的配置:

#运行端口
port 7001
#启用集群模式 
cluster-enabled yes
#集群状态信息文件位置
cluster-config-file /etc/redis-cluster/nodes7001.conf
#节点超时时间(秒),超过该时间则认为节点宕机,master宕机会触发主备切换,slave宕机就不会提供服务
cluster-node-timeout 15000
daemonize    yes                            
pidfile        /var/run/redis_7001.pid                         
dir         /var/redis/7001        
logfile     /var/log/redis/7001.log
#宿主机IP
bind 192.168.0.107
appendonly yes

接着,准备生产环境的启动脚本,将Redis安装包util目录下的redis_init_script脚本拷贝到/etc/init.d目录,分别命名为: redis_7001, redis_7002, redis_7003, redis_7004, redis_7005, redis_7006,每个启动脚本内,都修改REDISPORT为对应的端口号:

#!/bin/sh
#
# Simple Redis init.d script conceived to work on Linux systems
# as it does use of the /proc filesystem.

# chkconfig:   2345 90 10
# description:  Redis is a persistent key-value database

最后,执行/etc/init.d/redis_XXXX start分别启动6个节点,我们可以到/var/log/redis/目录中看下启动日志,确认下是否启动成功:

[root@ressmix-dsf01 redis]# cat 7001.log 
1087:C 28 Apr 2020 23:03:04.763 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1087:C 28 Apr 2020 23:03:04.763 # Redis version=5.0.8, bits=32, commit=00000000, modified=0, pid=1087, just started
1087:C 28 Apr 2020 23:03:04.763 # Configuration loaded
1088:M 28 Apr 2020 23:03:04.765 * Increased maximum number of open files to 10032 (it was originally set to 1024).
1088:M 28 Apr 2020 23:03:04.765 # Warning: 32 bit instance detected but no memory limit set. Setting 3 GB maxmemory limit with 'noeviction' policy now.
1088:M 28 Apr 2020 23:03:04.765 * No cluster configuration found, I'm 6ac3762d35e60af9d8d12f3ac148c8ede11b94af
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 5.0.8 (00000000/0) 32 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in cluster mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 7001
 |    `-._   `._    /     _.-'    |     PID: 1088
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

1088:M 28 Apr 2020 23:03:04.776 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1088:M 28 Apr 2020 23:03:04.776 # Server initialized
1088:M 28 Apr 2020 23:03:04.776 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1088:M 28 Apr 2020 23:03:04.776 * Ready to accept connections

1.2 集群创建

配置完并启动上述的6个节点后,它们还不在一个集群里面,不能互相发现。我们需要通过命令完成集群的创建:

redis-cli --cluster create 192.168.0.107:7001 192.168.0.107:7002 192.168.0.109:7003 192.168.0.109:7004 192.168.0.110:7005 192.168.0.110:7006  --cluster-replicas 1

其中,--cluster-replicas 1表示每个Master有一个Slave,Redis会保证尽量不将同一个Master和Slave分配在一台机器上。执行完命令后,输出如下:

>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.0.109:7004 to 192.168.0.107:7001
Adding replica 192.168.0.110:7006 to 192.168.0.109:7003
Adding replica 192.168.0.107:7002 to 192.168.0.110:7005
M: 6ac3762d35e60af9d8d12f3ac148c8ede11b94af 192.168.0.107:7001
   slots:[0-5460] (5461 slots) master
S: 359dec6ff52d85c983f49c97997b98d91758f049 192.168.0.107:7002
   replicates 94b96a1d009232fc013f9d15535000087c82248a
M: 50d9fb7c86d15847233fc691f6e54e6173fd5d85 192.168.0.109:7003
   slots:[5461-10922] (5462 slots) master
S: 477af6a0d870cd61930d654667992eb4abeac920 192.168.0.109:7004
   replicates 6ac3762d35e60af9d8d12f3ac148c8ede11b94af
M: 94b96a1d009232fc013f9d15535000087c82248a 192.168.0.110:7005
   slots:[10923-16383] (5461 slots) master
S: e8091d7c48647d55a34fa83949600d025d68b879 192.168.0.110:7006
   replicates 50d9fb7c86d15847233fc691f6e54e6173fd5d85
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
...
>>> Performing Cluster Check (using node 192.168.0.107:7001)
M: 6ac3762d35e60af9d8d12f3ac148c8ede11b94af 192.168.0.107:7001
   slots:[0-5460] (5461 slots) master
   13816612910207598593 additional replica(s)
M: 50d9fb7c86d15847233fc691f6e54e6173fd5d85 192.168.0.109:7003
   slots:[5461-10922] (5462 slots) master
   653934029518667777 additional replica(s)
S: e8091d7c48647d55a34fa83949600d025d68b879 192.168.0.110:7006
   slots: (0 slots) slave
   replicates 50d9fb7c86d15847233fc691f6e54e6173fd5d85
M: 94b96a1d009232fc013f9d15535000087c82248a 192.168.0.110:7005
   slots:[10923-16383] (5461 slots) master
   653934854152388609 additional replica(s)
S: 477af6a0d870cd61930d654667992eb4abeac920 192.168.0.109:7004
   slots: (0 slots) slave
   replicates 6ac3762d35e60af9d8d12f3ac148c8ede11b94af
S: 359dec6ff52d85c983f49c97997b98d91758f049 192.168.0.107:7002
   slots: (0 slots) slave
   replicates 94b96a1d009232fc013f9d15535000087c82248a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

至此,3主3从的Redis Cluster集群搭建完毕。

二、集群操作

2.1 数据读写

我们可以尝试在ressmix-dsf01这台机器上,给Redis的Master节点写入一条数据:

[root@ressmix-dsf01 src]# redis-cli -h 192.168.0.107 -p 7001

192.168.0.107:7001> set k1 v1
(error) MOVED 12706 192.168.0.110:7005
192.168.0.107:7001> set k2 v2
OK
192.168.0.107:7001> KEYS *
1) "k2"

可以看到,对于k1这个键,返回了一个MOVED指令,说明k1对应的槽(slot)不在192.168.0.107:7001这个Master节点上,而是在192.168.0.110:7005上。

我们如果希望Redis自动进行重定向,可以加上一个-c参数:redis-cli -h 192.168.0.107 -p 7001 -c 。

2.2 高可用

我们可以kill掉一个Master节点,然后通过命令redis-cli --cluster check 192.168.0.107:7001看下集群状态,看看是否会发生故障自动转移和恢复。

kill掉192.168.0.107:7001之前:

[root@ressmix-dsf01 src]# redis-cli --cluster check 192.168.0.107:7001
192.168.0.107:7001 (6ac3762d...) -> 1 keys | 5461 slots | 1 slaves.
192.168.0.109:7003 (50d9fb7c...) -> 0 keys | 5462 slots | 1 slaves.
192.168.0.110:7005 (94b96a1d...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.0.107:7001)
M: 6ac3762d35e60af9d8d12f3ac148c8ede11b94af 192.168.0.107:7001
   slots:[0-5460] (5461 slots) master
   13832323230561468417 additional replica(s)
M: 50d9fb7c86d15847233fc691f6e54e6173fd5d85 192.168.0.109:7003
   slots:[5461-10922] (5462 slots) master
   625984512660078593 additional replica(s)
S: e8091d7c48647d55a34fa83949600d025d68b879 192.168.0.110:7006
   slots: (0 slots) slave
   replicates 50d9fb7c86d15847233fc691f6e54e6173fd5d85
M: 94b96a1d009232fc013f9d15535000087c82248a 192.168.0.110:7005
   slots:[10923-16383] (5461 slots) master
   625985406013276161 additional replica(s)
S: 477af6a0d870cd61930d654667992eb4abeac920 192.168.0.109:7004
   slots: (0 slots) slave
   replicates 6ac3762d35e60af9d8d12f3ac148c8ede11b94af
S: 359dec6ff52d85c983f49c97997b98d91758f049 192.168.0.107:7002
   slots: (0 slots) slave
   replicates 94b96a1d009232fc013f9d15535000087c82248a
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

然后kill掉192.168.0.107:7001,再check一下集群状态:

[root@ressmix-dsf01 run]# redis-cli --cluster check 192.168.0.107:7002
Could not connect to Redis at 192.168.0.107:7001: Connection refused
192.168.0.109:7004 (477af6a0...) -> 1 keys | 5461 slots | 0 slaves.
192.168.0.109:7003 (50d9fb7c...) -> 0 keys | 5462 slots | 1 slaves.
192.168.0.110:7005 (94b96a1d...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.0.107:7002)
S: 359dec6ff52d85c983f49c97997b98d91758f049 192.168.0.107:7002
   slots: (0 slots) slave
   replicates 94b96a1d009232fc013f9d15535000087c82248a
S: e8091d7c48647d55a34fa83949600d025d68b879 192.168.0.110:7006
   slots: (0 slots) slave
   replicates 50d9fb7c86d15847233fc691f6e54e6173fd5d85
M: 477af6a0d870cd61930d654667992eb4abeac920 192.168.0.109:7004
   slots:[0-5460] (5461 slots) master
M: 50d9fb7c86d15847233fc691f6e54e6173fd5d85 192.168.0.109:7003
   slots:[5461-10922] (5462 slots) master
   643121260372426753 additional replica(s)
M: 94b96a1d009232fc013f9d15535000087c82248a 192.168.0.110:7005
   slots:[10923-16383] (5461 slots) master
   643122188085362689 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.

可以看到192.168.0.109:7004从Slave转变为了Master,说明故障自动转移和恢复是成功的。然后,我们再重新启动192.168.0.107:7001,再check一下集群状态,发现192.168.0.107:7001已经变成了Slave:

[root@ressmix-dsf01 init.d]# redis-cli --cluster check 192.168.0.107:7002
192.168.0.109:7004 (477af6a0...) -> 1 keys | 5461 slots | 1 slaves.
192.168.0.109:7003 (50d9fb7c...) -> 0 keys | 5462 slots | 1 slaves.
192.168.0.110:7005 (94b96a1d...) -> 0 keys | 5461 slots | 1 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.0.107:7002)
S: 359dec6ff52d85c983f49c97997b98d91758f049 192.168.0.107:7002
   slots: (0 slots) slave
   replicates 94b96a1d009232fc013f9d15535000087c82248a
S: 6ac3762d35e60af9d8d12f3ac148c8ede11b94af 192.168.0.107:7001
   slots: (0 slots) slave
   replicates 477af6a0d870cd61930d654667992eb4abeac920
S: e8091d7c48647d55a34fa83949600d025d68b879 192.168.0.110:7006
   slots: (0 slots) slave
   replicates 50d9fb7c86d15847233fc691f6e54e6173fd5d85
M: 477af6a0d870cd61930d654667992eb4abeac920 192.168.0.109:7004
   slots:[0-5460] (5461 slots) master
   603169234066866177 additional replica(s)
M: 50d9fb7c86d15847233fc691f6e54e6173fd5d85 192.168.0.109:7003
   slots:[5461-10922] (5462 slots) master
   603169612023988225 additional replica(s)
M: 94b96a1d009232fc013f9d15535000087c82248a 192.168.0.110:7005
   slots:[10923-16383] (5461 slots) master
   603170539736924161 additional replica(s)
[OK] All nodes agree about slots configuration.

三、总结

本章,我介绍了redis cluster的部署方式,读者可以自己在本地尝试搭建一个redis cluster集群,以便加深理解。

正文到此结束
本文目录