MongoDB分片

MongoDB分片

Deng YongJie's blog 269 2021-09-13

第1章 为什么需要分片

1.有了副本集,为什么还要分片?
副本集资源利用率不高
主库读写压力大

2.分片的优缺点
优点:
资源利用率高
读写压力负载均衡
横向水平拓展

缺点:
理想情况下需要的机器比较多
配置和运维都变的极其复杂
一定要提前规划好,一旦建立后再想改变架构变得很困难

第2章 分片的一些概念

1.路由服务-mongos
不要求副本集,每个mongos都是独立的,配置一模一样
mongos没有数据目录,不存粗数据
路由服务,提供代理,替用户向后去请求shard分片的数据

2.分片配置信息服务器-config
config服务在4.x之后强制要求必须是副本集
保存数据分配在哪个shard上
保存所有shard的配置信息
提供给mongos查询服务

3.片键-shard-key
数据放在哪个shard的分区规则
片键就是索引

4.数据节点-shard
负责处理数据的节点,每个shard都是分片集群的一部分

第3章 片键的分类

1.HASH片键

数据类型

id  name   host  sex 
1	yongjie  SH	     boy
2   jie     BJ    boy 
3   jiejie   SZ    girl

以id作为片键

索引:id
1  hash计算  shard1
2  hash计算  shard2
3  hash计算  shard3

hash片键特点

足够随机,足够平均

2.区间分片

数据类型

id  name   host  sex 
1	yongjie SH	  boy
2   jie     BJ    boy 
3   jiejie  SZ    girl

如果以ID作为片键

id
1-100     shard1
101-200   shard2
201-300   shard3 
300+正无穷 shard4

以host作为片键

SH shard1
BJ shard2
SZ shard3

第4章 IP端口目录规划

1.分片架构图

image.png

2.IP端口规划

db01	10.0.0.51	
			Shard1_Master	  28100
			Shard3_Slave	  28200
			Shard2_Arbiter	  28300
			Config Server	  40000
			mongos Server	  60000
		
db02	10.0.0.52	
			Shard2_Master	  28100
			Shard1_Slave	  28200
			Shard3_Arbiter	  28300
			Config Server	  40000
			mongos Server	  60000
		
db03	10.0.0.53	
			Shard3_Master	  28100
			Shard2_Slave	  28200
			Shard1_Arbiter	  28300
			Config Server	  40000
			mongos Server	  60000

3.目录规划

服务目录:

/opt/master/{conf,log,pid}
/opt/slave/{conf,log,pid}
/opt/arbiter/{conf,log,pid}
/opt/config/{conf,log,pid}
/opt/mongos/{conf,log,pid}

数据目录:

/data/master
/data/slave
/data/arbiter
/data/config

4.搭建步骤

1.搭建部署shard副本集
2.搭建部署config副本集
3.搭建mongos
4.添加分片成员
5.数据库启动分片功能
6.集合设置片键
7.写入测试数据
8.检查分片效果
9.安装使用图形化工具

第5章 shard节点副本集搭建部署

1.安装软件

db01操作

pkill mongo
rm -rf /opt/mongo_2*
rm -rf /data/mongo_2*
rsync -avz /opt/mongodb* 10.0.0.52:/opt/
rsync -avz /opt/mongodb* 10.0.0.53:/opt/

db02和db03操作

echo 'export PATH=$PATH:/opt/mongodb/bin' >> /etc/profile
source /etc/profile

2.创建目录-三台机器都操作

mkdir -p /opt/master/{conf,log,pid}
mkdir -p /opt/slave/{conf,log,pid}
mkdir -p /opt/arbiter/{conf,log,pid}

mkdir -p /data/master
mkdir -p /data/slave
mkdir -p /data/arbiter

3.db01创建配置文件

master节点配置文件

cat >/opt/master/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/master/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/master/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/master/pid/mongodb.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28100
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard1

sharding:
  clusterRole: shardsvr
EOF

slave节点配置文件

cat >/opt/slave/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/slave/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/slave/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/slave/pid/mongodb.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28200
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard3

sharding:
  clusterRole: shardsvr
EOF

arbiter节点配置文件

cat >/opt/arbiter/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/arbiter/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/arbiter/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/arbiter/pid/mongodb.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28300
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard2

sharding:
  clusterRole: shardsvr
EOF

4.db02创建配置文件

master节点配置文件

cat >/opt/master/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/master/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/master/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/master/pid/mongodb.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28100
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard2

sharding:
  clusterRole: shardsvr
EOF

slave节点配置文件

cat >/opt/slave/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/slave/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/slave/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/slave/pid/mongodb.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28200
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard1

sharding:
  clusterRole: shardsvr
EOF

arbiter节点配置文件

cat >/opt/arbiter/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/arbiter/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/arbiter/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/arbiter/pid/mongodb.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28300
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard3

sharding:
  clusterRole: shardsvr
EOF

5.db03创建配置文件

master节点配置文件

cat >/opt/master/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/master/log/mongod.log

storage:
  journal:
    enabled: true
  dbPath: /data/master/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/master/pid/mongodb.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28100
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard3

sharding:
  clusterRole: shardsvr
EOF

slave节点配置文件

cat >/opt/slave/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/slave/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/slave/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/slave/pid/mongod.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28200
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard2

sharding:
  clusterRole: shardsvr
EOF

arbiter节点配置文件

cat >/opt/arbiter/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/arbiter/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/arbiter/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/arbiter/pid/mongod.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 28300
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  oplogSizeMB: 1024 
  replSetName: shard1

sharding:
  clusterRole: shardsvr
EOF

6.优化警告-三台都操作

echo "never"  > /sys/kernel/mm/transparent_hugepage/enabled
echo "never"  > /sys/kernel/mm/transparent_hugepage/defrag

7.启动服务-三台都操作

mongod -f /opt/master/conf/mongod.conf 
mongod -f /opt/slave/conf/mongod.conf 
mongod -f /opt/arbiter/conf/mongod.conf
ps -ef|grep mongod

8.初始化副本集

db01 master节点初始化副本集

mongo --port 28100
rs.initiate()
等一下变成PRIMARY
rs.add("10.0.0.52:28200")
rs.addArb("10.0.0.53:28300")
rs.status()

db02 master节点初始化副本集

mongo --port 28100
rs.initiate()
等一下变成PRIMARY
rs.add("10.0.0.53:28200")
rs.addArb("10.0.0.51:28300")
rs.status()

db03 master节点初始化副本集

mongo --port 28100
rs.initiate()
等一下变成PRIMARY
rs.add("10.0.0.51:28200")
rs.addArb("10.0.0.52:28300")
rs.status()

检查3台机器的副本集health是否都为1,则成功

echo "rs.status()"|mongo --port 28100|grep health

第6章 config副本集搭建部署

1.创建目录-三台机器都操作

mkdir -p /opt/config/{conf,log,pid}
mkdir -p /data/config/

2.创建配置文件-三台机器都操作

cat >/opt/config/conf/mongod.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/config/log/mongodb.log

storage:
  journal:
    enabled: true
  dbPath: /data/config/
  directoryPerDB: true

  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true

processManagement:
  fork: true
  pidFilePath: /opt/config/pid/mongod.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 40000
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

replication:
  replSetName: configset

sharding:
  clusterRole: configsvr
EOF

3.启动-三台机器都操作

mongod -f /opt/config/conf/mongod.conf
ps -ef|grep mongod

4.初始化副本集-db01操作即可

mongo --port 40000
rs.initiate()
等一下变成PRIMARY
rs.add("10.0.0.52:40000")
rs.add("10.0.0.53:40000")
rs.status()

第7章 mongos搭建部署

1.创建目录-三台机器都操作

mkdir -p /opt/mongos/{conf,log,pid}

2.创建配置文件-三台机器都操作

cat >/opt/mongos/conf/mongos.conf<<EOF   
systemLog:
  destination: file 
  logAppend: true 
  path: /opt/mongos/log/mongos.log

processManagement:
  fork: true
  pidFilePath: /opt/mongos/pid/mongos.pid
  timeZoneInfo: /usr/share/zoneinfo

net:
  port: 60000
  bindIp: 127.0.0.1,$(ifconfig eth0|awk 'NR==2{print $2}')

sharding:
  configDB: 
    configset/10.0.0.51:40000,10.0.0.52:40000,10.0.0.53:40000
EOF

3.启动-三台机器都操作

mongos -f /opt/mongos/conf/mongos.conf

4.登陆mongos-db01操作

mongo --port 60000

第8章 添加分片成员

1.登陆mongos添加shard成员信息-db01操作

mongo --port 60000
use admin
db.runCommand({addShard:'shard1/10.0.0.51:28100,10.0.0.52:28200,10.0.0.53:28300'})
db.runCommand({addShard:'shard2/10.0.0.52:28100,10.0.0.53:28200,10.0.0.51:28300'})
db.runCommand({addShard:'shard3/10.0.0.53:28100,10.0.0.51:28200,10.0.0.52:28300'})

2.查看分片成员信息

db.runCommand({ listshards : 1 })

第9章 hash分片配置

1.数据库开启分片

mongo --port 60000
use admin
db.runCommand( { enablesharding : "yongjie" } )

2.创建索引-片键

给yongjie库下的hash集合创建索引,索引字段为id,索引类型为hash类型

use yongjie
db.hash.ensureIndex( { id: "hashed" } )

建议将字段为数字的添加成索引

3.集合开启hash分片

use admin
sh.shardCollection( "yongjie.hash",{ id: "hashed" } )

第10章 写入数据测试

1.生成测试数据

mongo --port 60000
use yongjie
for(i=1;i<10000;i++){db.hash.insert({"id":i,"name":"BJ","age":18});}

2.分片验证数据

shard1:
mongo --port 28100
use yongjie
db.hash.count()
3349

shard2:
mongo --port 28100
use yongjie
db.hash.count()
3366

shard3:
mongo --port 28100
use yongjie
db.hash.count()
3284

第11章 分片常用管理命令

1.列出分片所有详细信息

mongo --port 60000
db.printShardingStatus()
sh.status()

2.列出所有分片成员信息

mongo --port 60000
use admin
db.runCommand({ listshards : 1})

3.列出开启分片的数据库

mongo --port 60000
use config
db.databases.find({"partitioned": true })

4.查看分片的片键

mongo --port 60000
use config
db.collections.find().pretty()

第12章 正确的启动顺序

所有的config
所有的mongos
所有的master
所有的arbiter
所有的slave