ceph优化篇

操作系统优化

linux kernel

  1. 调度 (默认值deadline)
    ssd使用noop,hd使用deadline
    1
    2
    echo noop > /sys/block/sd*/queue/scheduler
    echo deadline > /sys/block/sd*/queue/scheduler
  1. 预读 (默认值128)
    linux默认预读不能满足 rados的需求,建议8192kb

    1
    echo '8192' > /sys/block/sd*/queue/read_ahead_kb
  2. 进程 (默认值196608)
    osd会消耗大量进程

    1
    echo 4194303 > /proc/sys/kernel/pid_max
  3. cpu频率 (默认performance)
    使cpu运行在性能模式

    1
    echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor >/dev/null
  4. swap
    在系统默认情况下,就算还有大量内存,swap也会开始使用,这对ceph集群来说是会影响性能

    1
    echo "vm.swappiness = 0" | tee -a /etc/sysctl.conf
  5. cpu绑定
    绑定osd进程到指定cpu核心,不过这样有利有弊

network

  1. 巨型帧
    减少大量数据包分片对网络的影响
    1
    2
    3
    ifconfig eth0 mtu 9000
    echo "MTU=9000" | tee -a /etc/sysconfig/network-script/ifcfg-eth0
    systemctl restart network

硬件加速

  1. 减小内存拷贝
    打开网卡tcp-segmentation-offload功能
    1
    ethtool -K em1 tso on

ceph配置优化

[global]
fsid = c74e7a1b-b4aa-490b-b60e-8a8656d08226
mon initial members = ssd-node241
mon host = 172.16.8.241
public network = 172.16.8.0/24
cluster network = 172.16.8.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd pool default size = 2
osd pool default min size = 1
max open files = 131072
debug bluestore = 0/0
debug bluefs = 0/0
debug bdev = 0/0
debug rocksdb = 0/0
rbd cache = false
osd pool default pg num = 256
osd op num shards = 8
osd_op_num_threads_per_shard= 2
[client]
#rbd cache size = 268435456
#rbd cache max dirty = 134217728
#rbd cache max dirty age = 5
rbd_default_features = 3
rbd cache writethrough until flush = True

[mon]
mon data = /var/lib/ceph/mon/ceph-$id
mon_allow_pool_delete = true

[osd]
enable experimental unrecoverable data corrupting features = bluestore rocksdb zs
bluestore fsck on mount = true
osd objectstore = bluestore
bluestore = true
osd data = /var/lib/ceph/osd/ceph-$id
osd mkfs type = xfs
osd mkfs options xfs = -f
osd max write size = 512
osd client message size cap = 2147483648
osd deep scrub stride = 131072
osd op threads = 8
osd disk threads = 4
osd map cache size = 1024
osd map cache bl size = 128
osd_mount_options_xfs = rw,noatime,inode64,logbsize=256k
osd recovery op priority = 4
osd recovery max active = 10
osd max backfills = 4

从传统运维到云运维演进历程之软件定义存储(三)上
Ceph性能优化总结(v0.94)

文章目录
  1. 1. 操作系统优化
    1. 1.1. linux kernel
    2. 1.2. network
    3. 1.3. 硬件加速
  2. 2. ceph配置优化
|