《clickhouse 集群部署详细教程.docx》由会员分享,可在线阅读,更多相关《clickhouse 集群部署详细教程.docx(10页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、clickhouse集群部署详细教程前后:写这个文档初衷就是看了 clickhouse的官方文档,只能说写的太简洁了,很多地方知识点 没有完全的阐述清楚,需要自己边做试验边悟,在这期间也看了网上文档,都是东一鳞西一 爪,当然也从中吸取了很多的知识,但没有完全的体系的知识来说明clickhouse集群,所以 在整体消化了知识之后,提供个文档给大家作为学习参考。、clickhouse 集群一问一答1. clickhouse集群与MYSQL集群之类的传统集群有什么区别?MYSQL等传统数据库集群有主从角色,正常场景下都以实例为单位进行数据同步,每个 实例都存储一份完整数据。clickhouse集群相
2、比而言非常灵活,可以在配置文件中配置多个集群。单个集群中配置N 个share (分片)和N个replicate (复制)。每个表在创立的时候可以指定适合自己的集群 (分片和复制方式)。没有主从的概念,每个节点都可以写入。clickhouse的语法对于精通MYSQL的人 几乎没有任何障碍。2. clickhouse 表类型本地表:一般表引擎为mergetree ,数据只是单节点存储,可作为分布表的本地表存储数 据。复制表:引擎为replicatemergetree,单个节点写入,会复制到其他节点的复制表,也可以 作为分布表的存储表。分布表:本身不存储数据,作为路由,对录入数据进行分片复制(具体分
3、片和复制策略由 集群配置决定),下发给存储表,对外是一个整体。3. zookeeper集群对于dickhouse的作用?CREATE TABLE tutorial.testll7 on cluster cluster_3_shard_l_replicas (EventDate DateTime,CounterlD Ulnt32zUserID Ulnt32 ) ENGINE = ReplicatedMergeTree(7clickhouse/tables/shard/testll7, replica) PARTITION BY toYYYYMM(EventDate) ORDER BY (Coun
4、terlD, EventDate, intHash32(UserlD) SAMPLE BY intHash32(UserlD);#on cluster DDL批量创立分布表dcactiond :) CREATE TABLE tutorial.testll7_all on cluster cluster_3_shard_l_replicas AS tutorial.testll7 ENGINE = Distributed(cluster_3_shard_l_replicasz tutorial, testll7z rand();zabbix控制share #分片on cluster #ddl 批
5、量处理dickhouse自实现internal_replication=true 写入单个节点 zabbix 同步到 其他复制云点replicate # 复制internaLreplication=false控制写入所有复制节点#dickhouse 自实现on cluster DDL批量处理作用?clickhouse集群在单个节点上执行DDL语句只会在此节点生效,这就意味着如果对应集群 为100个节点,那么DDL语句就要去每个节点都要执行一次。on cluster关键字解决了以上问题,执行一次所有节点都执行完毕,此方法需要zabbix集 群去同步处理的。123CREATE TABLE tuto
6、rial.hit_v4_all,n cluste(cluster 3 shard l replicasAS tutorial.hit_v4 BIGINE = Distributed(cluster_3_shard_l_replicast tutorial, hit_v2, randO);关键4集群名称4. share如何实现?clickhouse分片是通过distribute引擎实现的,简称分布表。分布表属于路由类型,实际上 不存储数据,可以看成是路由表,底层是本地表存储数据。12345 CREATE TABLE tutorial.hit_v4_all on cluster cluster_3
7、_shard_l_replicas AS tutorial.hit_v4 ENGINE = Distributed(cluster_3_shard_l_replicas, tutorial, hit_v2, randO); / / / / /j关键字集群名 库名本地表名分布方式115. replicate实现方式replicate (复制)有两种实现方式,fflilfalseift行控制。false表示写入的时候写入所有复制节点,缺点是某个节点写入不成功就会造成数据不完 整。true表示只写入当前节点,zookeeper负责同步。 建议在实际生产用第二种。7.如何批量创立replicateme
8、rgetree复制表1 CREATE TABLE testlllzookeeper目录名2 (EventDate DateTime,4 CounterlD UInt32,5 UserID UInt32 7 ENGINE = ReplicatedMergeTree/clickhouse/tables/te?t/testl.l.l4 1 replicationll8 PARTITION BY toYYYYm (EventDctej9 ORDER BY (CounterlD, EventDate, intHash32(UserID)/K10 SAMPLE BY intHash32(UserID);/
9、9 表名、share 名称replicate 名称14上图为创立复制表语句,可以看到replicatemergetree后需要显示指明share名称和 replicate名称,这里就有一个问题,如果利用on cluster DDL批量操作,那么会报错,因为配 置重复。那么怎么解决这个问题,clickhouse给出了宏配置方案。具体配置如下:234 conf.xml56 vshard10;4/shard .repliajdcaction0#/replica9 10111214 SQL语句it CREATE TABLE testlll on cluster cluster_3_shard_l_re
10、plicas17 (- 18 EventDate DateTime,19 CounterlD UInt32,20 UserID UInt3222 ENGINE = ReplicatedMergeTree(,/clickhouse/tableslshardVtestlll, replica) .PARTITION BY toYYYYMMCEventDate)L24 ORDER BY (CounterlD, EventDate, intHash32(UserID)2f SAMPLE BY intHash32(UserID);2627clickhouse如何加载外部配置文件?conf.xml 和 u
11、sers.xml 自动加载conf.d和users.d目录内自动加载/etc/metrika.xml 自动加载除此之外 需要在conf.xml中配置如下 配置文件路径 clickhouse如何从外部配置文件获取的属性生效?、除conf.xml和users.xml以外其他配置文件中的属性如果想生效那么需要在conf.xml或 users.xml中使用inclo incl属性表示可从外部文件中获取节点名为clickhouse_remote_servers 的配置内容。1011121314151617181920conf.xml,/etc/metrika.xmldcaction012181dcact
12、ion022181dcaction03218121 22232425lO.clickhouse hosts文件配置需考前须知hosts文件中配置IP对应的名称必须为主机名 否那么clickhouse复制表利用zookeeper同步时会出错找不到主机。root四cactioqOl clickhouse-server# hostname dcaction01root(adcaction01 clickhouse-server# cat /etc/hosts 127.0.0.1 localhost localhost, localdomain local.host4 local.host4. Ioc
13、aldomain4 :1localhost localhost.localdomain localhost6 localhost6.Iocaldomain6dcactionOl dcaction02 dcaction03(rootdcaction01 clickhouse-server#ll.clickhouse集群如何对外提供服务?clickhouse没有主从概念,可以在前端搭建负载均衡LVS轮询所有节点对外提供服务。二、clickhouse集群搭建1.服务器信息IP主机名dcactionOldcaction02dcaction032.软件信息软件名版本存储路径zookeeper/data/
14、zookeeperclickhouse/opt/clickhouse/3. zookeeper集群安装1) 修改/etc/hostsrootdcaction01 zookeeper# cat /etc/hosts127.0.0.1 localhost localhost.localdomain Iocalhost4 Iocalhost4.localdomain4:1localhost localhost.localdomain Iocalhost6 Iocalhost6.localdomain6dcactionOldcaction02dcaction032)官网下载zookeeper软件 注意
15、下载带bin二进制压缩文件才能符合需求, 3) mkdir/data/4) tar -xzvf apache-zookeeper-3.8.0-bin.tar.gz -C /data/5)6) vi /data/zookeeper/conf/zoo.cfgtickTime=2000initLimit=10syncLimit=5dataDir=/data/zookeeper/data # 数据存储位置clientPort=2181#集群配置server.l=192.168.10.112:2888:3888server.2=192.168.10.113:2888:3888server.3=192.1
16、68.10.114:2888:38887)以上操作三台服务器都处理完后 每台服务器执行第7步8) /data/zookeeper/bin/zkServer.sh start #启动 zookeeper9) /data/zookeeper/bin/zkServer.sh status #查看 zookeeper 集群状态 rootdcaction01 zookeeper# ./bin/zkServer.sh status/usr/bin/javaZooKeeper JMX enabled by defaultUsing config: /data/zookeeper/bin/./conf/zoo
17、.cfgClient port found: 2181. Client address: localhost. Client SSL: false.Mode: followerrootdcaction02 # /data/zookeeper/bin/zkServer.sh status /usr/bin/javaZooKeeper JMX enabled by defaultUsing config: /data/zookeeper/bin/./conf/zoo.cfgClient port found: 2181. Client address: localhost. Client SSL:
18、 false. Mode: leaderrootdcaction03 # /data/zookeeper/bin/zkServer.sh status /usr/bin/javaZooKeeper JMX enabled by defaultUsing config: /data/zookeeper/bin/./conf/zoo.cfgClient port found: 2181. Client address: localhost. Client SSL: false. Mode: follower10) zookeeper集群处理完毕4. clickhouse安装配置1) 每台服务器yu
19、m方式安装yum install -y yum-utilsyum-config-manager -add-repo yum install -y clickhouse-server clickhouse-client2) 配置/etc/clickhouse-server/config.xml (只标出修改局部): # 对外连接没有限制 ttclickhouse 集群配置 incl 表示外部文件 属性生效 #zookeeper 集群配置 incl 表示外部文件属性生效macros incl=macros-servers”/ #宏定义incl表示外部文件属性生效3)酉己置/etc/metri ka
20、.xml #clickhouse 集群酉己置#clickhouse 集群名称 三个分片 单个复制false #定 义复制同步方式192,168.10.1129000192.168.10.1139000192.168.10.1149000 #宏定义配置用于复制表批量DDL配合参数化01#分片的标识 不同分片标识不同replicadcaction01#复制的标识 同一个分片下 复制的标识不同 #zookeeper 酉己置dcaction012181dcaction022181dcaction0321814)以上配置三台服务器都执行后,每台服务器进行第5步5) /etc/init.d/clickho
21、use-server start6)查看集群配置(root9dcaction61 data# clickhouse-clientClickHouse client version 22.2.2.1.Connecting to localhost:%” as user default.Connected to ClickHousc server version 22.2.2 revision 5445s.dcactionBl :) select fro syste(n.clusters;SELECT HWK system, clustersQuery id: db5b6da-dl8d-4419-
22、ble4-ld39816466b8-clustercluste r_3_sha rd_ repl leasi 7shard jwi9ht-replicar-host.address器-loot1-userdefaultr-e r rors_cotmt- -sl(Mdowns_count-1 e-est ifBited_recovery_t211192.168.1.113defaulteeecluster_3_$hard_l_replic$31190Mdefaultee03 rows in set. Elapsed: 0.005 sec.5.复制表创立#三台服务器分别执行dcactiond :)
23、CREATE TABLE tutorial.testll5 (dcactiond :)CREATE TABLE tutorial.testll5 (EventDate DateTime,CounterlD Ulnt32,UserID Ulnt32 )ENGINE = ReplicatedMergeTree(7clickhouse/tables/test/testll5, replicationll2)ENGINE = ReplicatedMergeTree(7clickhouse/tables/test/testll5, replicationll2)PARTITION BY toYYYYMM
24、(EventDate) ORDER BY(CounterlD, EventDate, intHash32(UserlD) SAMPLE BY intHash32(UserlD);dcaction02 :)CREATE TABLE tutorial.testll5 (dcaction02 :)CREATE TABLE tutorial.testll5 (EventDate DateTime,CounterlD Ulnt32,UserID Ulnt32 )ENGINE = ReplicatedMergeTree(7clickhouse/tables/test/testll5, replicatio
25、nll3)ENGINE = ReplicatedMergeTree(7clickhouse/tables/test/testll5, replicationll3)PARTITION BY toYYYYMM(EventDate) ORDER BY(CounterlD, EventDate, intHash32(UserlD) SAMPLE BY intHash32(UserlD);dcaction03 :)CREATE TABLE tutorial.testllS (dcaction03 :)CREATE TABLE tutorial.testllS (EventDate DateTime,C
26、ounterlD Ulnt32zUserID Ulnt32 )PARTITION BY toYYYYMM(EventDate) ORDER BYENGINE = ReplicatedMergeTree(7clickhouse/tables/test/testll5, replicationll4) (CounterlD, EventDate, intHash32(UserlD) SAMPLE BY intHash32(UserlD);#在dcactionOl执行插入dcactionOl :) insert into tutorial.testllS values(2022-01-22 02:0
27、1:01,444,333);#在 dcaction02 dcaction03 查看 dcaction02 :)dcaction02 :) select * from testll5;SELECT * HUM testllSQuery id: 7e2573f4-cc20-4db0-8e2a-877d53131465 EventDatep-Counte rlD-j-Ose rlD-,I 2022-01-22 02:61:01 |444 |333 |dcaction03 :)dcaction63 :) select from testll5;SELECT *FROM testll5Query id:
28、 8a25Be5d-67e2-430b-ae62-6906c8c379a5EventDate-p-Counte rIDjUse rID-)2022-01-22 02:01:01 |444333 |.复制表参数化DDL批量创立#SQL语句CREATE TABLE tutorial.testll? on cluster cluster_3_shard_l_replicas (CREATE TABLE tutorial.testll? on cluster cluster_3_shard_l_replicas (EventDate DateTime,CounterlD Ulnt32zUserID U
29、lnt32 ) ENGINE = ReplicatedMergeTree(7clickhouse/tables/shard/testll7, replica) PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterlD, EventDate, intHash32(UserlD) SAMPLE BY intHash32(UserlD);#宏定义配置#dcaction01rootgdcactionfll cat /etc/clickhouse-server/config.xml|grep macro acres incl= Mcros-servers
30、/ rootdcaction01 # cat /etc/netrika.xml|grep -A6 macro Blreplicaxicaction01#dcaction02rootdcaction02 # cat /etc/clickhouse-server/config.xml|grep macro rootdcaction02 # cat /etc/metrika.xl|grep -A6 macro 02dcaction02#dcaction03rootdcaction03 -# cat /etc/clickhouse-server/config.xml|grep nacrorootedc
31、action03 # cat /etc/metrika.xml|grep -A6 macro 03replicaxicaction036 .分布表+本地表创立#on cluster DDL批量创立本地表dcaCtionOl :) CREATE TABLE testll6 on cluster cluster_3_shard_l_replicas ( EventDate DateTime, CounterlD Ulnt32z UserID Ulnt32 ) ENGINE = MergeTree PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterlD, EventDate, intHash32(UserlD) SAMPLE BYintHash32(UserlD);#on cluster DDL批量创立分布表dcaCtionOl :) CREATE TABLE tutorial.testll6_all on cluster cluster_3_shard_l_repIicas AS tutorial.testll6 ENGINE = Distributed(cluster_3_shard_l_replicas, tutorial, testll6, rand();8.分布表+复制表创立#on cluster DDL复制表 语句