- A+
ELK收集MySQL慢日志并告警
采用的是filebeat采集日志,Redis做日志存储,logstash消费处理日志,将处理过的日志存储到ES,kibana做日志展示,Elastalert做监控告警长时间的慢日志。
1. ELK架构的安装
参考文档:https://www.cnblogs.com/98record/p/13648570.html
2. Elastalert 安装
2.1 官方Git代码
采用的是Docker方式部署
[root@centos2 opt]# git clone https://github.com/Yelp/elastalert.git [root@centos2 opt]# cd elastalert [root@centos2 elastalert]# ls changelog.md docs Makefile requirements.txt tests config.yaml.example elastalert pytest.ini setup.cfg tox.ini docker-compose.yml example_rules README.md setup.py Dockerfile-test LICENSE requirements-dev.txt supervisord.conf.example # 创建Dockerfile [root@centos2 elastalert]# cat Dockerfile FROM ubuntu:latest RUN apt-get update && apt-get upgrade -y RUN apt-get -y install build-essential python3 python3-dev python3-pip libssl-dev git WORKDIR /home/elastalert ADD requirements*.txt ./ RUN pip3 install -r requirements-dev.txt # 编译容器 [root@centos2 elastalert]# docker build -t elastalert:1 . [root@centos2 elastalert]# docker run -itd --name elastalert -v `pwd`/:/home/elastalert/ elastalert:1 [root@centos2 elastalert]# docker exec -it elastalert bash root@45f77d2936d4:/home/elastalert# pip install elastalert
2.2 集成Git代码
因官方的docker代码多年未更新,导致很多问题,而且也为集成钉钉插件,所我特根据我个人的需求,集成了钉钉插件后,并重写了
Dockerfile
。我已将相关文件上传到我的gitee,并与官方代码合成,有需要的直接拉即可。
git clone https://gitee.com/rubbishes/elastalert-dingtalk.git cd elastalert docker build -t elastalert:1 . docker run -itd --name elastalert -v `pwd`/:/home/elastalert/ elastalert:1
3.配置
3.1 filebeat配置
[root@mysql-178 filebeat-7.6.0-linux-x86_64]# vim filebeat.yml #=========================== Filebeat inputs ============================= filebeat.inputs: - type: log # Change to true to enable this input configuration. enabled: true # Paths that should be crawled and fetched. Glob based paths. paths: - /usr/local/mysql/data/mysql-178-slow.log #- c:programdataelasticsearchlogs* # Exclude lines. A list of regular expressions to match. It drops the lines that are # matching any regular expression from the list. #exclude_lines: ['^# Time'] exclude_lines: ['^# Time|^/usr/local/mysql/bin/mysqld|^Tcp port|^Time'] multiline.pattern: '^# Time|^# User' multiline.negate: true multiline.match: after #配置filebeat是否重头开始读取日志,默认是重头开始的。 #tail_files: true tags: ["mysql-slow-log"] #============================= Filebeat modules =============================== filebeat.config.modules: # Glob pattern for configuration loading path: ${path.config}/modules.d/*.yml # Set to true to enable config reloading reload.enabled: ture # Period on which files under path should be checked for changes reload.period: 10s #==================== Elasticsearch template setting ========================== setup.template.settings: index.number_of_shards: 1 #index.codec: best_compression #_source.enabled: false #================================ General ===================================== # The name of the shipper that publishes the network data. It can be used to group # all the transactions sent by a single shipper in the web interface. name: 10.228.81.178 #============================== Dashboards ===================================== # These settings control loading the sample dashboards to the Kibana index. Loading # the dashboards is disabled by default and can be enabled either by setting the # options here or by using the `setup` command. #setup.dashboards.enabled: false # The URL from where to download the dashboards archive. By default this URL # has a value which is computed based on the Beat name and version. For released # versions, this URL points to the dashboard archive on the artifacts.elastic.co # website. #setup.dashboards.url: #============================== Kibana ===================================== # Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API. # This requires a Kibana endpoint configuration. setup.kibana: # Kibana Host # Scheme and port can be left out and will be set to the default (http and 5601) # In case you specify and additional path, the scheme is required: http://localhost:5601/path # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601 #host: "localhost:5601" # Kibana Space ID # ID of the Kibana Space into which the dashboards should be loaded. By default, # the Default Space will be used. #space.id: #================================ Outputs ===================================== # Configure what output to use when sending the data collected by the beat. #-------------------------- Elasticsearch output ------------------------------ #output.elasticsearch: # Array of hosts to connect to. # hosts: ["localhost:9200"] # Protocol - either `http` (default) or `https`. #protocol: "https" # Authentication credentials - either API key or username/password. #api_key: "id:api_key" #username: "elastic" #password: "changeme" #----------------------------- Logstash output -------------------------------- #output.logstash: # The Logstash hosts # hosts: ["localhost:5044"] # Optional SSL. By default is off. # List of root certificates for HTTPS server verifications #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"] # Certificate for SSL client authentication #ssl.certificate: "/etc/pki/client/cert.pem" # Client Certificate Key #ssl.key: "/etc/pki/client/cert.key" #================================ Processors ===================================== # Configure processors to enhance or manipulate events generated by the beat. processors: - add_host_metadata: ~ - add_cloud_metadata: ~ - add_docker_metadata: ~ - add_kubernetes_metadata: ~ #删除字段 - drop_fields: fields: ["beat","offset", "prospector"] #================================ Logging ===================================== # Sets log level. The default log level is info. # Available log levels are: error, warning, info, debug # 刚开始调试的时候可以开启debug模式,后期注释了就行了 #logging.level: debug # At debug level, you can selectively enable logging only for some components. # To enable all selectors use ["*"]. Examples of other selectors are "beat", # "publish", "service". #logging.selectors: ["*"] #================================= Migration ================================== # This allows to enable 6.7 migration aliases #migration.6_to_7.enabled: true #输出到Redis output.redis: hosts: ["10.228.81.51:6379"] password: "123456" db: "1" key: "mysqllog" timeout: 5 datatype: list
3.2 logstash配置
建议使用docker与二进制方式部署,rpm包部署的时候提示不支持ruby语句。
input { redis { host => "10.228.81.51" port => 6379 password => "123456" db => "1" data_type => "list" key => "mysqllog" } } filter { json { source => "message" } grok { match => [ "message" , "(?m)^#s+User@Host:s+%{USER:user}[[^]]+]s+@s+(?:(?<clienthost>S*) )?[(?:%{IPV4:clientip})?]s+Id:s+%{NUMBER:row_id:int}n#s+Query_time:s+%{NUMBER:query_time:float}s+Lock_time:s+%{NUMBER:lock_time:float}s+Rows_sent:s+%{NUMBER:rows_sent:int}s+Rows_examined:s+%{NUMBER:rows_examined:int}ns*(?:use %{DATA:database};s*n)?SETs+timestamp=%{NUMBER:timestamp};ns*(?<sql>(?<action>w+)b.*;)s*(?:n#s+Time)?.*$" ] } #替换时间戳 date { locale => "en" match => ["timestamp","UNIX"] target => "@timestamp" } #因MySQL使用的是UTC时间跟我们的时间差八小时,所以我们需要将时间戳加8小时再传给ES ruby { code => "event.set('timestamp', event.get('@timestamp').time.localtime + 8*3600)" } } output { stdout { #开启debug模式,调试的时候使用,调试完成后建议关闭,不然日志输出真的大,特别在监控mysql-binglog那种的时候 codec => rubydebug } # 这里判断tags标签是否等于 mysql-slow-log,如果是则输出到es,并生成索引为 mysql-slow-log-年月日 if [tags][0] == "mysql-slow-log" { elasticsearch { hosts => ["10.228.81.51:9200"] index => "%{[tags][0]}-%{+YYYY.MM.dd}" } } }
3.3 Elastalert 配置
3.3.1 config.yaml 配置
先复制一下默认的
cp config.yaml.example config.yaml
然后酌情修改,如下
# 主要是配置es的地址与端口,其他的无需配置 # This is the folder that contains the rule yaml files # Any .yaml file will be loaded as a rule rules_folder: example_rules run_every: minutes: 1 buffer_time: minutes: 15 # The Elasticsearch hostname for metadata writeback # Note that every rule can have its own Elasticsearch host es_host: 10.228.81.51 # The Elasticsearch port es_port: 9200 # The index on es_host which is used for metadata storage # This can be a unmapped index, but it is recommended that you run # elastalert-create-index to set a mapping writeback_index: elastalert_status writeback_alias: elastalert_alerts # If an alert fails for some reason, ElastAlert will retry # sending the alert until this time period has elapsed alert_time_limit: days: 2
通过我的Git拉取的,直接修改config.yaml
文件配置即可,修改点与上大同。
3.3.2 rule.yaml配置
这主要是配置你的告警规则的
钉钉通知
cd example_rules cat mysql_rule.yaml # 配置es的主机与端口 es_host: 10.228.81.51 es_port: 9200 #不使用https协议 use_ssl: False #定义规则唯一标识,需要唯一性。 name: My-Product Exception Alert # 指定规则类型 ## 支持any,blacklist,whitelist,change,frequency,spike,flatline,new_term,cardinality 类型 ### frequency: type: frequency在相同 query_key条件下,timeframe 范围内有num_events个被过滤出 来的异常; # 指定索引名,支持通配符,正则匹配与kibana中一样 index: mysql-* #时间出发的次数 num_events: 1 #和num_events参数关联,也就是说在5分钟内触发1次会报警 timeframe: minutes: 5 # 设置告警规则 filter: - query: query_string: # 这里的语法使用的 es中的查询语法,测试的时候可以使用kibana中的查询来过滤出自己想要的内容,然后粘贴至此 query: "user:eopuser OR user:root" # 指定需要的字段,如果不指定的话默认是所有字段 include: ["message","clientip","query_time"] # 告警方式,我这里使用的是钉钉,支持email与企业微信 alert: - "elastalert_modules.dingtalk_alert.DingTalkAlerter" # 配置你机器人的api dingtalk_webhook: "https://oapi.dingtalk.com/robot/send?access_token=96eabeeaf956bb26128fed1259cxxxxxxxxxxfa6b2baeb" # 钉钉标题,也是机器的关键字 dingtalk_msgtype: "text" #alert_subject: "test" # 指定内容格式 alert_text: " text: 1 n IP: {}n QUERYTIME: {} " alert_text_args: - clientip - query_time
邮件通知
# 与钉钉没多大区别就是需要配置一下 email的一些相关信息 root@45f77d2936d4:/home/elastalert/example_rules# cat myrule_email.yaml es_host: 10.228.81.51 es_port: 9200 use_ssl: False #name属性要求唯一,这里最好能标示自己的产品 name: My-Product Exception Alert #类型,我选择任何匹配的条件都发送邮件警告 type: any #需要监控的索引,支持通配 index: mysql-* num_events: 50 timeframe: hours: 4 filter: - query: query_string: query: "user:eopuser OR user:root" #email的警告方式 alert: - "email" #增加邮件内容 alert_text: "test" #SMTP协议的邮件服务器相关配置(我这里是阿里企业邮箱) smtp_host: smtp.mxhichina.com smtp_port: 25 #用户认证文件,需要user和password两个属性 smtp_auth_file: smtp_auth_file.yaml email_reply_to: test@test.com from_addr: test@test.com #需要接受邮件的邮箱地址列表 email: - "test@test.com" # 因为我们的账号与密码也写在了yaml文件中了,所以我们需要在同级目录下配置一下 root@45f77d2936d4:/home/elastalert/example_rules# cat smtp_auth_file.yaml user: "test@test.com" password: "123456"
注意: 如果是使用我的代码构建的,需修改 example_rules/myrule.yaml
规则文件,其他规则名无效,或修改我的run.sh
脚本也可。
3.3.3 安装dingtalk插件
这是因为使用的原版的制作无dingtalk插件,所以需要手动安装。如采用我的Dockerfile生成的话是已经有了的,可以省略
git clone https://github.com.cnpmjs.org/xuyaoqiang/elastalert-dingtalk-plugin.git cd elastalert-dingtalk-plugin/ # 将elastalert_modules目录拷贝到 elastalert 根目录下 cp -r elastalert_modules ../elastalert/
4. 启动
启动顺序
ES > Kibana > elastalert > Redis > Filebeat > Logstash
其实启动顺序主要需要将ES启动先,这样kibana才能起来,然后为了能告警所以我们需要先把elastalert起起来,接着Redis起来为filebeat收集日志做准备,filebeat启动收集日志到Redis,logstash 最后启动 消费Redis中的数据,存到ES。
其他的启动我刚开始的文档中都有,我就不多说了,主要是针对elastalert的启动需要多说一嘴。
一样,如果是使用我的代码生成的docker,那么无需操作这一步。
# 进入容器 [root@centos2 elastalert]# docker exec -it elastalert bash # 先做个测试规则文件有没有问题 root@45f77d2936d4:/home/elastalert# root@45f77d2936d4:/home/elastalert# elastalert-test-rule example_rules/myrule.yaml # 没问题就后台运行好了 root@45f77d2936d4:/home/elastalert# nohup python3 -m elastalert.elastalert --verbose --rule example_rules/myrule.yaml & root@45f77d2936d4:/home/elastalert# exit
5、 扩展
elastalert Dockerfile 文件编写
FROM ubuntu:latest RUN apt-get update && apt-get upgrade -y && apt-get install -y build-essential python3 python3-dev python3-pip libssl-dev git && echo "Asia/Shanghai" > /etc/timezone WORKDIR /home/elastalert ADD ./* ./ RUN pip install elastalert && ln -sf /dev/stdout elastalert.log && ln -sf /dev/stderr elastalert.log CMD ["/bin/bash","run.sh"]
运行
docker run -itd --name elastalert -v /root/elastalert/:/home/elastalert/ -v /etc/localtime:/etc/localtime elastalert:1