It is always useful to have monitoring – we use mysql-exporter, node-exporter, Prometheus and Grafana. After 1 year of using it I cannot imagine solving any problem on the database without monitoring.
Percona dashboards for Grafana + my own dashboards based on them showing data from mysql-exporter are my “everyday friends” 🙂
If you see lags in monitoring:
- Check output of the command
show slave status\G
- value “Relay_Log_File” shows relay log which is currently processed on replica.
- Look into directory with relay logs (variable “relay-log”) and check if there are relay logs with higher number in extension available.
- If not – problem is with IO_THREAD – meaning your network is too slow. Try:
- set “slave_compressed_protocol=ON” – data sent from master will be compressed so lower amount will be transfered.
- If you have problems with network connectivity check “slave_net_timeout” (=timeout in seconds) and if it is too high set it to some small number like 10 etc.
- Similar setting is “CHANGE MASTER TO MASTER_CONNECT_RETRY = xx” to some low value (value is also in seconds).
- If you see more unapplied relay logs available then problem is with SQL_THREAD which is more tricky.
- binlog_format=MIXED (or even STATEMENT)
- innodb_thread_concurrency to some higher number
- slave_parallel_workers to some non zero or higher number (most likely you will have to set “slave-parallel-type=LOGICAL_CLOCK”)