Here are just some notes about MySQL high availability (HA) based on practical experiences.

  1. Companies have to grow to the point when they finally understand necessity for HA.
    • It usually means their services for customers get so valuable (usually meaning SLAs fees are so severe) so they finally understand HA means redundancy in system and it means to spend some money “for nothing”.
    • Also it means to go through at least one real disaster situation when people finally can see that magic spell “we have backups” is just an illusion. Because restoring database even only several GB big can mean dozens of minutes or several hours. Which will kill all SLAs with absolute certainty.
  2. Of course people tend to forget bad things so after some time there will be always someone who will ask – why are we spending money for redundant servers/ VM instances? So it is always good idea to document old problems properly to be able to argue with such people.
  3. For example if you have MySQL cluster which must be from time to time scaled up to withstand some peaks it is good idea to have some replicas without load available + one special replica which is intended for base backups or cloning (on cloud) in case you need to create more replicas in some reasonable time.
  4. This special replica for backup/ cloning should not be used for other purposes. So you will be able to lock it with read lock or even stop it whenever you need and make backup or clone instance.
  5. Of course if you have no problem to switch master to read_only or stop it any time you need then this special replica is not necessary. But this is most probably not typical scenario…
  6. Replicas should be in read_only mode and you should not use superuser across your applications. Which is usually very often problem….