Deployment Architectureedit

There are multiple ways to deploy Uptime and Heartbeat. Use the information in this section to determine the best deployment for you. A guiding principle is that an outage that takes down the service being monitored should not also take down Heartbeat. You want Heartbeat to be functioning even when your service is not, so the guidelines here help you maximise this possibility.

Heartbeat is generally run as a centralized service within a data center. While it is possible to run it as a separate "sidecar" process paired with each process/container, we recommend against it. Running Heartbeat centrally ensures you will still be able to see monitoring data in the event of an overloaded, disconnected, or otherwise malfunctioning server.

For further redundancy, you may want to deploy multiple Heartbeats across geographic and/or network boundaries to provide more data. Specify Heartbeat’s observer geo options to do so. Some examples might be:

  • A site served from a content delivery network (CDN) with points of presence (POPs) around the globe: In this case you may want to have multiple Heartbeat instances at different data centers around the world checking to see if your site is reachable via local CDN POPs.
  • A service within a single data center that is accessed across multiple VPNs: Set up one Heartbeat instance within the VPN the service operates from, and another within an additional VPN that users access the service from. Having both instances will help pinpoint network errors in the event of an outage.
  • A single service running primarily in a US east coast data center, with a hot failover located in a US west coast data center: In each data center, run a Heartbeat instance that checks both the local copy of the service and its counterpart across the country. Set up two monitors in each region, one for the local service and one for the remote service. In the event of a data center failure it will be immediately obvious if the service had a connectivity issue to the outside world or if the failure was only internal.