36 Minutes: How an Automated Monitor Saved Our Client from a Costly Outage

May 7, 2025 author

Petro Gordiievych

May 7, 2026 — It was 00:53 when the server went down. No one was awake. No support ticket had arrived. Not a single customer had complained yet.

But within seconds of the outage starting, an automated monitor had already detected it, logged the incident, and sent an alert to the responsible team member's email. By 01:30 — 36 minutes and 14 seconds later — the service was back online, the incident was closed, and a clean report was waiting in the inbox: what happened, when it started, when it was resolved, and what caused it.

The customer never knew there was a problem.

The Alternative

Here is what happens without uptime monitoring.

The server goes down at 00:53. Nobody knows. At 08:30 the next morning, a client sends an email: "Your platform isn't working." By then the application has been unreachable for nearly eight hours. The team starts investigating from scratch. The issue is eventually traced and fixed. The damage — missed transactions, frustrated users, a support thread that shouldn't exist — is already done.

For an e-commerce platform with modest traffic, eight hours of downtime can mean thousands of euros in lost revenue. For a SaaS product, it can mean cancelled subscriptions and support overhead that takes weeks to recover from. For any business, it means the customer found out before you did — which is never a good look.

What Uptime Monitoring Actually Does

An uptime monitor is a simple concept: an external service sends a request to your application every few minutes from a location outside your own infrastructure. If the response doesn't come back as expected — wrong status code, connection timeout, too slow — it triggers an alert immediately.

The key word is external. Your own internal health checks won't catch a network-level failure. Your server may report itself as healthy while being completely unreachable from the outside world. An external monitor doesn't care what your server thinks — it checks what a real user would experience.

In the incident above, the root cause was a connection timeout on the hosting provider's side. The server was running internally but unreachable externally. An internal health check would have missed it entirely. The external monitor, checking from Ashburn, USA, flagged it within one check cycle.

The Cost of Downtime vs. the Cost of Monitoring

The numbers are straightforward.

Industry benchmarks place the average cost of IT downtime for a small or medium business at €300 to €1,500 per hour, accounting for lost revenue, staff investigation time, and customer impact. For businesses running critical operations online, the figure can be significantly higher.

UptimeRobot — the tool we use and recommend — costs €0 per month on the free tier, covering up to 50 monitors with 5-minute check intervals. The paid tier, which adds 1-minute checks and additional features, starts at a few euros per month.

The math doesn't require a spreadsheet.

What a Good Monitoring Setup Covers

At GPO OU, uptime monitoring is a standard part of every production deployment we deliver. A solid baseline setup includes:

1. HTTP/HTTPS endpoint monitoring
Verify that your main application URL returns a 200 OK. If it returns anything else — or nothing — the alert fires.

2. API health checks
For applications with backend APIs, monitor the health endpoint directly. A working frontend with a broken API is still a broken product.

3. SSL certificate expiry alerts
An expired certificate takes your HTTPS site offline and triggers browser security warnings for every visitor. Monitors can alert you 30 days before expiry — long before it becomes an emergency.

4. Alert routing
Alerts should reach whoever can act on them: email, Telegram, Slack, PagerDuty, or a combination. A notification that reaches the wrong person at 01:00 is no better than no notification at all.

5. Incident reports
A timestamped log of every incident — start time, resolution time, duration, root cause, detection location — gives you the data to understand patterns and report accurately to clients when they ask.

A Note on Infrastructure Reliability

No cloud infrastructure is immune to incidents. AWS, Hetzner, Google Cloud — all of them experience outages. The outage described in this post occurred on Hetzner, our preferred cloud infrastructure provider for most client workloads. Hetzner is reliable, cost-efficient, and GDPR-compliant — and still had a connection timeout event that affected a monitored endpoint for 36 minutes.

The provider didn't change the outcome. The monitoring did.

36 minutes of downtime. Zero customer impact. That is what the right tooling looks like in practice.

How GPO OU Can Help

If your production application is running without uptime monitoring, we can configure it in under an hour. If you are unsure what is or isn't being monitored in your current setup, we can audit it and close the gaps.

We work with UptimeRobot, Better Uptime, and custom monitoring solutions depending on project requirements. All application and cloud deployments from GPO OU include monitoring configuration as a standard deliverable — not an afterthought.

Contact us:

Website: https://gpo-tech.com
Email: contact@gpo-tech.com

GPO OU is a software and hardware development company based in Estonia, specialising in cloud infrastructure, IoT systems, and autonomous drone technology. We build and maintain production systems for clients across Europe.

devops monitoring infrastructure cloud

Back to blog