From 0e01bc6758e545cd98ad9a47007e5244b1e56554 Mon Sep 17 00:00:00 2001 From: Nicholas Orlowsky Date: Sat, 28 Dec 2024 13:34:58 -0500 Subject: [PATCH] add new blog --- src/main.rs | 10 +++- .../blogs/11-28-2024-onward-portmortem.html | 51 +++++++++++++++++++ templates/index.html | 5 +- 3 files changed, 63 insertions(+), 3 deletions(-) create mode 100644 templates/blogs/11-28-2024-onward-portmortem.html diff --git a/src/main.rs b/src/main.rs index d6db50c..10cd001 100644 --- a/src/main.rs +++ b/src/main.rs @@ -113,10 +113,18 @@ async fn main() { lazy_static! { static ref blogs: HashMap<&'static str, BlogInfo<'static>> = { let mut m = HashMap::new(); + m.insert( + "11-28-2024-postmortem", + BlogInfo { + title: "Downtime Incident Postmortem (Nov 2024 - Present)", + date: "December 28th, 2024", + url: "11-28-2023-postmortem", + }, + ); m.insert( "11-08-2023-postmortem", BlogInfo { - title: "Downtime Incident Postmortem", + title: "Downtime Incident Postmortem (Nov 2023)", date: "November 11th, 2023", url: "11-08-2023-postmortem", }, diff --git a/templates/blogs/11-28-2024-onward-portmortem.html b/templates/blogs/11-28-2024-onward-portmortem.html new file mode 100644 index 0000000..6fbcdab --- /dev/null +++ b/templates/blogs/11-28-2024-onward-portmortem.html @@ -0,0 +1,51 @@ +

NWS Incident Postmortem 11/28/2024 - Present

+ +

+ On November 28th, 2024 at approximately 07:37 UTC, NWS suffered + a complete outage. This outage resulted in the downtime of all + services hosted on NWS and the downtime of the NWS Management + Engine and the NWS dashboard. +

+ +

+ The incident lasted 10 days and 15 hours after which it was manually + resolved and all services were restored. This was NWS' first + outage event of 2024. +

+ +

+ Since then, similar outages have occurred. +

+ +

Cause

+

+ NWS utilizes several tactics to ensure uptime. A component of + this is load balancing and failover. Due to logistical issues, + only one NWS point of presence has been operating since early + November 2024. This means that any issue with the remaining + datacenter will result in a total outage. More points of presence + are expected to be brought online in August 2024. Similar incidents are + expected until then. +

+ +

+ This outage lasted 10 days due to the fact that I was busy with + school. I'm not super concerned about maintaining high uptime with + only one server, and I'm pretty happy with NWS since we hit 100% uptime + for a >365 day period. +

+ +

+ The cause of the outage was that the Xfinity ( yeah :( ) router that + NWS uses in the Pottsville location encountered an issue which caused + it to automatically drop all port forwards. To combat this issue, a new + Ubiquiti EdgeMax router is scheduled to be installed in December 2024. +

+ + +

Fix

+

+ The port forwards were restored and the router is scheduled to be replaced. +

+ +

Last updated on December 28th, 2024

diff --git a/templates/index.html b/templates/index.html index 614f0c5..1d33226 100644 --- a/templates/index.html +++ b/templates/index.html @@ -18,11 +18,12 @@ We operate four datacenters located across three cities in two states. This infr

This has led to us maintaining four nines availability (99.9931% ; 38 minutes of downtime - all year) for 2023 and 100% uptime for 2024 (YTD). + all year) for 2023 and 100% uptime for the period from 11/8/2023 to 11/28/2024 (over a year!). This was the original goal of NWS.

- In 2024, YTD we have surpassed both Vercel and Github Pages in total uptime + Currently, NWS is only able to operate with one point of presence and as such, will + have reduced uptime. This is expected to be resolved around August 2024.

Compare us to our competitors!