add new blog

This commit is contained in:
Nicholas Orlowsky 2024-12-28 13:34:58 -05:00
parent 4b71a0f116
commit 0e01bc6758
3 changed files with 63 additions and 3 deletions

View file

@ -113,10 +113,18 @@ async fn main() {
lazy_static! { lazy_static! {
static ref blogs: HashMap<&'static str, BlogInfo<'static>> = { static ref blogs: HashMap<&'static str, BlogInfo<'static>> = {
let mut m = HashMap::new(); let mut m = HashMap::new();
m.insert(
"11-28-2024-postmortem",
BlogInfo {
title: "Downtime Incident Postmortem (Nov 2024 - Present)",
date: "December 28th, 2024",
url: "11-28-2023-postmortem",
},
);
m.insert( m.insert(
"11-08-2023-postmortem", "11-08-2023-postmortem",
BlogInfo { BlogInfo {
title: "Downtime Incident Postmortem", title: "Downtime Incident Postmortem (Nov 2023)",
date: "November 11th, 2023", date: "November 11th, 2023",
url: "11-08-2023-postmortem", url: "11-08-2023-postmortem",
}, },

View file

@ -0,0 +1,51 @@
<h1>NWS Incident Postmortem 11/28/2024 - Present</h1>
<p>
On November 28th, 2024 at approximately 07:37 UTC, NWS suffered
a complete outage. This outage resulted in the downtime of all
services hosted on NWS and the downtime of the NWS Management
Engine and the NWS dashboard.
</p>
<p>
The incident lasted 10 days and 15 hours after which it was manually
resolved and all services were restored. This was NWS' first
outage event of 2024.
</p>
<p>
Since then, similar outages have occurred.
</p>
<h2>Cause</h2>
<p>
NWS utilizes several tactics to ensure uptime. A component of
this is load balancing and failover. Due to logistical issues,
only one NWS point of presence has been operating since early
November 2024. This means that any issue with the remaining
datacenter will result in a total outage. More points of presence
are expected to be brought online in August 2024. Similar incidents are
expected until then.
</p>
<p>
This outage lasted 10 days due to the fact that I was busy with
school. I'm not super concerned about maintaining high uptime with
only one server, and I'm pretty happy with NWS since we hit 100% uptime
for a >365 day period.
</p>
<p>
The cause of the outage was that the Xfinity ( yeah :( ) router that
NWS uses in the Pottsville location encountered an issue which caused
it to automatically drop all port forwards. To combat this issue, a new
Ubiquiti EdgeMax router is scheduled to be installed in December 2024.
</p>
<h2>Fix</h2>
<p>
The port forwards were restored and the router is scheduled to be replaced.
</p>
<p>Last updated on December 28th, 2024</p>

View file

@ -18,11 +18,12 @@ We operate four datacenters located across three cities in two states. This infr
<p> <p>
This has led to us maintaining four nines availability (99.9931% ; 38 minutes of downtime This has led to us maintaining four nines availability (99.9931% ; 38 minutes of downtime
all year) for 2023 and 100% uptime for 2024 (YTD). all year) for 2023 and <b>100% uptime for the period from 11/8/2023 to 11/28/2024 (over a year!). This was the original goal of NWS.</b>
</p> </p>
<p> <p>
In 2024, YTD we have surpassed both Vercel and Github Pages in total uptime Currently, NWS is only able to operate with one point of presence and as such, will
have reduced uptime. This is expected to be resolved around August 2024.
</p> </p>
<h2>Compare us to our competitors!</h2> <h2>Compare us to our competitors!</h2>