add new blog
This commit is contained in:
parent
4b71a0f116
commit
0e01bc6758
10
src/main.rs
10
src/main.rs
|
@ -113,10 +113,18 @@ async fn main() {
|
|||
lazy_static! {
|
||||
static ref blogs: HashMap<&'static str, BlogInfo<'static>> = {
|
||||
let mut m = HashMap::new();
|
||||
m.insert(
|
||||
"11-28-2024-postmortem",
|
||||
BlogInfo {
|
||||
title: "Downtime Incident Postmortem (Nov 2024 - Present)",
|
||||
date: "December 28th, 2024",
|
||||
url: "11-28-2023-postmortem",
|
||||
},
|
||||
);
|
||||
m.insert(
|
||||
"11-08-2023-postmortem",
|
||||
BlogInfo {
|
||||
title: "Downtime Incident Postmortem",
|
||||
title: "Downtime Incident Postmortem (Nov 2023)",
|
||||
date: "November 11th, 2023",
|
||||
url: "11-08-2023-postmortem",
|
||||
},
|
||||
|
|
51
templates/blogs/11-28-2024-onward-portmortem.html
Normal file
51
templates/blogs/11-28-2024-onward-portmortem.html
Normal file
|
@ -0,0 +1,51 @@
|
|||
<h1>NWS Incident Postmortem 11/28/2024 - Present</h1>
|
||||
|
||||
<p>
|
||||
On November 28th, 2024 at approximately 07:37 UTC, NWS suffered
|
||||
a complete outage. This outage resulted in the downtime of all
|
||||
services hosted on NWS and the downtime of the NWS Management
|
||||
Engine and the NWS dashboard.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The incident lasted 10 days and 15 hours after which it was manually
|
||||
resolved and all services were restored. This was NWS' first
|
||||
outage event of 2024.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
Since then, similar outages have occurred.
|
||||
</p>
|
||||
|
||||
<h2>Cause</h2>
|
||||
<p>
|
||||
NWS utilizes several tactics to ensure uptime. A component of
|
||||
this is load balancing and failover. Due to logistical issues,
|
||||
only one NWS point of presence has been operating since early
|
||||
November 2024. This means that any issue with the remaining
|
||||
datacenter will result in a total outage. More points of presence
|
||||
are expected to be brought online in August 2024. Similar incidents are
|
||||
expected until then.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
This outage lasted 10 days due to the fact that I was busy with
|
||||
school. I'm not super concerned about maintaining high uptime with
|
||||
only one server, and I'm pretty happy with NWS since we hit 100% uptime
|
||||
for a >365 day period.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The cause of the outage was that the Xfinity ( yeah :( ) router that
|
||||
NWS uses in the Pottsville location encountered an issue which caused
|
||||
it to automatically drop all port forwards. To combat this issue, a new
|
||||
Ubiquiti EdgeMax router is scheduled to be installed in December 2024.
|
||||
</p>
|
||||
|
||||
|
||||
<h2>Fix</h2>
|
||||
<p>
|
||||
The port forwards were restored and the router is scheduled to be replaced.
|
||||
</p>
|
||||
|
||||
<p>Last updated on December 28th, 2024</p>
|
|
@ -18,11 +18,12 @@ We operate four datacenters located across three cities in two states. This infr
|
|||
|
||||
<p>
|
||||
This has led to us maintaining four nines availability (99.9931% ; 38 minutes of downtime
|
||||
all year) for 2023 and 100% uptime for 2024 (YTD).
|
||||
all year) for 2023 and <b>100% uptime for the period from 11/8/2023 to 11/28/2024 (over a year!). This was the original goal of NWS.</b>
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In 2024, YTD we have surpassed both Vercel and Github Pages in total uptime
|
||||
Currently, NWS is only able to operate with one point of presence and as such, will
|
||||
have reduced uptime. This is expected to be resolved around August 2024.
|
||||
</p>
|
||||
|
||||
<h2>Compare us to our competitors!</h2>
|
||||
|
|
Loading…
Reference in a new issue