From 10782342f299802d9e50f5f0580d7f17537d211b Mon Sep 17 00:00:00 2001 From: Nicholas Orlowsky Date: Thu, 16 Nov 2023 13:17:56 -0500 Subject: [PATCH] ?? --- out/blogs/nws-postmortem-11-8-23.html | 111 +++++++++++++++++ out/blogs/side-project-10-20-23.html | 121 +++++++++++++++++++ src/blogs/nws-postmortem-11-8-23.filler.html | 89 ++++++++++++++ 3 files changed, 321 insertions(+) create mode 100644 out/blogs/nws-postmortem-11-8-23.html create mode 100644 out/blogs/side-project-10-20-23.html create mode 100644 src/blogs/nws-postmortem-11-8-23.filler.html diff --git a/out/blogs/nws-postmortem-11-8-23.html b/out/blogs/nws-postmortem-11-8-23.html new file mode 100644 index 0000000..d15d790 --- /dev/null +++ b/out/blogs/nws-postmortem-11-8-23.html @@ -0,0 +1,111 @@ + + Nicholas Orlowsky + + + + + + +

NWS Incident Postmortem 11/08/2023

+ +

+ On November 8th, 2023 at approximately 09:47 UTC, NWS suffered + a complete outage. This outage resulted in the downtime of all + services hosted on NWS and the downtime of the NWS Management + Engine and the NWS dashboard. +

+ +

+ The incident lasted 28 minutes after which it was automatically + resolved and all services were restored. This is NWS' first + outage event of 2023. +

+ +

Cause

+

+ NWS utilizes several tactics to ensure uptime. A component of + this is load balancing and failover. This service is currently + provided by Cloudflare at the DNS level. Cloudflare sends + health check requests to NWS servers at specified intervals. If + it detects that one of the servers is down, it will remove the + A record from entry.nws.nickorlow.com for that server (this domain + is where all services on NWS direct their traffic via a + CNAME). +

+ +

+ At around 09:47 UTC, Cloudflare detected that our servers in + Texas (Austin and Hill Country) were down. It did not detect an + error, but rather an HTTP timeout. This is an indication that the + server has lost network connectivity. When it detected that the + servers were down, it removed their A records from the + entry.nws.nickorlow.com domains. Since NWS' Pennsylvania servers + have been undergoing maintenance since August 2023, this left no + servers able to serve requests routed to entry.nws.nickorlow.com, + resulting in the outage. +

+ +

+ NWS utilizes UptimeRobot for monitoring the uptime statistics of + services on NWS and NWS servers. This is the source of the + statistics shown on the NWS status page. +

+ +

+ UptimeRobot did not detect either of the Texas NWS servers as being + offline for the duration of the outage. This is odd, as UptimeRobot + and Cloudflare did not agree on the status of NWS servers. Logs + on NWS servers showed that requests from UptimeRobot were being + served while no requests from Cloudflare were shown in the logs. +

+ +

+ No firewall rules existed that could have blocked this traffic + for either of the NWS servers. There was no other configuration + found that would have blocked these requests. As these servers + are on different networks inside different buildings in different + parts of Texas, their networking equipment is entirely separate. + This rules out any hardware failure of networking equipment owned + by NWS. This leads us to believe that the issue may have been + caused due to an internet traffic anomaly, although we are currently + unable to confirm that this is the cause of the issue. +

+ +

+ This is being actively investigated to find a more concrete root + cause. This postmortem will be updated if any new information is + found. +

+ +

+ A similar event occurred on November 12th, 2023 lasting for 2 seconds. +

+ +

Fix

+

+ The common factor between both of these servers is that they both use + Spectrum for their ISP and that they are located near Austin, Texas. + The Pennsylvania server maintenance will be expedited so that we have + servers online that operate with no commonalities. +

+ +

+ NWS will also investigate other methods of failover and load + balancing. +

+ +

Last updated on November 16th, 2023

+ + + diff --git a/out/blogs/side-project-10-20-23.html b/out/blogs/side-project-10-20-23.html new file mode 100644 index 0000000..d003846 --- /dev/null +++ b/out/blogs/side-project-10-20-23.html @@ -0,0 +1,121 @@ + + Nicholas Orlowsky + + + + + + +

Side Project Log 10/20/2023

+

This side project log covers work done from 8/15/2023 - 10/20/2023

+ +

Anthracite

+[ GitHub Repo ] +

+ Anthracite is a web server written in C++. The site you're reading this on + right now is hosted on Anthracite. I wrote it to deepen my knowledge of C++ and networking protocols. My + main focus of Anthracite is performance. While developing anthracite, + I have been exploring different optimization techniques and benchmarking + Anthracite against popular web servers such as NGINX and Apache. + Anthracite supports HTTP/1.1 and only supports GET requests to request + files stored on a server. +

+ +

+ Anthracite currently performs on par with NGINX and Apache when making + 1000 requests for a 50MB file using 100 threads in a Docker container. + To achieve this performance, I used memory profilers to find + out what caused large or repeated memory copies to occur. I then updated + those sections of code to remove or minimize these copies. I also + made it so that Anthracite caches all files it can serve in memory. This + avoids unnecessary and costly disk reads. The implementation of this is + subpar, as it requires that the server be restarted whenever the files + it is serving are changed for the updates to be detected by Anthracite. +

+ +

+ I intend to make further performance improvements, specifically in the request + parser. I also plan to implement HTTP/2.0. +

+ +

Yet Another Chip Eight Emulator (yacemu)

+[ GitHub Repo ] +

+ YACEMU is an interpreter for the CHIP-8 instruction set written in C. My main + goal when writing it was to gain more insight into how emulation works. I had + previous experience with this from when I worked on an emulator for a slimmed-down + version of X86 called Y86. + So far, I've been able to get most instructions working. I need to work on adding + input support so that users can interact with programs running in yacemu. It has + been fairly uncomplicated and easy to write thus far. After I complete it, I would + like to work on an emulator for a real device such as the GameBoy (This might be + biting off more than I can chew). +

+ +

Nick VIM

+

+ Over the summer while I was interning, I began using VIM as my primary + text editor. I used a preconfigured version of it (NvChad) to save time, as + setting everything up can take a while. After using it for a few months, I began + making my own configuration for VIM, taking what I liked from NvChad and leaving + behind the parts that I didn't like as much. +

+ +Screenshot of an HTML file open for editing in NickVIM + +

+ One important part of Nick VIM was ensuring that it was portable between different + machines. I wanted the machine to have as few dependencies as possible so that I + could get NickVIM set up on any computer in a couple of minutes. This will be especially + useful when working on my School's lab machines and when switching to new computers + in the future. I achieved this by dockerizing Nick VIM. This is based on what one of + my co-workers does with their VIM setup. The Docker container contains + all the dependencies for each language server. Whenever you edit a file with Nick Vim, + the following script runs: +

+ + +echo Starting container... +cur_dir=`pwd` +container_name=${cur_dir////$'_'} +container_name="${container_name:1}_$RANDOM" +docker run --name $container_name --network host -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --mount type=bind,source="$(pwd)",target=/work -d nick-vim &> /dev/null + +echo Execing into container... +docker exec -w /work -it $container_name bash + +echo Stopping container in background... +docker stop $container_name &> /dev/null & + + +

+ This code creates a new container, forwards the host's clipboard to the container, and + mounts the current directory inside the container for editing. +

+ +

Secane

+

[ Video Demo ]

+

+ Secane was a simple ChatGPT wrapper that I wrote to practice for the behavioral part of + job interviews. It takes your resume, information about the company, and information about + the role you're interviewing for. It also integrates with OpenAI's whisper, allowing you + to simulate talking out your answers. I made it with Next.JS. +

+ +
+

These projects had minimal/no work done on them: NWS, RingGold, SQUIRREL

+

These projects I will no longer be working on: Olney

+ + + diff --git a/src/blogs/nws-postmortem-11-8-23.filler.html b/src/blogs/nws-postmortem-11-8-23.filler.html new file mode 100644 index 0000000..dfccc2b --- /dev/null +++ b/src/blogs/nws-postmortem-11-8-23.filler.html @@ -0,0 +1,89 @@ +

NWS Incident Postmortem 11/08/2023

+ +

+ On November 8th, 2023 at approximately 09:47 UTC, NWS suffered + a complete outage. This outage resulted in the downtime of all + services hosted on NWS and the downtime of the NWS Management + Engine and the NWS dashboard. +

+ +

+ The incident lasted 28 minutes after which it was automatically + resolved and all services were restored. This is NWS' first + outage event of 2023. +

+ +

Cause

+

+ NWS utilizes several tactics to ensure uptime. A component of + this is load balancing and failover. This service is currently + provided by Cloudflare at the DNS level. Cloudflare sends + health check requests to NWS servers at specified intervals. If + it detects that one of the servers is down, it will remove the + A record from entry.nws.nickorlow.com for that server (this domain + is where all services on NWS direct their traffic via a + CNAME). +

+ +

+ At around 09:47 UTC, Cloudflare detected that our servers in + Texas (Austin and Hill Country) were down. It did not detect an + error, but rather an HTTP timeout. This is an indication that the + server has lost network connectivity. When it detected that the + servers were down, it removed their A records from the + entry.nws.nickorlow.com domains. Since NWS' Pennsylvania servers + have been undergoing maintenance since August 2023, this left no + servers able to serve requests routed to entry.nws.nickorlow.com, + resulting in the outage. +

+ +

+ NWS utilizes UptimeRobot for monitoring the uptime statistics of + services on NWS and NWS servers. This is the source of the + statistics shown on the NWS status page. +

+ +

+ UptimeRobot did not detect either of the Texas NWS servers as being + offline for the duration of the outage. This is odd, as UptimeRobot + and Cloudflare did not agree on the status of NWS servers. Logs + on NWS servers showed that requests from UptimeRobot were being + served while no requests from Cloudflare were shown in the logs. +

+ +

+ No firewall rules existed that could have blocked this traffic + for either of the NWS servers. There was no other configuration + found that would have blocked these requests. As these servers + are on different networks inside different buildings in different + parts of Texas, their networking equipment is entirely separate. + This rules out any hardware failure of networking equipment owned + by NWS. This leads us to believe that the issue may have been + caused due to an internet traffic anomaly, although we are currently + unable to confirm that this is the cause of the issue. +

+ +

+ This is being actively investigated to find a more concrete root + cause. This postmortem will be updated if any new information is + found. +

+ +

+ A similar event occurred on November 12th, 2023 lasting for 2 seconds. +

+ +

Fix

+

+ The common factor between both of these servers is that they both use + Spectrum for their ISP and that they are located near Austin, Texas. + The Pennsylvania server maintenance will be expedited so that we have + servers online that operate with no commonalities. +

+ +

+ NWS will also investigate other methods of failover and load + balancing. +

+ +

Last updated on November 16th, 2023