new blog posts

This commit is contained in:
Nicholas Orlowsky 2023-11-16 12:59:25 -05:00
parent 178550a1a8
commit beb0b9ab04
No known key found for this signature in database
GPG key ID: BE7DF0188A405E2B
7 changed files with 373 additions and 2 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 249 KiB

View file

@ -0,0 +1,81 @@
<head>
<title>Nicholas Orlowsky</title>
<link rel="stylesheet" href="/style.css">
<link rel="icon" type="image/x-icon" href="/favicon.ico">
</head>
<body>
<nav>
<a href="/">[ Home ]</a>
<a href="/blog.html">[ Blog ]</a>
<a href="/projects.html">[ Projects ]</a>
<a href="/extra.html">[ Extra ]</a>
<hr/>
</nav>
<h1>Side Project Log 8/15/2023</h1>
<p>This side project log covers work done from 8/8/2023 - 8/15/2023</p>
<h2 id="olney">Olney</h2>
<p>
I added a frontend to Olney and added a feature where it can automatically keep track of your job applications
by monitoring your email.
</p>
<h3>Frontend</h3>
<p>
The frontend was made with Svelte. I chose not to use any UI/CSS libraries as I wanted to keep the number of
dependencies low. This was another good opportunity to learn about Svelte.
</p>
<h3>Automatic Tracking via E-Mail</h3>
<p>
This is the killer feature that I initially set out to build Olney for. This works by having the user forward their
E-Mail to an instance of Olney. To receive E-Mail, Olney uses <a href="https://inbucket.org">Inbucket</a>, a mailserver
easily hostable within Docker. It listens on a websocket for incoming mail. Whenever a new mail message is received,
Olney uses the OpenAI API to get a summary of the email in the following format:
</p>
<pre><code class="language-json">
{
isRecruiting: bool, // is the message about recruiting?
recruitingInfo: null | {
location: string, // Location in City, Providence/State, Country format
company: string, // Casual name of company e.g: Google, Cisco, Apple
position: string, // Name of job position
type: "assessment" | "interview" | "offer" | "rejection" | "applied" // What the message is discussing
dateTime: string, // DateTime communication rec'd OR DateTime that is being discussed (i.e. interview date confirmation)
name: string // Name of event, giving more detail to type
} // null if message is not about recruiting, fill with values if it is
}
</code></pre>
<p>
Olney then takes some details from this data, namely: company, position, and location and then uses the OpenAI API to generate
an <a href="https://www.pinecone.io/learn/vector-embeddings/">embedding</a>. We then query the closest match out of the job applications
in the database (with <a href="https://github.com/pgvector/pgvector">pgvector</a>). Once we have the job application, we add
the event to the database, using the job application's id as a fkey.
</p>
<p>
Embeddings was chosen as the lookup method that way we don't have to worry about data being parsed out of the email being an exact
match for what the user inputted. This also allows the lookup to work even when certain things such as location are missing from the
email.
</p>
<p>
Olney should be open-sourced/released within the next week or two.
</p>
<hr>
<p><strong>These projects had minimal/no work done on them:</strong> NWS, RingGold, SQUIRREL</p>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/styles/dark.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/highlight.min.js"></script>
<script>hljs.highlightAll();</script>
<footer>
<hr />
<p style="margin-bottom: 0px;">Copyright &#169; Nicholas Orlowsky 2023</p>
<p style="margin-top: 0px;">Hosting provided by <a href="https://nws.nickorlow.com">NWS</a></p>
</footer>
</body>

View file

@ -0,0 +1,86 @@
<head>
<title>Nicholas Orlowsky</title>
<link rel="stylesheet" href="/style.css">
<link rel="icon" type="image/x-icon" href="/favicon.ico">
</head>
<body>
<nav>
<a href="/">[ Home ]</a>
<a href="/blog.html">[ Blog ]</a>
<a href="/projects.html">[ Projects ]</a>
<a href="/extra.html">[ Extra ]</a>
<hr/>
</nav>
<h1>Side Project Log 8/8/2023</h1>
<p>This side project log covers work done from 7/12/2023 - 8/8/2023</p>
<h2 id="squirrel">SQUIRREL</h2>
<p>
SQUIRREL has been updated to work with INSERT INTO and SELECT queries. I also refactored much of the codebase to do error handling more elegantly and to make the parser
more extensible. Here's a screenshot of table creation, data insertion, and data selection:
</p>
<p>
The biggest challenge of this part was working on the parser which has now been written three times. The approaches to the parsing were:
</p>
<ol>
<li>
<b>Stepping through whitespace:</b> <p>This was my initial and naive approach to the problem. I split the input string by its whitespace
and then queried values by referencing their indexes in the split string. </p>
</li>
<li>
<b>Regex:</b> <p>This approach was cleaner than the first and led to a small parser, however it required an external dependency (which I'm
trying to minimize), and would make it hard to add additional features to commands later down the line.</p>
</li>
<li>
<b>Finite state machine:</b> <p>This solution was more verbose than the others, however it allows for easier development. This method works
by splitting the query string into tokens. Tokens are the smallest piece of data that a parser recognizes. SQUIRREL gets them by splitting
the input by delimiters and using the split list as tokens (excluding whitespace) SQUIRREL recognizes the following characters as delimiters:
</p>
<code>
' ', ',', ';', '(', ')'
</code>
<p>
This means that the string "INSERT INTO test (id) VALUES (12);" would be parsed into the list: "INSERT", "INTO", "test", "(", "id", etc..
</p>
<p>
Once we have our list of tokens, we iterate through them starting at a default state and perform a certain task for the given state, which
usually includes switching to another state. We do this until we reach the end state.
</p>
<p>
For example, with the above insert statement, we would start in the IntoKeyword state which would ensure that "INTO" is the current token.
We would then transition to the TableName state which would read the table name and store it in the ParsedCommand struct we're returning. We
would then move to the ColumnListBegin state which would look for an opening parenthesis, and switch the state to ColumnName. This process
continues with the other parts of the query until the Semicolon state is reached which checks that the statement ends with a semicolon, then
returns the ParsedCommand struct.
</p>
</li>
</ol>
<p>
Next steps for this are to add column selection to SELECT statements and add WHERE clauses to SELECT statements.
</p>
<h2 id="olney">Olney</h2>
<p>
I added a feature to the Olney API which scans the <a href="https://github.com/SimplifyJobs/Summer2024-Internships">pittcsc (now Simplify) summer internships Github repo</a>
and parses the data into JSON format. I parsed the markdown file they have uisng regex which was relatively simple. There were some issues during development due to the
changing structure of the markdown file. These issues are being fixed on a rolling basis. I expect the changes to slowdown now that the transition from pittcsc to Simplify
is complete. You can access the JSON at <a href="https://olney.nickorlow.com/jobs">olney.nickorlow.com/jobs</a>.
</p>
<hr>
<p><strong>These projects had minimal/no work done on them:</strong> NWS, RingGold</p>
<footer>
<hr />
<p style="margin-bottom: 0px;">Copyright &#169; Nicholas Orlowsky 2023</p>
<p style="margin-top: 0px;">Hosting provided by <a href="https://nws.nickorlow.com">NWS</a></p>
</footer>
</body>

View file

@ -1,7 +1,8 @@
<div>
<h1 style="margin-bottom: 0px;">Blog</h1>
<p style="margin-top: 0px;">A collection of my thoughts, some of them may be interesting</p>
<p><a href="./blogs/nws-postmortem-11-8-23.html">[ NWS Postmortem 11/08/23 ]</a> - November, , 2023</p>
<p><a href="./blogs/side-project-10-20-23.html">[ Side Project Log 10/20/23 ]</a> - October 20th, 2023</p>
<p><a href="./blogs/side-project-8-15-23.html">[ Side Project Log 8/15/23 ]</a> - August 15th, 2023</p>
<p><a href="./blogs/side-project-8-8-23.html">[ Side Project Log 8/08/23 ]</a> - August 8th, 2023</p>
<p><a href="./blogs/side-project-7-12-23.html">[ Side Project Log 7/12/23 ]</a> - July 12th, 2023</p>

View file

@ -0,0 +1,89 @@
<h1>NWS Incident Postmortem 11/08/2023</h1>
<p>
On November 8th, 2023 at approximately 09:47 UTC, NWS suffered
a complete outage. This outage resulted in the downtime of all
services hosted on NWS and the downtime of the NWS Management
Engine and the NWS dashboard.
</p>
<p>
The incident lasted 28 minutes after which it was automatically
resolved and all services were restored. This is NWS' first
outage event of 2023.
</p>
<h2>Cause</h2>
<p>
NWS utilizes several tactics to ensure uptime. A component of
this is load balancing and failover. This service is currently
provided by Cloudflare at the DNS level. Cloudflare sends
health check requests to NWS servers at specified intervals. If
it detects that one of the servers is down, it will remove the
A record from entry.nws.nickorlow.com for that server (this domain
is where all services on NWS direct their traffic via a
CNAME).
</p>
<p>
At around 09:47 UTC, Cloudflare detected that our servers in
Texas (Austin and Hill Country) were down. It did not detect an
error, but rather an HTTP timeout. This is an indication that the
server has lost network connectivity. When it detected that the
servers were down, it removed their A records from the
entry.nws.nickorlow.com domains. Since NWS' Pennsylvania servers
have been undergoing maintenance since August 2023, this left no
servers able to serve requests routed to entry.nws.nickorlow.com,
resulting in the outage.
</p>
<p>
NWS utilizes UptimeRobot for monitoring the uptime statistics of
services on NWS and NWS servers. This is the source of the
statistics shown on the NWS status page.
</p>
<p>
UptimeRobot did not detect either of the Texas NWS servers as being
offline for the duration of the outage. This is odd, as UptimeRobot
and Cloudflare did not agree on the status of NWS servers. Logs
on NWS servers showed that requests from UptimeRobot were being
served while no requests from Cloudflare were shown in the logs.
</p>
<p>
No firewall rules existed that could have blocked this traffic
for either of the NWS servers. There was no other configuration
found that would have blocked these requests. As these servers
are on different networks inside different buildings in different
parts of Texas, their networking equipment is entirely separate.
This rules out any hardware failure of networking equipment owned
by NWS. This leads us to believe that the issue may have been
caused due to an internet traffic anomaly, although we are currently
unable to confirm that this is the cause of the issue.
</p>
<p>
This is being actively investigated to find a more concrete root
cause. This postmortem will be updated if any new information is
found.
</p>
<p>
A similar event occurred on November 12th, 2023 lasting for 2 seconds.
</p>
<h2>Fix</h2>
<p>
The common factor between both of these servers is that they both use
Spectrum for their ISP and that they are located near Austin, Texas.
The Pennsylvania server maintenance will be expedited so that we have
servers online that operate with no commonalities.
</p>
<p>
NWS will also investigate other methods of failover and load
balancing.
</p>
<p>Last updated on November 16th, 2023</p>

View file

@ -0,0 +1,99 @@
<h1>Side Project Log 10/20/2023</h1>
<p>This side project log covers work done from 8/15/2023 - 10/20/2023</p>
<h2 id="anthracite">Anthracite</h2>
<a href="https://github.com/nickorlow/anthracite">[ GitHub Repo ]</a>
<p>
Anthracite is a web server written in C++. The site you're reading this on
right now is hosted on Anthracite. I wrote it to deepen my knowledge of C++ and networking protocols. My
main focus of Anthracite is performance. While developing anthracite,
I have been exploring different optimization techniques and benchmarking
Anthracite against popular web servers such as NGINX and Apache.
Anthracite supports HTTP/1.1 and only supports GET requests to request
files stored on a server.
</p>
<p>
Anthracite currently performs on par with NGINX and Apache when making
1000 requests for a 50MB file using 100 threads in a Docker container.
To achieve this performance, I used memory profilers to find
out what caused large or repeated memory copies to occur. I then updated
those sections of code to remove or minimize these copies. I also
made it so that Anthracite caches all files it can serve in memory. This
avoids unnecessary and costly disk reads. The implementation of this is
subpar, as it requires that the server be restarted whenever the files
it is serving are changed for the updates to be detected by Anthracite.
</p>
<p>
I intend to make further performance improvements, specifically in the request
parser. I also plan to implement HTTP/2.0.
</p>
<h2 id="yacemu">Yet Another Chip Eight Emulator (yacemu)</h2>
<a href="https://github.com/nickorlow/yacemu">[ GitHub Repo ]</a>
<p>
YACEMU is an interpreter for the CHIP-8 instruction set written in C. My main
goal when writing it was to gain more insight into how emulation works. I had
previous experience with this from when I worked on an emulator for a slimmed-down
version of X86 called <a href="https://web.cse.ohio-state.edu/~reeves.92/CSE2421sp13/PracticeProblemsY86.pdf">Y86</a>.
So far, I've been able to get most instructions working. I need to work on adding
input support so that users can interact with programs running in yacemu. It has
been fairly uncomplicated and easy to write thus far. After I complete it, I would
like to work on an emulator for a real device such as the GameBoy (This might be
biting off more than I can chew).
</p>
<h2 id="nick-vim">Nick VIM</h2>
<p>
Over the summer while I was interning, I began using VIM as my primary
text editor. I used a preconfigured version of it (<a href="https://nvchad.com/">NvChad</a>) to save time, as
setting everything up can take a while. After using it for a few months, I began
making my own configuration for VIM, taking what I liked from NvChad and leaving
behind the parts that I didn't like as much.
</p>
<img src="/blog-images/NickVIM_Screenshot.png" alt="Screenshot of an HTML file open for editing in NickVIM"/>
<p>
One important part of Nick VIM was ensuring that it was portable between different
machines. I wanted the machine to have as few dependencies as possible so that I
could get NickVIM set up on any computer in a couple of minutes. This will be especially
useful when working on my School's lab machines and when switching to new computers
in the future. I achieved this by dockerizing Nick VIM. This is based on what one of
my co-workers does with their VIM setup. The Docker container contains
all the dependencies for each language server. Whenever you edit a file with Nick Vim,
the following script runs:
</p>
<code lang="bash">
echo Starting container...
cur_dir=`pwd`
container_name=${cur_dir////$'_'}
container_name="${container_name:1}_$RANDOM"
docker run --name $container_name --network host -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --mount type=bind,source="$(pwd)",target=/work -d nick-vim &> /dev/null
echo Execing into container...
docker exec -w /work -it $container_name bash
echo Stopping container in background...
docker stop $container_name &> /dev/null &
</code>
<p>
This code creates a new container, forwards the host's clipboard to the container, and
mounts the current directory inside the container for editing.
</p>
<h2 id="secane">Secane</h2>
<p><a href="https://www.youtube.com/watch?v=tKRehO7FH_s">[ Video Demo ]</a></p>
<p>
Secane was a simple ChatGPT wrapper that I wrote to practice for the behavioral part of
job interviews. It takes your resume, information about the company, and information about
the role you're interviewing for. It also integrates with OpenAI's whisper, allowing you
to simulate talking out your answers. I made it with Next.JS.
</p>
<hr/>
<p><strong>These projects had minimal/no work done on them:</strong> NWS, RingGold, SQUIRREL</p>
<p><strong>These projects I will no longer be working on:</strong> Olney</p>

View file

@ -18,6 +18,21 @@
</p>
</div>
<div>
<h2>Anthracite Web Server</h2>
<p><i>C++ &amp; Python</i></p>
<a href="https://github.com/nickorlow/anthracite">[ GitHub Repo ]</a>
<p>
Anthracite is a simple web server written in C++. It currently supports HTTP/1.0 and HTTP/1.1.
The benchmarking tools for Anthracite are written in Python. Anthracite is optimized for performance
and rivals the performance of NGINX &amp; Apache in our testing. It uses a thread-per-connection
architecture, allowing it to process many requests in paralell. Additionally, it caches all
files that it serves in memory to ensure that added latency from disk reads do not slow down requests.
Through writing Anthracite, I have learned to use different C++ profilers as well as some general
optimization techniques for C++.
</p>
</div>
<div>
<h2>CavCash</h2>
<p><i>C#, Kubernetes, SQL Server, and MongoDB</i></p>
@ -54,7 +69,7 @@
<div>
<h2>Olney</h2>
<i>Rust, Postgres, Svelte, TypeScript, and OpenAI's API</i>
<p><i>Rust, Postgres, Svelte, TypeScript, and OpenAI's API</i></p>
<a href="https://github.com/nickorlow/olney">[ GitHub Repo ]</a>
<p>
Olney is a job application tracker that aims to be better than using a <a href="https://trello.com">Trello</a> board or a spreadsheet.