new blogs

This commit is contained in:
Nicholas Orlowsky 2023-08-18 01:57:57 -05:00
parent 74af6d1531
commit 053398bad3
No known key found for this signature in database
GPG key ID: 24C84C4DDAD95065
6 changed files with 133 additions and 9 deletions

View file

@ -2,6 +2,8 @@
<h1 style="margin-bottom: 0px;">Blog</h1>
<p style="margin-top: 0px;">A collection of my thoughts, some of them may be interesting</p>
<p><a href="./blogs/side-project-8-15-23.html">[ Side Project Log 8/15/23 ]</a> - August 15th, 2023</p>
<p><a href="./blogs/side-project-8-8-23.html">[ Side Project Log 8/08/23 ]</a> - August 8th, 2023</p>
<p><a href="./blogs/side-project-7-12-23.html">[ Side Project Log 7/12/23 ]</a> - July 12th, 2023</p>
<p><a href="./blogs/side-project-4-29-23.html">[ Side Project Log 4/29/23 ]</a> - April 29th, 2023</p>
<p><a href="./blogs/side-project-3-27-23.html">[ Side Project Log 3/27/23 ]</a> - March 27th, 2023</p>

View file

@ -0,0 +1,56 @@
<h1>Side Project Log 8/15/2023</h1>
<p>This side project log covers work done from 8/8/2023 - 8/15/2023</p>
<h2 id="olney">Olney</h2>
<p>
I added a frontend to Olney and added a feature where it can automatically keep track of your job applications
by monitoring your email.
</p>
<h3>Frontend</h3>
<p>
The frontend was made with Svelte. I chose not to use any UI/CSS libraries as I wanted to keep the number of
dependencies low. This was another good opportunity to learn about Svelte.
</p>
<h3>Automatic Tracking via E-Mail</h3>
<p>
This is the killer feature that I initially set out to build Olney for. This works by having the user forward their
E-Mail to an instance of Olney. To receive E-Mail, Olney uses <a href="https://inbucket.org">Inbucket</a>, a mailserver
easily hostable within Docker. It listens on a websocket for incoming mail. Whenever a new mail message is received,
Olney uses the OpenAI API to get a summary of the email in the following format:
</p>
<code>
{
isRecruiting: bool, // is the message about recruiting?
recruitingInfo: null | {
location: string, // Location in City, Providence/State, Country format
company: string, // Casual name of company e.g: Google, Cisco, Apple
position: string, // Name of job position
type: "assessment" | "interview" | "offer" | "rejection" | "applied" // What the message is discussing
dateTime: string, // DateTime communication rec'd OR DateTime that is being discussed (i.e. interview date confirmation)
name: string // Name of event, giving more detail to type
} // null if message is not about recruiting, fill with values if it is
}
</code>
<p>
Olney then takes some details from this data, namely: company, position, and location and then uses the OpenAI API to generate
an <a href="https://www.pinecone.io/learn/vector-embeddings/">embedding</a>. We then query the closest match out of the job applications
in the database (with <a href="https://github.com/pgvector/pgvector">pgvector</a>). Once we have the job application, we add
the event to the database, using the job application's id as a fkey.
</p>
<p>
Embeddings was chosen as the lookup method that way we don't have to worry about data being parsed out of the email being an exact
match for what the user inputted. This also allows the lookup to work even when certain things such as location are missing from the
email.
</p>
<p>
Olney should be open-sourced/released within the next week or two.
</p>
<hr>
<p><strong>These projects had minimal/no work done on them:</strong> NWS, RingGold, SQUIRREL</p>

View file

@ -0,0 +1,65 @@
<h1>Side Project Log 8/8/2023</h1>
<p>This side project log covers work done from 7/12/2023 - 8/8/2023</p>
<h2 id="squirrel">SQUIRREL</h2>
<p>
SQUIRREL has been updated to work with INSERT INTO and SELECT queries. I also refactored much of the codebase to do error handling more elegantly and to make the parser
more extensible. Here's a screenshot of table creation, data insertion, and data selection:
</p>
<p>
The biggest challenge of this part was working on the parser which has now been written three times. The approaches to the parsing were:
</p>
<ol>
<li>
<b>Stepping through whitespace:</b> <p>This was my initial and naive approach to the problem. I split the input string by its whitespace
and then queried values by referencing their indexes in the split string. </p>
</li>
<li>
<b>Regex:</b> <p>This approach was cleaner than the first and led to a small parser, however it required an external dependency (which I'm
trying to minimize), and would make it hard to add additional features to commands later down the line.</p>
</li>
<li>
<b>Finite state machine:</b> <p>This solution was more verbose than the others, however it allows for easier development. This method works
by splitting the query string into tokens. Tokens are the smallest piece of data that a parser recognizes. SQUIRREL gets them by splitting
the input by delimiters and using the split list as tokens (excluding whitespace) SQUIRREL recognizes the following characters as delimiters:
</p>
<code>
' ', ',', ';', '(', ')'
</code>
<p>
This means that the string "INSERT INTO test (id) VALUES (12);" would be parsed into the list: "INSERT", "INTO", "test", "(", "id", etc..
</p>
<p>
Once we have our list of tokens, we iterate through them starting at a default state and perform a certain task for the given state, which
usually includes switching to another state. We do this until we reach the end state.
</p>
<p>
For example, with the above insert statement, we would start in the IntoKeyword state which would ensure that "INTO" is the current token.
We would then transition to the TableName state which would read the table name and store it in the ParsedCommand struct we're returning. We
would then move to the ColumnListBegin state which would look for an opening parenthesis, and switch the state to ColumnName. This process
continues with the other parts of the query until the Semicolon state is reached which checks that the statement ends with a semicolon, then
returns the ParsedCommand struct.
</p>
</li>
</ol>
<p>
Next steps for this are to add column selection to SELECT statements and add WHERE clauses to SELECT statements.
</p>
<h2 id="olney">Olney</h2>
<p>
I added a feature to the Olney API which scans the <a href="https://github.com/SimplifyJobs/Summer2024-Internships">pittcsc (now Simplify) summer internships Github repo</a>
and parses the data into JSON format. I parsed the markdown file they have uisng regex which was relatively simple. There were some issues during development due to the
changing structure of the markdown file. These issues are being fixed on a rolling basis. I expect the changes to slowdown now that the transition from pittcsc to Simplify
is complete. You can access the JSON at <a href="https://olney.nickorlow.com/jobs">olney.nickorlow.com/jobs</a>.
</p>
<hr>
<p><strong>These projects had minimal/no work done on them:</strong> NWS, RingGold</p>

View file

View file

@ -36,21 +36,21 @@
</tr>
<tr>
<td>Bench Press</td>
<td>275 Lbs</td>
<td>2023</td>
<td>Gregory Gym</td>
<td>280 Lbs</td>
<td>August 10th, 2023</td>
<td>FUTO HQ</td>
</tr>
<tr>
<td>Squat</td>
<td>405 Lbs</td>
<td>2023</td>
<td>Gregory Gym</td>
<td>415 Lbs</td>
<td>August 10th, 2023</td>
<td>FUTO HQ</td>
</tr>
<tr>
<td>Deadlift</td>
<td>405 Lbs</td>
<td>2023</td>
<td>Gregory Gym</td>
<td>415 Lbs</td>
<td>Augist 10th, 2023</td>
<td>FUTO HQ</td>
</tr>
</table>
</div>

View file

@ -39,6 +39,7 @@
<div>
<h2>SQUIRREL</h2>
<a href="https://github.com/nickorlow/squirrel">[ GitHub Repo ]</a>
<p>
SQUIRREL stands for SQL Query Util-Izing Rust's Reliable and Efficient Logic. It is a SQL database
that I am writing in Rust. Currently, it can parse CREATE TABLE commands, and works with the