CS371p Spring 2022: Jonathan Randall

What did you do this past week?

I turned in Collatz. I worked a bit on a project for my Big Data class, involving Spark SQL. Programming with SQL-like logic activates a different part of the brain than most of the programming I do. We’re implementing a Google-inspired PageRank algorithm in PySpark SQL, but does this problem not call for a Spark Graph endpoint? Also, in M340L (“Matrices”), our professor taught us PageRank in conjunction with Markov chains and repeated matrix exponentiation. Are these perspectives on PageRank all semantically identical? I hope to dive deeper into this.

For my Data Mining course, I wrote code in Pandas to create decision trees from labeled data. I tried my best to leverage Numpy and write cute list comprehensions to optimize the execution time, but I realized that I had overlooked a very basic optimization in one of my helper methods. In that class, we learned that decision trees create axis-parallel hyperplanes to divide observations into various labelled regions. I want to try out my decision tree code on the same data, but after applying PCA to rotate and scale the data. I hypothesize this should make it easier for the decision tree to linearly separate data.

What’s in your way?

I need to spend less money on food. I’m so absorbed in my classes this semester to the point that I’m forgetting to do other things. I need to figure out how to file income tax.

What will you do next week?

I have two big ole interview rounds that are going to take up a lot of my attention.

What did you think of Paper #4: Pair Programming?

I like to read the papers after writing each week’s blog entry.

What was your experience of exceptions and StrCmp? (this question will vary, week to week)

I was interested in how C++ introduced exceptions (among other things) into C, and looked into how exceptions are implemented in various compilers. In class, we went over naive methods to check user error, including checking return values, global variables, and parameters. These naive methods struggle with nested function call chains, and remembering to implement user-defined “try/catch” blocks in the caller method is tricky! In C, one can attempt to use setjmp and longjmp to transfer control up the function call hierarchy, but this is not for the faint of heart or recommended by most C style guides. Anyway, I am again reminded of how much I have yet to learn about compilers.

What made you happy this week?

I went to my friend’s birthday party. I also went out to dinner at Suerte, which was pretty good although really expensive. And such small portions!

What’s your pick-of-the-week or tip-of-the-week?

My pick-of-the-week would be 3Blue1Brown’s excellent video on solving Wordle using information theory concepts, which I’m sure everybody has already watched since it’s been out for a week. That being said, I tried my hand at making a Wordle solver in Python, using a very naive approach, so I respect any effort to defeat Wordle once and for all. One observation I had is that the narrator uses a greedy algorithm: for each round, guess the word which best cuts down the expected entropy of the remaining words. However, don’t we really want to look further in the future than just the next round?

Headshot