Don't Double Major
Double majoring in college is a very suboptimal strategy. The reason is simple: It adds a substantial set of constraints to the courses you can take, but in return gives you only a very modest extra...
View ArticleA College Waitlisting Model
Suppose a selective college wants $N_0$ students in their freshman class. How many students should they admit, and what's the distribution of the number of students they'll admit off the waitlist? Of...
View ArticleCollege Interview Tips
The college admissions interview is a valuable component of college applications because it provides admissions officers with a holistic evaluation from a source that has no vested interest in your...
View ArticleVisualizing Lasso Polytope Geometry
Some recent research about the lasso exploits a beautiful geometric picture: Suppose you fix the design matrix X and the regularization parameter $\lambda$. For a particular value of y, the...
View ArticleSensitivity of Independence Assumptions
Recently I was considering an interesting problem: Several people interview a potential job candidate, and each of them scores that candidate numerically on some scale. What's the variation associated...
View ArticlePython Subclass Relationships Aren't Transitive
Subclass relationships are not transitive in Python. That is, if A is a subclass of B, and B is a subclass of C, it is not necessarily true that A is a subclass of C. The reason for this is that with...
View ArticleRobust Machine Learning
Real data often has incorrect values in it. Origins of incorrect data include programmer errors, ("oops, we're double counting!"), surprise API changes, (a function used to return proportions, suddenly...
View ArticleT-Tests Aren't Monotonic
R. A. Fisher and Karl Pearson play a heated round of golf. Being Statisticians, they agree before the round to run a two-sided paired T-test to see if either of them is statistically significantly...
View ArticleHow to Forge an Email
Most people don't realize how easy it is to forge an email. Say my brother John Doe uses the email address john.doe@example.com. If I get an email from that address, it's natural to assume that John...
View ArticleHalf the Decimal Trick
If something happened 1,234 out of 10,000 times, we'd estimate that the true probability of occurence is about 0.1234. Of course, we wouldn't expect the true probability to be exactly 0.1234, and to...
View ArticleYou Can't Predict Small Counts
A small restaurant is interested in predicting how many customers will come in on a given night. This is valuable information to know ahead of time, for example, so that the restaurant can figure out...
View ArticleVisualizing DBSCAN Clustering
A previous post covered clustering with the k-means algorithm. In this post, we consider a fundamentally different, density-based approach called DBSCAN. In contrast to k-means, which modeled clusters...
View ArticleMachine Learning over JSON
Supervised machine learning is the problem of approximating functions X -> Y from many example (x, y) pairs. Now, the vast majority of supervised learning algorithms assume that X is p-dimensional...
View ArticleOHMS Lessons Learned
Note: I found the following post as an almost complete draft as I was reading some of my unpublished posts. I wrote it around October 1st, 2013, at the beginning of what would end up being my last year...
View ArticleDesperation Motivated Creativity
I am not the strongest climber. Some of the people I've climbed with are so strong that they can do a one-arm pull-up, and then--while locking off with one arm--sing the "Head, Shoulders, Knees and...
View ArticleWhy I'm Making Tauthon
For the past two months I've been spending half my time on Tauthon. Tauthon is a backwards-compatible Python interpreter that runs Python 2 code and C-extensions exactly as-is, while also allowing...
View ArticleAn Easy Chess Puzzle
I was looking through Markovian, my old chess engine, recently, and came across the first game it won against another chess engine. Stepping through the game, it seems that both engines actually played...
View ArticleContinuous Time Lending
Assume a borrower takes out an installment loan of size $1$ and makes continuous-time payments on it. The installment loan starts at time $0$, ends at time $T$, and has an interest rate of $r$,...
View ArticleDay-to-Day Operations of Palo Alto
Palo Alto runs a pretty open city government, with a number of interesting documents available for download on their website. Of particular interest are their annual budgets and annual financial...
View ArticleImplementing "nonlocal" in Tauthon: Part I
Tauthon is a fork of Python 2.7 with syntax, builtins, and libraries backported from Python 3. It aspires to be able to run all valid Python 2 and 3 code. In this article, I begin discussing how I was...
View Article