This project analyzes over 30 years of grids, answers, and clues that make up the New York Times daily crossword: what they mean, how they change, and who they represent.
- XWordInfo.com: I used this site to obtain the actual data for each crossword (grid, clues, answers)
- Who's in the Crossword (Michelle McGhee): A Pudding article that I took some visual inspiration from, for underlines and the last couple of charts
- 24 Years of NYT Crossword Answers (Johnathan Tan): An article that shares some overlap with my content, especially with things like Type-Token ratio.
- R (tidyverse, ggplot, gganimate)
- I used R (and its packages) to make references for most of the plots displayed on the website and general data exploration
- Python (Jupyter Notebook)
- I only used Python for one application: finding the most used first words and last words in clues, using the NLTK package
- Adobe After Effects
- To enhance the look of plots and to add animations, I re-created most plots in AE
- Adobe Illustrator
- Used for design of general image assets (minimap, logo, etc.) and some plots
- Adobe Photoshop
- I used Photoshop for conversion of GIFs from AE to one-time plays
- Excel
- Excel was used for additional data exploration and organization of some information for the last couple charts
- JavaScript, TypeScript (with HTML, CSS)
- I used some basic JavaScript and TypeScript code to add some interactability to the website, like guided scrolling, highlights, GIFs, minimap, tooltips, and mobile formatting
"One Million Crossword Clues" was a project that I worked on over the summer in 2025. From the inception of the idea to the actual finished product, I estimate it took me about 3 months and around 100 hours of work.
Every day over the summer, I solved (or attempted to solve) a Times crossword with my Dad. Over that time, we got significantly faster and better at solving these puzzles, which is something that I partially attribute to this project (though we are yet to crack a Friday or Saturday puzzle). It'll come in time though, no doubt.