I'm a PhD student who's interested in computer architecture, high performance computing, and deep learning. Interested in parallel architectures, including CPUs, GPUs, and accelerators. My work currently focuses on compiling for the RISC-V-based Tenstorrent accelerators.
Implemented RainbowDQN with CUDA acceleration (via PyTorch) to train a deep reinforcement leraning agent to play Super Mario Bros. Some experiments we ran:
- Which methods yielded the most significant performance?
- How well do agents generalise across levels? (via transfer learning)
- What is the best transfer learning approach? (one level specialist, vs generalist)
- What are the best hyperparameters for our agent (coordinate descent approach)
We collected metrics for each of these, using frames (~steps) as our measure of time, which is more rigourous than episodes (which can vary in step length). Some of our video demos and results are shown below:
Full Game Performance |
Multi-level Training 1 |
Multi-level Training 2 |
Comparing agents (training) |
Comparing agents (level completion) |
Comparing transfer learning approaches |
Raytracer built from scratch in C++, with the following features:
- Global Lighting (reflections/refraction)
- Accelerated Raytracing (Bounding Boxes, multithreading parallelism)
- Quadratic Surfaces (Cones, ellipses, curved surfaces in general)
- Constructive Solid Geometries to combine surfaces
- PolyMesh rendering (
.objfile parser that constructed triangles, see the teapot in the image) - Phong material colouring, with photon mapping support
- Global illumination by generating and querying a photon map (see paper)
Final render with photon mapping
I'm pretty happy using any languages, with my main experience being in:
- C, C++
- Python
- Swift
- Java




