My projects are organised by deliverable and depth:
Flagship Projects: end-to-end builds that ship a working app or pipeline — reproducible architecture, rigorous evaluation, and documented results you can explore directly.
Research Projects: peer-reviewed modelling work presented as data science case studies, where the primary output is a published finding with real consequences.
Skill Labs: short, targeted builds designed to cement a specific technique through implementation rather than coursework. Everything below that — the Liquid template logic, the section headings, the grid includes
Flagship Projects
Working apps/pipelines with reproducible architecture and documented outputs.
A deterministic job-market intelligence system that turns messy postings into interpretable skill demand, salary signals, and clear best_now vs stretch recommendations — delivered as a reproducible Python pipeline + Streamlit app.
An async LLM pipeline that extracts structured job intelligence from raw postings using GPT-4o-mini, chain-of-thought prompt architecture, and a rigorous LLM-as-judge evaluation framework — built as the AI layer for the Job Intelligence Engine.
Eleven production-minded Python projects spanning the full LLM engineering stack — from structured prompting and RAG to QLoRA fine-tuning, autonomous multi-agent systems, and serverless cloud deployment. The flagship project builds a price predictor that trains on 800k products, benchmarks a dozen model architectures, and deploys an autonomous agent that scans for deals and notifies you in real time.
End-to-end SQL analytics project using the Lahman Baseball Database. Designed a complete relational workflow with schema creation, reusable views, advanced CTEs, window functions, and business-focused analyses on talent pipelines, salary dynamics, and player careers.
Research Projects
Peer-reviewed modelling/analytics work presented in a DS case-study format.
Developed a hierarchical Bayesian workflow to quantify partial altitudinal migration and system-wide community reshuffling across elevation and season in the Australian Wet Tropics.
Developed and implemented a holistic Bayesian framework integrating microclimate, mechanistic physiology, biogeochemical processes, and population dynamics to identify causal pathways from climate change to survival and recruitment.
Revealed ecosystem cascades and biogeochemical pathways in tropical systems using Bayesian hierarchical modelling to quantify direct and indirect effects in complex ecological networks.
Applied hierarchical Bayesian models with satellite-derived predictors to identify climate-driven population changes in rainforest birds across space and time.
Developed Bayesian hierarchical models incorporating detection probability to forecast population viability and support elevated conservation status for imperilled species.
Developed a high-throughput spatial forecasting workflow of community turnover under climate change, optimising computational performance for multi-species forecasting across elevational gradients.
Used time-series GLMs and interactive visualisation (Shiny app) to nominate 14 bird species for elevated protection under national and international priority lists.
Implemented core machine learning algorithms — from regression and classification to clustering and deep learning — through applied projects in Python. Focused on building intuition for model training, evaluation, and interpretability using Scikit-learn and Jupyter Notebooks.
Developed an applied EDA framework combining real-world case studies — emergency call records and financial time series — to demonstrate data wrangling, feature extraction, and visualisation workflows using pandas, seaborn, and plotly.
Developed a suite of applied Python mini-systems demonstrating the progression from procedural programming to object-oriented design and class interaction across real-world examples.