This page is an overview of my data-science work, organised by deliverable and depth:
Flagship Projects: end-to-end builds that ship a working app/pipeline, with reproducible architecture, evaluation, and documented results.
Research Projects: research-grade modelling and analytics work, formatted as DS case studies, where the primary outcome is a peer-reviewed publication.
Skill Labs: short, targeted builds designed to practise and demonstrate a specific technique through implementation.
Flagship Projects
Working apps/pipelines with reproducible architecture and documented outputs.
A deterministic job-market intelligence system that turns messy postings into interpretable skill demand, salary signals, and clear best_now vs stretch recommendations — delivered as a reproducible Python pipeline + Streamlit app.
End-to-end SQL analytics project using the Lahman Baseball Database. Designed a complete relational workflow with schema creation, reusable views, advanced CTEs, window functions, and business-focused analyses on talent pipelines, salary dynamics, and player careers.
Research Projects
Peer-reviewed modelling/analytics work presented in a DS case-study format.
Developed a hierarchical Bayesian workflow to quantify partial altitudinal migration and system-wide community reshuffling across elevation and season in the Australian Wet Tropics.
Developed and implemented a holistic Bayesian framework integrating microclimate, mechanistic physiology, biogeochemical processes, and population dynamics to identify causal pathways from climate change to survival and recruitment.
Revealed ecosystem cascades and biogeochemical pathways in tropical systems using Bayesian hierarchical modelling to quantify direct and indirect effects in complex ecological networks.
Applied hierarchical Bayesian models with satellite-derived predictors to identify climate-driven population changes in rainforest birds across space and time.
Developed Bayesian hierarchical models incorporating detection probability to forecast population viability and support elevated conservation status for imperilled species.
Developed a high-throughput spatial forecasting workflow of community turnover under climate change, optimising computational performance for multi-species forecasting across elevational gradients.
Used time-series GLMs and interactive visualisation (Shiny app) to nominate 14 bird species for elevated protection under national and international priority lists.
Implemented core machine learning algorithms — from regression and classification to clustering and deep learning — through applied projects in Python. Focused on building intuition for model training, evaluation, and interpretability using Scikit-learn and Jupyter Notebooks.
Developed an applied EDA framework combining real-world case studies — emergency call records and financial time series — to demonstrate data wrangling, feature extraction, and visualisation workflows using pandas, seaborn, and plotly.
Developed a suite of applied Python mini-systems demonstrating the progression from procedural programming to object-oriented design and class interaction across real-world examples.