๐Ÿ“ˆ

rack-stats

Tournament analytics pipeline for the Florida Billiards Circuit

A Python data pipeline that generates synthetic billiards tournament data, stores it in SQLite, computes meaningful stats via SQL aggregations, and presents everything in an interactive Streamlit dashboard.

View on GitHub
14 scenariosยทPyTest

The Problem

Tournament data in billiards is typically tracked in spreadsheets with no analytics layer. Understanding venue performance, player trends, and game type distributions requires manual work.

The Solution

An ETL pipeline with domain-accurate synthetic data โ€” real Florida venue names, realistic Fargo rating distributions, proper handicap vs open tournament separation โ€” feeding an interactive dashboard.

Screenshots

Circuit overview metrics

12 tournaments, 32 players, 224 matches, $17,050 total payout across the 2025 circuit

Player standings chart

Top 10 players by win percentage with rack efficiency color coding โ€” Marvel vs DC on the felt

Venue analytics chart

Total payout by venue and game type โ€” Davie and West Palm Beach leading the circuit

Tournament details table

Full tournament details table with sortable columns

Tech Stack

Python

Primary data engineering language

SQLAlchemy

ORM for database modeling โ€” same pattern as EF Core

Pandas

DataFrame-based data manipulation and aggregation

Streamlit

Instant web dashboard from Python scripts

Plotly

Interactive charts with hover and zoom

SQLite

Zero-config database โ€” anyone can run it locally

Key Engineering Decisions

โ–ธ

Synthetic data uses real Florida venue names and accurate tournament formats

โ–ธ

Handicap tournaments ($20-50 entry, Fargo โ‰ค650) vs open tournaments ($100-200, any Fargo) modeled separately

โ–ธ

Race lengths domain-accurate: Race to 7 for 9/10-ball, Race to 3 or 5 for Banks

โ–ธ

Field size weighted toward 22-24 players to match real local tournament averages

โ–ธ

W/L computed from match results rather than stored โ€” caught 2 real query bugs during testing