The Effects of MLB Integration on Run Scoring and Attendance in Negro League Baseball
For the final project of my EdX DSE 200X class, I downloaded Retrosheet’s Negro League Baseball dataset to analyze how Jackie Robinson’s MLB debut in 1947 affected Negro League Baseball. I looked at Negro League games in 1946, 1947, 1948, and 1949, charting how the run scoring environment (runs/game) and attendance changed over the course of those four seasons. I found that run scoring–which I used as a crude proxy for quality of play–remained steady, but attendance dropped precipitously over the course of those four seasons, as the best Black professional players began to enter the AL and the NL. Here is a slideshow summarizing my major findings, and I’ve also exported my Jupyter Notebook to HTML, which you can view here.


