1. The Hit Song Timeline
Average duration by first Billboard chart year.
DSC 106 · Final Project Proposal
How has the Billboard Hot 100 changed in the streaming era — and has the modern hit song become optimized for instant attention?
Popular music has always changed with technology: radio, albums, MTV, downloads, and now streaming. Streaming platforms make skipping songs effortless, so we investigate whether recent hit songs show signs of being shaped for faster attention.
We will compare Billboard Hot 100 songs across time using duration, danceability, energy, loudness, acousticness, valence, tempo, and chart metadata. Our explorable explanation will let readers move through the history of the hit song, compare pre-streaming and streaming-era tracks, and inspect outliers that became popular despite being long, quiet, acoustic, or otherwise unusual.
The goal is not to prove that streaming caused every musical shift. Instead, the project asks a sharper visual question: what did the streaming era do to the shape of a hit song, and what kinds of musical space may have become rarer?
We use the public Kaggle dataset Billboard Hot weekly charts with Spotify audio features. It includes weekly Billboard chart records and Spotify-derived audio features, satisfying the outside-dataset requirement of at least 100 rows and 5 columns.
Two files are merged by SongID. The cleaned version keeps one row per song using the first Billboard chart appearance as its year — 24,206 songs with duration, chart performance, danceability, energy, loudness, acousticness, valence, and tempo.
For the interactive prototype’s artist and song comparison, we also use Spotify’s 12M+ songs dataset. From it we build per-artist catalog averages and a track lookup so readers can overlay a specific song and catalog profile on the era baselines in the radar chart, using each track’s Spotify release year to place it in a streaming era.
Interactive charts exploring duration, audio features over time, era comparisons, and outliers that resist the trend. The interactive prototype extends this work with scrollytelling and era comparisons.
Average duration by first Billboard chart year.
Duration distribution before and during streaming.
Average audio features for each listening era.
Use the era buttons to highlight one part of the timeline.
Average spread of core audio features by year.
Recent hits that resist the optimized pattern: longer, quieter, more acoustic, or less danceable.
We have chosen a music-focused topic and built the project around Billboard Hot 100 songs with Spotify audio features. We cleaned the source data into one row per song, using each track's first Billboard chart appearance as its year. We created summary CSVs for yearly trends, era-level duration distributions, audio-feature diversity, scatterplot samples, and modern outlier songs. We also built a public webpage with D3 visualizations that show early patterns in duration, danceability, energy, acousticness, valence, and unusual modern hits.
The hardest design challenge will be turning the project from a set of charts into a clear explorable explanation. The dataset can show that hit songs changed over time, but it cannot prove that streaming directly caused every change. We will need to design annotations and interactions that make the evidence feel compelling without overstating causality. We also need to help readers compare eras without losing the song-level detail that makes the topic interesting.