DSC 106 · Final Project Proposal

Designed Not to Be Skipped

How has the Billboard Hot 100 changed in the streaming era — and has the modern hit song become optimized for instant attention?

24,206 cleaned Hot 100 songs
1958–2021 first chart appearance
9 audio features tracked

Project Proposal

Popular music has always changed with technology: radio, albums, MTV, downloads, and now streaming. Streaming platforms make skipping songs effortless, so we investigate whether recent hit songs show signs of being shaped for faster attention.

We will compare Billboard Hot 100 songs across time using duration, danceability, energy, loudness, acousticness, valence, tempo, and chart metadata. Our explorable explanation will let readers move through the history of the hit song, compare pre-streaming and streaming-era tracks, and inspect outliers that became popular despite being long, quiet, acoustic, or otherwise unusual.

The goal is not to prove that streaming caused every musical shift. Instead, the project asks a sharper visual question: what did the streaming era do to the shape of a hit song, and what kinds of musical space may have become rarer?

Dataset

We use the public Kaggle dataset Billboard Hot weekly charts with Spotify audio features. It includes weekly Billboard chart records and Spotify-derived audio features, satisfying the outside-dataset requirement of at least 100 rows and 5 columns.

Two files are merged by SongID. The cleaned version keeps one row per song using the first Billboard chart appearance as its year — 24,206 songs with duration, chart performance, danceability, energy, loudness, acousticness, valence, and tempo.

For the interactive prototype’s artist and song comparison, we also use Spotify’s 12M+ songs dataset. From it we build per-artist catalog averages and a track lookup so readers can overlay a specific song and catalog profile on the era baselines in the radar chart, using each track’s Spotify release year to place it in a streaming era.

Era Definitions

  • Pre-streaming — before 2010 · 19,360 songs
  • Streaming growth — 2010–2019 · 4,266 songs
  • Streaming native — 2020 onward · 580 songs

Proposal Visualizations

Interactive charts exploring duration, audio features over time, era comparisons, and outliers that resist the trend. The interactive prototype extends this work with scrollytelling and era comparisons.

  • The cleaned data contains 24,206 Hot 100 songs, with far more pre-streaming hits than streaming-native hits.
  • Streaming-native hits average about 3.20 minutes, compared with about 3.66 minutes before 2010.
  • Danceability rises across eras, while average valence drops, suggesting newer hits are more danceable but less musically positive.

1. The Hit Song Timeline

Average duration by first Billboard chart year.

2. Shorter Songs by Era

Duration distribution before and during streaming.

3. Era Sound Profile

Average audio features for each listening era.

4. Duration vs. Energy

Use the era buttons to highlight one part of the timeline.

5. Narrower Sound?

Average spread of core audio features by year.

6. Weird Modern Hits

Recent hits that resist the optimized pattern: longer, quieter, more acoustic, or less danceable.

Prototype writeup

What have we done so far?

We have chosen a music-focused topic and built the project around Billboard Hot 100 songs with Spotify audio features. We cleaned the source data into one row per song, using each track's first Billboard chart appearance as its year. We created summary CSVs for yearly trends, era-level duration distributions, audio-feature diversity, scatterplot samples, and modern outlier songs. We also built a public webpage with D3 visualizations that show early patterns in duration, danceability, energy, acousticness, valence, and unusual modern hits.

What will be the most challenging part to design and why?

The hardest design challenge will be turning the project from a set of charts into a clear explorable explanation. The dataset can show that hit songs changed over time, but it cannot prove that streaming directly caused every change. We will need to design annotations and interactions that make the evidence feel compelling without overstating causality. We also need to help readers compare eras without losing the song-level detail that makes the topic interesting.