This project focuses on analyzing the performance of music on Spotify. The music industry is never consistent with what is currently popular, and Spotify does a wonderful job keeping track of these changes. The purpose of this project is to analyze top performing songs and albums to determine what makes certain music successful. Furthermore, this would allow for artists/listeners to compare the performance of other projects to potentially determine why a project was successful or not.
During the ingestion stage of this project, there are two data sources that are used: one for albums, and one for singles.
The data first comes from kworb.net, where they extract the top 200 singles globally. This site will refresh every week with the most popular songs and display information such as how many weeks the single has been on the chart, the streams accrued that week, and total streams.
To collect the data for the top singles, web scraping is used. By reading the values in the table on the website, the artist name, song title, streams from the week, and total streams are collected. These values are then added to a CSV file.
To collect the data for popular albums, the process is very similar. Using Spotify's API, calls can be made that determine what album a song is on. Since kworb.net already contains the most popular songs, finding the albums these songs are on (if applicable) will yield good results.
This process includes first collecting all the songs just as before when collecting the top singles. Next, API calls must be made to determine which albums contain the song(s) collected. Once the albums are collected, all songs from the album (the tracklist) must be collected to analyze the album entirely. In addition to this, an extra value to retrieve is the length of a tracklist. All this information will be saved in a CSV file.
Data cleaning is very minimal when retreiving data from kworb.net. When the data is being read, if there is no valid artist name/song title (i.e., an entry is labeled as "Unknown Artist - Unknown Song"), it is discarded and no longer processed.
When transforming an individual song into an album, if the song counts as a single and is not present on an album, it is disregarded. Furthermore, if at any point in the transformation from a song to an album, if the JSON object is malformed or is null, it will also be disregarded.
Albums and songs will be compared by analyzing values retrieved by using Spotify's API. Spotify has metrics called audio features for each song that describes its character. These include danceability, loudness, the key the song was written in, the tempo, the time signature, and a lot more. These values are represented by floating point numbers.
For both singles and albums, API calls would be made on every song to collect these values. For a single, the raw values will be used, and singles will be compared to singles. For an album, an average of all the songs' audio features will be collected instead, and albums will only be compared to albums.
UPDATE: Due to Spotify's API deprecating the endpoints that gave the data specified above, it is no longer possible to retreive the audio features for songs. While this removed half the data initially inteded to be used, extra functionality was added to the application to compensate.
To compare songs and albums, a user can provide a song/album title and the artist name. If the JSON can be found and is formatted correctly, the song/album will be returned. The user's input will be put in a CSV file to compare their music of choice to a collected average of the popular songs/albums from kworb.net.
The data extracted to compare for songs will be the popularity value (calculated by Spotify), the season in which the song was released (summer, fall, winter, spring), if the song is explicit, and if the song is a collab with another artist(s).
The data extracted to compare for albums will be the popularity value, the season in which the album was released, if the album contains any explicit songs, the tracklist length, if the album is a collaboration between artists, and if there are any features on any songs.