This code was created during an university project at the University of Mannheim. The prupose of this crawler is to crawl the Billboard TOP100 charts. You can specify the timeframe and the ammount of snapshots the crawler should make. The default distance between each snapshot is 7 days. The data is written into an Firebase Database. Through this it is relativley easy to work further on and add more attributes to each song / artists. Currently the Spotify Web API is used to add metadata as well as song attributes to each entry.
- You need to create a config.json on the top level with the following structure:
{
"serviceAccount" : "/Users/YourUser/FirebaseAdminSDK.json",
"databaseURL" : "https://YourFirebaseProjectID.firebaseio.com",
"spotifyClientID" : "SeeSpotifyDocumentation",
"spotifyClientSecret" : "SeeSpotifyDocumentation",
"spotifyRedirectUri": "SeeSpotifyDocumentation",
"spotifyAccessToken" : "SeeSpotifyDocumentation"
}-
Fill in your IDs and Tokens above. For Spotify please refer to: https://developer.spotify.com/documentation/general/guides/authorization-guide/
-
Run npm install inside /crawler
-
To run the code type
node crawler.js