This repository contains the Python Gemsearch server module which provides:
-
an music embedding mechanism which allows you to model the music ecosystem as heterogeneous graph and then create embeddings where proximities can be computed by the vector cosinus distance.
-
a flexible query mechanism to search for music using those embedding
-
an evaluation framework for those embeddings to evalute the performance on playlist track prediction and classic user-track recommendations
-
a REST API to expose the query mechanisms
The code is optimized for Python 3. Install python dependencies with running:
pip3 install -r requirements.txt
Copy the .env.example to create an environment file .env to adapt the following runtime parameters:
SLACK_API_TOKEN=tokenA valid slack api token to send notifications about long running tasksUSE_WINDOWS_BASH=FalseSet to true if you are running this package on a Windows computer. Linux executables (e.g. node2vec) are then executed with the Windows bash subsystem.SPOTIFY_CLIENT_ID=idA valid Spotify API client id to access the Spotify API.SPOTIFY_CLIENT_SECRET=secretA valid Spotify API secret to access the Spotify API.LASTFM_API_KEY=keyA valid Last-fm API key.GEMSEARCH_API_KEY=secretA random API key to authorize special API routes (primary /api/reload_embedding which reloads the current embedding). The same token has to be used by theEmbedding serviceto notify about a new embedding.GEMSEARCH_API_URL=http://localhost:8080/apiThe URL to the running Gemsearch API (used for theEmbedding service).
To run the playlist evaluation and auto-suggest service, a running Elasticsearch database is required.
See /gemsearch/runners/ for possible script entry points.
For further explanations look at the README files within the folders.
- data This folder is used to store intermediate datasets and computed embeddings
- deepwalk A fork of the Deepwalk algorithm (https://github.com/phanein/deepwalk/) with adaptions for Python3 and to extend existing embeddings on the fly
- gem A fork of the GEM package (https://github.com/palash1992/GEM) with adaptions for Python3 and Bugfixes
- gemsearch The actual Gemsearch implementation (see Section Gemsearch package) as Python package
- mongo-queries MongoDB queries to mostly perform statistical analysis
- reports Contains some log files of performed experiments and script files to process log files.
- api REST API to access the query service
- core Core classes to load and generate data
- crawler Different crawlers for Last.fm and Spotify
- embedding Classes to create, load and compute results on embeddings
- evaluation Evaluation methods
- graph Graph abstraction classes
- query methods to construct vector-queries out of unstructured text
- runners executable program entry points for specific tasks
- services Service runners for the API server
- storage Data-Wrapper classes to access MongoDB data and to an import script for csv playlist data
- utils Utility functions