This project showcases the usage of machine learning applied to the problem of sound classification.
Sound classification has a wide range of practical applications across various domains. For example, classification of environmental noise such as traffic noise, construction sounds, or wildlife sounds helps in assessing the impact on urban planning, public health, or wildlife conservation. Also in industrial settings it could help identify malfunctioning machinery or potential safety hazzards. These are only a couple of use cases among thousands that highlight the relevance and versatility of automated sound classification.
This demonstration makes use of a kaggle dataset containing 50 different sound classes from which only 10 classes are selected for simplicity.
The selected dataset is used to train a convolutional neural network that classifies audio signals into one out of 10 categories. Since the amount of data is limited to only 40 samples per class, this project makes use of data augmentation tecniques in order to have a more robust model.
For demonstration purposes, the model is then put into a simple streamlit application and depoyed to the cloud to enable easy interaction with end users.
Project outline:
The sound data is obtained from this kaggle dataset
The data contains a csv file with the following fields:
- filename: reference to the .wav file. type: object
- fold: fold number. type: int64
- category: label representing the type of sound. type: object
- src_file: file number. type: int64
The data contains 40 different sound classes, but for the purpose of this project only the following 10 are selected:
{
'dog': 0,
'chirping_birds': 1,
'thunderstorm': 2,
'keyboard_typing': 3,
'car_horn': 4,
'drinking_sipping': 5,
'rain': 6,
'breathing': 7,
'coughing': 8,
'cat': 9
}
In order to run the notebooks and scripts provied in this repositoy, you should download this kaggle dataset and save it to the folder data in the root of this repository.
For managing the python dependencies and virtual environments I chose conda. The provided project-dependencies.yml file contains the neccesary dependencies to run the notebooks. However note that currently this setting was tried only on Macbook Pro with an M2 chip. If you want to run this in a different system, the tensorflow dependencies have to be changed to match your computer specific hardware. As for the rest of the dependencies they can stay the same. With the correct tensorflow dependencies, the environment can be created as:
conda env create -f project-dependencies.ymlOnce it is created it can be activated with conda activate tf-metal-2
In addition to the already installed dependencies, you should install tflite_runtime like:
pip install --extra-index-url https://google-coral.github.io/py-repo/ tflite_runtimeIf you want to try out the service locally, run:
streamlit run src/app.pyAlso, make sure that you have Docker setup in your system and the aws-cli if you want to deploy the service to the cloud.
The previously described environment configuration is used for developemnt purpose. If all you want is to run the streamlit app on a linux virtual machine these are the steps to follow:
-
Create an aws EC2 instance as instructed in this video
-
Once the connection to the instance is established we can can install the neccesary packages.
- Install conda
wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.shaccept terms and condition, accept defaults
- Install docker
sudo apt update sudo apt install docker.io
- Install docker compose
wget https://github.com/docker/compose/releases/download/v2.24.0/docker-compose-linux-x86_64 -O docker-composecheck docker compose github to find recent versions.
- Modify the path:
nanon .bashrcand modify the file as follows:
export PATH="${HOME}/soft:{PATH}"execute
source .bashrcand checkwhich docker-composeto enable executing docker without sudo:
sudo usermod -aG docker $USERlogout and ssh to the server again and it should work.
-
Run the service
First of all this repository should be cloned to the linux virtual machine. Given that the previous installation was done succesfully, we can build the docker image and run the container
docker build -t sound-img .docker run -d -p 8501:8501 sound-imgWith this, the service should be accesible on the 8501 port. Note that you can set port forwarding on your local machine to make the service accesible from your browser.
The main notebook contains the code to read the data, preprocess it and train a model for sound classification. Each of the steps is explained in more detail inside the notebook.
It can be run using the conda environment created in the previous steps.
In order to train the model and save it, it can be done by first activating the conda environment and then executing the training script:
make run-trainingHere you can observe a short video of how the demo application works:
To run this application, first of all clone this repository. With docker installed in your system, you can build and run the image to launch the web service and try the sound classification model.
docker build -t sound-img . docker run -d -p 8501:8501 sound-imgAlternatively, if you want to run the service locally, just recreate the conda environment as:
conda env create -f project-dependencies.ymland activate it with conda activate tf-metal-2. Then the streamlit application can be run:
streamlit run src/app.pyMake sure you have an aws account configured in your system to access aws cloud resources programmatically. Check the aws official documentation to learn more about this.
Initialize the project
eb init -p docker --profile {your_aws_profile} -r {aws_region} star-model-servingDeploy to aws with:
make aws-deployIt can take up a few minutes until the neccesary resources are create and initialized. When it is done, the console output will provide an url where the service is available.
Finally, clean up all the resources to avoid undesired aws costs:
make aws-delete-env