The repository contains all proposed solutions for the laboratory exercises completed during the a.y. 2024-2025 of the Data Visualization Lab course.
-
Data manipulation
-
Pandas and Numpy
-
Inkscape
-
Matplotlib
-
Seaborn
-
Ggplot
-
Dimensionality reduction
-
Geospatial mapping
-
Bokeh
Here is a more detailed list of the topics covered in each lab:
-
Lab 1: Cleaning of data using pandas
- melt
- pivot
-
Lab 2: Data exploration/modification using pandas
- lotr
- concat
- combining datasets with join and its variants
-
Lab 3: Inkscape
-
Lab 4: Plotting with matplotlib
- matlab like interface (state-based)
- object oriented interface
- figure and subplots
- axes vs axis
- format string for data points and lines
- shared x/y axis
- figure size, aspect ratio, DPI and saving
- labels and legends
- formatting: LaTeX, rcParams, plot styles
-
Lab 5: Plotting with matplotlib #2
- colors, line widths, line types (similar to format string)
- line and marker styles
- axis appearance: range, ticks, tick labels
- custom legend
- texts, arrows and annotations
- subplots layout: subplot2grid, GridSpec
- histograms, boxplots, bar plots, error bars and continuous errors
-
Lab 6: Plotting with pandas & seaborn
- indexing and selection (column and row)
- condition-based filtering
- plotting using 'kind' and transpose method
- area chart, histograms, bar charts, pie charts, boxplots, scatter plots
- 'standard' plots, density plots, joint plots (with hue variable), pair plots
-
Lab 7: Plotting with seaborn and further case studies
- 'standard' plots, strip/swarm plot, catplot, linear regresssion, heatmaps, cluster maps
- subplots with seaborn
- faceted plots (show specific subgroups of the whole data set)
- grid layouts: pairplot (auto), PairGrid (map plot functions to the empty grids)
- case studies: encircling, diverging bars (and texts), diverging dot plot, cleveland dot plot, ridgeline plot, waffle chart, treemap, cluster plot, andrews plot, parallel coordinates
-
Lab 8: Plotting with ggplot (plotnine)
- data, aes(thetics), geometry (geom_* or stat_*), axes and themes
- 'standard' plots, aes color
- trend lines: geom_smooth and possible methods
- layering
- axis limits: xlim/ylim (deletion of points) vs. coord_cartesian (zoom region)
-
Lab 9: Plotting with ggplot (plotnine) #2
- scaling and scaling color
- 'standard' plots
- statistical transformations
- position arguments (e.g stacked/dodge bars)
- coordi_flip for horizontal drawing
- themes (axis, ticks, legend, background), labels and element_*()
- saving plots
- distributions plotting (density, reference lines)
- faceting
-
Lab 10a: Dimensionality reduction
- PCA
- t-SNE
- UMAP
-
Lab 10b: Geospatial mapping
- Geopandas
- Geoplot
- Plotly
- Folium
-
Lab 11: Bokeh: quick summary
During the lessons, the notebook files contained some exercises in order to test the understanding.
About Inkscape, as there is no dedicated notebook file, a folder has been created inkscape where it is possible to see
the original plots before_lab03.png and the final result lab03.png.
Each notebook includes exercises to test understanding of the topics covered.
For Inkscape (Lab 3), since there is no dedicated notebook, a folder named inkscape contains the original plots showed during the lecture.
In particular, in the output-svg folder it is possible to visualize the first version (before_lab03.png) and the final version (lab03.png) which has been generated for the assignment.
The folder Practice_exam contains an example of exam, which is given as a notebook file. All instructions are already written on it.
The folder Exam contains the exam of 12/06, which is given as a notebook file. All instructions are already written on it.
Given the amount of modules needed for each notebook, a requirements.txt has been generated so that it can be used to install directly all the modules needed, simply type in your terminal:
pip install -r requirements.txtDorijan Di Zepp dorijan.dizepp@studenti.unitn.it