Skip to content

WinsonTruong/police

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Investigation of New York Police Department's Practice of Stop, Question, and Frisk

The protests of June 2020 demonstrate that all across America, the people were upset at the systemtic discrimination and violence that Black Americans have been experiencing for centuries. Of the countless injustices, now in mainstream focus thanks to the Black Lives Matter movement, this project will focus on the horrid racial profiling of Black people by the police.

This picture says it all:

Image 1a

Notion Page

This dedicated website explains the background knowledge and walks you through all the my modeling decisions.

Questions

  1. What are the past behaviors and trends of officers who racially profile black people?
  2. Are officers actually frisking an individual based on reasonable suspicion or just racially profiling?
  3. Is SQF fair? Does SQF and policing at large have to be fair to begin with?

Data

The primary data we will be using is publicly available by the New York Police Department. It contains every recording of stop, question, and frisk practices used in 2019. While data starting from 2003 is available, it does not follow the same layout year-by-year and often contains missing data. The cleaning process can be found in the 'notebooks' directory where there is a 'clean_and_impute.ipynb' notebook.

A Glimpse at the Missing Data

Image 2

In order to account for the missing values we will impute via Multiple Imputation by Chained Equations.

Why Impute? Single imputation procedures such as replacing the data with the sample mean would mean I have to draw a decision boundary for my binary variables and with no prior knowledge, mean imputation could potentially accentuate bias. In order to maintain the relationship between the variables, I believe a Bayesian method that utilizes the other well-defined feature sets is best.

EDA

Using the power of Tableau and Python, here are some visualizations of interest

Police behavior seems to change depending on when their shift is:

Image 3

The heatmap below shows that most stop and frisks are done by daytime patrol officers

Image 4

Models and Feature Selection

I will use 3 different feature sets, with increasing complexity, for each of the 4 models.

Feature Sets Models
Race Logistic Regression (Baseline)
Appearance Logistic Regression + LASSO
Context SVM with Linear Kernel
\ SVM with Gaussian Kernel
Comparison Statistics Outputs
F1 Score Comparison Table
False Discovery Rate ROC Curves

Why the LASSO?

I could penalize/erase potentially insignificant features like hair color which could lead to more generalizability

Why SVM with Differeing Kernels?

  • the data might not be linearly seperable, in which the SVM can handle the high dimensionality
  • there is a decently large sample size
  • I want to understand the tradeoff between speed/burn-in time relative to accuracy

Why the F1 Score?

I'm interested in how sure officers are that the suspect has contraband/weaspons (precision), but also how many suspects they are frisking at large (recall) because I'm interested in the racial profiling of Black people. Taking the harmonic mean of these two metrics seems quite fitting.

Why the False Discovery Rate?

I'm also interested in when my model is incorrect because an error in the context of policing could lead to claims of racial profiling by the police. While FDR-control is not the point of this project it is a very interesting idea.

Results

Image 6

Some Observations

  1. The F1 Score seems to hover around (0.5, 0.58). While this isn't the highest ideal score, we can see that adding 'Appearance' helps by about 0.05 and 'Context' features by about 0.01. While my model doesn't account for all latent variables, officers are classifying primarily based on race.
  2. In terms of the F1 Score, the Logistic LASSO outperforms the other 3 models in the 'Race' and 'Context' feature sets.
  3. Given all our models and datasets, the FDR approximately ranges from (0.36, 0.43). That means our models predicts that officers are frisking individuals on un-reasonable suspicion about 40% of the time. Read more about the future studies here!

About

Questioning NYPD's stop and frisk program.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors