Skip to content
View karanewh's full-sized avatar

Block or report karanewh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
karanewh/README.md

Kara Newhouse Data Journalism Portfolio

Overview

I am committed to journalism that holds public officials accountable and shines a light on systemic inequities. This repository includes examples of my work using data to discover, report and tell those kinds of stories.

Jail to Deportation Pipeline: Analyzing ICE data in Florida

I analyzed ICE data from the Deportation Data Project to uncover how, in Florida, the jail-to-deportation pipeline is one of the main engines of President Trump's mass deportation push. I wrote code in R to find patterns in immigration detainers, arrests and deportations in the past two years. My findings revealed the scope of ICE’s practice of taking people from local jails — and how the practice has accelerated since Trump’s inauguration in January.

For this story I also obtained local data on ICE holds through public records requests, which my reporting partner and I used to find local affected families. Through court dockets and interviews, we discovered that people who are transferred to ICE upon paying a bond for their release can end up with fines and a warrant when they do not appear at their court dates.

⛹️‍♀️ Unlevel Playing Fields: Flaws in federal Title IX data

Through public records requests, data analysis and traditional reporting, I found that federal data often overstates high school girls’ participation in sports. That makes it impossible to tell whether schools are in compliance with Title IX, unless someone complains.

This story ran on the front page of The Baltimore Sun. It was part of “Unlevel Playing Fields,” a series that examined the continued obstacles to gender equity in high school sports as Title IX turned 50. I also produced data visualizations for the story. The underlying code for my data analysis and graphics can be seen in the fact check notebook that accompanied the story.

📰 Printing Hate: Newspaper lineage web scraper

For "Printing Hate," a series and news app examining white-owned newspapers' role in promoting racial terror lynchings, a colleague and I built a web scraper that uses the Library of Congress' U.S. Newspaper Directory to access more than 150,000 newspapers' lineage information. We created a ‘newspaper family tree,’ which we connected to historians’ lynching victim datasets. We wrote the code for loc_scraper_kara.ipynb in Python using Selenium and exported the resulting files for further manipulation and analysis in R. This scraper laid the foundation for analysis of historical newspaper coverage about our topic. In addition to working on the archival database, I conducted a gender analysis of the lynching dataset in R for an article on how lynchings of Black women were covered.

⚖️ Courts Slackbot

I created Courts Slackbot to solve a problem I experienced as a local education reporter in Pennsylvania: keeping track of lawsuits against school districts I covered. The bot uses a Python script to parse entries from a list of RSS feeds for court dockets and search for the phrase "school district" in the entry titles. If those keywords appear, it sends a Slack message containing the case title, a link to its docket sheet in Pacer and a description of the latest activity.

🦺 Essential and Exposed: Analysis of workplace COVID-19 outbreaks

Analysis of state health department records led to key findings for "Essential and Exposed," a Howard Center for Investigative Journalism project on worker safety during the first year of the COVID-19 pandemic. We found that essential workers at places like Walmart paid the price for splintered government oversight. The 13 data findings that shaped the final story, along with their underlying code, can be seen in our line-by-line fact check.

From records acquisition to analysis to many late-night rounds of writing, revising and fact checking, I was deeply involved in every aspect of the data work on this project. My responsibilities included:

  • Coordinating a 50-person public records campaign to obtain data on workplace COVID-19 outbreaks from state health departments.
  • Co-leading a team of seven reporters in cleaning, wrangling and analyzing data.
  • Writing code to look for patterns and outliers within states and identify outbreaks at Walmart stores and distribution centers across states.
  • As the most senior reporter on the team, assisting team members in asking journalistic questions about the data, problem-solving in R and transforming data findings into language that the reporting team and readers could digest.
  • Applying news judgment and journalism ethics to decisions about data from states that differed in their definitions of "workplace outbreak" across multiple variables, and some did not even use the term "outbreak."
The Associated Press ran the story, and it was picked up by newspapers across the country, including The Washington Post.

📊 Datawrapper & Flourish charts

🏫 Data-driven education reporting

As an education beat reporter in Lancaster, Pennsylvania, I used data to connect public policy to people’s lives.

For example, as frustration over standardized testing surged nationwide, I obtained data from the state Department of Education about families opting children out of mandated assessments and reported on a statewide spike in this practice. I found that Lancaster County was a hotspot for opt-outs — a discovery that complemented my prior reporting on local parents organizing around this issue. After my story was published, other education reporters across Pennsylvania sought the same data I had obtained and localized the topic to their regions.

For that story and others on the ed beat, I conducted my analyses using self-taught methods in Google Sheets. I also experimented with building interactive maps via Fusion Tables and regularly created my own graphics using free web tools. I bring my love of learning and willingness to try new techniques to every setting.

🖋️ About Me

I am an award-winning data journalist with 15 years of experience across newspapers, public radio and online outlets. I am currently the deputy data editor at the Howard Center for Investigative Journalism at the University of Maryland. As a local reporter in Pennsylvania, my partner and I uncovered financial mismanagement and persistent Sunshine Act violations in Lancaster County's wealthiest school district. Our reporting prompted hundreds of taxpayers to attend public meetings and demand changes to school district operations, including a more transparent superintendent search and the restoration of elementary art and music classes.

I have a master’s degree in journalism with an emphasis on investigative and data journalism at the Philip Merrill College of Journalism at the University of Maryland. My undergraduate degree is from American University, where I studied anthropology.

Popular repositories Loading

  1. courts-slackbot courts-slackbot Public

    A bot that sends Slack messages about new or updated court cases based on court RSS feeds and selected criteria

    Python 2

  2. karanewh karanewh Public

    Jupyter Notebook

  3. ncaa-wbb ncaa-wbb Public

    JavaScript