What is the Precog Project?

We first did exploratory data analysis on the datasets gathered, those being from Washington Post and US Census datasets. This allowed us to clean the data, and we could then calculate the probabilities of homicide based on gender, age, race, and location, as well as the likelihood of someone being affected by a police shooting (based on the same factors). This was done by creating a probability function that found the conditional probability of an incident given a set of attributes all of this aformentioned data science work was done in Python. This logic was then ported over to a JavaScript/svelte front-end that represents all this data and allows the users of the website to find out where they fall under according to their demographics and our calculated probabilities.

Get Started

Datasets Used

The Washington Post Police Shooting Dataset

The Washington Post Police shooting dataset contains records of all persons shot by an on-duty police officer from 2015 to 2023, in the United States, as well as the agencies involved in each event. It is updated regularly as fatal shootings are reported and as facts emerge about individual cases.

The Washington Post Homicide Dataset

The Washington Post collected data on more than 52,000 criminal homicides over the past 10 years in 50 of the largest American cities. The data included the location of the homicide, whether an arrest occurred and, demographic information about the victim.

2020 US Census Dataset (from Kaggle)

This dataset is from the US Census website and was cleaned for Kaggle to increase accessibility. There are no personal identifiable information in this dataset.

US 2021 Age Distribution Dataset

Population and demographic data in this dataset is based on analysis of the Census Bureau's American Community Survey (ACS) and thus it may differ from other population estimates by the Census Bureau. The US and state population data displayed on this site are only for civilians. Population numbers are rounded to the nearest 100.