Data

For this study, it was desired to obtain the data at the census tract level because the spatial unit is fine enough to capture the processes in question and because the census tract tends to be the scale that surveys data is reported at. Therefore Chicago census tract shapefile was mapped and clipped to the Chicago city shapefile. Income, race, and education data sets were all obtained as CSV files and joined with their respective census tracts

Income Data: 

Median household income data was collected from censusreporter.org, which sourced income data from the American Community Survey (ACS) 2019 compiled by the US Census Bureau. Our project analysis also used the metric income 1000, which we calculated by dividing income by 1000.

 Median household income by the number of CTs within the bin.

Demographics Data:

Our demographics data included the percentage of people in each census tract that were of different ethnicities. The categories included: White, Black, Asian, Pacific Islander, Native American or Alaskan, other, and combination of 2. The data was collected from censusreporter.org and was obtained from the ACS in 2019.

The average percentage of each ethnicity within CTs in Chicago Illinois.

Diversity

The diversity metric was calculated in python from the dataset containing ethnicity by proportion in each census tract by comparing each race’s proportion against each other race’s proportion in the following function.

def diversity_calc(w: float, b: float, a: float, pi: float, na: float, o: float):
    return ((5-(abs(w-b)+
            abs(w-a)+
            abs(w-pi)+
            abs(w-na)+
            abs(w-o)+
            abs(b-a)+
            abs(b-pi)+
            abs(b-na)+
            abs(b-o)+
            abs(a-pi)+
            abs(a-na)+
            abs(a-o)+
            abs(pi-na)+
            abs(pi-o)+
            abs(na-o)))/5)

The results of the diversity calculation carried out in python can be seen below in the histogram. The function we designed to illustrate CT diversity produced results 0-1. The higher the rating the higher the rate of diversity within the CT. We hoped to compare this statistic with enrollment data to understand a possible connection between the two. 

Count of CTs diversity scores (0 meaning only one ethnicity in CT and 1 meaning perfectly even distribution of ethnicities in CT)

Education: 

CT Data (attained post-secondary Education):

We collected enrollment data using two different methods. The first was a CSV file from censusreporter.org, which provided census data from the American Community Survey in 2019. The data was the percentage of the population in a CT with different levels of education within the CT. The other categories were high school, GED, < 1-year university (no diploma), > 1-year university (no diploma), bachelor’s degree, professional degree, master’s degree, and PhD. In python, we combined high school and GED and reclassified them as a single category called “high school and GED”. Additionally, using python, we combined any level of post-secondary education and created a new column called “higher education”. Below our code for those two operations:

proj_data['Highschool and GED'] = (proj_data['Highschool Diploma']+
proj_data['GED'])

proj_data['Higher Education'] = (proj_data["Bachelor'Degree"]+
proj_data["Associate's degree"] +
proj_data["Master's Degree"] +
proj_data['Professional Degree'] +
proj_data['Doctorate Degree'] +
proj_data['<1 Year of College'] +
proj_data['>1 Year No Degree'])

School District Data (enrolled post-secondary education):

The second method we used to examine university enrollment rates was collected from the Chicago Data Portal. Rates of high school graduate enrolment in post-secondary were reported by high school name in a CSV file. Python was then used to merge school addresses to their statistics to be geocoded and plotted on a map of Chicago.

Demographics Map Explorer of Chicago Illinois