Methodologies – COVID-19 Vulnerability in Montréal, April/2020: A Speculative Report

After the Data was compiled into a file geodatabase (named CovidMontreal), a new ArcGIS Pro Map was created and the CovidMontreal folder was designated as the project folder. Within this folder was another empty folder named “gwr_raster” designated for the output results from the GWR.

The shapefiles of the Outline of Montréal, Major Road Networks, Green Spaces, and Elderly Homes were the added to the map and stylized according to cartographic principles (ie. the outline was made hallow with a fine black line and the parks polygons were made to be light green).

Next, I established the working environment: in Environments the Raster Cell Size was set to 50 meters and the Output Coordinate System was set to NAD 1983 MTM 8, reflecting the current map, and the Workspace was designated as my geodatabase (CovidMontreal).

The next step was to add the Census Tract shapefile to the map, but it first had to be reprojected to NAD 1983 MTM 8 from Lambert Conformal Conic using the Define Projection tool. It was then clipped to the Outline of Montréal to omit all other Census Tracts and reduce computation time.

The Census Tract and the total reported COVID-19 cases .xls metadata were compiled on two different spatial scales, where the Census Tract metadata was at Census Tract level and the total reported COVID-19 cases are at a much larger neighbourhood level. I designated what Census Tracts would contain COVID-19 cases by looking at the total number of cases per neighbourhood, the total number of Census Tracts within that neighbourhood, and chose tracts that were closest to elderly homes or parks to contain the highest majority of cases. This was done because COVID-19 cases are not reported individually with their own precise location or metadata, so intellectual-based assumptions had to be made about where cases were most likely to originate. After these two .xls files were manually combined, they had to be joined with the Census Tract Boundaries shapefile. The field they are to Join by is ‘CTUID’, however, the .xls file formats this field as numbers while the shapefile formats it as text, and joins can only be made using text files. So, I reformulated this field by adding another CTUID column set to be text using the formula:

=TEXT(462####,“0000000.00”)

This added trailing zeros (.00) required to match the shapefile, while leading zeros were not required in this particular case, though it is common. I then deleted the CTUID number column from the spreadsheet. It should be noted this was performed during the data collection phase, before beginning the project in ArcGIS Pro to ensure multiple folders were not used. I was then able to add this as a Standalone Table and Join it to the Census Tract Boundaries.

The first step of the spatial analysis was to carry out an Exploratory Regression to determine what the ‘best’ neighbourhood variables should be included in the following OLS and GWR analyses. I used the Exploratory Regression tool to do so with the following parameters:

- - - Input Features = Census Tract Boundaries
    - The Dependent Variable = COVID-19 Cases (see more on this in Limitations)
    - The Candidate Explanatory Variables: (see Data for a definition of these variables)
      - PopDen
        
        Pop65
        
        Jewish
        
        Alone
        
        LowIncome
        
        HC_SA
        
        Homes_CT

Within the details of the regression, each variable is given an AdjR²value and a AICc value (see Spatial Statistics for an explanation on the statistics used), where the most important variables have the highest AdjR²values and the lowest AICc value. The most important variables identified are Pop65, Jewish, LowIncome, PopDen, and Homes_CT.

The next step was to carry out the OLS Regression using the variables identified above in the Exploratory Regression. I used the Generalized Linear Regression tool to determine the global model statistics associated with each set of variables. The parameters are as identified below:

- - - Input Features = Census Tract Boundaries
    - The Dependent Variable = COVID-19 Cases
    - Model Type = Continuous Gaussian (this is another way to say OLS)
    - The Explanatory Variables:
      - Pop65
        
        Jewish
        
        LowIncome
        
        PopDen
        
        Homes_CT
    - All other fields were left with their default values

The detailed results were then saved to a text file.

After the global model statistics were calculated, the next step was to use the same Exploratory Variables to create a localized model exploring the spatial relationships between the sets of variables. I used the Geographically Weighted Regression tool to do so, with the parameters:

- - - Input Features = Census Tract Boundaries
    - The Dependent Variable = COVID-19 Cases
    - Model Type = Continuous Gaussian
    - The Explanatory Variables:
      - Pop65
        
        Jewish
        
        LowIncome
        
        PopDen
        
        Homes_CT
    - The Neighbourhood Type = Number of Neighbours
    - The Neighbourhood Selection Method = Golden Search
    - Under Additional Options:
      - Local Weighting Scheme = Bisquare
        
        This specifies the kernel type that is used to spatially weigh the model, that is how each feature is related to other features within its neighbourhood. Bisquare is the default, assigning a weight of 0 to any feature outside the specified neighbourhood.
        
        Coefficient Raster Workspace = gwr_raster folder

The detailed results were then saved to a text file. Raster layers for each variable were produced as well and added to the Map Contents, which in essence are each a vulnerability surface for the spread of COVID-19 cases based on each variable. The classification scheme was changed to Standard Deviation because the statistics are based on a continuous model type, meaning that the values are distributed normally and centred around the mean which is generally around 0. I want to distinguish the positive values from the negative values to show the vulnerability, so this scheme works best. The maps were symbolized with a red-green colour palette to represent high to low vulnerability and were Clipped to the Outline of Montréal.

Although we already generally know what areas have the highest concentration of COVID-19 cases, based on the COVID-19 Field in the Census Tract Boundaries layer, a grouping analysis helps to easily interpret the total GWR results, across all variables. However, first I used a Definition Query to exclude any Census Tracts where Income = 0. To do this I used the Spatially Constrained Multivariate Clustering tool with the following parameters:

- - - Input Features = Census Tract Boundaries
    - Model Type = Continuous Gaussian
    - The Analysis Fields:
      - Pop65
        
        Jewish
        
        LowIncome
        
        PopDen
        
        Homes_CT
    - All other fields = default value

The results produced a new layer into the Map Contents, containing 5 different clusters which are mostly regions of Montréal, including: Montréal North region, Côte-des-Neiges–Notre-Dame-de-Grâce/Westmount region, Ahuntsic–Cartierville/Saint Laurent region, Dollard-des-Ormeaux/Southwestern Montréal region, and L’Île-Bizard–Sainte-Geneviève region. These regions are not exact because the neighbourhoods are smaller than the entire region, but they the prevalent borough within each cluster.