Project Summary

Home Prices in Toronto, 2011

Abstract

The relationship between Toronto home prices and other variables was examined. Exploratory regression was used to determine the most important variables to use in the geographically weighted regression (GWR) analysis. With those determined variables, generalized linear regression (GLR) was performed to calculate the statistics correlated with the variables. Then GWR was conducted to examine the spatial relationships with the variables. Box-plots were then generated after using a spatially constrained multivariate clustering analysis tool. It should be noted that the variables explored may not necessarily accurately predict Toronto home prices. Statistical analyses may not necessarily reflect real-life situations and individual behavior.

Introduction

Large cities in Ontario like Toronto, Mississauga, Hamilton and Burlington have twice the Canadian average of year-over-year home price increases (Nistor & Reianu, 2018, p. 543). Nistor and Reianu found that interest rate, immigration, unemployment rate, household size and income are correlated with home prices (2018, p. 541). Immigration holds a key position in Canadian population growth and the Canadian economy (Nistor & Reianu, 2018, p. 541). Research suggests that the high influx of immigrants and low mortgage interest rates may cause high home prices in Ontario (Nistor & Reianu, 2018, p. 541).

Toronto CMA (census metropolitan area) is a main destination of immigrants (Nistor & Reianu, 2018, p. 541). It had a growth of 400% in immigration while the non-immigrant population increased by 14% from 2001 to 2011 (Nistor & Reianu, 2018, p. 541). During those 10 years, average home prices increased by $158,875, due to immigrants ($86,701 increase), income ($17,986 increase), unemployment rate ($38,916 decrease) and interest rate ($93,103 increase) (Nistor & Reianu, 2018, p. 541-542). In Toronto, household size remained stable during this time and did not contribute to home price (Nistor & Reianu, 2018, p. 542). My study will look at different variables to help explain home prices in Toronto and map the findings.

Data

The raw data used were all obtained from City of Toronto Open Data (https://open.toronto.ca/). Some data were transformed to ensure that the variables have similar ranges, as ArcGIS regression tools do not provide standardized regression values.

Data included:

  • Early Development Instrument (children’s test scores in 5 areas of competence: physical health and wellbeing, social knowledge and competence, emotional health and maturity, language and cognitive development, communication skills and general knowledge)
  • Employment rate (labour status)
  • Health providers (location counts of health-related businesses such as doctor offices, dentist offices, pharmacies, clinics and other health employers)
  • Immigrants
  • Linguistic Diversity Index (probability that any two people selected at random would have different mother tongues)
  • Median household total income ($)
  • Median income of individuals ($)
  • Median value of dwellings ($)
  • Neighbourhood Equity Score (composite indicator of 15 neighbourhood outcomes, indicators measure outcomes related to economic opportunities, social development, participation in decision-making, physical surroundings and healthy lives)
  • Only regular maintenance or minor repairs needed (dwelling condition)
  • Population
  • Regional municipal boundary (administrative boundary of the City of Toronto)
  • Salvation Army donors
  • Total area (in square km)
  • Total private dwellings
  • Walk Score (walkability based on walking routes to destinations such as grocery stores, schools, parks, restaurants and retail)

Methodology

Exploratory regression was used to analyze different combinations of explanatory variables to find the best variables for use in the GWR analysis. The dependent variable analyzed was the median value of dwellings, while the other variables were the candidate exploratory variables. The 5 most important variables (with the highest AdjR-squared value and the lowest AICc value) were found to be the Linguistic Diversity Index, Neighbourhood Equity Score, median household total income, immigrants and employment rate.

GLR was used create predictions in regards to the relationship between different variables. The model type used was continuous (Gaussian), which performs an ordinary least squares (OLS) regression, as the dependent variable (the median value of dwellings) can take on a wide range of values. The explanatory variables used were the 5 variables found to be the most important in the exploratory regression.

The next step was performing a GWR, which is a tool used to examine spatial relationships among variables. For every feature in the dataset, it constructs a local regression equation. In the analysis, the previous 5 variables were used as explanatory variables in the model.

The spatially constrained multivariate clustering tool was used to determine spatially adjacent clusters of features. The analysis fields used to distinguish clusters from each other were the 5 previous variables.

The local r-squared values were plotted (in order to help identify where the GWR model worked well) using coloured symbols for a classification with 3 levels (weak, medium and strong correlation).

Findings

Looking at the GLR map (GLR Map), there appear to be more dark blue areas (< 2.5 standard deviation) in the southern, especially southeastern, parts of Toronto. There are more dark green areas (> 2.5 standard deviation) in central Toronto.

Looking at the GWR map (GWR Map), the dark blue areas (< 2.5 standard deviation) occur in the west and centre of Toronto. The dark green areas (> 2.5 standard deviation) occur in the west and centre of Toronto.

The local R-squared values (R-Squared Map) show if the local regression model fits the observed values well. Points of strong correlation appear in north central Toronto. This shows that the local model is performing well. Points of medium correlation are found in the west, east and south of Toronto. This represents the locations where the GWR model worked moderately well. Points of low correlation can be seen at the eastern, western and southern edges of Toronto. This is where the GWR model did not perform as well.

The spatial clustering analysis map shows five groups. These areas include the southwestern, central, northeastern, northwestern and central western parts of Toronto. The southwestern part of Toronto is characterized by fewer immigrants, high income, high employment, low language diversity and high neighbourhood equity. The central part of Toronto generally has traits like fewer immigrants, high income, high employment, low language diversity and high neighbourhood equity. Toronto’s northeast is typically comprised of more immigrants, with low employment and high language diversity. The northwest of Toronto is generally comprised of more immigrants, with low income, low employment, high language diversity and low neighbourhood equity. The central western part of Toronto is typically characterized by low income and low neighbourhood equity.  See the Cluster Map and the Box Plots.

Conclusion

The 5 variables found to be the most important for GWR in my study are similar to those examined by Nistor and Reianu (2018). With such high home prices, it is feared that there may be a housing price bubble (Nistor & Reianu, 2018, p. 543). Issues with the housing market mean difficulties for young professionals in purchasing property, thus lowering the quality of life (Nistor & Reianu, 2018, p. 542). In fact, 40% of Canadians think that housing is unaffordable (Nistor & Reianu, 2018, p. 542). A national housing strategy is needed but the housing market is complex, and other factors may contribute to home prices (Nistor & Reianu, 2018, p. 542). Further research on home prices can be done for more recent years or for other locations.

Works Cited

Nistor, A. & Reianu, D. (2018) Determinants of housing prices: Evidence from Ontario cities, 2001-2011. International Journal of Housing Markets and Analysis, 11 (3), 541-556. doi:10.1108/IJHMA-08-2017-0078