Lecture 4: Statistics: A review

In this lesson we reviewed several fundamental statistical analysis techniques such as summarizing data using methods such as central tendency which involves the mean, median, and mode of the data. Also, we went over the measuring of dispersion such as skewness, which is a measure of symmetry. Skewness for a normal distribution is zero, and any symmetric data should have a skewness of zero. Data skewing towards the left is sometimes called a negatively skewed distribution because it’s long tail is on the negative direction on a number line, whereas right skew is where the the mean is typically less than the median.

I also learned about several statistical analysis such as the grouping analysis. This analysis is a powerful tool in GIS which can aid us and sort data into different groups/communities based on a set of quantitative variables. Different clusters will represent distinct characteristics from other clusters.

Regression analysis is another method which can be used to understand the relationships between different variables. There are two regression models I learned which were the ordinary least square model and the geographic weighted regression model. Ordinary Least Squares model is a global model aiming to minimize residuals. It is the proper starting point for spatial regression analysis, creating a single regression equation to represent the variable you are trying to understand. On the other hand, the geographically weighted regression model is a regression used to model spatial relationships of a given data set. This regression model is useful in working with large data sets with multiple features, for instance, working with multiple enumeration areas as a census data. One of the highlights of this regression model is that unlike the ordinary least square regression model, it adds a level of modeling sophistication by allowing the relationships between the independent and dependent variables to vary by locality. The GWR model is able to address this problem and construct a separate ordinary least square equation for every location in the data set determine by the kernel or bandwidth.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.