Category Archives: Uncategorized

GIS and Crime — Part II

GIS and Crime, Part II, 18 March 2015

Visualization is a very important aspect to criminology.  Police can use daily crime maps to compare the locations of various types of crime.  In general, maps are useful for understanding hierarchical perspectives and obtaining an overview of how things change.  It is important to always consider MAUP (modifiable areal unit problem), however, because this will affect how readers understand the issues.  Scale is always an vital feature of mapping.  Two other factors to keep in mind when mapping are the use of kernel densities and adjustments for population.  Kernel densities show relative impacts, rather than raw numbers.  Relative changes from low to high are more informative than actual values.  Normalizing data by the underlying population reduces biases in the resulting maps.  Additionally, there are a number of available software packages and datasets that make crime mapping easier with GIS.

GIS and Crime — Part I

GIS and Crime, Part I, 16 March 2015

The ability to visualize crime is essential for the identification of hotspots and reduction of crime rates.  Not all branches of criminology understand the importance of geography in understanding crime, but environmental criminology does understand that crime is not randomly distributed.  There are three ways to understand the spatial distribution of offenses and offenders: Routine Activity Theory, Rational Choice Theory, and Criminal Pattern Theory.

Routine Activity Theory is described by the following formula:

likely offender + suitable target – capable guardian = crime opportunity

By looking at nodes of activity, the theory takes into account the routine activities of both criminals and victims, and it account for socioeconomic variables.

Rational Choice Theory assumes that offenders make a rational decision about whether to commit a crime.  The theory supposes that criminals will balance rewards against the chances of getting caught.

Criminal Pattern Theory states that offenders will be affected by the routines of their daily lives.  For example, they will commit crimes in areas they are familiar with.

Environmental criminology combines these three theories to better understand the links between crime, time, and space.  By comparing the spatial distribution of offenders with that of offenses, researchers can understand how the two factors are related over space.

In Vancouver, GIS is incredibly important.  Because there is so much data and too many classes, GIS is the only way to sort through and utilize the information.  Dr. Kim Rossmo uses GIS in the city to aid in geographic profiling.  Profiling does not solve crimes, per se, but it does help find answers, particularly in cases of serial killers, as people tend to commit crimes in areas they are familiar with.  The Donut Theory states that while people will want to commit crimes in familiar areas, they do not want to do it too close to home, so there will be a ring of possible crime areas around their residence.

Crime analysis is  “the qualitative and quantitative study of crime and law enforcement information in combination with socio-demographic and spatial factors to apprehend criminals, prevent crime, reduce disorder, and evaluate organizational procedures.”  GIS can be used to assist crime analysis is many applications, including intelligence analysis, criminal investigation analysis, tactical crime analysis, strategic crime analysis, and administrative crime analysis.  I have listed these analyses in by level of aggregation from low to high.

Definitions are as follows:

Intelligence analysis: “the study of ‘organized’ criminal activity in order to assist investigative personnel in linking people, events, and property”

Criminal investigative analysis: “the study of serial criminals, victims and/or crime scenes as well as physical, socio-economic, psychological, and geographic characteristics to develop patterns that will assist in linking together and solving current serial criminal activities”

Tactical crime analysis: “the study of recent criminal incidents and potential criminal activity by examining characteristics”

Strategic crime analysis: “the study of crime and law enforcement information integrated with the socio-economic and spatial factors to determine long term ‘patterns’ of activity”

Administrative crime analysis: “the presentation of interesting findings of crime research and analysis based on legal, political, and practical concerns to inform large audiences within law enforcement, administration, city government, and citizens”

As well, here is a brief history of crime mapping:

  • “Early 1800’s: Social Theorists: Single symbol point and graduated area maps
  • early 1900’s: New York City Police Department and others: Single symbol point maps, ‘pin maps’
  • 1920s and 30’s: Urban sociologists at the University of Chicago: Graduated area maps of crime and delinquency
  • 1960-60’s: First computer-generated maps of crime
  • 1980’s: Desktop computers available for (limited quality) mapping; Environmental Criminology Theory
  • 1990’s: Desktop GIS and integration with law enforcement systems and data; government funding, etc.”

GIS and Health Geography — Part III

GIS and Health Geography, Part III — Epidemiology, continued, 4 March 2015

There are several ways to understand disease.  One way is through manifestational criteria: observing manifestations of the condition.  This requires that each disease has a distinct set of symptoms, and this is how the disease is defined.  Another way to define disease is by causal criteria, which relies upon an understanding of the etiology of the disease.  One challenge in defining disease is the issue of equifinality: there are only so many ways that the body can react, so the same symptoms may be related to multiple diseases.  As well, one causal agent may have multiple manifestations.

In order to study disease, researchers must study the occurrence; two factors that affect this are prevalence (incidents at a given point in time) and incidence (rate of occurrence within a population).  These two proportions help to determine the occurrence of a disease over a landscape.

Understanding demography is crucial to understanding epidemiology.  When considering data for studies, researchers must take into account the size of the study, multi-level models, and adjacent geographical areas.  Often they need to strike a balance between “statistical stability of the estimates and geographic precision.”  There can also be many statistical challenges in dealing with epidemiological data and models.  One such challenge is the use of the standardized mortality ratio which may not work in all scenarios.  For example, in the case of Pellagra in the US, spatial smoothing (an interpolation process) was applied so there could be more confidence in the data.  Another option for small areas is shrinkage estimation (head-bang interpolation), which uses the information from nearby areas to give confidence to the area being studied.

GIS and Health Geography — Part II

GIS and Health Geography, Part II — Epidemiology, 2 March 2015

There are many applications for GIS within the field of health geography.  Four of these include the study of spatial epidemiology, the study of environmental hazards, modeling health services, and identifying health inequalities.

Spatial epidemiology investigates the spatial patterns of disease and risk of disease.  The scale of spatial epidemiology is at the individual and small area level, rather than the population level.  Several issues that arise in this field of work are spatial misalignment due to minor difference that have large impacts at the small-area level and general uncertainty in the data, whether due to collection or the inherent nature of the information.  Best practices have been defined to reduce these issues.  The classic example of spatial epidemiology is Dr. John Snow’s map of cholera deaths in 1854.

Environmental hazards are addressed by the CDC in a stepwise fashion in an attempt to understand and prevent disease.  Their process involves hazard surveillance, exposure surveillance, and outcome surveillance.  GIS can be used to identify causes or mitigating factors.

GIS can also assist governments and other agencies with modeling the distribution and success of health services.  One example of this use is Accessibility/Remoteness Index of Australia (ARIA).  This is a “generic index of accessibility/remoteness for all populated places that are non-metropolitan.”  It is useful for comparing results across different studies and identifying relationships across Australia.

GIS can be used to identify health inequalities based on previously known relationships.   For example, the relationship between socioeconomic status and heart disease can be mapped to help identify what part location plays in the relationship.

In general, epidemiology is “the study of the distribution and determinants of health and disease-related states in populations, and the application of this study to control health problems.”  There are many different approaches to this field of study, including descriptive and analytic approaches.  Again, the question of health as differentiated from disease arises, and the understanding of these two concepts dictates the decisions people make.

Health Geography — Part I

What is Health Geography? Part I, 23 February 2015

Geography has a very great influence on health: people living in different areas will have different health outcomes.  There are many ways to look at the relationship between geography and health, and these have changed over time.  The old perspective was the of “medical geography,” which promotes a “biomedical viewpoint.”  In the 1980s, naming and approach shifted to “health geography” which questions the ideas of authority inherent in healthcare.  Ideas from the discipline of geography as well as other social sciences are now incorporated into the field of healthcare.  Health is intimately linked to other factors in society; it is not a self-contained entity.  This hybrid became “post-medical” geography.

Health geography problematizes some of the unquestioned beliefs of medical geography.  This includes the assumption that doctors are “neutral,” and that factors such as gender and race are important to the provision of healthcare for various populations.  Different populations are treated differently and view the system differently.

There are five strands of health geography, on a spectrum from traditional medical to most contemporary :

  1. Spatial patterning of disease and health: medical geography; illness is seen as a “fact”; quantitative
  2. Spatial patterning of service provision: government perspective; emphasis on quantitative approaches, and the utilization of healthcare; assumes that everyone acts from a “rational,” cost-driven thought process
  3. Humanistic approaches to ‘medical geography’: emphasis on lay rationality, qualitative approaches, and the idea that illness is socially constructed.
  4. Structuralist / materialist / critical approaches to ‘medical geography’: assumes inherent inequalities based on the social, political, and economic systems; incorporates Marxist critiques of capitalism; health and power cannot be separated
  5. Cultural approaches to ‘medical geography’: one should immerse themselves in a particular community to understand their point of view

These diverse approaches stem from fundamental differences in the understanding of humanity.  These differences include the assumption of disease as “fact” rather than a function of geographical factors that affect access to healthcare; the role of place in health; the scale at which health is investigated; quantitative vs. qualitative analysis, and the centrality of social theory.

Some approaches combine multiple points of view.  Often, both quantitative and qualitative analyses are combined, such as in mixed-mode analyses.  The many-faceted field of health geography continues to overlap and change.

Statistics — Part II

Statistics Part II — 28 January 2015

Simple linear regression has one dependent variable and one or more independent variables.  They are expressed as the linear equation  y = a + bx.  The method creates a best-fit line based on the lowest sum of squared residuals.

Multiple regression is useful when there are several factors that affect the independent variable.  It is modeled with an equation where each X has its own coefficient, plus a random error term.

R squared is the correlation coefficient representing the residuals, and demonstrates the strength of the relationship and the validity of the model.

There are several ways to examine the utility of the model.  For example, the P values demonstrate the relevance of each variable, while F values show the significance of the model as a whole.  AICc values also demonstrate model simplicity and parsimony.  You should also look at multicollinearity to determine whether your independent variables are related to one another or are telling the same story.  A model could also be demonstrating endogeneity if there are circular or backwards relationships.  As well, it is important to determine the level of specificity, as a model may be biased by eliminating a variable.

Geography plays a very large role in statistical analysis of data.  As previously mentioned, spatial autocorrelation and MAUP are issues that the researcher should take into account.  Models such as Geographically Weighted Regression create local models that take geography into account. (Ordinary Least Squares, while it is probably the best known type of regression analysis, is useful for non-spatial data, because it provides a global model.)  Manipulation of bandwidth, kernel types, and other parameters can give more control over the creation of these local models.

Statistics

Statistics, 26 January 2015

Statistics are a useful tool for GIS because it provides methods to explore data, understand patterns and relationships, and predict the future.  Data can classified as nominal, ordinal, interval or ratio, and can also be examined as samples or as populations.

There are many ways to deal with a set of data Examinations could be performed graphically or numerically.  Data may also be derived (e.g. percentages) as opposed to raw numbers.

When looking to summarize data, there are many options, depending on the type of information you are looking at.  The following are some of the ways to summarize a set of data:

  • Measures of central tendency: mean, median, mode, etc.
  • Measures of skewness: if you plot it, are there are uneven tails on either side?
  • Measures of kurtosis: whether the distribution has a high peak or a low peak
  • Z-score: a measure of the relationship between a score and the mean of the data set
  • Arithmetic mean: average
  • Geometric mean: for percentage data.
  • Harmonic mean: average of rates. N over (sum of (1/variable))

As well, one can look at the relationships between values in a data set.  For these, you can use measures of association, such as various types of correlation:

  • Pearson’s R
  • Spearman’s R
  • Crosstabulation
  • Chi square statistics

Another way to look at relationships is through regression analysis. One common form of regression analysis is Ordinary Least Squares, which attempts to minimize the sum of squared errors.  The residuals are squared and then summed.

There are several issues to take note of when modeling data.  Ideally, you want simplicity and parsimony, to explain the most with the least number of variables. You must watch out for multi-collinearity: having two or more variables representing the same thing.  You should also look at whether your data is homoskedastic (residuals are scattered evenly along a straight horizontal line) or heteroskedastic (there is variability in the residuals across the range).  One other important factor, when looking at data involving geography is autocorrelation.  In general, most geographic data is spatially autocorrelated, so in performing analyses, you can use spatial declustering or other methods to reduce bias.

When determining which model to use, look at both the R Squared value and the information content (AIC).  The AIC will tell you how much is being explained by the variables, so you can weigh complexity of the model with explanation.

Here is some regression analysis terminology:

 

  • Simple linear: model uses just x and y
  • Multiple: multiple independent variables
  • Multivariate: multiple dependent and independent variables (canonical analysis)
  • SAR (Spatial Autoregressive model)
  • CAR (Conditional Autoregressive model)
  • Logistic: binary data
  • Poisson: count data
  • Ecological: how do we better predict what individual has, regardless of the group?
  • Hedonic: determine the value of something by assigning value to attributes of that thing
  • Analysis of variance: analysis of differences between means
  • T Test: difference between means
  • Analysis of covariance:analysis of the linear relationship between variables

One way to account for spatial autocorrelation and other issues introduced by geography is to use methods such as Geographically Weighted Regression (GWR).  These models account for local variation,as opposed to creating a global model.  One other method of avoiding the problems of spatial autocorrelation is to look at bandwidth.  If you look at small areas, they will display less correlation than the larger area generally.

To test the statistical analysis you have performed, you can use sensitivity analysis to see how the data responded to binary analyses.  This can include models, decay function, bandwidth, and point or centroid selection.

Patterns and Processes — Part II

Patterns and Processes Part II, 21 January 2015

There is a fundamental assumption that underlies a study of landscape metrics: “Processes are linked to, and can be predicted from, some (often unknown) broad-scale spatial pattern.”  From the pattern, we can understand the processes.  Therefore methods must be developed by which we can qualify and quantify the patterns we see.

Within a given landscape, there are three causes of spatial patterning: local uniqueness, phase differences, and dispersal.  Boundaries are created and maintained within the landscape which lead to the identification of patches.  These boundaries can be distinguished as “sharp, narrow, persistent” or “blurred, wide, transient,” and they may be difficult to identify.  One example of patterning is fragmentation.  Fragmentation may occur as the result of many different processes, and it both increases isolation and decreases rare species.   

There are certain criteria one must look at when analyzing a landscape.  According to Riitters et al. (1995), there are five classes of metrics:

  1. “Number of classes or cover types”
  2. “Texture measures (fine or coarse)”
  3. “Degree to which patches are compact or dissected”
  4. “Patches are linear or planar”
  5. “Patch perimeters are complicated or simple in shape”

When examining landscape composition, there are additional metrics:

  • “Relative richness: -the proportion of the number of cover types potentially present”
  • “Dominance: the deviation from the maximum possible evenness”
  • “Diversity: a reflection of richness and how evenly the proportions of cover types are distributed”
  • “Connectivity: based on a user-defined threshold, a measure of how connected the patches are”

Finally, there are additional measures of spatial configuration:

  • “Probabilities of adjacency–the probability that a grid cell of cover type i is adjacent to cover type j”
  • “Contagion–distinguishes between overall landscape patterns that are clumped or rather dissected”
  • “Connectivity–how fragmented is a habitat type”
  • “Proximity index–the degree to which patches in the landscape are isolated from other patches of the same cover type”
  • “Area-weighted average patch size–to account for the frequently observed skewed distribution in patch sizes”

One method of measuring the complexity of a landscape is to look at fractals.  It may be easy to identify a patch, but it is more difficult to determine whether that patch is complicated or simple.  Fractals identify the fundamental complexity of an object.