Limitations

While we took steps to ensure that our models were specified to the best degree possible and that parameters were chosen appropriately, there are some limitations and sources of error and uncertainty which should be mentioned both regarding our chosen variables and the models with which we analysed them.

First is the effect of lag times. Our analysis primarily used data from 1999 and 2000, based on the assumption that smoking rates and other independent variables remained fairly constant over time. However, lung cancer mortality rates have strong temporal aspects. Indeed, one study by Islami, Torre, and Jemal (2015) found national lung cancer mortality rates generally followed historical trends in smoking prevalence with a lag of 20-30 years. It is possible that the mortality rates used in our study were more greatly influenced by smoking rates or other environmental factors of the 1970s and 80s. For instance, smoking rates peaked in the 1960s and had begun a rapid decline in the 1970s (CDC, 2014). During the time-frame of our analysis, long-term trends of American male and female lung cancer mortality rates had reached peak levels and begun to decline. This lag effect may be indicative of a temporal mismatch which this analysis did not account for.

Sources of uncertainty within our data are also worth noting. For instance, the radon risk surface produced by the EPA was based on numerous characteristics of the underlying geology, which was used by this study as a proxy for the radon exposure of the population. However, it is possible for exceedances to occur even in areas characterized as “low risk” by the EPA. Florida is one such example, where radon risk is characterized as low but exposure remains high (Florida Department of Health, n.d.). There are other determinants which may influence a population’s exposure to higher levels of indoor radiation. In the United States, radon resistant building regulations can be implemented at the state, county and municipal level (EPA, 2018). Our analysis did not consider which areas have adopted these types of legislation. It is possible that our analysis lacked variables which could have explained more of the observed lung cancer mortality patterns, such as health care accessibility, gender, or occupational factors related to exposure to carcinogens. Data values are also subject to effects of aggregation and the Modifiable Areal Unit Problem. Even at a small scale county-level analysis, there may be smaller intra-county variations in lung cancer mortality rates and the explanatory variables that describe them. The effects of local phenomena may be lost when aggregated to a greater spatial scale.

Limitations of our models must also be acknowledged. For instance, our GWR analysis was limited by edge effects. GWR performed local regression equations at each feature, which was dependent on values of neighbouring features. However, our study area had two borders along the western and northern edges of the study area. Contrary to counties in the central region of our study area, counties near these borders were not affected by 360 degrees of neighbours. We did take measures to limit edge effects on the western side of our study area in the creation of our atmospheric particulate matter surface by generating a layer across the entire USA, but edge effects would still influence the results of our GWR. We also struggled with effects of local multicollinearity. When trying to perform GWR on our second model, we experienced severe design failures. This drew concerns of the extent to which some of our explanatory variables may have exhibited multicollinearity. For example, income may be correlated with the proportion the population that is university educated. Unfortunately, troubleshooting could not get Model 2 to work in our GWR analysis so questions regarding the extent of multicollinearity could not be resolved. Lastly, our GWR analysis yielded local adjusted R-squared statistics that accounted for 19% to 72% of the observed lung cancer mortality rate. Areas where our R-squared values are lowest, in states like Pennsylvania (PA) or New York (NY), indicate that there were significant explanatory variables missing from the model.

It is also important to consider that this study considers mortality, rather than incidence of lung cancer. Mortality relates to survival, and thus may be influenced by many factors including the presence of other life-threatening conditions, smoking status at the time of diagnosis, and access to healthcare (Tammemagi, Neslund-Dudas, Simoff, & Kvale, 2004; Ward et al., 2009). This may lead to an underestimation of lung cancer risk in areas with a high proportion of former smokers, who will possess a higher risk of developing lung cancer than never-smokers (Christian, Bin Huang, Rinehart & Hopenhayn, 2011). Access to healthcare varies across the United States, and is limited by a number of factors such as the number of facilities, affordability of treatment and insurance, and systematic racism (Mayberry, Mili & Ofili, 2000; Shriniwas, Yingkui & Thomas, 2014). Moreover, local and statewide anti-smoking regulations would likely play an influential role on lung cancer mortality by limiting exposure to first and secondhand smoke (Pickett, Schober, Brody, Curtin & Giovino, 2006). This is more relevant to more recent studies; the time frame of our analysis was set before the implementation of the majority of such policies (Hopkins et al., 2012). Several studies have explored the influence of anti-smoking policies on cancer and respiratory health, but few have yet to employ spatial analysis tools such as geographically-weighted regression (Barnoya & Glantz, 2004; Tan & Glantz, 2012; Callinan, Clarke, Doherty & Kelleher, 2010).

Spam prevention powered by Akismet