This analysis explores the relationships of agricultural commodity loss, at a county level, from 1989-2015, for the 26 county region of the Palouse, in Washington, Idaho, and Oregon. Here we explore the entire range of commodities and damage causes, identifying the top revenue loss commodities and their most pertinent damage causes - as indicated from the USDA’s agricultural commodity loss insurance archive.

Phase 3

In Phase 3, we perform a CHERRIES mixed modeling analysis using a two-step hurdle technique, for a selected set of damage causes. The following analysis builds on Phases 1 and 2.

Hurdle Mixed Models

Hurdle model techniques allow us to address zero inflated datasets, by first running a logstical regression model to determine the probability of zeros occuring. Then we use the non-zero values in a separate, mixed model. In this instance, we use county as a random effect.

In our two part hurdle model, we identify zero values - that is, counties and years that have zero loss for particular damage causes for cherries. Previously we removed counties that we have determined have no cherries being grown - based on known crop yield data. The counties we are identifying are those where we KNOW cherries are being grown, but in some instances, there are no loss claims being filed in particular years.

As such, these are not missing data, but actual zero values that we do not want to exclude from our model. However we want to be able to use a normalized distribution that is not positively skewed/zero inflated.

Hurdle Model - CHERRIES

Here we run our hurdle technique for CHERRIES, using a generalized linear model with a binomal function to delineate between zero and non-zero values. Given this model, Is our data normally distributed? What (if any) outliers exist? Are residuals well distributed - indicating normality?

Cherries Non-zero Goodness of fit hoslem test

##          llh      llhNull           G2     McFadden         r2ML 
## -343.9471505 -487.9400087  287.9857163    0.2951036    0.3241724 
##         r2CU 
##    0.4410982
##  Hosmer and Lemeshow goodness of fit (GOF) test
## data:  alllevs2_cherries$non_zero, fitted(m1)
## X-squared = 23.313, df = 8, p-value = 0.002985

Cherries zero/non-zero bionomal model to see outliers and zeros values vs non-zero values