This analysis explores the relationships of agricultural commodity loss, at a county level, from 1989-2015, for the 26 county region of the Palouse, in Washington, Idaho, and Oregon. Here we explore the entire range of commodities and damage causes, identifying the top revenue loss commodities and their most pertinent damage causes - as indicated from the USDA’s agricultural commodity loss insurance archive.

Phase 3

In Phase 3, we perform a DRY PEAS mixed modeling analysis using a two-step hurdle technique, for a selected set of damage causes. The following analysis builds on Phases 1 and 2.

Hurdle Mixed Models

Hurdle model techniques allow us to address zero inflated datasets, by first running a logstical regression model to determine the probability of zeros occuring. Then we use the non-zero values in a separate, mixed model. In this instance, we use county as a random effect.

In our two part hurdle model, we identify zero values - that is, counties and years that have zero loss for particular damage causes for dry peas Previously we removed counties that we have determined have no dry peas being grown - based on known crop yield data. The counties we are identifying are those where we KNOW dry peas are being grown, but in some instances, there are no loss claims being filed in particular years.

As such, these are not missing data, but actual zero values that we do not want to exclude from our model. However we want to be able to use a normalized distribution that is not positively skewed/zero inflated.

Hurdle Model - DRY PEAS

Here we run our hurdle technique for DRY PEAS, using a generalized linear model with a binomal function to delineate between zero and non-zero values. Given this model, Is our data normally distributed? What (if any) outliers exist? Are residuals well distributed - indicating normality?

Dry Peas Non-zero Goodness of fit hoslem test

##           llh       llhNull            G2      McFadden          r2ML 
##  -755.4688136 -1138.2908423   765.6440573     0.3363130     0.3129277 
##          r2CU 
##     0.4653870
##  Hosmer and Lemeshow goodness of fit (GOF) test
## data:  alllevs2_drypeas$non_zero, fitted(m1)
## X-squared = 15.916, df = 8, p-value = 0.04359

Dry Peas zero/non-zero bionomal model to see outliers and zeros values vs non-zero values