Agricultural Data Mining Methodology

DMINE is a system of services and code that collect, transform,and model climate impacts.  We do this using R and python and other HPC mechanisms.

Study Area for analysis. The three state region of Idaho, Washington, and Oregon is used as a basis for our overall analysis – and zeroing into the three agricultural regions for specific model development.

Our approach uses data extraction and transformation techniques in R and python to organize and filter data, for use in machine learning predictive models.

Our models are visualized in data dashboards as well as application programming interfaces (API).  Our data dashboards allow a user to review and predict outcomes of a particular area, with our initial efforts focusing on agricultural systems for insurance commodity loss relate to climate.

DMINE Methodology

In order to describe how DMINE works, we have developed a specific, 8 step methodology that walks thru how we:

  • organize our hypothesis, (Step 1)
  • assemble our data and perform any necessary data preparation and organization, (Step 2)
  • perform initial exploratory data analysis, (Step 3)
  • perform initial feature extraction and transformation, (Step 4)
  • construct an algorithm to filter our data, (Step 5)
  • model construction and hyper-tuning, (Step 6)
  • implement a model API, and (Step 7)
  • perform integration and analysis on the overall results (Step 8).

These methodology steps use agricultural systems, drought, and commodity insurance claims as an example topic area.

Case Example Area: Climate Impacts & agricultural systems

Summary: Economic crop loss has a close relationship to food resilience and security.  Under this premise, we have been developing a case scenario example of data mining and machine learning to explore agricultural commodity loss and its relationship to drought and water scarcity.