CIRC’s data mining group is developing data mining transformations + machine learning models that predict the impact of climate change on agricultural systems. DMINE.io is our research development server.
The Climate Impacts Research Consortium (CIRC) data mining team is a grouping of researchers and students that use data mining and machine learning techniques to explore relationships of climate to agricultural systems. As part of our team work, we are developing data dashboards and associated APIs, which we are using to instantiate our data mining and machine learning processes.
Agricultural systems and their products are essential components to our society. In 2014, the U.S. agricultural sector created a gross output of more than 835 billion dollars, and had an employee base of approximately 750,000 people. With roughly 2 million farms in the US, with an average size of about 435 acres, total grain production alone was $436 million (USDA Economic Research Service, 2014).
Our DMINE team assembles agricultural data for the purposes of developing statistical models that can tell us more about the interactions of climate and agricultural commodity systems. We are using the USDA’s insurance program’s 2.8 million records of commodity losses in the Pacific Northwest from 1989-2015, in combination with related climatic variables.
Data Acquisition for Agriculture
Several key datasets have been initially identified, including:
- The USDA’s agricultural crop loss data archive (1989-2016). The USDA’s Risk Management Agency has insurance claim records associated with commodity crop loss from 1980 to 2016. Specifically, we are using the cause of loss archive datasets, which are .csv files which summarize insurance claims by month and by county. This data is available for the entire United States. For this analysis we have focused on the three state region of Idaho, Oregon, and Washington.
- NASS crop commodity results. The USDA’s agricultural statistical service provides extensive, county based information on commodity outputs nationwide, including variables such as annual area harvested, production, sales, and water applied. We also use NASS CDL acreage for cropland commodities by year and county. NASS Cropland Data Layer commodity codes (csv)
- Associated climate and geophysical variables. With all dashboards and functional data areas, we are combining with climate/geophysical variables to explore patterns. For agriculture, we aggregate climate data at a county level, and then compare with 27 years of agricultural monthly commodity loss claims across the Pacific Northwest. See our Methodology for more information, as well as our Agriculture Data Portal.
EXAMPLE: Below is an animation for agricultural commodity loss from 1989 – 21015 for Wheat claims due to drought. Our Agricultural Dashboard provides more information related to the subject of agricultural commodity loss and commodity outputs in the future.
Example WHEAT Commodity Loss Animation, 1989 2015
This work is developed as part of the Pacific Northwest Climate Impacts Research Consortium (CIRC), a climate-science-to-climate-action team funded by the National Oceanic and Atmospheric Administration (NOAA). A mix of scientists from disciplines as varied as atmospheric and social science, CIRC is a proud member of NOAA’s Regional Integrated Sciences and Assessments (RISA) program, a national leader in climate science and adaptation.