Using data mining techniques to model primary productivity from international long-term ecological research (ILTER) agricultural experiments in Austria

Using data mining techniques to model primary productivity from international long-term... Primary productivity is in the foundation of farming communities. Therefore, much effort is invested in understanding the factors that influence the primary productivity potential of different soils. The International Long-Term Ecological Research (ILTER) is a network that enables valuable comparisons of data in understanding environmental change. In this study, we investigate three ILTER cropland sites and one long-term field experiment (LTE) outside of the ILTER network. The focus is on the influence of different management practices (tillage, crop residue incorporation, and compost amendments) on primary productivity. Data mining analyses of the experimental data were carried out in order to investigate trends in the productivity data. We generated predictive models that identify the influential factors that govern primary productivity. The data mining models achieved very high predictive performance (r > 0.80) for each of the sites. Preceding crop and crop of the current year were crucial for primary productivity in the tillage LTE and compost LTE, respectively. For both crop residue incorporation LTEs, plant-available Mg affected productivity the most, followed by properties such as soil pH, SOM, and the crop residue management. The results obtained by data mining are in line with previous studies and enhance our knowledge about the driving forces of primary productivity in arable systems. Hence, the models are considered very suitable and reliable for predicting the primary productivity at these ILTER sites in the future. They may also encourage researcher- farmer-advisor-stakeholder interaction, and thus create enabling environment for cooperation for further research around these ILTER sites. . . . . . Keywords Soil functions Crop yield Plant-available Mg Tillage Compost amendments Crop residue incorporation Introduction * Aneta Trajanov Primary productivity, the capacity of a soil to produce plant aneta.trajanov@ijs.si biomass for human use (as food, feed, fuel, or fiber), is one of Heide Spiegel the cornerstones of prosperous farming communities. adelheid.spiegel@ages.at Accordingly, farmers need to focus on multiple soil functions Marko Debeljak in order to maintain the productivity function of the soil marko.debeljak@ijs.si (Schulte et al. 2014) and to help secure the viability of farms for the next generations. This includes the soils’ provision of Taru Sandén taru.sanden@ages.at clean drinking water, the recycling of nutrients, carbon se- questration, and soil serving as a habitat for biota (Schulte et Department of Knowledge Technologies, Jozef Stefan Institute, al. 2015). To this end, several improved management practices Jamova cesta 39, 1000 Ljubljana, Slovenia are being applied in the field. No-tillage or non-inversion till- Jozef Stefan International Postgraduate School, Jamova cesta 39, age practices are being promoted to reduce the labor and crop 1000 Ljubljana, Slovenia production costs, but also due to their positive effects on soil Department for Soil Health and Plant Nutrition, Institute for organic matter (SOM) and soil aggregate stability (Bauer et al. Sustainable Plant Production, Austrian Agency for Health & Food 2015; Tatzber et al. 2008; Spiegel et al. 2007). Incorporation Safety – AGES, Spargelfeldstrasse 191, 1220 Vienna, Austria A. Trajanov et al. of crop residues and various organic fertilizer amendments, tasks and discover different patterns in the data (e.g., decision such as composts and farmyard manure, is feasible options trees, clusters, equations, rules). These algorithms search for substituting mineral nitrogen fertilizers (Spiegel et al. through the space of patterns (models) to find interesting pat- 2010). Currently, conservation agriculture is becoming more terns that are valid in the given data. wide-spread among the global farming community. Already This study was designed to predict primary productivity approximately 125 million hectares of land are managed by and to identify the driving factors that govern primary produc- the principles of conservation agriculture (Friedrich et al., tivity by means of data mining. To this end, we addressed the 2012; Brouder and Gomez-Macpherson, 2014). The defini- following questions within the framework of the selected field tion of conservation agriculture includes minimum, non- experiments: inversion or reduced tillage, combined with retention of crop residues on the soil surface and crop rotation. The aim is to (1) Can data mining help make reliable predictive models of conserve soil and water for optimum productivity (Hobbs et primary productivity from LTE (long-term field al. 2008; Kertész and Madarász 2014). experiments) data? The International Long-Term Ecological Research (1) What are the driving factors of primary productivity in (ILTER) represents a network that enables valuable compari- the selected arable LTEs that are sufficiently fertilized sons of data for understanding environmental change. with main nutrients? Nonetheless, cropland sites are still underrepresented in the (2) Do the selected management practices influence primary network, and more sites would be needed for global compar- productivity? isons. In Austria, only four agricultural ILTER sites are includ- ed in the network of 38 sites in total (Mirtl et al. 2015). This study focuses on three of the cropland sites (tillage and crop residue incorporation), as well as one long-term field experi- Materials and methods ment (compost amendments) outside of the ILTER network. The previous investigations of the selected LTER sites have International Long-Term Ecological Research (ILTER) focused on how the different improved management practices, experimental sites i.e., tillage (Franko and Spiegel 2016; Tatzber et al. 2008; Spiegeletal. 2007), crop residue incorporation (Spiegel et al. This paper investigated data from four Austrian long-term 2018), cropping systems (Tatzber et al. 2009, 2012, 2015a, b; field experiments (LTEs, Fig. 1). Tatzber 2009), and organic/compost amendments (Lehtinen et al. 2017; Hijbeek et al. 2017; Körschens et al. 2013), affected Tillage the soil properties as well as soil productivity. However, no analyses of the experimental data have been carried out in order The long-term field experiment investigating different tillage to determine patterns in the productivity data. management (tillage LTE) was established in Fuchsenbigl There are many positive examples of using data mining (Table 1). In brief, the experiment included three different techniques for building predictive models in the field of agri- tillage systems (minimum, reduced, and conventional tillage) cultural and environmental sciences (Bondi et al. 2018; Bui et (Spiegel et al. 2002, 2007; Tatzber et al. 2008). The experi- al. 2009;Debeljaket al. 2007, 2008; Goldstein et al. 2017; ment consisted of a randomized block design, the plots mea- Kuzmanovski et al. 2015; Shekoofa et al. 2014; Trajanov suring 60 m × 12 m each. The crop rotation was not fixed and 2011). Their biggest advantage is that they are applied on consisted of the most important crops for the region such as easily obtainable empirical data, and the parametrization of cereals, sugar beet, maize, and grain legumes. the data mining models is done automatically from the data; hence it is not influenced by the subjectivity of the modelers. Crop residue incorporation By applying data mining methods, data sets from long-term field experiments can be turned into an understandable struc- Two long-term field experiments were established to investi- ture, and interpretable patterns (i.e., long-term trends and their gate the management of crop residues, crop residue LTE in drivers) in the data can be identified. Rutzendorf, and crop residue LTE in Grabenegg. Both sites Data mining, as a part of the Knowledge Discovery in have recently been described by Spiegel et al. (2018). The Databases (KDD) process, uses machine learning and statisti- field experiments consisted of a randomized block design with cal methods in order to find interesting patterns in data four replicates, each plot measuring 32 m × 6 m (192 m )in (Fayyad et al. 1996). The goal of data mining is to extract Rutzendorf and 30 m × 7.5 m (225 m ) in Grabenegg. There information from datasets that is intelligible and useful in an were four P-fertilization stages (0, 33, 66, 131 kg P ha-1y-1), understandable and easily interpretable format. Different data and all crop residues were either incorporated or removed in mining algorithms are used to address different data mining the treatments. The crop rotation was not fixed and consisted Using data mining techniques to model primary productivity from international long-term ecological research... Fig. 1 Map of the long-term experiment (LTE) locations in Austria of the most important crops for both regions, such as cereals, compost, manure compost, and sewage sludge compost −1 sugar beet, grain maize, and grain legumes. plots (each treatment corresponding to 175 kg N ha )with a crop rotation of winter wheat, winter barley, maize, and pea (without compost application). In further variants, the Compost amendments four compost amendments are fertilized with 80 kg mineral −1 N(NH NO )ha . The long-term compost LTE was designed in Ritzlhof near 4 3 Linz, Upper Austria, to study the effects of different com- post amendments on chemical, physical, and microbial soil Data mining methods parameters and plants. The compost LTE and its soils have previously been described in Lehtinen et al. (2017)and ref- In our study, we used data mining algorithms for generation erences therein. The field experiment consists of a random- of decision trees (Breiman et al. 1984), in particular model ized block design with four replicates, the plots measuring and regression trees. The algorithms for building decision 5 m ×6 m (=30 m ). The field trial includes a control plot trees are one of the most commonly used data mining algo- (zero N), minerally fertilized plots (40 kg N, 80 kg N, rithms. They predict the value of a dependent variable −1 −1 120 kg N ha y ) and biowaste compost, green waste (termed target attribute) from a set of independent variables Table 1 The long-term agricultural field experiments of AGES (Austrian Agency for Health & Food Safety) Tillage LTE Crop residue LTE - Grabenegg Crop residue LTE - Rutzendorf Compost LTE GPS coordinates 48° 11′ N16° 44′E48°12′ N15° 15′E48°21′ N16° 61′E48°18′ N14° 25′ E ILTER site name LTER_EU_AT_030 LTER_EU_AT_038 LTER_EU_AT_030 n.a. Year established 1988 1986 1982 1991 Soil type (WRB, 2015) Haplic Chernosem Gleyic Luvisol Calcaric Phaeozem Cambisol MAT 9.4 °C 8.5 °C 9.1 °C 8.5 °C MAP 529 mm 836 mm 540 mm 753 mm Texture (% clay/silt/sand) 22/41/37 16/77/7 23/52/26 17/69/14 pH (CaCl ) 7.6 6.6 7.5 6.8 SOC 1.2% 1.0% 2.2% 1.2% A. Trajanov et al. (attributes). They are hierarchical models that contain inter- is to use tenfold cross-validation as a standard method nal and terminal nodes, connected with edges. In each inner for their evaluation. node, the value of an attribute is tested and compared to a Another measure of predictive performance is Root constant value. The edges coming out from the node corre- Mean Square Error (RMSE) (Witten and Frank 2011). It is spond to the outcome of the test. The leaf nodes contain the a measure that reports the average magnitude of the error. It predictions of the target attribute that apply to all samples is the square root of the average of squared differences be- that fall in that leaf. To predict the value of the target attri- tween prediction and actual values of the target attribute. bute for a new sample, it is routed down the tree according to To model the influence of different agricultural manage- the values of the attributes that are tested in each node. ment techniques on primary productivity, we used the data When the sample reaches a leaf, it is given the prediction mining package WEKA (Witten and Frank 2011), which assigned to the leaf. implements a large collection of machine learning algo- When the values of the target attribute are numeric, the rithms for different data mining tasks. In this study, we used leaves of the tree contain models for predicting it. The the model and regression tree algorithm M5P. For generat- models can be piece-wise linear regression equations, in ing regression trees with syntactic constraints, we used the which case the decision trees are known as model trees, or decision and rule induction system CLUS (Blockeel and can be constant values and in this case, the decision trees are Struyf 2002). termed regression trees. When generating regression trees, syntactic constraints can also be used (Džeroski et al. 2010). Data description The syntactic constraints influence the process of building the trees by defining a partial structure of the tree, from The data from the four Austrian long-term field experi- which point on the tree is generated automatically, follow- ments, described in the BInternational Long-Term ing the regular regression tree algorithm. Ecological Research (ILTER) experimental sites^ section, In this study, we generated model and regression trees were organized and preprocessed in order to be analyzed for each of the LTE experimental sites (tillage LTE, crop using data mining techniques. The data comprising the three residue LTEs, and compost LTE). For easier interpretation LTEs datasets are presented in Table 2. The attributes in- of the model trees, we calculated the average predicted cluded the long-term monitoring data available from each of value of the samples that fall into each leaf according to the experiments; thus, the attributes differed slightly be- its piece-wise equation, as well as their average actual tween the LTEs. values. When interpreting decision (classification or re- Although the general structure of all datasets was similar, gression) trees, we start from the top (root) of the tree. each dataset was preprocessed in a unique way in order to The most important factors that influence the target attri- correctly address the goal and obtain the most accurate and bute (primary productivity in our case) appear at the top. interpretable data mining models possible. The structure of the The importance of the attributes decreases as you move separate datasets from each experimental site is explained in towards the lower levels of the tree. the following sections. There are different measures of predictive performance to assess how good the data mining models describe our data. To assess the performance of regression and model Tillage trees, we used the correlation coefficient as a measure: it quantifies the statistical correlation between the predicted The tillage dataset consisted of data from 18 years of exper- and the real values of the target attribute. The values of iments (1998–2015), yielding 162 samples, described with the correlation coefficient can vary between 1 (perfect soil parameters, management techniques (tillage), and crop correlation) and − 1 (perfect negative correlation) through yields. In addition to these original attributes, for each ex- 0 (no correlation at all). In addition, to assess how good ample, we included the soil properties of the preceding year the model performs on new (test) data, we used the ten- (derived attributes) in order to check whether the soil prop- fold cross-validation technique (Witten and Frank 2011). erties of the preceding year and the type of tillage applied on In cross-validation, the dataset is split into n approxi- the field influence the crop yield in the current year. mately equal partitions (folds). Each fold is (in turn) used Three types of crops were grown at the tillage experimental for testing, while the remaining folds are used for train- site: sugar beet, grain maize, and cereals. These crops have ing (building) the model. This procedure is repeated n significantly different absolute yields. Therefore, we divided times and, at the end, the correlation coefficients obtained the dataset into three subsets, according to crop classes: in the different iterations are averaged to obtain the over- all correlation coefficient of the data mining model. A & Sugar beet (number of samples 18) common practice when generating data mining models & Grain maize (number of samples 36) Using data mining techniques to model primary productivity from international long-term ecological research... Table 2 Each long-term experiment dataset comprised of data describing the soil properties of the experimental sites, the management techniques used, and the crop yield Attributes Tillage long-term experiment Crop residue long-term Compost long-term experiment experiments Crop • Sugar beet Cereals (winter wheat, • Maize • Grain maize winter barley) • Cereals (spring wheat, winter wheat, • Cereals (winter wheat, spring wheat, winter barley, and pea) soybean, winter barley, spring barley) + preceding crops Soil • Percentage of total nitrogen (N) • Percentage of total • Percentage of total nitrogen (N) • Mineralized N (mg/kg/7day) nitrogen (N) • Mineralized N (mg/kg/7day) • Percentage of total soil organic • Mineralized N • Percentage of total soil organic carbon (SOC) (mg/kg/7day) carbon (SOC) • Plant available phosphorus (P) (mg/kg) • Percentage of total soil • Plant available phosphorus • Plant available potassium (K) (mg/kg) organic carbon (SOC) (P) (mg/kg) • Plant available magnesium • Plant available phosphorus • Plant available potassium (Mg) (mg/kg) (P) (mg/kg) (K) (mg/kg) • Soil pH • Plant available potassium • Plant available magnesium • Percentage of carbonate content (CAO) (K) (mg/kg) (Mg) (mg/kg) • Exchangeable calcium (Ca) (cmolc/kg) • Plant available magnesium • Soil pH • Exchangeable magnesium (Mg) (mg/kg) • C to N ratio (C/N) (Mg) (cmolc/kg) • Soil pH • Exchangeable potassium (K) (cmolc/kg) • CtoN ratio (C/N) • Exchangeable sodium (Na) (cmolc/kg) • Cation exchange capacity (CEC) (cmolc/kg) • C to N ratio (C/N) + preceding soil attributes Management • Minimum tillage • Removal of crop residues • Mineral fertilization (0, 40, 80 • Reduced tillage • Incorporation of crop residues or 120 kg N/ha/y) • Conventional tillage • Compost amendments (bio-waste, green waste, manure or sewage sludge compost corresponding to 175 kg N/ha/y) • Combination of compost and mineral fertilization (compost with additional 80 kg mineral N/ha/year) Primary Yield (kg/ha) Yield (kg/ha) Yield (kg/ha) productivity & Cereals: winter wheat, spring wheat, soybean, winter bar- The attributes from which CEC and C/N are calculated ley, spring barley (number of samples 108) were excluded in scenarios 2 and 4 in order to avoid correla- tions in the investigated attributes. For the data mining analyses, we generated five scenar- ios using different combinations of original and derived attributes: Crop residue incorporation & Scenario 1: Original attributes WITHOUT CEC and C/N The data from the crop residue incorporation experimental & Scenario 2: Original attributes WITH CEC and C/N (exclud- sites consisted of data from two LTEs—Rutzendorf and ing the attributes from which CEC and C/N are calculated) Grabenegg. Each dataset comprised 5 years of experiments & Scenario 3: Original and derived attributes WITHOUT (2002, 2008, 2010, 2012, and 2014), yielding 160 samples, CEC and C/N described by soil properties, management practices (crop res- & Scenario 4: Original and derived attributes WITH CEC idue incorporation or removal), and the crop yield. Data about and C/N (excluding the attributes from which CEC and preceding years were not included in these datasets because C/N are calculated) the data in these LTEs were not collected for consecutive & Scenario 5: Original and derived attributes WITH CEC years. At these experimental sites, only cereal crops were and C/N and syntactic constraints (forcing the tillage attri- grown, so there was no need to divide the datasets according bute at the top of the decision trees) to crop type. A. Trajanov et al. Here, we performed two scenarios for both Rutzendorf and Crop residue incorporation Grabenegg datasets: The regression trees of both trials highlight that the most impor- & Scenario 1: Using total nitrogen and total soil organic tant attribute for primary productivity in the crop residue incorpo- carbon attributes and excluding the C/N attribute ration long-term experiments was the plant-available Mg (Fig. 3). & Scenario 2: Using C/N and excluding the total nitrogen For modeling the influence of soil and crop properties as and total soil organic carbon attributes well as crop residue incorporation on primary productivity, we had one dataset for each LTE, for which we carried out two scenarios. The predictive performances of the model and Compost amendments regression trees obtained for the two datasets and for both sce- narios are veryhigh(Table 4). This makes them very reliable The compost amendment dataset consisted of 8 years of ex- for predicting and modeling the primary productivity in a field. periments (1998, 2001, 2002, 2003, 2005, 2007, 2012, and The best models were obtained for scenario 1. The regres- 2015), yielding 384 samples, described by soil properties, sion trees for Grabenegg and Rutzendorf are presented in Fig. 3. management practices (type of fertilization and compost amendment), and crop yield. At this experimental site, two Compost amendments classes of crops were grown: maize and cereals (spring wheat, winter wheat, winter barley, and pea). As in the case of the The regression tree for modeling the primary productivity in tillage LTE data and to avoid biasing the data mining models, cereals (Fig. 4) shows that in the compost amendment LTE, we divided the dataset into two subsets because the two types the crop grown on the field and the treatment applied on the of crops have significantly different absolute crop yields in kg/ field were the major drivers of primary productivity. ha: one consisting of data only for maize (144 examples) and As in the crop residue LTE analyses, for the compost the other for cereals (240 examples). Data on preceding crops amendments experimental site, we developed models for grown on the fields were also included in the datasets. two scenarios, using the C/N ratio and using total soil organic Here, we also carried out two scenarios: carbon and total nitrogen as separate attributes. The predictive performances of the obtained models are presented in Table 5. & Scenario 1: Using total nitrogen and total soil organic The correlation coefficients of the models for both types of carbon attributes and excluding the C/N attribute crops were very high, 0.78 and 0.94, respectively. The models & Scenario 2: Using C/N and excluding the total nitrogen obtained for the cereals dataset have especially high correlation and total soil organic carbon attributes coefficients, which make the predictions very reliable. The re- gression tree obtained for the cereals dataset is presented in Fig. 4. Results Discussion Tillage Tillage The results of the obtained model and regression trees for the In the tillage LTE, the crop rotation mimicked the current man- tillage experimental site are presented in Table 3 in terms of agement practices in the area, i.e., the most common agricultural correlation coefficients (r) and Root Mean Square Error crops were grown in 3–6-year crop rotations. Various aspects (RMSE). influence a farmer’s decision which crop to grow, all of which Figure 2 indicates that the preceding crop was of pivotal may act at a local, regional, or even at the global scale (Hazell importance for primary productivity. The predictive perfor- and Wood 2008). They may include the farm type, the economic mance of the models obtained for sugar beet was low. market, the technological opportunities at hand, the possibilities However, due to the low number of samples (18 in total) in for government or EU subsidies, as well as the nature of the this dataset, these results are unreliable. The best results farmer’s soil (Hazell and Wood 2008; Bennett et al. 2012). If (models) were obtained for grain maize and cereals, where economic market trends influence the choice of a crop of the the highest correlation coefficients (0.83 and 0.84, respective- season, the expected yields are probably one of the most impor- ly) were obtained for scenario 4. Overall, the correlation co- tant driving factors for the choice. Our data mining models efficients of the models for grain maize and cereals were clearly showed (Fig. 2) that cereal yields were significantly low- higher in scenarios 3 and 4, where we used soil and crop data er when sugar beet or winter wheat was the preceding crop of the preceding year, compared to the scenarios 1 and 2, compared to e.g., soybean or spring wheat. These differences where we used data only for the current year. may reflect a combination of factors, including how the grown Using data mining techniques to model primary productivity from international long-term ecological research... Table 3 Predictive performance in terms of correlation coefficient (r) and Root Mean Square Error (RMSE) of the model and regression trees obtained for the tillage International Long-Term Ecological Research (ILTER) experiments Scenarios 1: 2: 3: 4: 5: Original attr. Original attr. Original and Original and Original and derived without CEC with CEC derived attr. derived attr. attr. with CEC and and C/N and C/N without CEC with CEC C/N and a syntactic and C/N and C/N constraint Sugar beet Model trees r 0.62 0.69 0.42 0.54 / RMSE (kg/ha) 4552.73 3999.71 5182.84 4784.59 / Regression trees r − 0.22 − 0.22 − 0.14 − 0.25 0.68 RMSE 5928.43 5928.55 5865.84 5896.83 4264.75 (kg/ha) Grain maize Model trees r 0.70 0.72 0.82 0.83 / RMSE 905.21 884.38 730.81 711.35 / (kg/ha) Regression trees r 0.54 0.52 0.78 0.80 0.73 RMSE 1096.85 1108.93 986.36 978.81 877.4 (kg/ha) Cereals Model trees r 0.68 0.77 0.81 0.84 / RMSE 649.20 560.99 520.27 480.83 / (kg/ha) Regression trees r 0.47 0.68 0.79 0.74 0.67 RMSE 782.09 675.00 593.08 627.59 653.78 (kg/ha) The bold values represent the best results obtained for different scenarios. crops influence the soil-plant interphase with regard to soil prop- The importance of the preceding soil and crop properties is erties, pests, pathogens and soil microorganisms (Bennett et al. also evident from the generated model and regression trees. The 2012), and residual effects on the succeeding crops—just to best model tree, obtained for the cereals dataset and scenario 4 mention a fraction of the possibilities. (Fig. 2), shows that the most important attributes for predicting Fig. 2 Model tree for modeling the primary productivity of cereals in the tillage LTE A. Trajanov et al. Fig. 3 Regression trees for modeling the primary productivity in the crop residue incorporation long-term experiments: a regression tree for Grabenegg, and b regression tree for Rutzendorf yield were the preceding crop, the preceding yield, the C/N ratio, We were also interested in determining how the manage- and the preceding CEC and plant-available phosphorus. These ment practice, in this case soil tillage, influences primary pro- soil parameters are well-known to affect plant nutrition. A small- ductivity. However, the attributes describing the management er C/N ratio indicates more rapid decomposition of soil organic practice did not appear in the model and regression trees in matter and, thus, release of plant-available mineral nitrogen scenarios 1 to 4. Thus, in scenario 5, we generated regression (Jarvis et al. 2011). The sum of exchangeable cations—in alka- trees with syntactic constraints. Here, we defined the partial line soils mainly Ca2+, Mg2+, Na+ and K+, which are adsorbed structure of the tree (syntactic constraint) in a way that we on the exchange complex of the soil—also indicate the nutri- Bforced^ the management attributes to be at the top of the tree tional status of the soil and may inform about deficiencies and from there, the tree was generated automatically from the (Kopittke and Menzies 2007). Phosphorus is a key nutrient data. Nonetheless, the correlation coefficients of these regres- and essential for optimizing yields. Spiegel (2001) showed in sion trees were lower than in scenarios 3 and 4. We therefore another long-term experiment at the same site that plant- conclude that soil tillage, as a management practice, is less available P was significantly positively correlated with spring important for primary productivity than the current and pre- barley, sugar beet, and winter wheat yields. Thus, the model ceding soil and crop properties. This is in line with former results are in line with earlier findings that yields were enhanced findings from this field experiment that, on average, yields if CEC and plant-available phosphorus showed higher values. did not differ between the investigated tillage practices (Franko and Spiegel 2016; Spiegel et al. 2002, 2010). Crop residue incorporation Table 4 Predictive performance in terms of correlation coefficient (r) and Root Mean Square Error (RMSE) of the model and regression trees obtained for the crop residue incorporation International Long-Term Gerendás and Führs (2013) recently reviewed the literature on Ecological Research (ILTER) experiments the effect of magnesium on crop quality. Their review on cereals agrees with our results of positive yield response to Scenarios plant-available Mg in the crop residue incorporation experi- 1: 2: ments in Rutzendorf and Grabenegg. Gerendás and Führs Without C/N With C/N (2013), however, also show that there is not necessarily any quality response to Mg beyond the yield maximum. The mag- Grabenegg Model trees r 0.90 0.88 nesium available for plants depends on several factors includ- RMSE 655.82 704.15 ing soil pH, soil moisture, weathering, and microbial activity Regression trees r 0.76 0.84 of the soil (Senbayram et al. 2015). Grzebisz (2013) described RMSE 991.79 870.33 the so-called Bmagnesium-induced nitrogen uptake^ which Rutzendorf Model trees r 0.94 0.93 highlights the positive effect of magnesium on the nitrogen RMSE 537.0 568.88 uptake efficiency of the plants. Many factors, among them Regression trees r 0.92 0.91 source rock material and its properties and grade of RMSE 762.93 769.19 weathering as well as management practices such as crop The bold values represent the best results obtained for different scenarios. rotation and fertilization practices, influence the availability Using data mining techniques to model primary productivity from international long-term ecological research... Fig. 4 Regression tree for modeling the primary productivity of cereals in the compost amendments long-term experiment of Mg to plants (Gransee and Führs 2013). In our study, it was supply may be more prone to challenges due to severe envi- plant-available Mg in the soils, not Mg fertilization, that was ronmental conditions such as heat (Cakmak 2013). investigated and most important for crop yields. All the treat- Besides Mg, also the pH value of the soil, SOC, and po- ments in Grabenegg and Rutzendorf were sufficiently sup- tential N mineralization (only in Grabenegg)—but also the plied with N, P, and K. This may explain why plant- crop residue management and the crop type (only available Mg became so important in our model (Gransee Rutzendorf)—affected primary productivity. In the and Führs 2013). Magnesium is an essential plant nutrient that Grabenegg LTE, soils with slightly acidic pH values had is one of the building blocks of chlorophyll (Gerendás and higher crop yields than soils with lower acidity. In contrast, Führs 2013). Magnesium is also involved in enzyme activa- at Rutzendorf LTE, higher yields were achieved at alkaline tion, ATP formation and utilization, and growth of roots as soil pH levels. In both experiments, with higher soil pH, crop well as in seed formation (Cakmak 2013), making it very residue management influenced crop yields. Yields increased important for the whole life cycle of a plant. In intensively with long-term incorporation of crop residues. SOC was very farmed soils, such as soils of our two long-term field experi- different in the two experiments: low in Grabenegg and high ments, Mg balances become even more important for crop in Rutzendorf. In Grabenegg, higher SOC led to higher yields, yields due to a possible rapid depletion of Mg of the soils whereas this was not the case in the already highly supplied (Cakmak 2013). Moreover, wheat grown under low Mg Rutzendorf soils. This is in line with results from 20 long-term field experiments in Germany (Körschens et al. 2013), where the relevance of initial SOC was emphasized. Both pH and Table 5 Predictive performance in terms of correlation coefficient (r) SOC are fundamental for soil fertility, especially for the bio- and Root Mean Square Error (RMSE) of the model and regression trees obtained for the compost amendment International Long-Term logical activity of soils (Diepenbrock et al. 2009), driving Ecological Research (ILTER) experiments important biogeochemical cycles (e.g., C, N, and P). Former studies have also revealed that long-term crop residue incor- Scenarios poration leads to higher SOC compared to the yearly removal 1: 2: (Lehtinen et al. 2014; Poeplau et al. 2015, 2017; Spiegel et al. Without C/N With C/N 2018). Lehtinen et al. (2014) showed a significant increase in SOC when crop residues were incorporated, but did not find Maize Model trees r 0.78 0.69 significant correlations between SOC and crop yields. A gen- RMSE 901.70 1046.58 eral 6% increase in yields following crop residue incorpora- Regression trees r 0.73 0.69 tion, as compared to crop residue removal, was observed. RMSE 1015.61 1069.30 Poeplau et al. (2015, 2017) have shown similar increase Cereals Model trees r 0.97 0.97 ranges in SOC following crop residue incorporation in RMSE 466.57 469.72 Sweden and Italy. The interplay between soil organic matter Regression trees r 0.93 0.93 and attainable yields also puzzled Hijbeek et al. (2017), who RMSE 720.68 720.68 investigated how different organic inputs affected crop yields. Their assumption was that increased soil organic matter leads The bold values represent the best results obtained for different scenarios. A. Trajanov et al. to increased crop yields, but to their surprise, the increase was 2000; Debeljak and Džeroski 2011;Jiaweietal. 2006; not statistically significant when 20 European long-term ex- Veenadhari et al. 2011). Because of their ability to represent periments were investigated together. They explained this by the relationships between the attributes in a visual way, the differences between experimental sites and the soil properties, discovered patterns and knowledge about the problem can as well as organic inputs used in the various experiments be easily interpreted. Therefore, the created decision trees (Hijbeek et al. 2017). could also be used to strengthen the researcher-farmer- advisor-stakeholder dialog and to foster co-creation of Compost amendments new research questions with high farmer relevance. From the top attributes from the decision trees, the farmers could In general, pea and spring wheat achieved lower yields than disentangle what may be limiting their productivity and winter wheat and winter barley (Diepenbrock et al. 2009), how to improve it. not least because of the shorter growing season. For the The construction of data mining models (model or re- cereal crops, spring and winter wheat and winter barley, gression trees) proceeds automatically from the available fertilization was an important driver for yields. No or low data, minimizing the researchers’ subjectivity during the mineral fertilization and only compost amendments resulted generation of the models. This means that the form of the in lower yields compared to sufficient mineral fertilization models and the interactions among the variables are induced or a combination of compost and mineral fertilization. Pea, a automatically from the data and not set by the experts. Since legume, did not use the given fertilizer either in mineral or in the models were generated from data, they can be easily organic form. Our modeling results coincide with the results validated using different validation techniques such as ten- of conventional statistical analyses of a similar data set ob- fold cross-validation, train-test sets, or leave-one-out vali- tained from this long-term compost experiment (Lehtinen et dation (Witten and Frank 2011). The limitations are con- al. 2017). In farm management, caution should be exercised nected to availability of long-term data. In case data on the with short crop rotations or when focusing on only the most role of soil aggregation or soil microbiology in productivity yielding crops. This is because the long-term maintained is not available, their importance in producing biomass can- crop yields are more important from a sustainability point not be shown. This calls for extending the attributes that are of view than maximum profit for a single cropping season. being monitored in LTEs. In addition, the essential micronutrients may be neglected The data mining models are predictive models; therefore, from the fertilization schemes when only a few crops are the validated models that achieve high predictive perfor- beingconsidered(RashidandRyan 2004). Production of mance can be used to predict future scenarios of the same only a few different crops may require less technical equip- type and under similar conditions as the ones that were used for constructing the models. Finally, the data mining models ment at the farm (Bullock 1992), although the farmer may observe yield declines after several years (Bennett et al. are usually presented in a form, such as a decision tree, 2012). The effects of fertilization on crop yields are which is intuitive and easy to interpret by the researchers. known, and the effect of compost combined with mineral The data mining models generated in this study achieved fertilizer was also shown by Lehtinen et al. (2017)fromthe better predictive performance for each of the LTEs than the same compost LTE. The current models confirm the previ- statistical studies previously carried out on the same data ous results and show that compost amendment alone may be (Spiegel et al. 2002, 2007, 2018; Lehtinen et al. 2017). insufficient to match the yields reached with mineral fertil- They are therefore very suitable and reliable for predicting izer. This may reflect slow nutrient release from the com- the primary productivity at the experimental sites in the posts (Amlinger et al. 2003; Alluvione et al. 2013), which is future. A great advantage of using data mining methodolo- ca. 5–15% in the year of compost amendment and only 2– gy over classical statistical or mechanistic models is the 8% in the following years (Amlinger et al. 2003). simple and fast construction of models that can be easily adapted to new data. Accordingly, having an established Applicability and scalability of the results data collection system at the LTE sites would simplify upscaling the predictive data mining models to newest data. There are several advantages of using data mining methods Having more data from these sites will make the models over classical statistical methods. First, the analyses are not more general and might further improve the predictive per- limited to only a few attributes or pair-wise comparisons for formances. Collecting data on a regional basis and covering modeling a certain soil function, but all available data can be important farming regions would improve regional model- used. Using all the available data enables discovering inter- ing efforts. The farmers could, for example, include their esting and new—often unexpected—patterns from the data soil monitoring data into the models, in order to find more (Buczko et al. 2018). This can provide new knowledge and regional patterns. The created decision trees could support researcher-farmer-advisor dialog on productivity insights about the problem at hand (De’ath and Fabricius Using data mining techniques to model primary productivity from international long-term ecological research... Funding information LANDMARK has received funding from the management. However, obtaining empirical data from ad- European Union’s Horizon 2020 research and innovation programme ditional experimental sites is most often a difficult and time- under grant agreement No 635201. consuming task and presents a limiting factor when apply- Open Access This article is distributed under the terms of the Creative ing data mining technologies in the field of agronomy. Commons Attribution 4.0 International License (http:// Incomplete or inconsistent data can bias the data mining creativecommons.org/licenses/by/4.0/), which permits unrestricted use, models, so the completeness and quality of the data is also distribution, and reproduction in any medium, provided you give appro- an important factor in this approach. priate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Conclusions References Our study has generated primary production data mining Alluvione F, Fiorentino N, Bertora C, Zavattaro L, Fagnano M, models with high predictive performance for all four LTEs Chiarandà FQ, Grignani C (2013) Short-term crop and soil response selected. The most important driving factors for productivity to C-friendly strategies in two contrasting environments. Eur J were preceding crop, plant-available Mg and crop of the Agron 45:114–123. https://doi.org/10.1016/j.eja.2012.09.003 growing year for tillage LTE, crop residue LTEs, and compost Amlinger F, Götz B, Dreher P, Geszti J, Weissteiner C (2003) Nitrogen in biowaste and yard waste compost: dynamics of mobilisation and LTE, respectively. In addition, soil properties such as soil pH, availability—a review. Eur J Soil Biol 39(3):107–116. https://doi. SOC, C/N ratio, preceding CEC, and preceding plant- org/10.1016/S1164-5563(03)00026-8 available P played a role. Crop residue incorporation as well Bauer T, Strauss P, Grims M, Kamptner E, Mansberger R, Spiegel H as sufficient mineral fertilization or combined compost and (2015) Long-term agricultural management effects on surface mineral fertilizer treatments of the soils proved to be effective roughness and consolidation of soils. Soil Till Res 151:28–38. https://doi.org/10.1016/j.still.2015.01.017 measures to optimize crop yields. Bennett AJ, Bending GD, Chandler D, Hilton S, Mills P (2012) Meeting In this study, data mining techniques were used for the the demand for crop production: the challenge of yield decline in first time in these LTEs to discover knowledge and patterns crops grown in short rotations. Biol Rev 87(1):52–71. https://doi. from the data. While the model and regression trees gener- org/10.1111/j.1469-185X.2011.00184.x Blockeel H, Struyf J (2002) Efficient algorithms for decision tree cross- ated in this study are region specific, the data mining ap- validation. J Mach Learn Res 3:621–650 proach enabled the effects of changing management and Bondi G, Creamer R, Ferrari A, Fenton O, Wall D (2018) Using machine soil, along with soil fertility parameters over time, to be learning to predict soil bulk density on the basis of visual parame- assessed in the context of crop yields and productivity at ters: tools for in-field and post-field evaluation. Geoderma 318:137– 147. https://doi.org/10.1016/j.geoderma.2017.11.035 the sites. The knowledge obtained from our predictive Breiman L, Freidman JH, Olshen RA, Stone CJ (1984) Classification and models can be utilized by farmers in this region to predict regression trees. Wadsworth & Brooks, Monterey ISBN: how future management will affect the productivity of their soils. In a more general context, this methodology can be Brouder SM, Gomez-Macpherson H (2014) The impact of conservation employed in other regions, where long-term data sets com- agriculture on smallholder agricultural yields: a scoping review of the evidence. Agric Ecosyst Environ 187:11–32. https://doi.org/10. prising a few critical but widely measured soil and crop 1016/j.agee.2013.08.010 parameters are available. This approach enables performing Buczko U, van Laak M, Eichler-Löbermann B, Gans W, Merbach I, Panten structural dynamic modeling, which is one of the main K, Peiter E, Reitz T, Spiegel H, von Tucher S (2018) Re-evaluation of methodological goals in ecological modeling when dynam- the yield response to phosphorus fertilization based on meta-analyses of long-term field experiments. Ambio 47(Supplement 1):50–62. https:// ic, unpredictable systems are involved. doi.org/10.1007/s13280-017-0971-1 These results are also important in understanding the driv- Bui E, Henderson B, Viergever K (2009) Using knowledge discovery ing forces of primary productivity in arable systems that are with data mining from the Australian soil resource information sys- sufficiently fertilized with main nutrients (nitrogen, phospho- tem database to inform soil carbon mapping in Australia. Glob rous, and potassium). We can highlight which management Biogeochem Cycles 23:GB4033. https://doi.org/10.1029/ 2009GB003506 practices positively influence crop yields. This calls for further Bullock DG (1992) Crop rotation. Crit Rev Plant Sci 11(4):309–326. investigations on other agricultural management practices, as https://doi.org/10.1080/07352689209382349 well as for upscaling the results to a larger geographical area. Cakmak I (2013) Magnesium in crop production, food quality and human health. Plant Soil 368:1–4. https://doi.org/10.1007/s11104-013-1781-2 Acknowledgements This study was conducted as a part of the De’ath G, Fabricius KE (2000) Classification and regression trees: a LANDMARK (LAND Management: Assessment, Research, powerful yet simple technique for ecological data analysis. Knowledge Base) project. The authors wish to thank David Wall from Ecology 81(11):3178–3192 http://epubs.aims.gov.au/11068/5812 Teagasc, Ireland, Bhim Bahadur Ghaley from the University of Debeljak M, Džeroski S (2011) Decision trees in ecological Modelling. Copenhagen, Denmark, and Kirsten Madena from the The Chamber of In: Jopp F, Reuter H, Breckling B (eds) Modelling complex ecolog- Agriculture of Lower Saxony (CALS), Germany, for proofreading the ical dynamics. Springer, Berlin. https://doi.org/10.1007/978-3-642- manuscript and providing useful comments. 05029-9_14 A. Trajanov et al. Debeljak M, Squire G, Demšar D, Young MW, Džeroski S (2008) Relations Lopez-Fando C, Merbach I, Merbach W, Pardor MT, Rogasik J, Rühlmann J, Spiegel H, Schulz E, Tajnsek A, Toth Z, Wegener H, between the oilseed rape volunteer seedbank, and soil factors, weed functional groups and geographical location in the UK. Ecol Model Zorn W (2013) Effect of mineral and organic fertilization on crop 212:138–146. https://doi.org/10.1016/j.ecolmodel.2007.10.019 yield, nitrogen uptake, carbon and nitrogen balances, as well as soil Debeljak M, Cortet J, Demšar D, Krogh PH, Džeroski S (2007) organic carbon content and dynamics: results from 20 European Hierarchical classification of environmental factors and agricultural long-term field experiments of the twenty-first century. Arch practices affecting soil fauna under cropping systems using Bt Agron Soil Sci 59(8):1017–1040. https://doi.org/10.1080/ maize. Pedobiologia 51:229–238. https://doi.org/10.1016/j.pedobi. 03650340.2012.704548 2007.04.009 Kuzmanovski V, Trajanov A, Leprince F, Džeroski S, Debeljak M (2015) Diepenbrock W, Ellmer F, Léon J (2009) Ackerbau, Pflanzenbau und Modeling water outflow from tile-drained agricultural fields. Sci Pflanzenzüchtung. UTB Verlag Eugen Ulmer Stuttgart. ISBN: Total Environ 505:390–401. https://doi.org/10.1016/j.scitotenv. 9783825238438 2014.10.009 Džeroski S, Goethals B, Panov P (2010) Inductive databases and Lehtinen T, Dersch G, Söllinger J, Baumgarten A, Schlatter N, constraint-based data mining. Springer, New York ISBN: 978-1- Aichberger K, Spiegel H (2017) Long-term amendment of four dif- 4419-7738-0 ferent compost types on a loamy silt Cambisol: impact on soil or- Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to ganic matter, nutrients and yields. Arch Agron Soil Sci 63(5):663– knowledge discovery in databases. AI Mag 17:37–54. https://doi. 673. https://doi.org/10.1080/03650340.2016.1235264 org/10.1609/aimag.v17i3.1230 Lehtinen T, Schlatter N, Baumgarten A, Bechini L, Krüger J, Grignani C, Franko U, Spiegel H (2016) Modeling soil organic carbon dynamics in an Zavattaro L, Costamagna C, Spiegel H (2014) Effect of crop residue Austrian long-term tillage field experiment. Soil Till Res 156:83–90. incorporation on soil organic carbon and greenhouse gas emissions https://doi.org/10.1016/j.still.2015.10.003 in European agricultural soils. Soil Use Manage 30(4):524–538. Friedrich T, Derpsch R, Kassam A (2012) Overview of the global spread https://doi.org/10.1111/sum.12151 of conservation agriculture. Field actions science reports, special Mirtl M, Bahn M, Battin T, Borsdorf A, Dirnböck T, English M, issue 6, Reconciling Poverty Eradication and Protection of the Erschbamer B, Fuchsberger J, Gaube V, Grabherr G, Gratzer Environment, http://factsreports.revues.org/1941 G, Haberl H, Klug H, Kreiner D, Mayer R, Peterseil J, Richter Gerendás J, Führs H (2013) The significance of magnesium for crop A, Schindler S, Stocker-Kiss A, Tappeiner U, Weisse T, quality. Plant Soil 368:101–128. https://doi.org/10.1007/s11104- Winiwarter V, Wohlfahrt G, Zink R (2015) Research for the 012-1555-2 future - LTER-Austria white paper 2015 - on the status and Goldstein A, Fink L, Meitin A, Bohadana S, Lutenberg O, Ravid G orientation of process oriented ecosystem research, biodiversity (2017) Applying machine learning on sensor data for irrigation rec- and conservation research and socio-ecological research in ommendations: revealing the agronomist’s tacit knowledge. Precis Austria. In: LTER-Austria series, Austrian Society for Long- Agric 47(4):1–24. https://doi.org/10.1007/s11119-017-9527-4 Term Ecological Research, Vienna, Austria, pp. 74. ISBN:978- Gransee A, Führs H (2013) Magnesium mobility in soils as a challenge 3-9503986-1-8 for soil and plant analysis, magnesium fertilization and root uptake Poeplau C, Reiter L, Berti A, Kätterer T (2017) Qualitative and quantita- under adverse growth conditions. Plant Soil 368(1):5–21. https:// tive response of soil organic carbon to 40 years of crop residue doi.org/10.1007/s11104-012-1567-y incorporation under contrasting nitrogen fertilisation regimes. Soil Grzebisz W (2013) Crop response to magnesium fertilization as affected Res 55(1):1–9. https://doi.org/10.1071/SR15377 by nitrogen supply. Plant Soil 368:23–39. https://doi.org/10.1007/ Poeplau C, Kätterer T, Bolinder MA, Börjesson G, Berti A, Lugato E s11104-012-1574-z (2015) Low stabilization of aboveground crop residue carbon in Hazell P, Wood S (2008) Drivers of change in global agriculture. Philos sandy soils of Swedish long-term experiments. Geoderma 237: Trans R Soc Lond B Biol Sci 363(1491):495–515. https://doi.org/ 246–255. https://doi.org/10.1016/j.geoderma.2014.09.010 10.1098/rstb.2007.2166 Rashid A, Ryan J (2004) Micronutrient constraints to crop production in HijbeekR,van Ittersum MK,ten Berge HFM, Gort G,Spiegel H, soils with Mediterranean-type characteristics: a review. J Plant Nutr Whitmore AP (2017) Do organic inputs matter—a meta-analysis 27(6):959–975. https://doi.org/10.1081/PLN-120037530 of additional yield effects for arable crops in Europe. Plant Soil Schulte RPO, Bampa F, Bardy M, Coyle C, Creamer RE, Fealy R, Gardi 411(1):293–303. https://doi.org/10.1007/s11104-016-3031-x C, Ghaely BB, Jordan P, Laudon H, O’Donoghue C, O’hUallacháin Hobbs PR, Sayre K, Gupta R (2008) The role of conservation agriculture D, O’Sullivan L, Rutgers M, Six J, Toth GL, Vrebos D (2015) in sustainable agriculture. Philos Trans R Soc Lond B Biol Sci 363: Making the most of our land: managing soil functions from local 543–555. https://doi.org/10.1098/rstb.2007.2169 to continental scale. Front Environ Sci 3(81). https://doi.org/10. Jarvis S, Hutchings N, Brentrup F, Olesen JE, Van de Hoek KW (2011) 3389/fenvs.2015.00081 Nitrogen flows in farming systems across Europe. In: Sutton MA, Schulte RPO, Creamer RE, Donnellan T, Farrelly N, Fealy R, Howard CM, Erisman JW, Billen G, Bleeker A, Grennfelt P, Van O’Donoghue C, O’hUallachain D (2014) Functional land manage- Grinsven H, Grizzetti B (eds) The European Nitrogen Assessment, ment: a framework for managing soil-based ecosystem services for 211–228. Cambridge University Press. https://doi.org/10.1017/ the sustainable intensification of agriculture. Environ Sci Pol 38:45– CBO9780511976988.013 58. https://doi.org/10.1016/j.envsci.2013.10.002 Jiawei H, Kamber M, Pei J (2006) Data mining: concepts and techniques. Senbayram M, Gransee A, Wahle V, Thiel H (2015) Role of magnesium Morgan Kaufmann. ISBN: 978012381479 fertilisers in agriculture: plant–soil continuum. Crop Pasture Sci Ke 66(12):1219–1229. https://doi.org/10.1071/CP15104 rtész A, Madarász B (2014) Conservation agriculture in Europe. Int Soil Water Conserv Res 2:91–96. https://doi.org/10.1016/S2095- Shekoofa A, Emam Y, Shekoufa N, Ebrahimi M, Ebrahimie E (2014) 6339(15)30016-2 Determining the most important physiological and agronomic traits Kopittke PM, Menzies NW (2007) A review of the use of the basic cation contributing to maize grain yield through machine learning algo- rithms: a new avenue in intelligent agriculture. PLoS One 9(5): saturation ratio and the BIdeal^ Soil. Soil Sci Soc Am J 71(2):259– 265. https://doi.org/10.2136/sssaj2006.0186 e97288. https://doi.org/10.1371/journal.pone.0097288 Körschens M, Albert E, Armbruster M, Barkusky D, Baumecker M, Spiegel H, Sandén T, Dersch G, Baumgarten A, Gründling R, Franko U Behle-Schalk L, Bischoff R, Cergan Z, Ellmer F, Herbst F, (2018). Soil organic matter and nutrient dynamics following differ- Hoffmann S, Hofmann B, Kismanyoky T, Kubat J, Kunzova E, ent management of crop residues at two sites in Austria. Book Using data mining techniques to model primary productivity from international long-term ecological research... Chapter in BSoil Management and Climate Change: Effects on carbon for laboratory routines: three long-term field experiments in Austria. Soil Res 53(2):190–204. https://doi.org/10.1071/SR14200 Organic Carbon, Nitrogen Dynamics and Greenhouse Gas Emissions^,253–265, Elsevier. ISBN: 978-0-12-812128-3 Tatzber M, Stemmer M, Spiegel H, Katzlberger C, Landstetter C, Haberhauer G, Gerzabek MH (2012) 14C-labeled organic amend- Spiegel H, Dersch G, Baumgarten A, Hösch J (2010) The international ments: characterization in different particle size fractions and humic organic nitrogen long-term fertilisation experiment (IOSDV) at acids in a long-term field experiment. Geoderma 177–178:39–48. Vienna after 21 years. Arch Agron Soil Sci 56:405–420. https:// https://doi.org/10.1016/j.geoderma.2012.01.028 doi.org/10.1080/03650341003645624 Tatzber M (2009) Decomposition of Carbon-14-labeled organic Spiegel H, Dersch G, Hösch J, Baumgarten A (2007) Tillage effects on amendments and humic acids in a long-term field experiment. soil organic carbon and nutrient availability in a long-term field Soil Sci Soc Am J 73(3):744–750. https://doi.org/10.2136/ experiment in Austria. Die Bodenkultur 58:47–58 sssaj2008.0235 Spiegel H, Pfeffer M, Hösch J (2002) N dynamics under reduced tillage. Tatzber M, Stemmer M, Spiegel H, Katzlberger C, Zehetner F, Arch Agron Soil Sci 48:503–512. https://doi.org/10.1080/ Haberhauer G, Garcia-Garcia E, Gerzabek MH (2009) Spectroscopic behaviour of 14C-labeled humic acids in a long- Spiegel H (2001) Results of three long-term P-field experiments in term field experiment with three cropping systems. Soil Res 47(5): Austria: 1 Report: Effects of different types and quantities of P- 459–469. https://doi.org/10.1071/SR08231 fertiliser on yields and P CAL-contents in soils | Ergebnisse von drei Tatzber M, Stemmer M, Spiegel H, Katzlberger C, Haberhauer G, 40-jährigen P-Dauerversuchen in Österreich: 1. Mitteilung: Gerzabek MH (2008) Impact of different tillage practices on molec- Auswirkungen ausgewählter P-Düngerformen und -mengen auf ular characteristics of humic acids in a long-term field experiment— den Ertrag und die P CAL-Gehalte im Boden. Bodenkultur 52(1): an application of three different spectroscopic methods. Sci Total 3–17 Environ 406:256–268. https://doi.org/10.1016/j.scitotenv.2008.07. Tatzber M, Klepsch S, Soja G, Reichenauer T, Spiegel H, Gerzabek M (2015a) Determination of soil organic matter features of extractable Trajanov A (2011) Machine learning in agroecology: from simulation fractions using capillary electrophoresis: an organic matter stabiliza- models to co-existence rules. Lambert Academic Publishing tion study in a Carbon-14-labeled long-term field experiment. In: He (LAP), Germany ISBN:978-3845471334 Z, Wu F (eds) Labile organic matter—chemical compositions, func- Veenadhari S, Mishra B, Singh CD (2011) Soybean productivity model- tion, and significance in soil and the environment. SSSA Special ling using decision tree algorithms. Int J Comput Appl T 27(7):11– Publication 62. © 2015. SSSA, Madison. https://doi.org/10.2136/ 15. https://doi.org/10.13140/RG.2.1.3852.1846 sssaspecpub62.2014.0033 Witten IH, Frank E (2011) Data mining: practical machine learning tools Tatzber M, Schlatter N, Baumgarten A, Dersch G, Körner R, Lehtinen T, and techniques - 3rd edition. Morgan Kaufmann. ISBN:978-0-12- Unger G, Mifek E, Spiegel H (2015b) KMnO4 determination of active 374856-0 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Regional Environmenal Change Springer Journals

Using data mining techniques to model primary productivity from international long-term ecological research (ILTER) agricultural experiments in Austria

Free
13 pages

Loading next page...
 
/lp/springer_journal/using-data-mining-techniques-to-model-primary-productivity-from-FSZ0Qna066
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s)
Subject
Environment; Climate Change; Climate Change/Climate Change Impacts; Oceanography; Geography, general; Regional/Spatial Science; Nature Conservation
ISSN
1436-3798
eISSN
1436-378X
D.O.I.
10.1007/s10113-018-1361-3
Publisher site
See Article on Publisher Site

Abstract

Primary productivity is in the foundation of farming communities. Therefore, much effort is invested in understanding the factors that influence the primary productivity potential of different soils. The International Long-Term Ecological Research (ILTER) is a network that enables valuable comparisons of data in understanding environmental change. In this study, we investigate three ILTER cropland sites and one long-term field experiment (LTE) outside of the ILTER network. The focus is on the influence of different management practices (tillage, crop residue incorporation, and compost amendments) on primary productivity. Data mining analyses of the experimental data were carried out in order to investigate trends in the productivity data. We generated predictive models that identify the influential factors that govern primary productivity. The data mining models achieved very high predictive performance (r > 0.80) for each of the sites. Preceding crop and crop of the current year were crucial for primary productivity in the tillage LTE and compost LTE, respectively. For both crop residue incorporation LTEs, plant-available Mg affected productivity the most, followed by properties such as soil pH, SOM, and the crop residue management. The results obtained by data mining are in line with previous studies and enhance our knowledge about the driving forces of primary productivity in arable systems. Hence, the models are considered very suitable and reliable for predicting the primary productivity at these ILTER sites in the future. They may also encourage researcher- farmer-advisor-stakeholder interaction, and thus create enabling environment for cooperation for further research around these ILTER sites. . . . . . Keywords Soil functions Crop yield Plant-available Mg Tillage Compost amendments Crop residue incorporation Introduction * Aneta Trajanov Primary productivity, the capacity of a soil to produce plant aneta.trajanov@ijs.si biomass for human use (as food, feed, fuel, or fiber), is one of Heide Spiegel the cornerstones of prosperous farming communities. adelheid.spiegel@ages.at Accordingly, farmers need to focus on multiple soil functions Marko Debeljak in order to maintain the productivity function of the soil marko.debeljak@ijs.si (Schulte et al. 2014) and to help secure the viability of farms for the next generations. This includes the soils’ provision of Taru Sandén taru.sanden@ages.at clean drinking water, the recycling of nutrients, carbon se- questration, and soil serving as a habitat for biota (Schulte et Department of Knowledge Technologies, Jozef Stefan Institute, al. 2015). To this end, several improved management practices Jamova cesta 39, 1000 Ljubljana, Slovenia are being applied in the field. No-tillage or non-inversion till- Jozef Stefan International Postgraduate School, Jamova cesta 39, age practices are being promoted to reduce the labor and crop 1000 Ljubljana, Slovenia production costs, but also due to their positive effects on soil Department for Soil Health and Plant Nutrition, Institute for organic matter (SOM) and soil aggregate stability (Bauer et al. Sustainable Plant Production, Austrian Agency for Health & Food 2015; Tatzber et al. 2008; Spiegel et al. 2007). Incorporation Safety – AGES, Spargelfeldstrasse 191, 1220 Vienna, Austria A. Trajanov et al. of crop residues and various organic fertilizer amendments, tasks and discover different patterns in the data (e.g., decision such as composts and farmyard manure, is feasible options trees, clusters, equations, rules). These algorithms search for substituting mineral nitrogen fertilizers (Spiegel et al. through the space of patterns (models) to find interesting pat- 2010). Currently, conservation agriculture is becoming more terns that are valid in the given data. wide-spread among the global farming community. Already This study was designed to predict primary productivity approximately 125 million hectares of land are managed by and to identify the driving factors that govern primary produc- the principles of conservation agriculture (Friedrich et al., tivity by means of data mining. To this end, we addressed the 2012; Brouder and Gomez-Macpherson, 2014). The defini- following questions within the framework of the selected field tion of conservation agriculture includes minimum, non- experiments: inversion or reduced tillage, combined with retention of crop residues on the soil surface and crop rotation. The aim is to (1) Can data mining help make reliable predictive models of conserve soil and water for optimum productivity (Hobbs et primary productivity from LTE (long-term field al. 2008; Kertész and Madarász 2014). experiments) data? The International Long-Term Ecological Research (1) What are the driving factors of primary productivity in (ILTER) represents a network that enables valuable compari- the selected arable LTEs that are sufficiently fertilized sons of data for understanding environmental change. with main nutrients? Nonetheless, cropland sites are still underrepresented in the (2) Do the selected management practices influence primary network, and more sites would be needed for global compar- productivity? isons. In Austria, only four agricultural ILTER sites are includ- ed in the network of 38 sites in total (Mirtl et al. 2015). This study focuses on three of the cropland sites (tillage and crop residue incorporation), as well as one long-term field experi- Materials and methods ment (compost amendments) outside of the ILTER network. The previous investigations of the selected LTER sites have International Long-Term Ecological Research (ILTER) focused on how the different improved management practices, experimental sites i.e., tillage (Franko and Spiegel 2016; Tatzber et al. 2008; Spiegeletal. 2007), crop residue incorporation (Spiegel et al. This paper investigated data from four Austrian long-term 2018), cropping systems (Tatzber et al. 2009, 2012, 2015a, b; field experiments (LTEs, Fig. 1). Tatzber 2009), and organic/compost amendments (Lehtinen et al. 2017; Hijbeek et al. 2017; Körschens et al. 2013), affected Tillage the soil properties as well as soil productivity. However, no analyses of the experimental data have been carried out in order The long-term field experiment investigating different tillage to determine patterns in the productivity data. management (tillage LTE) was established in Fuchsenbigl There are many positive examples of using data mining (Table 1). In brief, the experiment included three different techniques for building predictive models in the field of agri- tillage systems (minimum, reduced, and conventional tillage) cultural and environmental sciences (Bondi et al. 2018; Bui et (Spiegel et al. 2002, 2007; Tatzber et al. 2008). The experi- al. 2009;Debeljaket al. 2007, 2008; Goldstein et al. 2017; ment consisted of a randomized block design, the plots mea- Kuzmanovski et al. 2015; Shekoofa et al. 2014; Trajanov suring 60 m × 12 m each. The crop rotation was not fixed and 2011). Their biggest advantage is that they are applied on consisted of the most important crops for the region such as easily obtainable empirical data, and the parametrization of cereals, sugar beet, maize, and grain legumes. the data mining models is done automatically from the data; hence it is not influenced by the subjectivity of the modelers. Crop residue incorporation By applying data mining methods, data sets from long-term field experiments can be turned into an understandable struc- Two long-term field experiments were established to investi- ture, and interpretable patterns (i.e., long-term trends and their gate the management of crop residues, crop residue LTE in drivers) in the data can be identified. Rutzendorf, and crop residue LTE in Grabenegg. Both sites Data mining, as a part of the Knowledge Discovery in have recently been described by Spiegel et al. (2018). The Databases (KDD) process, uses machine learning and statisti- field experiments consisted of a randomized block design with cal methods in order to find interesting patterns in data four replicates, each plot measuring 32 m × 6 m (192 m )in (Fayyad et al. 1996). The goal of data mining is to extract Rutzendorf and 30 m × 7.5 m (225 m ) in Grabenegg. There information from datasets that is intelligible and useful in an were four P-fertilization stages (0, 33, 66, 131 kg P ha-1y-1), understandable and easily interpretable format. Different data and all crop residues were either incorporated or removed in mining algorithms are used to address different data mining the treatments. The crop rotation was not fixed and consisted Using data mining techniques to model primary productivity from international long-term ecological research... Fig. 1 Map of the long-term experiment (LTE) locations in Austria of the most important crops for both regions, such as cereals, compost, manure compost, and sewage sludge compost −1 sugar beet, grain maize, and grain legumes. plots (each treatment corresponding to 175 kg N ha )with a crop rotation of winter wheat, winter barley, maize, and pea (without compost application). In further variants, the Compost amendments four compost amendments are fertilized with 80 kg mineral −1 N(NH NO )ha . The long-term compost LTE was designed in Ritzlhof near 4 3 Linz, Upper Austria, to study the effects of different com- post amendments on chemical, physical, and microbial soil Data mining methods parameters and plants. The compost LTE and its soils have previously been described in Lehtinen et al. (2017)and ref- In our study, we used data mining algorithms for generation erences therein. The field experiment consists of a random- of decision trees (Breiman et al. 1984), in particular model ized block design with four replicates, the plots measuring and regression trees. The algorithms for building decision 5 m ×6 m (=30 m ). The field trial includes a control plot trees are one of the most commonly used data mining algo- (zero N), minerally fertilized plots (40 kg N, 80 kg N, rithms. They predict the value of a dependent variable −1 −1 120 kg N ha y ) and biowaste compost, green waste (termed target attribute) from a set of independent variables Table 1 The long-term agricultural field experiments of AGES (Austrian Agency for Health & Food Safety) Tillage LTE Crop residue LTE - Grabenegg Crop residue LTE - Rutzendorf Compost LTE GPS coordinates 48° 11′ N16° 44′E48°12′ N15° 15′E48°21′ N16° 61′E48°18′ N14° 25′ E ILTER site name LTER_EU_AT_030 LTER_EU_AT_038 LTER_EU_AT_030 n.a. Year established 1988 1986 1982 1991 Soil type (WRB, 2015) Haplic Chernosem Gleyic Luvisol Calcaric Phaeozem Cambisol MAT 9.4 °C 8.5 °C 9.1 °C 8.5 °C MAP 529 mm 836 mm 540 mm 753 mm Texture (% clay/silt/sand) 22/41/37 16/77/7 23/52/26 17/69/14 pH (CaCl ) 7.6 6.6 7.5 6.8 SOC 1.2% 1.0% 2.2% 1.2% A. Trajanov et al. (attributes). They are hierarchical models that contain inter- is to use tenfold cross-validation as a standard method nal and terminal nodes, connected with edges. In each inner for their evaluation. node, the value of an attribute is tested and compared to a Another measure of predictive performance is Root constant value. The edges coming out from the node corre- Mean Square Error (RMSE) (Witten and Frank 2011). It is spond to the outcome of the test. The leaf nodes contain the a measure that reports the average magnitude of the error. It predictions of the target attribute that apply to all samples is the square root of the average of squared differences be- that fall in that leaf. To predict the value of the target attri- tween prediction and actual values of the target attribute. bute for a new sample, it is routed down the tree according to To model the influence of different agricultural manage- the values of the attributes that are tested in each node. ment techniques on primary productivity, we used the data When the sample reaches a leaf, it is given the prediction mining package WEKA (Witten and Frank 2011), which assigned to the leaf. implements a large collection of machine learning algo- When the values of the target attribute are numeric, the rithms for different data mining tasks. In this study, we used leaves of the tree contain models for predicting it. The the model and regression tree algorithm M5P. For generat- models can be piece-wise linear regression equations, in ing regression trees with syntactic constraints, we used the which case the decision trees are known as model trees, or decision and rule induction system CLUS (Blockeel and can be constant values and in this case, the decision trees are Struyf 2002). termed regression trees. When generating regression trees, syntactic constraints can also be used (Džeroski et al. 2010). Data description The syntactic constraints influence the process of building the trees by defining a partial structure of the tree, from The data from the four Austrian long-term field experi- which point on the tree is generated automatically, follow- ments, described in the BInternational Long-Term ing the regular regression tree algorithm. Ecological Research (ILTER) experimental sites^ section, In this study, we generated model and regression trees were organized and preprocessed in order to be analyzed for each of the LTE experimental sites (tillage LTE, crop using data mining techniques. The data comprising the three residue LTEs, and compost LTE). For easier interpretation LTEs datasets are presented in Table 2. The attributes in- of the model trees, we calculated the average predicted cluded the long-term monitoring data available from each of value of the samples that fall into each leaf according to the experiments; thus, the attributes differed slightly be- its piece-wise equation, as well as their average actual tween the LTEs. values. When interpreting decision (classification or re- Although the general structure of all datasets was similar, gression) trees, we start from the top (root) of the tree. each dataset was preprocessed in a unique way in order to The most important factors that influence the target attri- correctly address the goal and obtain the most accurate and bute (primary productivity in our case) appear at the top. interpretable data mining models possible. The structure of the The importance of the attributes decreases as you move separate datasets from each experimental site is explained in towards the lower levels of the tree. the following sections. There are different measures of predictive performance to assess how good the data mining models describe our data. To assess the performance of regression and model Tillage trees, we used the correlation coefficient as a measure: it quantifies the statistical correlation between the predicted The tillage dataset consisted of data from 18 years of exper- and the real values of the target attribute. The values of iments (1998–2015), yielding 162 samples, described with the correlation coefficient can vary between 1 (perfect soil parameters, management techniques (tillage), and crop correlation) and − 1 (perfect negative correlation) through yields. In addition to these original attributes, for each ex- 0 (no correlation at all). In addition, to assess how good ample, we included the soil properties of the preceding year the model performs on new (test) data, we used the ten- (derived attributes) in order to check whether the soil prop- fold cross-validation technique (Witten and Frank 2011). erties of the preceding year and the type of tillage applied on In cross-validation, the dataset is split into n approxi- the field influence the crop yield in the current year. mately equal partitions (folds). Each fold is (in turn) used Three types of crops were grown at the tillage experimental for testing, while the remaining folds are used for train- site: sugar beet, grain maize, and cereals. These crops have ing (building) the model. This procedure is repeated n significantly different absolute yields. Therefore, we divided times and, at the end, the correlation coefficients obtained the dataset into three subsets, according to crop classes: in the different iterations are averaged to obtain the over- all correlation coefficient of the data mining model. A & Sugar beet (number of samples 18) common practice when generating data mining models & Grain maize (number of samples 36) Using data mining techniques to model primary productivity from international long-term ecological research... Table 2 Each long-term experiment dataset comprised of data describing the soil properties of the experimental sites, the management techniques used, and the crop yield Attributes Tillage long-term experiment Crop residue long-term Compost long-term experiment experiments Crop • Sugar beet Cereals (winter wheat, • Maize • Grain maize winter barley) • Cereals (spring wheat, winter wheat, • Cereals (winter wheat, spring wheat, winter barley, and pea) soybean, winter barley, spring barley) + preceding crops Soil • Percentage of total nitrogen (N) • Percentage of total • Percentage of total nitrogen (N) • Mineralized N (mg/kg/7day) nitrogen (N) • Mineralized N (mg/kg/7day) • Percentage of total soil organic • Mineralized N • Percentage of total soil organic carbon (SOC) (mg/kg/7day) carbon (SOC) • Plant available phosphorus (P) (mg/kg) • Percentage of total soil • Plant available phosphorus • Plant available potassium (K) (mg/kg) organic carbon (SOC) (P) (mg/kg) • Plant available magnesium • Plant available phosphorus • Plant available potassium (Mg) (mg/kg) (P) (mg/kg) (K) (mg/kg) • Soil pH • Plant available potassium • Plant available magnesium • Percentage of carbonate content (CAO) (K) (mg/kg) (Mg) (mg/kg) • Exchangeable calcium (Ca) (cmolc/kg) • Plant available magnesium • Soil pH • Exchangeable magnesium (Mg) (mg/kg) • C to N ratio (C/N) (Mg) (cmolc/kg) • Soil pH • Exchangeable potassium (K) (cmolc/kg) • CtoN ratio (C/N) • Exchangeable sodium (Na) (cmolc/kg) • Cation exchange capacity (CEC) (cmolc/kg) • C to N ratio (C/N) + preceding soil attributes Management • Minimum tillage • Removal of crop residues • Mineral fertilization (0, 40, 80 • Reduced tillage • Incorporation of crop residues or 120 kg N/ha/y) • Conventional tillage • Compost amendments (bio-waste, green waste, manure or sewage sludge compost corresponding to 175 kg N/ha/y) • Combination of compost and mineral fertilization (compost with additional 80 kg mineral N/ha/year) Primary Yield (kg/ha) Yield (kg/ha) Yield (kg/ha) productivity & Cereals: winter wheat, spring wheat, soybean, winter bar- The attributes from which CEC and C/N are calculated ley, spring barley (number of samples 108) were excluded in scenarios 2 and 4 in order to avoid correla- tions in the investigated attributes. For the data mining analyses, we generated five scenar- ios using different combinations of original and derived attributes: Crop residue incorporation & Scenario 1: Original attributes WITHOUT CEC and C/N The data from the crop residue incorporation experimental & Scenario 2: Original attributes WITH CEC and C/N (exclud- sites consisted of data from two LTEs—Rutzendorf and ing the attributes from which CEC and C/N are calculated) Grabenegg. Each dataset comprised 5 years of experiments & Scenario 3: Original and derived attributes WITHOUT (2002, 2008, 2010, 2012, and 2014), yielding 160 samples, CEC and C/N described by soil properties, management practices (crop res- & Scenario 4: Original and derived attributes WITH CEC idue incorporation or removal), and the crop yield. Data about and C/N (excluding the attributes from which CEC and preceding years were not included in these datasets because C/N are calculated) the data in these LTEs were not collected for consecutive & Scenario 5: Original and derived attributes WITH CEC years. At these experimental sites, only cereal crops were and C/N and syntactic constraints (forcing the tillage attri- grown, so there was no need to divide the datasets according bute at the top of the decision trees) to crop type. A. Trajanov et al. Here, we performed two scenarios for both Rutzendorf and Crop residue incorporation Grabenegg datasets: The regression trees of both trials highlight that the most impor- & Scenario 1: Using total nitrogen and total soil organic tant attribute for primary productivity in the crop residue incorpo- carbon attributes and excluding the C/N attribute ration long-term experiments was the plant-available Mg (Fig. 3). & Scenario 2: Using C/N and excluding the total nitrogen For modeling the influence of soil and crop properties as and total soil organic carbon attributes well as crop residue incorporation on primary productivity, we had one dataset for each LTE, for which we carried out two scenarios. The predictive performances of the model and Compost amendments regression trees obtained for the two datasets and for both sce- narios are veryhigh(Table 4). This makes them very reliable The compost amendment dataset consisted of 8 years of ex- for predicting and modeling the primary productivity in a field. periments (1998, 2001, 2002, 2003, 2005, 2007, 2012, and The best models were obtained for scenario 1. The regres- 2015), yielding 384 samples, described by soil properties, sion trees for Grabenegg and Rutzendorf are presented in Fig. 3. management practices (type of fertilization and compost amendment), and crop yield. At this experimental site, two Compost amendments classes of crops were grown: maize and cereals (spring wheat, winter wheat, winter barley, and pea). As in the case of the The regression tree for modeling the primary productivity in tillage LTE data and to avoid biasing the data mining models, cereals (Fig. 4) shows that in the compost amendment LTE, we divided the dataset into two subsets because the two types the crop grown on the field and the treatment applied on the of crops have significantly different absolute crop yields in kg/ field were the major drivers of primary productivity. ha: one consisting of data only for maize (144 examples) and As in the crop residue LTE analyses, for the compost the other for cereals (240 examples). Data on preceding crops amendments experimental site, we developed models for grown on the fields were also included in the datasets. two scenarios, using the C/N ratio and using total soil organic Here, we also carried out two scenarios: carbon and total nitrogen as separate attributes. The predictive performances of the obtained models are presented in Table 5. & Scenario 1: Using total nitrogen and total soil organic The correlation coefficients of the models for both types of carbon attributes and excluding the C/N attribute crops were very high, 0.78 and 0.94, respectively. The models & Scenario 2: Using C/N and excluding the total nitrogen obtained for the cereals dataset have especially high correlation and total soil organic carbon attributes coefficients, which make the predictions very reliable. The re- gression tree obtained for the cereals dataset is presented in Fig. 4. Results Discussion Tillage Tillage The results of the obtained model and regression trees for the In the tillage LTE, the crop rotation mimicked the current man- tillage experimental site are presented in Table 3 in terms of agement practices in the area, i.e., the most common agricultural correlation coefficients (r) and Root Mean Square Error crops were grown in 3–6-year crop rotations. Various aspects (RMSE). influence a farmer’s decision which crop to grow, all of which Figure 2 indicates that the preceding crop was of pivotal may act at a local, regional, or even at the global scale (Hazell importance for primary productivity. The predictive perfor- and Wood 2008). They may include the farm type, the economic mance of the models obtained for sugar beet was low. market, the technological opportunities at hand, the possibilities However, due to the low number of samples (18 in total) in for government or EU subsidies, as well as the nature of the this dataset, these results are unreliable. The best results farmer’s soil (Hazell and Wood 2008; Bennett et al. 2012). If (models) were obtained for grain maize and cereals, where economic market trends influence the choice of a crop of the the highest correlation coefficients (0.83 and 0.84, respective- season, the expected yields are probably one of the most impor- ly) were obtained for scenario 4. Overall, the correlation co- tant driving factors for the choice. Our data mining models efficients of the models for grain maize and cereals were clearly showed (Fig. 2) that cereal yields were significantly low- higher in scenarios 3 and 4, where we used soil and crop data er when sugar beet or winter wheat was the preceding crop of the preceding year, compared to the scenarios 1 and 2, compared to e.g., soybean or spring wheat. These differences where we used data only for the current year. may reflect a combination of factors, including how the grown Using data mining techniques to model primary productivity from international long-term ecological research... Table 3 Predictive performance in terms of correlation coefficient (r) and Root Mean Square Error (RMSE) of the model and regression trees obtained for the tillage International Long-Term Ecological Research (ILTER) experiments Scenarios 1: 2: 3: 4: 5: Original attr. Original attr. Original and Original and Original and derived without CEC with CEC derived attr. derived attr. attr. with CEC and and C/N and C/N without CEC with CEC C/N and a syntactic and C/N and C/N constraint Sugar beet Model trees r 0.62 0.69 0.42 0.54 / RMSE (kg/ha) 4552.73 3999.71 5182.84 4784.59 / Regression trees r − 0.22 − 0.22 − 0.14 − 0.25 0.68 RMSE 5928.43 5928.55 5865.84 5896.83 4264.75 (kg/ha) Grain maize Model trees r 0.70 0.72 0.82 0.83 / RMSE 905.21 884.38 730.81 711.35 / (kg/ha) Regression trees r 0.54 0.52 0.78 0.80 0.73 RMSE 1096.85 1108.93 986.36 978.81 877.4 (kg/ha) Cereals Model trees r 0.68 0.77 0.81 0.84 / RMSE 649.20 560.99 520.27 480.83 / (kg/ha) Regression trees r 0.47 0.68 0.79 0.74 0.67 RMSE 782.09 675.00 593.08 627.59 653.78 (kg/ha) The bold values represent the best results obtained for different scenarios. crops influence the soil-plant interphase with regard to soil prop- The importance of the preceding soil and crop properties is erties, pests, pathogens and soil microorganisms (Bennett et al. also evident from the generated model and regression trees. The 2012), and residual effects on the succeeding crops—just to best model tree, obtained for the cereals dataset and scenario 4 mention a fraction of the possibilities. (Fig. 2), shows that the most important attributes for predicting Fig. 2 Model tree for modeling the primary productivity of cereals in the tillage LTE A. Trajanov et al. Fig. 3 Regression trees for modeling the primary productivity in the crop residue incorporation long-term experiments: a regression tree for Grabenegg, and b regression tree for Rutzendorf yield were the preceding crop, the preceding yield, the C/N ratio, We were also interested in determining how the manage- and the preceding CEC and plant-available phosphorus. These ment practice, in this case soil tillage, influences primary pro- soil parameters are well-known to affect plant nutrition. A small- ductivity. However, the attributes describing the management er C/N ratio indicates more rapid decomposition of soil organic practice did not appear in the model and regression trees in matter and, thus, release of plant-available mineral nitrogen scenarios 1 to 4. Thus, in scenario 5, we generated regression (Jarvis et al. 2011). The sum of exchangeable cations—in alka- trees with syntactic constraints. Here, we defined the partial line soils mainly Ca2+, Mg2+, Na+ and K+, which are adsorbed structure of the tree (syntactic constraint) in a way that we on the exchange complex of the soil—also indicate the nutri- Bforced^ the management attributes to be at the top of the tree tional status of the soil and may inform about deficiencies and from there, the tree was generated automatically from the (Kopittke and Menzies 2007). Phosphorus is a key nutrient data. Nonetheless, the correlation coefficients of these regres- and essential for optimizing yields. Spiegel (2001) showed in sion trees were lower than in scenarios 3 and 4. We therefore another long-term experiment at the same site that plant- conclude that soil tillage, as a management practice, is less available P was significantly positively correlated with spring important for primary productivity than the current and pre- barley, sugar beet, and winter wheat yields. Thus, the model ceding soil and crop properties. This is in line with former results are in line with earlier findings that yields were enhanced findings from this field experiment that, on average, yields if CEC and plant-available phosphorus showed higher values. did not differ between the investigated tillage practices (Franko and Spiegel 2016; Spiegel et al. 2002, 2010). Crop residue incorporation Table 4 Predictive performance in terms of correlation coefficient (r) and Root Mean Square Error (RMSE) of the model and regression trees obtained for the crop residue incorporation International Long-Term Gerendás and Führs (2013) recently reviewed the literature on Ecological Research (ILTER) experiments the effect of magnesium on crop quality. Their review on cereals agrees with our results of positive yield response to Scenarios plant-available Mg in the crop residue incorporation experi- 1: 2: ments in Rutzendorf and Grabenegg. Gerendás and Führs Without C/N With C/N (2013), however, also show that there is not necessarily any quality response to Mg beyond the yield maximum. The mag- Grabenegg Model trees r 0.90 0.88 nesium available for plants depends on several factors includ- RMSE 655.82 704.15 ing soil pH, soil moisture, weathering, and microbial activity Regression trees r 0.76 0.84 of the soil (Senbayram et al. 2015). Grzebisz (2013) described RMSE 991.79 870.33 the so-called Bmagnesium-induced nitrogen uptake^ which Rutzendorf Model trees r 0.94 0.93 highlights the positive effect of magnesium on the nitrogen RMSE 537.0 568.88 uptake efficiency of the plants. Many factors, among them Regression trees r 0.92 0.91 source rock material and its properties and grade of RMSE 762.93 769.19 weathering as well as management practices such as crop The bold values represent the best results obtained for different scenarios. rotation and fertilization practices, influence the availability Using data mining techniques to model primary productivity from international long-term ecological research... Fig. 4 Regression tree for modeling the primary productivity of cereals in the compost amendments long-term experiment of Mg to plants (Gransee and Führs 2013). In our study, it was supply may be more prone to challenges due to severe envi- plant-available Mg in the soils, not Mg fertilization, that was ronmental conditions such as heat (Cakmak 2013). investigated and most important for crop yields. All the treat- Besides Mg, also the pH value of the soil, SOC, and po- ments in Grabenegg and Rutzendorf were sufficiently sup- tential N mineralization (only in Grabenegg)—but also the plied with N, P, and K. This may explain why plant- crop residue management and the crop type (only available Mg became so important in our model (Gransee Rutzendorf)—affected primary productivity. In the and Führs 2013). Magnesium is an essential plant nutrient that Grabenegg LTE, soils with slightly acidic pH values had is one of the building blocks of chlorophyll (Gerendás and higher crop yields than soils with lower acidity. In contrast, Führs 2013). Magnesium is also involved in enzyme activa- at Rutzendorf LTE, higher yields were achieved at alkaline tion, ATP formation and utilization, and growth of roots as soil pH levels. In both experiments, with higher soil pH, crop well as in seed formation (Cakmak 2013), making it very residue management influenced crop yields. Yields increased important for the whole life cycle of a plant. In intensively with long-term incorporation of crop residues. SOC was very farmed soils, such as soils of our two long-term field experi- different in the two experiments: low in Grabenegg and high ments, Mg balances become even more important for crop in Rutzendorf. In Grabenegg, higher SOC led to higher yields, yields due to a possible rapid depletion of Mg of the soils whereas this was not the case in the already highly supplied (Cakmak 2013). Moreover, wheat grown under low Mg Rutzendorf soils. This is in line with results from 20 long-term field experiments in Germany (Körschens et al. 2013), where the relevance of initial SOC was emphasized. Both pH and Table 5 Predictive performance in terms of correlation coefficient (r) SOC are fundamental for soil fertility, especially for the bio- and Root Mean Square Error (RMSE) of the model and regression trees obtained for the compost amendment International Long-Term logical activity of soils (Diepenbrock et al. 2009), driving Ecological Research (ILTER) experiments important biogeochemical cycles (e.g., C, N, and P). Former studies have also revealed that long-term crop residue incor- Scenarios poration leads to higher SOC compared to the yearly removal 1: 2: (Lehtinen et al. 2014; Poeplau et al. 2015, 2017; Spiegel et al. Without C/N With C/N 2018). Lehtinen et al. (2014) showed a significant increase in SOC when crop residues were incorporated, but did not find Maize Model trees r 0.78 0.69 significant correlations between SOC and crop yields. A gen- RMSE 901.70 1046.58 eral 6% increase in yields following crop residue incorpora- Regression trees r 0.73 0.69 tion, as compared to crop residue removal, was observed. RMSE 1015.61 1069.30 Poeplau et al. (2015, 2017) have shown similar increase Cereals Model trees r 0.97 0.97 ranges in SOC following crop residue incorporation in RMSE 466.57 469.72 Sweden and Italy. The interplay between soil organic matter Regression trees r 0.93 0.93 and attainable yields also puzzled Hijbeek et al. (2017), who RMSE 720.68 720.68 investigated how different organic inputs affected crop yields. Their assumption was that increased soil organic matter leads The bold values represent the best results obtained for different scenarios. A. Trajanov et al. to increased crop yields, but to their surprise, the increase was 2000; Debeljak and Džeroski 2011;Jiaweietal. 2006; not statistically significant when 20 European long-term ex- Veenadhari et al. 2011). Because of their ability to represent periments were investigated together. They explained this by the relationships between the attributes in a visual way, the differences between experimental sites and the soil properties, discovered patterns and knowledge about the problem can as well as organic inputs used in the various experiments be easily interpreted. Therefore, the created decision trees (Hijbeek et al. 2017). could also be used to strengthen the researcher-farmer- advisor-stakeholder dialog and to foster co-creation of Compost amendments new research questions with high farmer relevance. From the top attributes from the decision trees, the farmers could In general, pea and spring wheat achieved lower yields than disentangle what may be limiting their productivity and winter wheat and winter barley (Diepenbrock et al. 2009), how to improve it. not least because of the shorter growing season. For the The construction of data mining models (model or re- cereal crops, spring and winter wheat and winter barley, gression trees) proceeds automatically from the available fertilization was an important driver for yields. No or low data, minimizing the researchers’ subjectivity during the mineral fertilization and only compost amendments resulted generation of the models. This means that the form of the in lower yields compared to sufficient mineral fertilization models and the interactions among the variables are induced or a combination of compost and mineral fertilization. Pea, a automatically from the data and not set by the experts. Since legume, did not use the given fertilizer either in mineral or in the models were generated from data, they can be easily organic form. Our modeling results coincide with the results validated using different validation techniques such as ten- of conventional statistical analyses of a similar data set ob- fold cross-validation, train-test sets, or leave-one-out vali- tained from this long-term compost experiment (Lehtinen et dation (Witten and Frank 2011). The limitations are con- al. 2017). In farm management, caution should be exercised nected to availability of long-term data. In case data on the with short crop rotations or when focusing on only the most role of soil aggregation or soil microbiology in productivity yielding crops. This is because the long-term maintained is not available, their importance in producing biomass can- crop yields are more important from a sustainability point not be shown. This calls for extending the attributes that are of view than maximum profit for a single cropping season. being monitored in LTEs. In addition, the essential micronutrients may be neglected The data mining models are predictive models; therefore, from the fertilization schemes when only a few crops are the validated models that achieve high predictive perfor- beingconsidered(RashidandRyan 2004). Production of mance can be used to predict future scenarios of the same only a few different crops may require less technical equip- type and under similar conditions as the ones that were used for constructing the models. Finally, the data mining models ment at the farm (Bullock 1992), although the farmer may observe yield declines after several years (Bennett et al. are usually presented in a form, such as a decision tree, 2012). The effects of fertilization on crop yields are which is intuitive and easy to interpret by the researchers. known, and the effect of compost combined with mineral The data mining models generated in this study achieved fertilizer was also shown by Lehtinen et al. (2017)fromthe better predictive performance for each of the LTEs than the same compost LTE. The current models confirm the previ- statistical studies previously carried out on the same data ous results and show that compost amendment alone may be (Spiegel et al. 2002, 2007, 2018; Lehtinen et al. 2017). insufficient to match the yields reached with mineral fertil- They are therefore very suitable and reliable for predicting izer. This may reflect slow nutrient release from the com- the primary productivity at the experimental sites in the posts (Amlinger et al. 2003; Alluvione et al. 2013), which is future. A great advantage of using data mining methodolo- ca. 5–15% in the year of compost amendment and only 2– gy over classical statistical or mechanistic models is the 8% in the following years (Amlinger et al. 2003). simple and fast construction of models that can be easily adapted to new data. Accordingly, having an established Applicability and scalability of the results data collection system at the LTE sites would simplify upscaling the predictive data mining models to newest data. There are several advantages of using data mining methods Having more data from these sites will make the models over classical statistical methods. First, the analyses are not more general and might further improve the predictive per- limited to only a few attributes or pair-wise comparisons for formances. Collecting data on a regional basis and covering modeling a certain soil function, but all available data can be important farming regions would improve regional model- used. Using all the available data enables discovering inter- ing efforts. The farmers could, for example, include their esting and new—often unexpected—patterns from the data soil monitoring data into the models, in order to find more (Buczko et al. 2018). This can provide new knowledge and regional patterns. The created decision trees could support researcher-farmer-advisor dialog on productivity insights about the problem at hand (De’ath and Fabricius Using data mining techniques to model primary productivity from international long-term ecological research... Funding information LANDMARK has received funding from the management. However, obtaining empirical data from ad- European Union’s Horizon 2020 research and innovation programme ditional experimental sites is most often a difficult and time- under grant agreement No 635201. consuming task and presents a limiting factor when apply- Open Access This article is distributed under the terms of the Creative ing data mining technologies in the field of agronomy. Commons Attribution 4.0 International License (http:// Incomplete or inconsistent data can bias the data mining creativecommons.org/licenses/by/4.0/), which permits unrestricted use, models, so the completeness and quality of the data is also distribution, and reproduction in any medium, provided you give appro- an important factor in this approach. priate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Conclusions References Our study has generated primary production data mining Alluvione F, Fiorentino N, Bertora C, Zavattaro L, Fagnano M, models with high predictive performance for all four LTEs Chiarandà FQ, Grignani C (2013) Short-term crop and soil response selected. The most important driving factors for productivity to C-friendly strategies in two contrasting environments. Eur J were preceding crop, plant-available Mg and crop of the Agron 45:114–123. https://doi.org/10.1016/j.eja.2012.09.003 growing year for tillage LTE, crop residue LTEs, and compost Amlinger F, Götz B, Dreher P, Geszti J, Weissteiner C (2003) Nitrogen in biowaste and yard waste compost: dynamics of mobilisation and LTE, respectively. In addition, soil properties such as soil pH, availability—a review. Eur J Soil Biol 39(3):107–116. https://doi. SOC, C/N ratio, preceding CEC, and preceding plant- org/10.1016/S1164-5563(03)00026-8 available P played a role. Crop residue incorporation as well Bauer T, Strauss P, Grims M, Kamptner E, Mansberger R, Spiegel H as sufficient mineral fertilization or combined compost and (2015) Long-term agricultural management effects on surface mineral fertilizer treatments of the soils proved to be effective roughness and consolidation of soils. Soil Till Res 151:28–38. https://doi.org/10.1016/j.still.2015.01.017 measures to optimize crop yields. Bennett AJ, Bending GD, Chandler D, Hilton S, Mills P (2012) Meeting In this study, data mining techniques were used for the the demand for crop production: the challenge of yield decline in first time in these LTEs to discover knowledge and patterns crops grown in short rotations. Biol Rev 87(1):52–71. https://doi. from the data. While the model and regression trees gener- org/10.1111/j.1469-185X.2011.00184.x Blockeel H, Struyf J (2002) Efficient algorithms for decision tree cross- ated in this study are region specific, the data mining ap- validation. J Mach Learn Res 3:621–650 proach enabled the effects of changing management and Bondi G, Creamer R, Ferrari A, Fenton O, Wall D (2018) Using machine soil, along with soil fertility parameters over time, to be learning to predict soil bulk density on the basis of visual parame- assessed in the context of crop yields and productivity at ters: tools for in-field and post-field evaluation. Geoderma 318:137– 147. https://doi.org/10.1016/j.geoderma.2017.11.035 the sites. The knowledge obtained from our predictive Breiman L, Freidman JH, Olshen RA, Stone CJ (1984) Classification and models can be utilized by farmers in this region to predict regression trees. Wadsworth & Brooks, Monterey ISBN: how future management will affect the productivity of their soils. In a more general context, this methodology can be Brouder SM, Gomez-Macpherson H (2014) The impact of conservation employed in other regions, where long-term data sets com- agriculture on smallholder agricultural yields: a scoping review of the evidence. Agric Ecosyst Environ 187:11–32. https://doi.org/10. prising a few critical but widely measured soil and crop 1016/j.agee.2013.08.010 parameters are available. This approach enables performing Buczko U, van Laak M, Eichler-Löbermann B, Gans W, Merbach I, Panten structural dynamic modeling, which is one of the main K, Peiter E, Reitz T, Spiegel H, von Tucher S (2018) Re-evaluation of methodological goals in ecological modeling when dynam- the yield response to phosphorus fertilization based on meta-analyses of long-term field experiments. Ambio 47(Supplement 1):50–62. https:// ic, unpredictable systems are involved. doi.org/10.1007/s13280-017-0971-1 These results are also important in understanding the driv- Bui E, Henderson B, Viergever K (2009) Using knowledge discovery ing forces of primary productivity in arable systems that are with data mining from the Australian soil resource information sys- sufficiently fertilized with main nutrients (nitrogen, phospho- tem database to inform soil carbon mapping in Australia. Glob rous, and potassium). We can highlight which management Biogeochem Cycles 23:GB4033. https://doi.org/10.1029/ 2009GB003506 practices positively influence crop yields. This calls for further Bullock DG (1992) Crop rotation. Crit Rev Plant Sci 11(4):309–326. investigations on other agricultural management practices, as https://doi.org/10.1080/07352689209382349 well as for upscaling the results to a larger geographical area. Cakmak I (2013) Magnesium in crop production, food quality and human health. Plant Soil 368:1–4. https://doi.org/10.1007/s11104-013-1781-2 Acknowledgements This study was conducted as a part of the De’ath G, Fabricius KE (2000) Classification and regression trees: a LANDMARK (LAND Management: Assessment, Research, powerful yet simple technique for ecological data analysis. Knowledge Base) project. The authors wish to thank David Wall from Ecology 81(11):3178–3192 http://epubs.aims.gov.au/11068/5812 Teagasc, Ireland, Bhim Bahadur Ghaley from the University of Debeljak M, Džeroski S (2011) Decision trees in ecological Modelling. Copenhagen, Denmark, and Kirsten Madena from the The Chamber of In: Jopp F, Reuter H, Breckling B (eds) Modelling complex ecolog- Agriculture of Lower Saxony (CALS), Germany, for proofreading the ical dynamics. Springer, Berlin. https://doi.org/10.1007/978-3-642- manuscript and providing useful comments. 05029-9_14 A. Trajanov et al. Debeljak M, Squire G, Demšar D, Young MW, Džeroski S (2008) Relations Lopez-Fando C, Merbach I, Merbach W, Pardor MT, Rogasik J, Rühlmann J, Spiegel H, Schulz E, Tajnsek A, Toth Z, Wegener H, between the oilseed rape volunteer seedbank, and soil factors, weed functional groups and geographical location in the UK. Ecol Model Zorn W (2013) Effect of mineral and organic fertilization on crop 212:138–146. https://doi.org/10.1016/j.ecolmodel.2007.10.019 yield, nitrogen uptake, carbon and nitrogen balances, as well as soil Debeljak M, Cortet J, Demšar D, Krogh PH, Džeroski S (2007) organic carbon content and dynamics: results from 20 European Hierarchical classification of environmental factors and agricultural long-term field experiments of the twenty-first century. Arch practices affecting soil fauna under cropping systems using Bt Agron Soil Sci 59(8):1017–1040. https://doi.org/10.1080/ maize. Pedobiologia 51:229–238. https://doi.org/10.1016/j.pedobi. 03650340.2012.704548 2007.04.009 Kuzmanovski V, Trajanov A, Leprince F, Džeroski S, Debeljak M (2015) Diepenbrock W, Ellmer F, Léon J (2009) Ackerbau, Pflanzenbau und Modeling water outflow from tile-drained agricultural fields. Sci Pflanzenzüchtung. UTB Verlag Eugen Ulmer Stuttgart. ISBN: Total Environ 505:390–401. https://doi.org/10.1016/j.scitotenv. 9783825238438 2014.10.009 Džeroski S, Goethals B, Panov P (2010) Inductive databases and Lehtinen T, Dersch G, Söllinger J, Baumgarten A, Schlatter N, constraint-based data mining. Springer, New York ISBN: 978-1- Aichberger K, Spiegel H (2017) Long-term amendment of four dif- 4419-7738-0 ferent compost types on a loamy silt Cambisol: impact on soil or- Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to ganic matter, nutrients and yields. Arch Agron Soil Sci 63(5):663– knowledge discovery in databases. AI Mag 17:37–54. https://doi. 673. https://doi.org/10.1080/03650340.2016.1235264 org/10.1609/aimag.v17i3.1230 Lehtinen T, Schlatter N, Baumgarten A, Bechini L, Krüger J, Grignani C, Franko U, Spiegel H (2016) Modeling soil organic carbon dynamics in an Zavattaro L, Costamagna C, Spiegel H (2014) Effect of crop residue Austrian long-term tillage field experiment. Soil Till Res 156:83–90. incorporation on soil organic carbon and greenhouse gas emissions https://doi.org/10.1016/j.still.2015.10.003 in European agricultural soils. Soil Use Manage 30(4):524–538. Friedrich T, Derpsch R, Kassam A (2012) Overview of the global spread https://doi.org/10.1111/sum.12151 of conservation agriculture. Field actions science reports, special Mirtl M, Bahn M, Battin T, Borsdorf A, Dirnböck T, English M, issue 6, Reconciling Poverty Eradication and Protection of the Erschbamer B, Fuchsberger J, Gaube V, Grabherr G, Gratzer Environment, http://factsreports.revues.org/1941 G, Haberl H, Klug H, Kreiner D, Mayer R, Peterseil J, Richter Gerendás J, Führs H (2013) The significance of magnesium for crop A, Schindler S, Stocker-Kiss A, Tappeiner U, Weisse T, quality. Plant Soil 368:101–128. https://doi.org/10.1007/s11104- Winiwarter V, Wohlfahrt G, Zink R (2015) Research for the 012-1555-2 future - LTER-Austria white paper 2015 - on the status and Goldstein A, Fink L, Meitin A, Bohadana S, Lutenberg O, Ravid G orientation of process oriented ecosystem research, biodiversity (2017) Applying machine learning on sensor data for irrigation rec- and conservation research and socio-ecological research in ommendations: revealing the agronomist’s tacit knowledge. Precis Austria. In: LTER-Austria series, Austrian Society for Long- Agric 47(4):1–24. https://doi.org/10.1007/s11119-017-9527-4 Term Ecological Research, Vienna, Austria, pp. 74. ISBN:978- Gransee A, Führs H (2013) Magnesium mobility in soils as a challenge 3-9503986-1-8 for soil and plant analysis, magnesium fertilization and root uptake Poeplau C, Reiter L, Berti A, Kätterer T (2017) Qualitative and quantita- under adverse growth conditions. Plant Soil 368(1):5–21. https:// tive response of soil organic carbon to 40 years of crop residue doi.org/10.1007/s11104-012-1567-y incorporation under contrasting nitrogen fertilisation regimes. Soil Grzebisz W (2013) Crop response to magnesium fertilization as affected Res 55(1):1–9. https://doi.org/10.1071/SR15377 by nitrogen supply. Plant Soil 368:23–39. https://doi.org/10.1007/ Poeplau C, Kätterer T, Bolinder MA, Börjesson G, Berti A, Lugato E s11104-012-1574-z (2015) Low stabilization of aboveground crop residue carbon in Hazell P, Wood S (2008) Drivers of change in global agriculture. Philos sandy soils of Swedish long-term experiments. Geoderma 237: Trans R Soc Lond B Biol Sci 363(1491):495–515. https://doi.org/ 246–255. https://doi.org/10.1016/j.geoderma.2014.09.010 10.1098/rstb.2007.2166 Rashid A, Ryan J (2004) Micronutrient constraints to crop production in HijbeekR,van Ittersum MK,ten Berge HFM, Gort G,Spiegel H, soils with Mediterranean-type characteristics: a review. J Plant Nutr Whitmore AP (2017) Do organic inputs matter—a meta-analysis 27(6):959–975. https://doi.org/10.1081/PLN-120037530 of additional yield effects for arable crops in Europe. Plant Soil Schulte RPO, Bampa F, Bardy M, Coyle C, Creamer RE, Fealy R, Gardi 411(1):293–303. https://doi.org/10.1007/s11104-016-3031-x C, Ghaely BB, Jordan P, Laudon H, O’Donoghue C, O’hUallacháin Hobbs PR, Sayre K, Gupta R (2008) The role of conservation agriculture D, O’Sullivan L, Rutgers M, Six J, Toth GL, Vrebos D (2015) in sustainable agriculture. Philos Trans R Soc Lond B Biol Sci 363: Making the most of our land: managing soil functions from local 543–555. https://doi.org/10.1098/rstb.2007.2169 to continental scale. Front Environ Sci 3(81). https://doi.org/10. Jarvis S, Hutchings N, Brentrup F, Olesen JE, Van de Hoek KW (2011) 3389/fenvs.2015.00081 Nitrogen flows in farming systems across Europe. In: Sutton MA, Schulte RPO, Creamer RE, Donnellan T, Farrelly N, Fealy R, Howard CM, Erisman JW, Billen G, Bleeker A, Grennfelt P, Van O’Donoghue C, O’hUallachain D (2014) Functional land manage- Grinsven H, Grizzetti B (eds) The European Nitrogen Assessment, ment: a framework for managing soil-based ecosystem services for 211–228. Cambridge University Press. https://doi.org/10.1017/ the sustainable intensification of agriculture. Environ Sci Pol 38:45– CBO9780511976988.013 58. https://doi.org/10.1016/j.envsci.2013.10.002 Jiawei H, Kamber M, Pei J (2006) Data mining: concepts and techniques. Senbayram M, Gransee A, Wahle V, Thiel H (2015) Role of magnesium Morgan Kaufmann. ISBN: 978012381479 fertilisers in agriculture: plant–soil continuum. Crop Pasture Sci Ke 66(12):1219–1229. https://doi.org/10.1071/CP15104 rtész A, Madarász B (2014) Conservation agriculture in Europe. Int Soil Water Conserv Res 2:91–96. https://doi.org/10.1016/S2095- Shekoofa A, Emam Y, Shekoufa N, Ebrahimi M, Ebrahimie E (2014) 6339(15)30016-2 Determining the most important physiological and agronomic traits Kopittke PM, Menzies NW (2007) A review of the use of the basic cation contributing to maize grain yield through machine learning algo- rithms: a new avenue in intelligent agriculture. PLoS One 9(5): saturation ratio and the BIdeal^ Soil. Soil Sci Soc Am J 71(2):259– 265. https://doi.org/10.2136/sssaj2006.0186 e97288. https://doi.org/10.1371/journal.pone.0097288 Körschens M, Albert E, Armbruster M, Barkusky D, Baumecker M, Spiegel H, Sandén T, Dersch G, Baumgarten A, Gründling R, Franko U Behle-Schalk L, Bischoff R, Cergan Z, Ellmer F, Herbst F, (2018). Soil organic matter and nutrient dynamics following differ- Hoffmann S, Hofmann B, Kismanyoky T, Kubat J, Kunzova E, ent management of crop residues at two sites in Austria. Book Using data mining techniques to model primary productivity from international long-term ecological research... Chapter in BSoil Management and Climate Change: Effects on carbon for laboratory routines: three long-term field experiments in Austria. Soil Res 53(2):190–204. https://doi.org/10.1071/SR14200 Organic Carbon, Nitrogen Dynamics and Greenhouse Gas Emissions^,253–265, Elsevier. ISBN: 978-0-12-812128-3 Tatzber M, Stemmer M, Spiegel H, Katzlberger C, Landstetter C, Haberhauer G, Gerzabek MH (2012) 14C-labeled organic amend- Spiegel H, Dersch G, Baumgarten A, Hösch J (2010) The international ments: characterization in different particle size fractions and humic organic nitrogen long-term fertilisation experiment (IOSDV) at acids in a long-term field experiment. Geoderma 177–178:39–48. Vienna after 21 years. Arch Agron Soil Sci 56:405–420. https:// https://doi.org/10.1016/j.geoderma.2012.01.028 doi.org/10.1080/03650341003645624 Tatzber M (2009) Decomposition of Carbon-14-labeled organic Spiegel H, Dersch G, Hösch J, Baumgarten A (2007) Tillage effects on amendments and humic acids in a long-term field experiment. soil organic carbon and nutrient availability in a long-term field Soil Sci Soc Am J 73(3):744–750. https://doi.org/10.2136/ experiment in Austria. Die Bodenkultur 58:47–58 sssaj2008.0235 Spiegel H, Pfeffer M, Hösch J (2002) N dynamics under reduced tillage. Tatzber M, Stemmer M, Spiegel H, Katzlberger C, Zehetner F, Arch Agron Soil Sci 48:503–512. https://doi.org/10.1080/ Haberhauer G, Garcia-Garcia E, Gerzabek MH (2009) Spectroscopic behaviour of 14C-labeled humic acids in a long- Spiegel H (2001) Results of three long-term P-field experiments in term field experiment with three cropping systems. Soil Res 47(5): Austria: 1 Report: Effects of different types and quantities of P- 459–469. https://doi.org/10.1071/SR08231 fertiliser on yields and P CAL-contents in soils | Ergebnisse von drei Tatzber M, Stemmer M, Spiegel H, Katzlberger C, Haberhauer G, 40-jährigen P-Dauerversuchen in Österreich: 1. Mitteilung: Gerzabek MH (2008) Impact of different tillage practices on molec- Auswirkungen ausgewählter P-Düngerformen und -mengen auf ular characteristics of humic acids in a long-term field experiment— den Ertrag und die P CAL-Gehalte im Boden. Bodenkultur 52(1): an application of three different spectroscopic methods. Sci Total 3–17 Environ 406:256–268. https://doi.org/10.1016/j.scitotenv.2008.07. Tatzber M, Klepsch S, Soja G, Reichenauer T, Spiegel H, Gerzabek M (2015a) Determination of soil organic matter features of extractable Trajanov A (2011) Machine learning in agroecology: from simulation fractions using capillary electrophoresis: an organic matter stabiliza- models to co-existence rules. Lambert Academic Publishing tion study in a Carbon-14-labeled long-term field experiment. In: He (LAP), Germany ISBN:978-3845471334 Z, Wu F (eds) Labile organic matter—chemical compositions, func- Veenadhari S, Mishra B, Singh CD (2011) Soybean productivity model- tion, and significance in soil and the environment. SSSA Special ling using decision tree algorithms. Int J Comput Appl T 27(7):11– Publication 62. © 2015. SSSA, Madison. https://doi.org/10.2136/ 15. https://doi.org/10.13140/RG.2.1.3852.1846 sssaspecpub62.2014.0033 Witten IH, Frank E (2011) Data mining: practical machine learning tools Tatzber M, Schlatter N, Baumgarten A, Dersch G, Körner R, Lehtinen T, and techniques - 3rd edition. Morgan Kaufmann. ISBN:978-0-12- Unger G, Mifek E, Spiegel H (2015b) KMnO4 determination of active 374856-0

Journal

Regional Environmenal ChangeSpringer Journals

Published: May 28, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off