Classification and regression tree analysis in modeling the milk yield and conformation traits for Holstein cows in Bulgaria

A.Yordanova1*, S. Gocheva-Ilieva1, H. Kulina1, L. Yordanova2, I. Marinov3

1Department of Applied Mathematics and Modeling, Faculty of Mathematics and Informatics, Plovdiv University Paisii Hilendarski, 24 Tzar Asen, 4000 Plovdiv, Bulgaria
2Department of Mathematics and Informatics, Faculty of Economics, Trakia University, 6000 Stara Zagora, Bulgaria
3Department of Animal Science – Ruminants and Dairy Farming, Faculty of Agriculture, Тrakia University, 6000 Stara Zagora, Bulgaria

Abstract. In the field of livestock breeding the investigation of the factors that influence to the highest degree the efficiency (e.g. milk yield) is essential for determining the conditions for the improvement of the overall production results. To extract relevant information from the data the appropriate mathematical methods are very useful. The aim of this work is to demonstrate the capabilities of the method of Classification and regression trees (CART) for statistical data processing including data of ordinal and nominal type. For a sample of 97 observations of cattle from 4 farms in Bulgaria, two decision trees are built for studying dependence of the 305 days milk yield for Holshte in cows with respect to 13 independent variables – 12 conformation traits and farm. The model with 12 independent variables for conformation traits describes 48% of the data and identifies the main factors for quantities of milk – udder width, locomotion, stature and chest width with normalized importance 100%, 48.1%, 41.2% and 39%, respectively. The second model includes the farm where the cattle are reared as 13th independent variable and this expanded model accounts for 70% of the data. Following the obtained rules for both models, predictions for new data could be made before end of lactations.