Detection of correct pregnancy status in lactating dairy cattle using MARS data mining algorithm
Küçük Resim Yok
Tarih
2022
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Tubitak Scientific & Technological Research Council Turkey
Erişim Hakkı
info:eu-repo/semantics/openAccess
Özet
In this study, it is aimed to determine pregnancy outcomes by using multivariate adaptive regression splines (MARS) algorithm for classification type problems. For this purpose, data obtained from a private dairy farm in the Konya region of Turkiye in 2020 were used to determine pregnancy outcomes in holstein dairy cattle. It has been determined how to perform statistical analyses on solving classification-type problems with the MARS algorithm and how to use R packages (caret and earth) by creating an R script file. After the analysis, the MARS estimation equation was created and in finding the probability of being pregnant: While lactation period, cow age, number of lactations, insemination number, and total lactation milk yield variables are important, it was seen that 7-day mean milk yield and last lactation milk yield were not significant. Using the train function of the caret package, the number of terms that produce the highest accuracy and the degree of interaction are determined. Goodness-of-fit tests of the optimum model were calculated. Within the scope of the evaluation of the generalization ability of the model, training and test sets were created, the classification success graph of the MARS algorithm, the model building phase were summarized, and the generalization ability of the established model was measured. When the pregnancy status is taken as a positive reference, the correct classification rate (sensitivity) of the animal with positive pregnancy status was found to be 0.9574:the correct classification rate (specificity) of pregnant animals was found to be 0.8370. The overall classification ratio of the training set (accuracy) was found to be 0.8777. The area under the ROC curve (AUC) was found to be 0.947, which indicates that the optimum specificity value is close to 1.
Açıklama
Anahtar Kelimeler
Logistic regression, classification, binary data, train and test set, Holstein breed
Kaynak
Turkish Journal of Veterinary & Animal Sciences
WoS Q Değeri
Q4
Scopus Q Değeri
Q3
Cilt
46
Sayı
6