Why is XGBoost better than SVM

Predicting a Cold from Speech Using Fisher Vectors; SVM and XGBoost as Classifiers

tip

Access other chapters in this book by swiping

2020 | OriginalPaper | Book chapter

Abstract

Screening a cold may be beneficial in the sense of avoiding the propagation of it. In this study, we present a technique for classifying subjects having a cold by using their speech. In order to achieve this goal, we make use of frame-level representations of the recordings of the subjects. Such representations are exploited by a generative Gaussian Mixture Model (GMM) which consequently produces a fixed-length encoding, i.e. Fisher vectors, based on the Fisher Vector (FV) approach. Afterward, we compare the classification performance of the two algorithms: a linear kernel SVM and a XGBoost Classifier. Due to the data sets having a high class imbalance, we undersample the majority class. Applying Power Normalization (PN) and Principal Component Analysis (PCA) on the FV features proved effective at improving the classification score: SVM achieved a final score of 67.81% of Unweighted Average Recall (UAR) on the test set. However, XGBoost gave better results on the test set by just using raw Fisher vectors; and with this combination we achieved a UAR score of 70.43%. The latter classification approach outperformed the original (non-fused) baseline score given in ‘The INTERSPEECH 2017 Computational Paralinguistics Challenge’.

Would you like to get access to this content? Then find out more about our products now:

Springer Professional "Business + Technology"

With Springer Professional "Business + Technology" you get access to:

  • above 69,000 books
  • above 500 magazines

from the following fields:

  • Automobile + engines
  • Construction + real estate
  • Business IT + informatics
  • Electrical engineering + electronics
  • Energy + environment
  • Finance + Banking
  • Management + leadership
  • Marketing + sales
  • Mechanical engineering + materials
  • Insurance + risk

Try now for 30 days free of charge.

Springer Professional "Technology"

With Springer Professional "Technology" you get access to:

  • above 50,000 books
  • above 380 magazines

from the following fields:

  • Automobile + engines
  • Construction + real estate
  • Business IT + informatics
  • Electrical engineering + electronics
  • Energy + environment
  • Mechanical engineering + materials



Try now for 30 days free of charge.

Springer Professional "Economy"

With Springer Professional "Economy" you get access to:

  • above 58,000 books
  • above 300 magazines

from the following fields:

  • Construction + real estate
  • Business IT + informatics
  • Finance + Banking
  • Management + leadership
  • Marketing + sales
  • Insurance + risk



Try now for 30 days free of charge.

literature
Go back to reference Cai, D., Ni, Z., Liu, W., Cai, W., Li, G., Li, M .: End-to-end deep learning framework for speech paralinguistics detection based on perception aware spectrum . In: Proceedings of Interspeech, pp. 3452-3456 (2017) Cai, D., Ni, Z., Liu, W., Cai, W., Li, G., Li, M .: End-to-end deep learning framework for speech paralinguistics detection based on perception aware spectrum. In: Proceedings of Interspeech, pp. 3452-3456 (2017)
Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A .: The devil is in the details: an evaluation of recent feature encoding methods. In: British Machine Vision Conference, vol. 2, pp. 76.1–76.12, November 2011 Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A .: The devil is in the details: an evaluation of recent feature encoding methods. In: British Machine Vision Conference, vol. 2, pp. 76.1-76.12, November 2011
Go back to reference Egas-López, J.V., Orozco-Arroyave, J.R., Gosztolya, G .: Assessing Parkinson’s disease from speech using fisher vectors. In: Proceedings of Interspeech (2019) Egas-López, J.V., Orozco-Arroyave, J.R., Gosztolya, G .: Assessing Parkinson’s disease from speech using fisher vectors. In: Proceedings of Interspeech (2019)
Go back to reference Egas López, J.V., Tóth, L., Hoffmann, I., Kálmán, J., Pákáski, M., Gosztolya, G .: Assessing Alzheimer's disease from speech using the i-vector approach. In: Salah, A.A., Karpov, A., Potapova, R. (eds.) SPECOM 2019. LNCS (LNAI), vol. 11658, pp. 289-298. Springer, Cham (2019). https: // doi. org / 10. 1007 / 978-3-030-26061-3_ 30 CrossRef Egas López, JV, Tóth, L., Hoffmann, I., Kálmán, J., Pákáski , M., Gosztolya, G .: Assessing Alzheimer's disease from speech using the i-vector approach. In: Salah, A.A., Karpov, A., Potapova, R. (eds.) SPECOM 2019. LNCS (LNAI), vol. 11658, pp. 289-298. Springer, Cham (2019). https: // doi. org / 10. 1007 / 978-3-030-26061-3_ 30CrossRef
Go back to reference Gosztolya, G., Bagi, A., Szalóki, S., Szendi, I., Hoffmann, I .: Identifying schizophrenia based on temporal parameters in spontaneous speech. In: Proceedings of Interspeech, Hyderabad, India, pp. 3408-3412, September 2018 Gosztolya, G., Bagi, A., Szalóki, S., Szendi, I., Hoffmann, I .: Identifying schizophrenia based on temporal parameters in spontaneous speech. In: Proceedings of Interspeech, Hyderabad, India, pp. 3408-3412, September 2018
Go back to reference Gosztolya, G., Busa-Fekete, R., Grósz, T., Tóth, L .: DNN-based feature extraction and classifier combination for child-directed speech, cold and snoring identification. In: Proceedings of Interspeech, Stockholm, Sweden, pp. 3522–3526, August 2017 Gosztolya, G., Busa-Fekete, R., Grósz, T., Tóth, L .: DNN-based feature extraction and classifier combination for child -directed speech, cold and snoring identification. In: Proceedings of Interspeech, Stockholm, Sweden, pp. 3522-3526, August 2017
Go back to reference Gosztolya, G., Grósz, T., Szaszák, G., Tóth, L .: Estimating the sincerity of apologies in speech by DNN rank learning and prosodic analysis. In: Proceedings of Interspeech, San Francisco, CA, USA, pp. 2026-2030, September 2016 Gosztolya, G., Grósz, T., Szaszák, G., Tóth, L .: Estimating the sincerity of apologies in speech by DNN rank learning and prosodic analysis. In: Proceedings of Interspeech, San Francisco, CA, USA, pp. 2026-2030, September 2016
Go back to reference Gosztolya, G., Grósz, T., Tóth, L .: General utterance-level feature extraction for classifying crying sounds, atypical and self-assessed affect and heart beats. In: Proceedings of Interspeech, Hyderabad, India, pp. 531-535, September 2018 Gosztolya, G., Grósz, T., Tóth, L .: General utterance-level feature extraction for classifying crying sounds, atypical and self-assessed affect and heart beats. In: Proceedings of Interspeech, Hyderabad, India, pp. 531-535, September 2018
Go back to reference Jaakkola, T.S., Haussler, D .: Exploiting generative models in discriminative classifiers. In: Proceedings of NIPS, Denver, CO, USA, pp. 487-493 (1998) Jaakkola, T.S., Haussler, D .: Exploiting generative models in discriminative classifiers. In: Proceedings of NIPS, Denver, CO, USA, pp. 487-493 (1998)
Go back to reference Kaya, H., Karpov, A.A .: Introducing weighted kernel classifiers for handling imbalanced paralinguistic corpora: snoring, addressee and cold. In: Interspeech, pp. 3527-3531 (2017) Kaya, H., Karpov, A.A .: Introducing weighted kernel classifiers for handling imbalanced paralinguistic corpora: snoring, addressee and cold. In: Interspeech, pp. 3527–3531 (2017)
Go back to reference Kaya, H., Karpov, A.A., Salah, A.A .: Fisher vectors with cascaded normalization for paralinguistic analysis. In: Proceedings of Interspeech, pp. 909-913 (2015) Kaya, H., Karpov, A.A., Salah, A.A .: Fisher vectors with cascaded normalization for paralinguistic analysis. In: Proceedings of Interspeech, pp. 909-913 (2015)
Go back to reference Lemaître, G., Nogueira, F., Aridas, C.K .: Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18 (1), 559-563 (2017) Lemaître, G., Nogueira, F., Aridas, C.K .: Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18 (1), 559-563 (2017)
Go back to reference Long, J.M., Yan, Z.F., Shen, Y.L., Liu, W.J., Wei, Q.Y .: Detection of Epilepsy using MFCC-based feature and XGBoost. In: 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1–4. IEEE (2018) Long, J.M., Yan, Z.F., Shen, Y.L., Liu, W.J., Wei, Q.Y .: Detection of Epilepsy using MFCC-based feature and XGBoost. In: 2018 11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1–4. IEEE (2018)
Go back to reference Moreno, P.J., Rifkin, R .: Using the Fisher kernel method for web audio classification. In: Proceedings of ICASSP, Dallas, TX, USA, pp. 2417-2420 (2010) Moreno, P.J., Rifkin, R .: Using the Fisher kernel method for web audio classification. In: Proceedings of ICASSP, Dallas, TX, USA, pp. 2417–2420 (2010)
Go back to reference Natekin, A., Knoll, A .: Gradient boosting machines, a tutorial. Front. Neurorob. 7, 21 (2013) CrossRef Natekin, A., Knoll, A .: Gradient boosting machines, a tutorial. Front. Neurorob. 7, 21 (2013) CrossRef
Go back to reference Peng, X., Wang, L., Wang, X., Qiao, Y .: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput. Vis. Image Underst. 150, 109-125 (2016) CrossRef Peng, X., Wang, L., Wang, X., Qiao, Y .: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. Comput. Vis. Image Underst. 150, 109-125 (2016) CrossRef
Go back to reference Perronnin, F., Dance, C .: Fisher kernels on visual vocabularies for image categorization. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2007. https: // doi. Org / 10. 1109 / CVPR. 2007. 383266 Perronnin, F., Dance, C .: Fisher kernels on visual vocabularies for image categorization. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2007. https: // doi. Org / 10. 1109 / CVPR. 2007. 383266
Go back to reference Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J .: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vision 105 (3), 222-245 (2013). https: // doi. org / 10. 1007 / s11263-013-0636-x MathSciNetCrossRefMATH Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J .: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vision 105 (3), 222-245 (2013). https: // doi. org / 10. 1007 / s11263-013-0636-xMathSciNetCrossRefMATH
Go back to reference Schuller, B., et al .: The Interspeech 2017 computational paralinguistics challenge: addressee, cold and snoring. In: Computational Paralinguistics Challenge (ComParE), Interspeech 2017, pp. 3442-3446 (2017) Schuller, B., et al .: The Interspeech 2017 computational paralinguistics challenge: addressee, cold and snoring. In: Computational Paralinguistics Challenge (ComParE), Interspeech 2017, pp. 3442–3446 (2017)
Go back to reference Schuller, B.W., Batliner, A.M .: Emotion, Affect and Personality in Speech and Language Processing. Wiley, Hoboken (1988) Schuller, B.W., Batliner, A.M .: Emotion, Affect and Personality in Speech and Language Processing. Wiley, Hoboken (1988)
Go back to reference Seeland, M., Rzanny, M., Alaqraa, N., Wäldchen, J., Mäder, P .: Plant species classification using flower images: a comparative study of local feature representations. PLOS ONE 12 (2), 1–29 (2017) Seeland, M., Rzanny, M., Alaqraa, N., Wäldchen, J., Mäder, P .: Plant species classification using flower images: a comparative study of local feature representations. PLOS ONE 12 (2), 1-29 (2017)
Go back to reference Smith, D.C., Kornelson, K.A .: A comparison of Fisher vectors and Gaussian supervectors for document versus non-document image classification. In: Applications of Digital Image Processing XXXVI, vol. 8856, p. 88560N. International Society for Optics and Photonics (2013) Smith, D.C., Kornelson, K.A .: A comparison of Fisher vectors and Gaussian supervectors for document versus non-document image classification. In: Applications of Digital Image Processing XXXVI, vol. 8856, p. 88560N. International Society for Optics and Photonics (2013)
Go back to reference Tian, ​​Y., He, L., Li, Z.Y., Wu, W.L., Zhang, W.Q., Liu, J .: Speaker verification using Fisher vector. In: Proceedings of ISCSLP, Singapore, pp. 419-422 (2014) Tian, ​​Y., He, L., Li, Z.Y., Wu, W.L., Zhang, W.Q., Liu, J .: Speaker verification using Fisher vector. In: Proceedings of ISCSLP, Singapore, pp. 419-422 (2014)
Go back to reference Wang, C., Deng, C., Wang, S .: Imbalance-XGBoost: leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost. arXiv preprint arXiv: 1908. 01672 (2019) Wang, C., Deng, C., Wang, S .: Imbalance-XGBoost: leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost. arXiv preprint arXiv: 1908. 01672 (2019)
Go back to reference Wang, S.-H., Li, H.-T., Chang, E.-J., Wu, A.-Y.A .: Entropy-assisted emotion recognition of valence and arousal using XGBoost classifier. In: Iliadis, L., Maglogiannis, I., Plagianakos, V. (eds.) AIAI 2018. IAICT, vol. 519, pp. 249-260. Springer, Cham (2018). https: // doi. org / 10. 1007 / 978-3-319-92007-8_ 22 CrossRef Wang, S.-H., Li, H.-T., Chang, E.- J., Wu, A.-YA: Entropy-assisted emotion recognition of valence and arousal using XGBoost classifier. In: Iliadis, L., Maglogiannis, I., Plagianakos, V. (eds.) AIAI 2018. IAICT, vol. 519, pp. 249-260. Springer, Cham (2018). https: // doi. org / 10. 1007 / 978-3-319-92007-8_ 22CrossRef
Go back to reference Zajíc, Z., Hrúz, M .: Fisher Vectors in PLDA speaker verification system. In: Proceedings of ICSP, Chengdu, China, pp. 1338-1341 (2016) Zajíc, Z., Hrúz, M .: Fisher Vectors in PLDA speaker verification system. In: Proceedings of ICSP, Chengdu, China, pp. 1338-1341 (2016)
About this chapter
title
Predicting a Cold from Speech Using Fisher Vectors; SVM and XGBoost as Classifiers
book
Speech and Computer

Print ISBN: 978-3-030-60275-8

Electronic ISBN: 978-3-030-60276-5

Copyright Year: 2020

https://doi.org/10.1007/978-3-030-60276-5

DOI
https://doi.org/10.1007/978-3-030-60276-5_15
Authors:
José Vicente Egas-López
Gábor Gosztolya

premium partner