Project Description
Speech Emotion Recognition (SER) uses acoustic/prosodic features of speech to classify words/sentences/audio files into emotions e.g. happiness, anger, sadness etc [1]. Emotions can also be mapped into a 2-dimensional physiological space of emotional positivity(valence) and strength(arousal) [2].
Semantic/lexical information may also be used independently or additionally, but this is not the primary focus here (for a more lexical approach, there is another hackweek project [3])
Goal for this Hackweek
I will attempt to use pyAudioAnalysis [4] and pytorch/scikit-sklearn to do supervised learning of audio files (from well-known audio/emotion databases [5]), mapping them to emotion classes. I will also study how continuous variables such as valence/arousal can be extracted. I will investigate SVMs and neural networks. My initial focus will be to understand the methods at a conceptual level, but I will also try to use a GPU (Nvidia Cuda or AMD Rocm) instead of a CPU if possible.
Unsupervised learning would be a further step. An even further step would be to do this for online/realtime voice recordings such as [6]
Resources
[1] https://hackernoon.com/intro-to-audio-analysis-recognizing-sounds-using-machine-learning-qy2r3ufl
[2] POSNER, J., RUSSELL, J., & PETERSON, B. (2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and Psychopathology, 17(3), 715-734. doi:10.1017/S0954579405050340
[3] https://hackweek.suse.com/20/projects/sentiment-analyzer
[4] https://github.com/tyiannak/pyAudioAnalysis/
[5] http://emodb.bilderbar.info/start.html
[6] https://rocmdocs.amd.com/en/latest/
[7] https://ai.googleblog.com/2019/03/an-all-neural-on-device-speech.html
No Hackers yet
Looking for hackers with the skills:
Nothing? Add some keywords!
This project is part of:
Hack Week 20
Activity
Comments
Be the first to comment!
Similar Projects
This project is one of its kind!