Speech Emotion Recognition (SER) uses acoustic/prosodic features of speech to classify words/sentences/audio files into emotions e.g. happiness, anger, sadness etc . Emotions can also be mapped into a 2-dimensional physiological space of emotional positivity(valence) and strength(arousal) .
Semantic/lexical information may also be used independently or additionally, but this is not the primary focus here (for a more lexical approach, there is another hackweek project )
Goal for this Hackweek
I will attempt to use pyAudioAnalysis  and pytorch/scikit-sklearn to do supervised learning of audio files (from well-known audio/emotion databases ), mapping them to emotion classes. I will also study how continuous variables such as valence/arousal can be extracted. I will investigate SVMs and neural networks. My initial focus will be to understand the methods at a conceptual level, but I will also try to use a GPU (Nvidia Cuda or AMD Rocm) instead of a CPU if possible.
Unsupervised learning would be a further step. An even further step would be to do this for online/realtime voice recordings such as 
 POSNER, J., RUSSELL, J., & PETERSON, B. (2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and Psychopathology, 17(3), 715-734. doi:10.1017/S0954579405050340
This project is one of its kind!