Speech emotion recognition promises to play an important role in various fields such as healthcare, security, HCI. This talk examines various convolutional neural network architectures for recognizing emotion in utterances in the Chinese language. Experiments are conducted with log-Mel spectrum features, pitch, energy along with voice activity detection. Further, experiments are conducted with spectrograms of the speech utterances. Different pooling operations are also investigated. Finally, preliminary experiments are conducted for cross language emotion recognition between the Chinese, English languages.