Mental-health monitoring, biometric security analysis have lots of impact in the smart world. Speech recognition, voice identification are key technologies in these fields. However these are very challenging research areas because voice features can vary based on gender, physical or mental condition and environmental noise. In our paper, we identify emotional status based on Cepstral and Jitter coefficients. Cepestral co efficient has some important role since it carries the maximum information of voice signals. Rather than using the entire voice signal, we use short time significant frames, which would b e enough to identify the emotional condition of the speaker. Our hybrid framework is very realistic and inexpensive because it computes with a part of voice signal considering jitter. We support our method by providing better accuracy and true acceptance rate.