Degree

Bachelor of Science (Computer Science)

Department

Department of Computer Science

School

School of Mathematics and Computer Science (SMCS)

Advisor

Dr. Faisal Iradat, Assistant Professor, Department of Computer Science

Co-Advisor

Dr. Sana Durvesh

Keywords

Emotion Recognition, Sentiment Analysis, Natural Language Processing, Computer Vision, Audio Recognition, Psychiatric Assistant, Mental Health Technology

Abstract

Mental health assessment is often a subjective and time-intensive process that relies heavily on a psychiatrist’s interpretation of patient behavior, speech, and written communication. This project presents a Psychiatric Assistant application designed to support psychiatrists by providing objective, data-driven insights through multimodal emotion recognition. The system integrates three core components: text analytics, computer vision, and audio recognition, enabling the analysis of patient input across written text, facial expressions, and vocal tones. The primary objective of this project is to enhance psychiatric evaluation by classifying emotions with high accuracy and generating automated reports that summarize a patient’s emotional state. Advanced models such as BERT for text, Convolutional Neural Networks (CNNs) for facial analysis, and audio feature extraction techniques for speech recognition were employed to ensure reliable classification. The contribution of this work lies in creating a unified platform that bridges machine learning with clinical utility, offering psychiatrists a tool that reduces bias, improves consistency, and saves time during consultations. By combining multiple modalities, the Psychiatric Assistant delivers a more comprehensive understanding of emotional states, supporting early detection of mental health issues and informed decision-making. This project demonstrates how AI-driven solutions can play a vital role in modern psychiatric care, ultimately aiming to improve patient outcomes and assist professionals in delivering more effective treatment.

Tools and Technologies Used

  • Programming Languages: Python, JavaScript

  • Frameworks & Libraries (ML/NLP): TensorFlow, PyTorch, Transformers (Hugging Face), Scikit-learn, NLTK, OpenCV, Librosa

  • Frontend & Web Development: React (for UI)

  • Backend & APIs: Flask / FastAPI

  • Databases: MongoDB (for storing reports & patient data)

Methodology

The development of the Psychiatric Assistant followed a multimodal machine learning approach, integrating text analytics, computer vision, and audio recognition to classify emotions from diverse patient inputs.

  1. Data Collection & Preprocessing:

    Text data was cleaned, tokenized, and lemmatized for sentiment and emotion classification.

    Facial expression datasets were preprocessed using image augmentation and normalization.

    Audio signals were converted into spectrograms and Mel-frequency cepstral coefficients (MFCCs) for feature extraction.

  2. Model Development:

    Text Analytics: Fine-tuned transformer-based models (e.g., BERT) were used for emotion classification from textual input.
    Computer Vision: Convolutional Neural Networks (CNNs) were employed for detecting facial expressions from images and video frames.
    Audio Recognition: Deep learning models were trained on extracted MFCC features to classify emotions from voice patterns.

  3. Integration:
    A web-based platform was developed where the three modalities were combined. Predictions from each model were aggregated to provide a comprehensive emotional profile.

  4. Evaluation & Validation:
    Models were evaluated using metrics such as accuracy, precision, recall, and F1-score. Cross-validation was applied to ensure generalizability.

Document Type

Restricted Access

Submission Type

BSCS Final Year Project

Share

COinS