Degree

Bachelor of Science (Computer Science)

Department

Department of Computer Science

School

School of Mathematics and Computer Science (SMCS)

Advisor

Dr. Muhammad Saeed, Assistant Professor, Visiting faculty at IBA Karachi

Abstract

NeuralTrace is an AI-powered memory assistance system designed to support individuals with cognitive impairments such as dementia and Alzheimer’s disease. The project's main objective is to provide real-time, context-aware memory augmentation by combining speech recognition, object detection, and scene understanding within a unified mobile platform. By capturing and processing spoken input, visual cues, and spatial information, the system enables natural language recall of past conversations, object locations, and scheduled tasks. The solution is composed of four core modules: an audio analysis pipeline using Whisper and semantic embeddings; a hybrid scene classification model integrating YOLOv8 and ResNet-50; a React Native-based mobile frontend; and a FastAPI backend that supports asynchronous machine learning operations and caregiver alerts via geofencing. Experimental evaluations demonstrate strong performance, including 92.2% transcription accuracy, 89% query match rate, and 84.1% scene classification accuracy. Overall, NeuralTrace presents a scalable and privacy-conscious framework that addresses critical gaps in memory support tools. It lays the foundation for future developments such as multilingual support, emotion-aware recall, and integration with wearable devices.

Tools and Technologies Used

Python, LangChain, FAISS, Hugging Face, YoloV8, React Native, PyTorch, FastAPI, PostgresSQL

Methodology

NeuralTrace was developed using a modular, component-based approach to integrate speech processing, computer vision, and mobile computing into a unified memory assistance system. The project is divided into four main modules: Audio Analysis, Object Detection, Frontend Interface, and Backend Infrastructure.

The Audio Analysis module captures spoken input from the user and processes it using a multi-stage pipeline. Audio is first cleaned through bandpass filtering and noise reduction. It is then transcribed using the Whisper model. Irrelevant sentences are filtered out using stopword ratios, and meaningful segments are embedded into semantic vectors using SentenceTransformers. These vectors, along with contextual metadata, are stored for fast retrieval using FAISS.

The Object Detection module classifies indoor scenes by combining object-level and global image features. YOLOv8 is used to detect objects, and a co-occurrence matrix captures their spatial relationships. A fusion model integrates these with scene features extracted via ResNet-50, enabling accurate scene recognition. Temporal smoothing and contextual tracking are used to enhance prediction stability.

The Frontend is built using React Native for cross-platform support on Android and iOS. It offers voice command input, camera access for visual detection, reminder management, and real-time caregiver alerts. Local processing is used where possible to protect user privacy.

The Backend is implemented using FastAPI and handles media uploads, model inference, and communication with the frontend. It supports asynchronous execution for low-latency operations and includes geofencing logic for caregiver alerts. Semantic memory vectors are indexed using pgvector and retrieved through similarity search.

The overall development process was iterative, with each module designed, tested, and optimized independently before full system integration. This ensured scalability, real-time responsiveness, and user privacy while enabling future extensions such as wearable support and multilingual capabilities.

Document Type

Restricted Access

Submission Type

BSCS Final Year Project

Share

COinS