Degree

Bachelor of Science (Computer Science)

Department

Department of Computer Science

School

School of Mathematics and Computer Science (SMCS)

Advisor

Dr. Syed Ali Raza, Assistant Professor, Department of Computer Science

Keywords

eKYC, Customer Onboarding, Know Your Customer

Abstract

Know-Your-Customer (KYC) verification is mandatory for regulated onboarding, yet real-world electronic KYC (eKYC) flows are frequently slow, brittle and inconsistent, and the tooling that exists is affordable only to a handful of large institutions. The gap is acute in fast-digitising markets such as Pakistan—the context of this project—where digital channels already carry the majority of retail payments and new regulation mandates digital onboarding for small and medium enterprises, yet millions of such businesses have no configurable, embeddable identity-verification product built for their documents, languages and price point. This project designs and implements an AI-poweredeKYC platform that closes that gap through three ideas working together: a no-code, drag-and-drop flow builder that lets product and compliance teams compose and version onboarding journeys; a durable workflow orchestrator that runs each onboarding as a persisted, resumable instance bound to an immutable flow version; and a modular verification pipeline (guided capture, image quality, OCR with CNIC and MRZ parsing, document authenticity, face matching, liveness, and camera-based fingerprint imaging) whose signals feed an explainable, weighted risk-scoring engine and a configurable rules engine. The system is delivered as three integrated components: a React Native mobile capture SDK, a React/Vite admin and manual-review console, and a Node/Express backend with a PostgreSQL data model, optional queue-based asynchronous workers, and a multi- tenant security and audit layer. Verification primitives reuse established, openly available models (PaddleOCR PP-OCRv5 for text recognition, face-api.js for face embeddings, and Google ML Kit for on-device face/liveness cues), integrated behind tenant-configurable thresholds, with Pakistan-specific handling (CNIC parsing and validation, and Urdu capture guidance). The result is a working end-to-end prototype that takes a customer from capture to an auditable approve / reject / manual-review decision, with every decision accompanied by per-module scores, applied rule identifiers and an immutable audit trail. We report the system’s architecture and methods, its implementation completeness against seventeen functional requirements, and the limitations that remain before it is a fully production-grade, sellable product.

Tools and Technologies Used

Python, TensorFlow/PyTorch (deep learning models), ArcFace/FaceNet (face matching), OCR models (multi-language), React Native, iOS SDK, Android SDK, JavaScript/Web SDK, REST APIs, gRPC, RabbitMQ/Kafka (message queues), Docker/Kubernetes (containerization and orchestration), PostgreSQL/Document DB/Graph DB/Time-series DB (polyglot persistence), AES-256, TLS 1.3, CI/CD pipelines, Vector Database (face embeddings).

Methodology

The project follows an Agile development methodology with short, focused sprints delivering incremental and testable value. Development is organized into eight sequential phases: (1) Foundation and CI/CD setup, (2) Core microservices and data persistence, (3) Capture flows and data pipeline, (4) Baseline ML model development (MVP), (5) Model tuning and risk/decision engine integration, (6) QA, security hardening, and penetration testing, (7) User Acceptance Testing (UAT) and production deployment with observability, and (8) Final documentation and stakeholder handover. Testing is layered (unit → integration → end-to-end) with automated regression runs in the CI pipeline, and sprint planning, daily stand-ups, and retrospectives are used for project management and risk control.

Document Type

Restricted Access

Submission Type

BSCS Final Year Project

Share

COinS