Degree

Bachelor of Science (Computer Science)

Department

Department of Computer Science

School

School of Mathematics and Computer Science (SMCS)

Advisor

Dr. S.M. Faisal Iradat, Assistant Professor, Department of Computer Science

Keywords

Invoice Processing, Federated Learning, Optical Character Recognition (OCR), Deep Learning, Natural Language Processing

Abstract

This project is an AI-Based Cost Management System built to automate the entire procurement document processing pipeline for organizations. The system takes scanned invoices, Purchase Orders and Goods Receipts in PDF or image format, extracts structured financial fields using a multi-engine OCR pipeline combining Tesseract, GOT-OCR and fine-tuned LayoutLMv3 and Donut transformer models, and then performs three-way matching to validate invoices against their corresponding POs and GRs. Anomaly detection runs through both a rule-based checker and an unsupervised Isolation Forest model, with a privacy-preserving upgrade built on top via a Federated Learning framework that trains a neural anomaly detector across distributed clients using FedAvg with Differential Privacy, making sure no raw financial data ever leaves client premises. The full system is exposed through a FastAPI backend and visualized through a Streamlit analytics dashboard and is trained and evaluated on a mix of real-world scanned document datasets to make it robust enough for actual enterprise deployment.

Tools and Technologies Used

Python, FastAPI, Streamlit, PyTorch, Hugging Face Transformers, LayoutLMv3, Donut (Document Understanding Transformer), PEFT/LoRA, Tesseract OCR (pytesseract), GOT-OCR, scikit-learn, RapidFuzz, Federated Learning (FedAvg), Differential Privacy, Isolation Forest, OpenCV, Pillow, pdfplumber, pandas, SQLite, openpyxl, ReportLab, Kaggle (GPU training environment)

Methodology

The project follows a pipeline-based development methodology split across five phases. Phase 1 covers data preparation for Purchase Orders, Goods Receipts and invoices. Phase 2 builds the document extraction layer where a multi-engine OCR pipeline attempts spatial PDF parsing, Tesseract and GOT-OCR in sequence, with a finetuned LayoutLMv3 via PEFT and LoRA acting as the primary field extractor for structured key-value recognition. Phase 3 implements three-way matching using fuzzy string matching to reconcile invoice fields against PO and GR records and flag any discrepancies. Phase 4 adds anomaly detection by combining a rule-based checker with an Isolation Forest model trained on historical invoice features. Phase 5 brings in the Federated Learning layer, where an MLP anomaly detector is trained using FedAvg with Gaussian Differential Privacy across simulated distributed clients, so the model keeps improving without any raw financial data ever being centralized. The whole system is then served through a FastAPI REST API and visualized on a Streamlit dashboard.

Document Type

Restricted Access

Submission Type

BSCS Final Year Project

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS