Degree
Bachelor of Science (Computer Science)
Department
Department of Computer Science
School
School of Mathematics and Computer Science (SMCS)
Advisor
Dr. Solat Jabeen Sheikh, Lecturer, Department of Computer Science
Keywords
Agentic AI, Retrieval-Augmented Generation, Legal Natural Language Processing, LangGraph, Multi-Agent Systems, Hybrid Retrieval
Abstract
Legal research and petition drafting in Pakistan remain manual processes, costing advocates three to five hours per case and locking out litigants without specialist support, a gap that global legal AI tools like Westlaw and LexisNexis cannot fill, since none are trained on Pakistani case law, citation hierarchy, or PPC/CrPC statutes. Legal Sahara closes this gap with a jurisdiction-specific, three-agent system: a hybrid RAG engine for grounded case research, an 11-node pipeline that drafts court-ready petitions across five filing types, and a document summarizer that converts dense judgments into structured legal briefs. Unlike single-prompt LLM tools, each agent is built to self-correct, replanning failed retrievals, flagging high-severity offences before drafting, and validating output against quality rubrics before delivery. Tested on 10,482 Supreme Court judgments, the system retrieves relevant precedent with 0.79 Precision@3, drafts petitions scoring 8.2/10 under expert LLM review, and produces case summaries averaging 79.3/100 against human-expert benchmarks. The results show that purpose-built agentic architectures can deliver practically usable first-draft legal research and drafting support for junior advocates, pro se litigants, and rural practitioners, a population current legal AI tools were never designed to serve.
Tools and Technologies Used
Python, FastAPI, React.js, LangGraph, LangChain, LLaMA 3.3 70B (Groq), LLaMA 3.1 8B (Groq), Gemma 3 27B (Google Gemini), ChromaDB, BAAI/bge-large-en- v1.5, rank_bm25, Reciprocal Rank Fusion (RRF), pypdf, pytesseract, pdf2image, python-docx, Pillow, Docker, Docker Compose v3.9, AWS EC2, Nginx, Git, GitHub
Methodology
Legal Sahara was developed using an agentic AI architecture orchestrated through LangGraph, with three specialized agents each built as a stateful graph with conditional edges, retry logic, and self-correction capabilities. The RAG Agent follows a plan-execute-replan loop where Gemma 3 routes queries, LLaMA 3.3 generates answers, and Gemma 3 independently verifies outputs through a two-gate faithfulness check, preventing hallucinated citations by retrieving case IDs directly from ChromaDB metadata rather than generating them. The Petition Drafter operates as an 11-node LangGraph pipeline that classifies the petition type, checks for missing information, detects PPC severity red flags, retrieves supporting precedents via hybrid search, ranks citations by legal hierarchy, drafts the petition using LLaMA 3.3 70B, and then validates the output using a 5- dimensional LLM-as-judge scoring rubric, triggering an automatic revision if thresholds are not met. The Summarizer Agent processes uploaded documents through an 8-node sequential pipeline covering classification, facts, issues, holding, reasoning, ratio decidendi, synthesis, and evaluation, with OCR auto- detection for scanned PDFs at 300 DPI and exponential backoff retry logic throughout. All agents are exposed via a FastAPI REST backend, containerized with Docker Compose, and deployed on AWS EC2 with persistent ChromaDB and HuggingFace model cache volumes
Document Type
Restricted Access
Submission Type
BSCS Final Year Project
Recommended Citation
., R., Masood, Z., & Abro, H. I. (2026). Legal Sahara: A Legal Assistant for Pakistani Law. Retrieved from https://ir.iba.edu.pk/fyp-bscs/60
COinS
