Degree
Bachelor of Science (Computer Science)
Department
Department of Computer Science
School
School of Mathematics and Computer Science (SMCS)
Advisor
Ms. Abeera Tariq, Lecturer and Academic Director, Department of Computer Science IBA Karachi
Co-Advisor
Muhammad Naeem (Industry Advisor)
Keywords
Natural Language to SQL, Agentic AI, Conversational BI
Abstract
IntelliQuery is an AI-powered natural language to SQL agent built on a multi-node LangGraph workflow that enables non-technical business users to query a PostgreSQL (Northwind) database hosted on AWS RDS simply by typing plain English questions. The system contributes a fully automated pipeline: Schema Inspector, SQL Planner, Query Validator, SQL Executor, and Output Formatter, where each node is intelligently connected by conditional routing edges that handle retries, error recovery, and security enforcement; the Planner node leverages Google Gemini LLM combined with ChromaDB semantic vector search over 23 curated few-shot examples to generate contextually accurate SQL, while the Validator enforces security (blocking injections and destructive statements) and business-logic correctness before any query reaches the database, ultimately democratizing data access by letting users get formatted, business-ready results from complex relational data without writing a single line of SQL.
Tools and Technologies Used
Python, LangChain, LangGraph, LangChain-Google-GenAI, Google Gemini (LLM), LangChain-Groq, Groq, Ollama, FastAPI, Uvicorn, Pydantic, PostgreSQL, AWS RDS, psycopg2, ChromaDB, FastEmbed, Redis, PyJWT, MCP (Model Context Protocol), LangChain-MCP-Adapters, python-dotenv, asyncssh, DuckDuckGo Search, pypdf, docx2txt, React (frontend), Vite, Node.js
Methodology
The project follows an Agentic AI development methodology built around a phased, iterative approach. The architecture is designed as a multi-node LangGraph directed graph, where each node (Schema Inspector, Planner, Validator, Executor, Output Formatter) is developed, tested, and validated independently before being wired into the full workflow, a modular-first strategy that allows unit-level testing at the tool, node, flow, and business-logic layers progressively. The development is structured across 7 planned stages, starting with a working NLP-to-SQL baseline (Stage 1, current) and incrementally adding self-correction, hierarchical memory, multi-tool ecosystems, context optimization, and finally PowerBI visualization integration, ensuring each stage delivers a functional system before complexity is added. The SQL generation uses a Retrieval-Augmented Generation (RAG) pattern, where ChromaDB with FastEmbed performs semantic similarity search over 23 curated few-shot examples to ground the Gemini LLM's output in domain-specific query patterns. A multi-layer validation strategy (security, syntax, business logic, database EXPLAIN) acts as a quality gate, and conditional edge routing enables automatic retry loops back to the Planner on failures, making the system selfcorrecting without human intervention. The overall design philosophy is domain-aware AI, where the agent is pre-loaded with hand-crafted business context, table relationship mappings, and JOIN patterns specific to the Northwind dataset, reducing hallucination and improving SQL accuracy over a generic LLM approach.
Document Type
Restricted Access
Submission Type
BSCS Final Year Project
Recommended Citation
Khan, I., Ahmad, S., & Umair, S. M. (2026). IntelliQuery: AI-Powered Conversational Business Intelligence Dashboard Platform. Retrieved from https://ir.iba.edu.pk/fyp-bscs/33
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
COinS
