Degree
Bachelor of Science (Computer Science)
Department
Department of Computer Science
School
School of Mathematics and Computer Science (SMCS)
Advisor
Dr. Imran Rauf, Assistant Professor & Program Coordinator BS(CS) and PhD (CS) Programs
Co-Advisor
Saad Mughal - CTO AlphaVenture
Keywords
Vision Language Models, Design Automation, Visual Quality Assurance
Abstract
AI For Visual QA is an AI powered design quality assurance platform that automates the detection of visual discrepancies between Figma design mockups and live web implementations. Design drift, the gradual accumulation of undetected layout, style, and content deviations between intended designs and deployed interfaces, erodes brand fidelity and creates a slow, error prone feedback cycle for development teams. The platform addresses this by providing a automated QA workflow: a custom Figma plugin exports design frame data to the system, which captures live website screenshots, performs a multi layer comparative analysis using vision language models, and produces a structured report with issues classified by severity, category, and confidence score. Beyond detection, an autonomous code fixing agent locates the source of each discrepancy, applies targeted changes, and submits a GitHub pull request, enabling complete resolution without manual developer involvement. Visual QA delivers a scalable, production ready framework that reduces the cost of maintaining design fidelity across iterative software development.
Tools and Technologies Used
Frontend: React 19, Vite, Laravel Echo, Pusher.js
Backend: Laravel 10, PHP 8.2
Database: MySQL
Real-time & WebSocket: Laravel Reverb
Browser Automation: Playwright, Chromium
AI & Vision Models: Claude Sonnet 4.6, Claude
Sonnet 4.5
API & Integration: OpenRouter API, GitHub REST API
Figma Plugin: TypeScript, esbuild
Methodology
The system is structured around a four phase sequential job pipeline. In the first phase, a large language model processes exported Figma JSON to identify and extract named design sections. The second phase uses Playwright to launch a headless Chromium browser and capture screenshots of the target website segmented to match each section. The third phase runs a two layer VLM analysis per section pair: Layer A evaluates structural and content fidelity using screenshots alongside extracted HTML and Figma JSON, while Layer B assesses visual style including typography, colors, and spacing using computed CSS and Figma JSON. A deduplication call then removes cross layer duplicate findings before the report is compiled. The optional fourth phase runs an autonomous agent loop that iterates over flagged issues, using file system and code search tools to locate and fix responsible code, with GitHub integration to commit changes and open a pull request. Each phase broadcasts real time progress to the frontend via WebSocket.
Document Type
Restricted Access
Submission Type
BSCS Final Year Project
Recommended Citation
Agha, S., Sheikh, S., & Sharjeel, S. M. (2026). AI For Visual QA. Retrieved from https://ir.iba.edu.pk/fyp-bscs/54
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.
COinS
