Abstract/Description
Data is the fundamental building block for advancements in artificial intelligence (AI), general AI (GAI), machine learning (ML), and large language models (LLMs). This study emphasizes the critical need for robust data infrastructure, arguing that without it, countries cannot fully benefit from technological advancements in various economic sectors. Governments possess vast repositories of both structured and unstructured data across multiple domains such as the judiciary, parliaments, and civil bureaucracy. However, these potential goldmines remain untapped due to inadequate data management capabilities and a lack of appreciation for the necessity of high-quality data. The research identifies key issues in public data management, including the non-uniform representation of key data sets and the prevalence of non-machine-readable formats, which further complicates data utilization. By analyzing examples of inconsistencies in standard data conventions within public datasets, this study underscores the challenges posed by messy data, which requires specific skills to be transformed into a tidy format where each feature is clearly delineated and consistently formatted. The objectives of this research are twofold: to explore effective utilization of public policy data and to harness natural language processing (NLP) and LLMs to analyze critical policy documents, such as monetary policy statements issued by the State Bank of Pakistan. This study aims to demonstrate how enormous amounts of unstructured policy document data can be leveraged to analyze policy objectives, enhance public policy formulation and implementation, thereby realizing the potential of data as a strategic asset in governance.
Keywords
Public policy, Natural-language processing (NLP), Large Language Models (LLMs), Machine Learning, Artificial Intelligence (AI), General AI (GAI)
Location
MAV 1 room, Adamjee building
Session Theme
Cities and Infrastructure
Session Type
Parallel Technical Session
Session Chair
Qazi Masood, Institute of Business Administration
Session Discussant
Demetrio Panarello, Link Campus University ; Zahid Asghar, Quaid-i-Azam University
Start Date
10-12-2024 3:15 PM
End Date
10-12-2024 5:15 PM
Recommended Citation
Asghar, Z. (2024). Unlocking the Power of Data: Enhancing Public Policy through Advanced Data Infrastructure and Language Model Analysis. CBER Conference. Retrieved from https://ir.iba.edu.pk/esdcber/2024/program/29
Click the Download button to view presentation slides.
Included in
Data Science Commons, Public Policy Commons, Statistical Models Commons, Urban Studies and Planning Commons
Unlocking the Power of Data: Enhancing Public Policy through Advanced Data Infrastructure and Language Model Analysis
MAV 1 room, Adamjee building
Data is the fundamental building block for advancements in artificial intelligence (AI), general AI (GAI), machine learning (ML), and large language models (LLMs). This study emphasizes the critical need for robust data infrastructure, arguing that without it, countries cannot fully benefit from technological advancements in various economic sectors. Governments possess vast repositories of both structured and unstructured data across multiple domains such as the judiciary, parliaments, and civil bureaucracy. However, these potential goldmines remain untapped due to inadequate data management capabilities and a lack of appreciation for the necessity of high-quality data. The research identifies key issues in public data management, including the non-uniform representation of key data sets and the prevalence of non-machine-readable formats, which further complicates data utilization. By analyzing examples of inconsistencies in standard data conventions within public datasets, this study underscores the challenges posed by messy data, which requires specific skills to be transformed into a tidy format where each feature is clearly delineated and consistently formatted. The objectives of this research are twofold: to explore effective utilization of public policy data and to harness natural language processing (NLP) and LLMs to analyze critical policy documents, such as monetary policy statements issued by the State Bank of Pakistan. This study aims to demonstrate how enormous amounts of unstructured policy document data can be leveraged to analyze policy objectives, enhance public policy formulation and implementation, thereby realizing the potential of data as a strategic asset in governance.