Degree

Master of Science in Computer Science

Department

Department of Computer Science

School

School of Mathematics and Computer Science (SMCS)

Date of Submission

Fall 2022

Supervisor

Dr. Sajjad Haider, Professor, Department of Computer Science, School of Mathematics and Computer Science (SMCS)

Abstract

The Questions Answering System (QAS), an advanced form of Information Retrieval (IR), returns comprehensive information as an answer to a human question. A new variant of QAS, Community Question Answering (cQA), provides a platform where users can share their knowledge by asking and responding to questions about any topic. Our research problem is to find similar questions that have already been asked in the forum for any new query. Existing methods use traditional Information Retrieval techniques of the bag of words (BOW) model, which loses semantics in text. Significant challenges revolve around different words between past and new questions, the length of questions, the relationship between questions and answers, and the ranking of matched questions. Furthermore, questions can have different words but can be related to any already asked question with respect to semantics and context. Therefore, we run experiments and compare different text encoding and word embedding approaches to vectorize the existing questions and new questions and calculate the similarity between vectors in vector space. Our results on the TrecQA dataset show relatively higher performance than traditional IR methods. To scale this further, hashed embeddings could be calculated so that for larger datasets, embeddings could be compared faster. In the future, the proposed solution could be integrated into chatbots to find frequently asked questions (FAQs) on websites and public or internal enterprise search engines to find relevant questions from forums.

Document Type

Restricted Access

Submission Type

Research Project

Available for download on Monday, June 15, 2026

The full text of this document is only accessible to authorized users.

Share

COinS