Author

Ahmed Nasir

Degree

Master of Science in Computer Science

Faculty / School

Faculty of Computer Sciences (FCS)

Department

Department of Computer Science

Date of Submission

2017-12-15

Advisor

Shams Naveed Zia, Faculty of Computer Science, Institute of Business Administration, Karachi

Project Type

MSCS Survey Report

Abstract

Distributed query processing includes querying big data that is stored in files and other formats, there are several approaches for query processing some of them involves query the data directly using in-situ processing and some systems involves load the data before it can be queried. The report studies the different query processing systems and concludes which systems and techniques are suitable for what type of workloads.

Notes

Several techniques for improving the performance of distributed query processing are discussed, its concluded that each system has its advantages and short comings and none of the system can be used for all use cases for instance the approach of using GPUs for warehousing queries works well if the data can be easily loaded in memory as otherwise the majority of time will be spent in loading it. So basically, there are many systems and models available to improve the performance but the choice of the system should be based on the use case.

The full text of this document is only accessible to authorized users.

Share

COinS