Open-Domain Table/Chart Answered Question Answering

Abstract:

In the age of information, the ability to extract meaningful insights from structured data in the form of tables and charts is invaluable. This project aims to develop an innovative Open-Domain Question Answering (ODQA) system that can process and answer questions based on data presented in tables and charts across a wide range of domains. Leveraging Machine Learning (ML) and Natural Language Processing (NLP) techniques, this project provides Master’s students with an opportunity to push the boundaries of research in open-domain data-driven QA.

Project Objectives:

  1. Data Collection: Gather a diverse dataset containing tables and charts from various domains, along with corresponding questions and answers. Emphasize a broad range of topics and document sources.
  2. Model Development: Develop and fine-tune an open-domain QA model that can process text-based questions and questions related to tables and charts. Explore the use of pre-trained vision-language (VL) models and adapt them for this specific task.
  3. Model Explainability: Investigate methods to make the model’s reasoning process more interpretable, especially when answering questions based on charts and tables.

Deliverables:

  1. An annotated dataset with tables/charts, questions, and answers from various domains.
  2. An open-domain QA model capable of answering questions based on tables and charts across multiple domains.
  3. Evaluation results demonstrating the model’s open-domain performance.
  4. A final report summarizing the project’s findings and contributions.

Skills and Knowledge Required:

  • Strong background in Machine Learning, including experience with deep learning and NLP.
  • Proficiency in Python and relevant libraries like PyTorch, Pytorch-Lightning and Hugging Face Transformers.
  • Experience with data collection, preprocessing, and annotation.
  • Strong communication and collaboration skills.

Note: It is crucial to stay updated with the latest research in the field of QA, NLP, and ML, as these areas are rapidly evolving. Students should leverage recent advancements and adapt their methods accordingly to achieve state-of-the-art results.

Send CV and grades to [email protected]