Domain-Specific Question and Answer

Domain-Specific Question and Answer#

At this point you now have an understanding of LLM, Prompting, and LangChain for building your own local application.

But a question remain, how would I utilize this in my own work or organization? Can’t I just use ChatGPT?

As a research scientist, you may have a set of documents, GitHub repositories, research papers, and domain-specific knowledge bases that you might want to search through quickly. You might have exploratory questions about the following:

  • How to utilize a particular method in a Python package based on your specific context

  • What is the current state of research on a particular subtopic of your choice

  • You are building expertise in a new subdomain in your field, and you want to use a variety of knowledge bases to accumulate knowledge

Or you are an organization where:

  • You want to create an assistive tool for onboarding new team members

  • You have teams of engineers in sizeable, diverse groups, and you want to be transparent about what’s happening across different projects

  • You have internal documents within labs or research that are not published, but you want a privacy-preserving way to expose that to your team

  • You have multiple codebases, and you want ways to help people understand how different portions of your code work

In this module we will be demonstrating how we can use research abstracts to power a question-answering application using OLMo.

We will be using arXiv Dataset, in particular, we will use the abstracts for all the papers classified under the astrophysics category, i.e., with category value of Astro-ph.

We will further demonstrate the integration of Astropy’s documentation from GitHub, a common core package for Astronomy in Python, into a question-answering application as well.

By the end of this module, you will have an understanding of the building blocks required to create a domain-specific question-answering application using OLMo, and then you can use these knowledge to create your own application for your domain.