Retrieval-Augmented Text Generation

Retrieval-Augmented Text Generation#

The moment that we’ve all been waiting for has finally arrived! The Retrieval-Augmented Text Generation (RAG) Framework is here! 🎉

Throughout this notebook we will be exploring RAG, what it is, how it works, and why it’s so exciting.

Why RAG?#

Although trained on large datasets, stale data can severely limit LLMs. It faces several challenges:

The models are trained on internet content, so they might not generate relevant output when prompted for information that is not publicly available on the internet.
The models are trained up to a certain date, they might not generate relevant output when prompted for content and information that has happened after the training completion date of the model.
The models are trained to be more generalized. This means that they can only produce generic outputs and might not perform as expected when prompted for specific deep-dive concepts related to a particular topic.

One way to dynamically integrate relevant external information is retrieval-augmented generation (RAG), which can help improve the reliability of LLM outputs.

Going back to our original question of how this can be utilized in our own work or organization on section 1 of this module. RAG Framework can really be useful in the scenario where there may be a set of documents, GitHub repositories, research papers, and domain-specific knowledge bases that you might want to search through quickly.

RAG Framework#

RAG proposes a solution to this issue by supplementing the prompt sent to the LLM with information from external sources through a retrieval model via vector embeddings (more on this later), thereby providing the LLM with more relevant input to generation responses. It allows you to use pre-trained LLMs without fine-tuning them or training your own LLM on your training data.

RAG Workflow

Image Source: Medium Blog

Multiple concepts influence RAG pipeline:

Retrieval
Augmentation
Generation

Retrieval#

The retrieval phase can also be considered the data and query/prompt preparation phase, focusing on efficient information retrieval or data access. To improve your RAG pipeline, the pre-retrieval phase contains tasks such as: (1): Indexing, (2) Query Manipulation, (3) Data Modification, (4) Search, and (5) Ranking. In this tutorial, we primarily focus on indexing and search.

Indexing enables fast and accurate information retrieval that sets up the context for any LLM to improve its response to a given user prompt or query.

We will be indexing abstracts for all astrophysics papers and Astropy’s documentation, a common core package for Astronomy in Python.

Embeddings#

Embeddings, also called “Vector Embedding,” help LLMs develop a semantic understanding of the textual data they are trained on. In simpler terms, these embedding models lay the groundwork for LLMs to perform tasks like sentence completion, similarity search, questions and answers, etc.

Embedding vs Fine-tuning#

	Embedding	Fine-tuning
Definition	Use pre-trained LLM as feature extractor	Update parameters of pre-trained LLM during task-specific training
Process	Input Encoding > tokenized > Embedding Extraction > Downstream Task	Initialization > Task-specific Training > Fine-tuning Layers (optional)
Advantages	Efficient use of pre-trained knowledge, Faster inference	Adaptability to task-specific nuances, May require less labeled data than from scratch
Considerations	N/A	Risk of overfitting, Computational cost can be high
When to use	Limited computational resources, Limited labeled data	Significant computational resources, Large corpus of labeled data
Performance	Performs well, especially with limited data	Can achieve state-of-the-art results on a wide range of tasks

In a nutshell#

Embeddings models are typically small in size and less computationally intensive
Regular updates of embedding vectors are faster, cheaper, and simpler compared to fine-tuning a model.

Vector#

At the lowest level, machines only understand numeric values. For LLMs to work, natural language is converted into an array of numeric values before they are fed into the models. These arrays of numeric values are called “Vector.”

An example of a vector: [2.5, 1.0, 3.3, 7.8]

The above is an example of a vector of size 4.

import numpy as np

vector = np.array([2.5, 1.7, 3.3, 7.8])
print(f"Vector: {vector}")

Vector: [2.5 1.7 3.3 7.8]

Tokens#

We stated above that “texts are converted into an array of numeric values called vectors”.

But depending on your use case, each word, sentence, paragraph, or entire document can be represented as a vector.

Tokens are the smallest natural language units converted into a vector. It could be at the character level, sub-word level, word level, sentence level, paragraph level, or document level.

Example: Consider the text below.

Earth is a planet of the solar system. There are 9 planets in the solar system. All planets revolve around the sun. Sun is a star.

Case 1.) Tokenizing the entire paragraph into vector.
Tokenization: The entire paragraph is a single token.
Vectorization: A single vector.
Sample Vector Representation: [3.1, 6.8, 5.4, 8.0, 7.1]

Case 2.) Tokenizing each sentence into vectors.
Tokenization: One token for each sentence (total 4 tokens)
Vectorization: One vector for each sentence (total 4 vectors).
Sample Vector Representation: [[1.2, 2.3, 3.8, 7.9, 0.8], [2.5, 3.0, 8.2, 6.6, 4.1], [3.2, 6.5, 8.1, 9.3, 1.4], [1.1, 0.7, 7.2, 3.5, 8.5]]

Case 3.) Tokenizing each word in the paragraph into a vector. There are 26 words in the paragraph, ignoring punctuation. Each word gets converted into a vector.
Tokenization: One token for each word in the paragraph (26 tokens)
Vectorization: One vector for each token (total 26 vectors).
Sample Vector Representation: [[2.1, 3.2, 4.1, 9.8, 7.0], [8.2, 4.2, 7.1, 3.8, 2.0]…..total 26 such representations]

Tokenizers#

Tokenizers are components responsible for converting large texts into tokens (tokenization). Different types of pre-trained tokenizers are available. You can even train your own tokenizers. But for the scope of this tutorial, we will use a pre-trained one.

Generally, each tokenizer follows the following steps:

Break down the original text into tokens. These tokens could again be at the character, sub-word, word, sentence, paragraph, or document levels.
Assign a unique identifier to each of the tokens created.

# For example, here is how you can split a short sentence into chunks of text
from langchain_text_splitters import CharacterTextSplitter

text_splitter = CharacterTextSplitter(
    separator=" ",
    chunk_size=10,
    chunk_overlap=0,
)
text_splitter.split_text(text="Earth is a planet in the solar system.")

['Earth is a', 'planet in', 'the solar', 'system.']

Learn more about how to split text into tokens in LangChain here.

Embedding Models#

A language model needs to understand how tokens are related to each other in the context of human language. To understand this semantic relationship, these tokens are converted into numerical vectors.

Embedding Models are trained upon these tokens to develop an “embedding space.”

Before the training, the embedding model initializes an N-dimensional ‘vector’ corresponding to each ‘token’ with random values. (Value of N depends on the embedding model)
During the embedding model training, the values for these vectors are updated across iterations. In this process, similar or related tokens are updated to have similarly valued vectors.
After the training, the collection of all the ‘vectors’ corresponding to all the tokens is called the “embedding space.”
“Embedding Space” is an encoded representation of meanings of tokens and inter-token relationships.

See Word Embeddings Resource for more conceptual details on embeddings.

To understand this further, let’s take a look at how it all works using a pre-trained embedding model.

For the tutorial and simplicity, we are using the Langchain Hugging Face integrations, which is available in the langchain-huggingface package. To use an embedding model available in Hugging Face, we will simply use the HuggingFaceEmbedding class.

from langchain_huggingface import HuggingFaceEmbeddings

We are using the all-MiniLM-L12-v2 sentence-transformers embedding model for this tutorial. After some evaluation that we did, we found that this model works well for our use case as it is lightweight and provides good performance.

This model “maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search”.

However, you can use any other embedding model available in Hugging Face, and we recommend going to MTEB Leaderboard to find embedding models and see how they compare to each other.

# Setup the embedding, we are using the MiniLM model here
embeddings_model = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L12-v2"
)

/Users/lsetiawan/mambaforge/envs/ssec-scipy2024/lib/python3.11/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:11: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console)
  from tqdm.autonotebook import tqdm, trange

query_result = embeddings_model.embed_query("Earth is a planet in the solar system.")

# Dimension of vector
len(query_result)

query_result[-3:]

[0.05409970134496689, 0.07589352875947952, -0.04195248335599899]

In an embedding space, you can find how similar two vectors are using dot product or using cosine similarity.

from scipy import spatial

print(
    "Similarity:",
    1
    - spatial.distance.cosine(
        query_result,
        embeddings_model.embed_query("Mars is a planet in the solar system."),
    ),
)

Similarity: 0.7926038246541188

print(
    "Similarity:",
    1
    - spatial.distance.cosine(
        query_result, embeddings_model.embed_query("Hello Tacoma.")
    ),
)

Similarity: 0.08770959442668558

What we have demonstrated above in finding similarity between vectors is essentially what’s happening in the retrieval phase of the RAG pipeline within a Vector Database.

Vector Stores#

Once the embeddings are created for our relevant documents or knowledge base, we need to store these embeddings in the database for fast retrieval.

The type of databases that store these vector embeddings are called “Vector Stores.” We will use a vector store called “Qdrant,” as shown below.

In the below code,

Vector store works along with the embedding model to create vector embeddings.
Vector embeddings are stored in the Qdrant Vector database collection.

We have already created a vector database that contains the astrophysics paper abstracts and Astropy’s documentation, please refer to the notebook in the Appendix.

The ssec_tutorials utility package contains a download_qdrant_data function that downloads the existing Qdrant database that we’ve created for this tutorial. Additionally, there’s a QDRANT_COLLECTION_NAME constant variable that contains the name of the collection in the Qdrant database.

from ssec_tutorials import download_qdrant_data, QDRANT_COLLECTION_NAME

QDRANT_PATH = download_qdrant_data()

Qdrant data already exists at /Users/lsetiawan/.cache/ssec_tutorials/scipy_qdrant

QDRANT_PATH

PosixPath('/Users/lsetiawan/.cache/ssec_tutorials/scipy_qdrant')

QDRANT_COLLECTION_NAME

'arxiv_astro-ph_abstracts_astropy_github_documentation'

With having the Qdrant path and collection name information, as well as the embeddings model, we can now use the Langchain Qdrant integrations package called langchain-qdrant to interact with the Qdrant database by using the Qdrant class.

from langchain_qdrant import Qdrant

qdrant = Qdrant.from_existing_collection(
    embedding=embeddings_model,
    collection_name=QDRANT_COLLECTION_NAME,
    path=QDRANT_PATH,
)

Search#

Now that we have the Qdrant database instance, we are ready to search for the relevant documents based on the user query. However, before we can simply search, we will need a VectorStoreRetriever object.

To get the VectorStoreRetriever object, we can simply call the .as_retriever() method on the Qdrant object.

In this example, we will be setting the search_type to "mmr" and search_kwargs to {"k": 2}.

“mmr” stands for Maximum Marginal Relevance

MMR selects examples based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.

The k parameter in search_kwargs specifies the number of documents to retrieve.

# Setup the retriever for later step
retriever = qdrant.as_retriever(search_type="mmr", search_kwargs={"k": 2})

Let’s invoke this retriever object with some of the questions from previous section and see what we get.

documents = retriever.invoke("What is dark matter?")

We got the relevant documents from the Qdrant database for the given questions. Let’s see what these documents look like.

document = documents[0]

type(document)

langchain_core.documents.base.Document

dict(document)

{'page_content': '  One of the great scientific enigmas still unsolved, the existence of dark\nmatter, is reviewed. Simple gravitational arguments imply that most of the mass\nin the Universe, at least 90%, is some (unknown) non-luminous matter. Some\nparticle candidates for dark matter are discussed with particular emphasis on\nthe neutralino, a particle predicted by the supersymmetric extension of the\nStandard Model of particle physics. Experiments searching for these relic\nparticles, carried out by many groups around the world, are also discussed.\nThese experiments are becoming more sensitive every year and in fact one of the\ncollaborations claims that the first direct evidence for dark matter has\nalready been observed.\n',
 'metadata': {'id': 'hep-ph/0110122',
  'title': 'The Enigma of the Dark Matter',
  '_id': '4ab99f7c922747d9a6a34b855d959779',
  '_collection_name': 'arxiv_astro-ph_abstracts_astropy_github_documentation'},
 'type': 'Document'}

We see that this is a core Langchain Document object that contains the document’s metadata and content.

Later we will see how we can use this document to generate the response, for now let’s create a utility formatting function to retrieve just the content of the document so that we can put this as part of our prompt template input, also known as “Augmentation”.

# Post-processing
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

print(format_docs(documents))

  One of the great scientific enigmas still unsolved, the existence of dark
matter, is reviewed. Simple gravitational arguments imply that most of the mass
in the Universe, at least 90%, is some (unknown) non-luminous matter. Some
particle candidates for dark matter are discussed with particular emphasis on
the neutralino, a particle predicted by the supersymmetric extension of the
Standard Model of particle physics. Experiments searching for these relic
particles, carried out by many groups around the world, are also discussed.
These experiments are becoming more sensitive every year and in fact one of the
collaborations claims that the first direct evidence for dark matter has
already been observed.


  Dark matter could be composed of compact dark objects (CDOs). These objects
may interact very weakly with normal matter and could move freely {\it inside}
the Earth. A CDO moving in the inner core of the Earth will have an orbital
period near 55 min and produce a time dependent signal in a gravimeter. Data
from superconducting gravimeters rule out such objects moving inside the Earth
unless their mass $m_D$ and or orbital radius $a$ are very small so that $m_D\,
a < 1.2\times 10^{-13}M_\oplus R_\oplus$. Here $M_\oplus$ and $R_\oplus$ are
the mass and radius of the Earth.

Augmentation & Generation#

Now that we can retrieve the most relevant document based on a question, we can use the retrieved document and send it along with the prompt to increase the context for the LLM.

This can also be referred to as the retrieval-augmented prompt.

from langchain_community.llms import LlamaCpp
from langchain_core.prompts import PromptTemplate
from ssec_tutorials import download_olmo_model

OLMO_MODEL = download_olmo_model()

Model already exists at /Users/lsetiawan/.cache/ssec_tutorials/OLMo-7B-Instruct-Q4_K_M.gguf

olmo = LlamaCpp(
    model_path=str(OLMO_MODEL),
    temperature=0.8,
    verbose=False,
    n_ctx=2048,
    max_tokens=512,
)

# Create a prompt template using OLMo's tokenizer chat template we saw in module 1.
prompt_template = PromptTemplate.from_template(
    template=olmo.client.metadata["tokenizer.chat_template"],
    template_format="jinja2",
    partial_variables={"add_generation_prompt": True, "eos_token": "<|endoftext|>"},
)

# Test the prompt you want to send to OLMo.
question = "What is dark matter?"
context = format_docs(retriever.invoke(question))

final_prompt_content = prompt_template.format(
    messages=[
        {
            "role": "user",
            "content": f"""\
                You are an astrophysics expert. Please answer the question on astrophysics based on the following context:

                Context: {context}

                Question: {question}
            """,
        }
    ]
)

print(final_prompt_content)

<|endoftext|>

<|user|>
                You are an astrophysics expert. Please answer the question on astrophysics based on the following context:

                Context:   One of the great scientific enigmas still unsolved, the existence of dark
matter, is reviewed. Simple gravitational arguments imply that most of the mass
in the Universe, at least 90%, is some (unknown) non-luminous matter. Some
particle candidates for dark matter are discussed with particular emphasis on
the neutralino, a particle predicted by the supersymmetric extension of the
Standard Model of particle physics. Experiments searching for these relic
particles, carried out by many groups around the world, are also discussed.
These experiments are becoming more sensitive every year and in fact one of the
collaborations claims that the first direct evidence for dark matter has
already been observed.


  Dark matter could be composed of compact dark objects (CDOs). These objects
may interact very weakly with normal matter and could move freely {\it inside}
the Earth. A CDO moving in the inner core of the Earth will have an orbital
period near 55 min and produce a time dependent signal in a gravimeter. Data
from superconducting gravimeters rule out such objects moving inside the Earth
unless their mass $m_D$ and or orbital radius $a$ are very small so that $m_D\,
a < 1.2\times 10^{-13}M_\oplus R_\oplus$. Here $M_\oplus$ and $R_\oplus$ are
the mass and radius of the Earth.


                Question: What is dark matter?
            


<|assistant|>

You can see above that we now have a context input within the prompt. This context is the content of the document(s) that we retrieved from the Qdrant database. With this context, the LLM can generate more relevant responses. So let’s see how it does!

from langchain_core.callbacks import StreamingStdOutCallbackHandler

OLMo with context#

olmo.invoke(
    final_prompt_content, config={"callbacks": [StreamingStdOutCallbackHandler()]}
)

 Dark matter is a theoretical component of the Universe that has yet to be directly observed but its presence can be inferred from the gravitational effects it exerts on visible matter such as stars, gas, and galaxies. According to current astrophysical data, approximately 90% of the content in the universe is dark matter, while only 5% is made up of visible, or baryonic, matter (stars, gas, and dust). Dark matter particles are yet to be directly observed; their existence is inferred from their gravitational effects on visible objects.

One candidate for dark matter is the neutralino, a particle predicted by the supersymmetric extension of the Standard Model of particle physics. Neutralinos have several properties that make them an appealing choice as dark matter candidates. First, they are stable and can persist in the universe until today. Second, they are non-interacting with ordinary matter particles, allowing them to move freely within planets like Earth without being affected by their gravity.

CDOs (Compact Dark Objects) are a possible explanation for the gravitational effects attributed to dark matter. CDOs are hypothetical objects that have been proposed to interact weakly with normal matter and can move freely inside a planet or other large celestial bodies, such as the Earth's inner core. However, their presence in the Earth's inner core is constrained by the data from superconducting gravimeters; they must have masses $m_D\, a < 1.2\times 10^{-13}M_\oplus R_\oplus$, where M_\oplus and R_\oplus are the mass and radius of the Earth, respectively.

It's important to remember that this is just one possibility for the existence of dark matter and there might be other types or forms of dark matter in the universe. As more data becomes available from experiments searching for dark matter relics, scientists will continue to refine their models and searches to better understand its nature.

" Dark matter is a theoretical component of the Universe that has yet to be directly observed but its presence can be inferred from the gravitational effects it exerts on visible matter such as stars, gas, and galaxies. According to current astrophysical data, approximately 90% of the content in the universe is dark matter, while only 5% is made up of visible, or baryonic, matter (stars, gas, and dust). Dark matter particles are yet to be directly observed; their existence is inferred from their gravitational effects on visible objects.\n\nOne candidate for dark matter is the neutralino, a particle predicted by the supersymmetric extension of the Standard Model of particle physics. Neutralinos have several properties that make them an appealing choice as dark matter candidates. First, they are stable and can persist in the universe until today. Second, they are non-interacting with ordinary matter particles, allowing them to move freely within planets like Earth without being affected by their gravity.\n\nCDOs (Compact Dark Objects) are a possible explanation for the gravitational effects attributed to dark matter. CDOs are hypothetical objects that have been proposed to interact weakly with normal matter and can move freely inside a planet or other large celestial bodies, such as the Earth's inner core. However, their presence in the Earth's inner core is constrained by the data from superconducting gravimeters; they must have masses $m_D\\, a < 1.2\\times 10^{-13}M_\\oplus R_\\oplus$, where M_\\oplus and R_\\oplus are the mass and radius of the Earth, respectively.\n\nIt's important to remember that this is just one possibility for the existence of dark matter and there might be other types or forms of dark matter in the universe. As more data becomes available from experiments searching for dark matter relics, scientists will continue to refine their models and searches to better understand its nature."

OLMo without context#

olmo.invoke(question, config={"callbacks": [StreamingStdOutCallbackHandler()]})

 What is dark energy?
These are two of the most interesting, and perhaps most important, questions in modern physics. In this episode of SciFri we explore what these mysterious substances might be, and why they have such a big impact on our understanding of the universe.
Dark Matter is a mysterious substance that seems to make up about 85% of the matter in the Universe. It doesn't interact with light or energy, but it does interact with gravity. This means that it affects how galaxies spin, and it also leaves its mark on the way that stars move through the Milky Way. Scientists believe that Dark Matter is a substance made from something called WIMPs - Weakly Interacting Massive Particles.
Dark Energy, on the other hand, is a mysterious force in the Universe that seems to be pushing the cosmos apart at an ever-increasing rate. It makes up about 68% of the energy in the universe. Scientists think that Dark Energy might be caused by something called "chocolate" - a substance that acts like a negative pressure, pulling things apart even faster than gravity.
Join SciFri's Gerry Canavan and Dr Emma Krumholz as they dive deep into the mysteries of Dark Matter and Dark Energy to see what we can learn about the universe.

' What is dark energy?\nThese are two of the most interesting, and perhaps most important, questions in modern physics. In this episode of SciFri we explore what these mysterious substances might be, and why they have such a big impact on our understanding of the universe.\nDark Matter is a mysterious substance that seems to make up about 85% of the matter in the Universe. It doesn\'t interact with light or energy, but it does interact with gravity. This means that it affects how galaxies spin, and it also leaves its mark on the way that stars move through the Milky Way. Scientists believe that Dark Matter is a substance made from something called WIMPs - Weakly Interacting Massive Particles.\nDark Energy, on the other hand, is a mysterious force in the Universe that seems to be pushing the cosmos apart at an ever-increasing rate. It makes up about 68% of the energy in the universe. Scientists think that Dark Energy might be caused by something called "chocolate" - a substance that acts like a negative pressure, pulling things apart even faster than gravity.\nJoin SciFri\'s Gerry Canavan and Dr Emma Krumholz as they dive deep into the mysteries of Dark Matter and Dark Energy to see what we can learn about the universe.'

From the responses above, we can see that the response with context is more relevant and informative compared to the response without context, an this shows the power of the RAG framework, with just a few documents.

One way to generate the response with OLMo is to build context using the question beforehand, as shown above, create an llm_chain then invoke it with messages.

However, We can further use LangChain’s convenience functions to streamline our pipeline using create_stuff_documents_chain and create_retrieval_chain from the main langchain package.

The main langchain package contains chains, agents, and retrieval strategies that make up an application’s cognitive architecture

create_stuff_documents_chain specifies how retrieved context is fed into a prompt and LLM.

On looking its signature, notice that it accepts prompt argument of type BasePromptTemplate but it needs input keys as context and input.

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

To use the helper functions, we’ll need to setup our template string to use the context and input keys as variables.

# Create a new prompt_template
# so that it accepts `context` and `input` as input_variables
input_string_template = """\
You are an astrophysics expert. Please answer the question on astrophysics based on the following context.
Context: {context}
Question: {input}
"""
transformed_prompt_template = PromptTemplate.from_template(
    prompt_template.partial(
        messages=[{"role": "user", "content": input_string_template}]
    ).format()
)
transformed_prompt_template

PromptTemplate(input_variables=['context', 'input'], template='<|endoftext|>\n\n<|user|>\nYou are an astrophysics expert. Please answer the question on astrophysics based on the following context.\nContext: {context}\nQuestion: {input}\n\n\n\n<|assistant|>\n\n')

document_chain = create_stuff_documents_chain(
    llm=olmo, prompt=transformed_prompt_template
)

We can run this by passing in the context directly:

question = "What is dark matter?"
document_chain.invoke(
    {
        "input": question,
        "context": retriever.invoke(question),
    },
    config={"callbacks": [StreamingStdOutCallbackHandler()]},
)

Dark matter is a theoretical entity that is still not directly observed or detected in our universe. Based on current scientific understanding, it makes up approximately 90% of the matter content in the observable Universe (1). The term "dark" refers to its invisible nature; it does not emit, reflect, nor absorb light and cannot be seen with conventional telescopes.

One possible explanation for dark matter is that it comprises weakly interacting massive particles (WIMPs), which are hypothetical particles predicted by theoretical physics. Although they have yet to be discovered, WIMPs could account for the missing mass in our Universe. Some particle candidates for dark matter include the neutralino mentioned in the context, which is a stable superluminous particle that can naturally acquire a tiny electric charge without violating energy conservation laws (2).

The existence of dark matter is supported by several astronomical and cosmological observations. These include the rotation curves of galaxies, where dark matter's gravitational pull dominates over visible matter; the cosmic microwave background radiation (CMBR) mapping, which reveals the distribution of matter in the Universe based on its temperature; and the large-scale structure formation, which highlights the non-uniform distribution of visible matter clusters.

Although direct detection of dark matter particles remains a challenge, ongoing experiments searching for these particles have become more sensitive every year (3). For instance, a collaboration reported the first indirect evidence of dark matter's existence via gravitational lensing effects on starlight (4). Moreover, superconducting gravimeters can detect signals from compact dark objects (CDOs) within Earth's inner core. However, they require small mass ($m_D\approx 1.2\times 10^{-13}M_\oplus R_\oplus$) and orbital radius ($a\approx 1\times10^{23}-1\times10^{24} m$) to satisfy the constraints (5).

In conclusion, dark matter is a fundamental concept in modern cosmology that remains an active research topic. Its nature remains unknown due to its invisible properties, yet it continues to be studied and sought after to better understand our cosmic composition and structure.

'Dark matter is a theoretical entity that is still not directly observed or detected in our universe. Based on current scientific understanding, it makes up approximately 90% of the matter content in the observable Universe (1). The term "dark" refers to its invisible nature; it does not emit, reflect, nor absorb light and cannot be seen with conventional telescopes.\n\nOne possible explanation for dark matter is that it comprises weakly interacting massive particles (WIMPs), which are hypothetical particles predicted by theoretical physics. Although they have yet to be discovered, WIMPs could account for the missing mass in our Universe. Some particle candidates for dark matter include the neutralino mentioned in the context, which is a stable superluminous particle that can naturally acquire a tiny electric charge without violating energy conservation laws (2).\n\nThe existence of dark matter is supported by several astronomical and cosmological observations. These include the rotation curves of galaxies, where dark matter\'s gravitational pull dominates over visible matter; the cosmic microwave background radiation (CMBR) mapping, which reveals the distribution of matter in the Universe based on its temperature; and the large-scale structure formation, which highlights the non-uniform distribution of visible matter clusters.\n\nAlthough direct detection of dark matter particles remains a challenge, ongoing experiments searching for these particles have become more sensitive every year (3). For instance, a collaboration reported the first indirect evidence of dark matter\'s existence via gravitational lensing effects on starlight (4). Moreover, superconducting gravimeters can detect signals from compact dark objects (CDOs) within Earth\'s inner core. However, they require small mass ($m_D\\approx 1.2\\times 10^{-13}M_\\oplus R_\\oplus$) and orbital radius ($a\\approx 1\\times10^{23}-1\\times10^{24} m$) to satisfy the constraints (5).\n\nIn conclusion, dark matter is a fundamental concept in modern cosmology that remains an active research topic. Its nature remains unknown due to its invisible properties, yet it continues to be studied and sought after to better understand our cosmic composition and structure.'

However, we want the context to be dynamically generated using the passed input or question.

From LangChain’s documentation: create_retrieval_chain adds the retrieval step and propagates the retrieved context through the chain, providing it alongside the final answer. It has input key input, and includes input, context, and answer in its output.

retrieval_chain = create_retrieval_chain(retriever, document_chain)

response = retrieval_chain.invoke(
    {"input": "What is dark matter?"},
    config={"callbacks": [StreamingStdOutCallbackHandler()]},
)

Dark matter is a theoretical entity that is still not directly observed or detected in our universe, despite its significant presence based on gravitational arguments and astronomical observations. According to astrophysics, approximately 90% of the visible mass in the Universe is believed to be non-luminous dark matter, which does not emit, reflect, or absorb light. Dark matter particles are yet to be discovered, but they have been predicted by particle physics based on supersymmetry (SUSY). SUSY predicts that dark matter can take various forms such as the neutralino, a particle that is an inert supersymmetric partner of the known photon and lepton particles.

The term "dark" refers to its inability to be detected or observed using electromagnetic radiation, making it difficult to study directly. Dark matter could exist in various forms like compact dark objects (CDOs) - hypothetical particles with no interaction with normal matter, moving freely inside Earth, and producing a time-dependent signal in a gravimeter due to their low mass and orbital radius.

Despite numerous experiments searching for evidence of dark matter, it remains an enigmatic scientific mystery. However, recent observations claim the first direct evidence of dark matter, which highlights the ongoing efforts to uncover this mysterious substance.

response

{'input': 'What is dark matter?',
 'context': [Document(page_content='  One of the great scientific enigmas still unsolved, the existence of dark\nmatter, is reviewed. Simple gravitational arguments imply that most of the mass\nin the Universe, at least 90%, is some (unknown) non-luminous matter. Some\nparticle candidates for dark matter are discussed with particular emphasis on\nthe neutralino, a particle predicted by the supersymmetric extension of the\nStandard Model of particle physics. Experiments searching for these relic\nparticles, carried out by many groups around the world, are also discussed.\nThese experiments are becoming more sensitive every year and in fact one of the\ncollaborations claims that the first direct evidence for dark matter has\nalready been observed.\n', metadata={'id': 'hep-ph/0110122', 'title': 'The Enigma of the Dark Matter', '_id': '4ab99f7c922747d9a6a34b855d959779', '_collection_name': 'arxiv_astro-ph_abstracts_astropy_github_documentation'}),
  Document(page_content='  Dark matter could be composed of compact dark objects (CDOs). These objects\nmay interact very weakly with normal matter and could move freely {\\it inside}\nthe Earth. A CDO moving in the inner core of the Earth will have an orbital\nperiod near 55 min and produce a time dependent signal in a gravimeter. Data\nfrom superconducting gravimeters rule out such objects moving inside the Earth\nunless their mass $m_D$ and or orbital radius $a$ are very small so that $m_D\\,\na < 1.2\\times 10^{-13}M_\\oplus R_\\oplus$. Here $M_\\oplus$ and $R_\\oplus$ are\nthe mass and radius of the Earth.\n', metadata={'id': 1912.0094, 'title': 'Gravimeter search for compact dark matter objects moving in the Earth', '_id': '97fa0dcbd2aa45d28dfccaa150e724e2', '_collection_name': 'arxiv_astro-ph_abstracts_astropy_github_documentation'})],
 'answer': 'Dark matter is a theoretical entity that is still not directly observed or detected in our universe, despite its significant presence based on gravitational arguments and astronomical observations. According to astrophysics, approximately 90% of the visible mass in the Universe is believed to be non-luminous dark matter, which does not emit, reflect, or absorb light. Dark matter particles are yet to be discovered, but they have been predicted by particle physics based on supersymmetry (SUSY). SUSY predicts that dark matter can take various forms such as the neutralino, a particle that is an inert supersymmetric partner of the known photon and lepton particles.\n\nThe term "dark" refers to its inability to be detected or observed using electromagnetic radiation, making it difficult to study directly. Dark matter could exist in various forms like compact dark objects (CDOs) - hypothetical particles with no interaction with normal matter, moving freely inside Earth, and producing a time-dependent signal in a gravimeter due to their low mass and orbital radius.\n\nDespite numerous experiments searching for evidence of dark matter, it remains an enigmatic scientific mystery. However, recent observations claim the first direct evidence of dark matter, which highlights the ongoing efforts to uncover this mysterious substance.'}

One of the nice things about the LangChain helper function is that the result is a dictionary containing the input, context, and answer keys, so you can easily see what you asked and the context that was used to generate the answer.

This way of creating the RAG pipeline is quick, but not as customizable. If you need more control over the input variables, we’ll need to create our own chain.

In the next module, we’ll explore how to do this to create a simple Panel application that uses the RAG pipeline to generate responses to user questions.

For now let’s clean up the qdrant client by closing it before the next module, otherwise we’ll run into errors!

qdrant.client.close()