Algorithms Prompting with OLMo#

At University of Washington, SSEC, we are very fortunate to have an in-house group of software engineers who have came up with a very interesting set of questions for us to try out with OLMo:

  • What is best method for multiplying large numbers?

  • What are the steps for solving dynamic programming problems?

  • What data structures can be used to represent graphs?

  • What’s a minimum spanning tree? How can one find it for a given graph?

  • How can you prove that a problem is NP-hard?

  • What is the runtime of MergeSort?

  • Which search algorithms are better choices for sorted data vs for unsorted data, and why?

  • Explain why binary search’s time complexity is O(logn) on a sorted set.

  • Does recursion make space complexity a more or less important consideration, and why?

  • Which data structures have the most efficient lookup time?

  • What is circuit satisfiability?

Let’s analyze how OLMo does without any additional context with these questions or your own specific domain questions.

Set up the OLMo Model and Prompt#

We’ll begin with a recap of the previous module, setting up the OLMo model and prompt.

from langchain_community.llms import LlamaCpp
from langchain_core.callbacks import StreamingStdOutCallbackHandler
from ssec_tutorials import download_olmo_model
OLMO_MODEL = download_olmo_model()

This time during the model setup, we’ll try to increase the n_ctx, input context length, to 2048 tokens and the max_tokens, maximum tokens generated by the model, to 512 tokens. This is so later we can really expand on the questions that we ask the model and get a more expansive answer.

olmo = LlamaCpp(
    model_path=str(OLMO_MODEL),
    callbacks=[StreamingStdOutCallbackHandler()],
    temperature=0.8,
    verbose=False,
    n_ctx=2048,  # input context length
    max_tokens=512,  # max tokens to generate
)

Now that we have the model ready, let’s setup the prompt template like before, using the internal chat template.

from langchain_core.prompts import PromptTemplate
# Create a prompt template using OLMo's tokenizer chat template we saw in module 1.
prompt_template = PromptTemplate.from_template(
    template=olmo.client.metadata["tokenizer.chat_template"],
    template_format="jinja2",
    partial_variables={"add_generation_prompt": True, "eos_token": "<|endoftext|>"},
)

We again use the partial variables here to fill out the add_generation_prompt and eos_token fields. So that we’re left with just the messages input variables.

prompt_template.input_variables

We have the prompt template ready, let’s move on to the next step, and create a prompt for the model. For simplicity of this tutorial, we’ll only use one message, user input to the model. This means we’ll only ask the model a single question at a time, rather than a series of questions that can feed of each other.

import textwrap  # a module to wrap text to make it more readable

Just like before, we’ll start by checking out what our full prompt text is going to look like. In this example, we’ve also used a handy built-in python module called textwrap to wrap the text to a certain width. We are using this to dedent the extra spaces to make it look cleaner.

# Test the prompt you want to send to OLMo.
question = "What is the best method for multiplying large numbers?"
input_content = textwrap.dedent(
    f"""\
    You are an algorithms expert. Please answer the following question on algorithms.
    Question: {question}
"""
)
input_messages = [
    {
        "role": "user",
        "content": input_content,
    }
]

full_prompt_text = prompt_template.format(messages=input_messages)
print(full_prompt_text)

Our prompt looks good. Let’s now make a chain and invoke it.

# Chain the prompt template and olmo
llm_chain = prompt_template | olmo
# Invoke the chain with a question and other parameters.
captured_answer = llm_chain.invoke({"messages": input_messages})

Great! At this point we have reviewed essentially the first 3 notebooks of this module. But to ask different questions, we’ll need a way to pass in different questions to the chain. We know that we can just create new values for question, input_content, and input_messages variables, but that’s a lot of work and formatting to do every time we want to ask a new question. So what can we do?

Partial prompt#

We will now introduce a new concept called partial formatting. By using this feature, we can expand the input variables to be ones that we can easily change and pass in new values to. Essentially, we are creating a new prompt template from the underlying model template.

We’ve seen this feature in module 1 and above with the use of partial_variables in the model setup. This time, since we know that we’re only using one message, we can simplify the prompt template to take variables question and instruction.

First, let’s create a simple prompt template string that takes in the variables we want to pass in.

input_prompt_template = textwrap.dedent(
    """\
{instruction}

Question: {question}
"""
)

Notice that the above prompt template string is NOT an f-string, but rather the simple string, like the ones you’ve created in module 1.

Now that we have the prompt template string ready, let’s create a partial formatting from it. Remember that prompt_template is a String PromptTemplate object that contains the original jinja-2 template string with the variables add_generation_prompt and eos_token filled in. The only variable left is messages, which we will create a partial formatting with.

prompt_template
partial_prompt_template = prompt_template.partial(
    messages=[
        {
            "role": "user",
            "content": input_prompt_template,
        }
    ]
)
partial_prompt_template

As you can see above, the partial formatting is simply filling in the variables messages and now we’re left with no input_variables. So at this point, how can we create a new prompt template from this?

The answer is pretty straightforward. Let’s just call the .format and get the “final” prompt template string.

new_prompt_string = partial_prompt_template.format()
new_prompt_string

Now we have a simple prompt string that we can create a String PromptTemplate from.

new_prompt_template = PromptTemplate.from_template(new_prompt_string)
new_prompt_template

You can see now that the new prompt template takes in instruction and question. Let’s create a new chain and invoke it with this new prompt template.

Q&A Session with OLMo#

We’ll first create a single domain instruction, since we know that we’re asking questions about algorithms.

domain_instruction = (
    "You are an algorithms expert. Please answer the following question on algorithms."
)
question = "How can you prove that a problem is NP-Hard?"
llm_chain = new_prompt_template.partial(instruction=domain_instruction) | olmo
llm_chain.invoke({"question": question})

Your Turn 😎#

You have two options:

  1. Use the questions provided at the beginning of this notebook and reuse the llm chain to ask questions about algorithms.

  2. With the new prompt template new_prompt_template and the olmo model. Create a new chain with a different domain instruction, and ask questions about that domain.

Feel free to ask any questions you like, and see how OLMo responds to them! If you’re open to sharing, we’d love to hear about the questions you asked and the responses you received in the Zoom chat.

# Write your code here