1. Introduction to RAG ¶
In my last post on RAG I discussed how to ingest President Kennedy's speeches into a Pinecone vector database and perform semantic search using both Pinecone's API as well as using the Langchain API. I used Pinecone for a vector database since its cloud based, fully managed and of course has a free tier. In this post I will expand upon my prior work and build out a Retrivial Augmented Generation (RAG) pipeline using Langchain. I will deploy this as a Streamlit application to be able to answer questions on President Kennedy.
You may ask what is the point of RAG pipelines? Don't Large Language Models (LLMs) know answers to everything? The answer is most LLMs take a long time to train and are often trained on data that is out of date when people begin to use the model. In order to incorporate more recent data into our LLM we could use fine-tuning, but this can still be time consuming and costly. The other option is to use Retrivial Augmented Generation (RAG). RAG takes your original question and "retrieves" documents from a vector database that are most most semantically related to your qeustion. RAG is able to do semantic search by converting the text in your question and the documents to a numerical vectors using an embedding. The closeness of the document vectors to the question vector (with resepect to a norm) measures the semantic similarity. The original question and the retrieved documents are incorporated into a prompt which is fed into the LLM where they are used as "context" to generate an answer. The entire process is depicted below,

I'll note that building a RAG pipeline was actually much easier than I originally thought which is a testament to the power and simplicity of the Langchain framework!
Let's get started!
I'll start out with all the necessary imports:
# LangChain
from langchain.chains.retrieval import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_pinecone import PineconeVectorStore
# Pinecone VectorDB
from pinecone import Pinecone
from pinecone import ServerlessSpec
import os
# API Keys
from dotenv import load_dotenv
load_dotenv()
True
2. Retriving Documents With Vector (Semantic) Search ¶
First thing we'll do is review retrivial with semantic search again. This is important since I will dicuss a more useful way to interact with the Vector databse using a so-called "retrivier." This functionality will be particularly helpful for a RAG pipeline.
The first thing I need to do is connect to the Pinecone database and make sure the index of vectors corresponding to President Kennedy's speches exists:
index_name = "prez-speeches"
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))
pc.list_indexes()
{'indexes': [{'deletion_protection': 'disabled', 'dimension': 1536, 'host': 'prez-speeches-2307pwa.svc.aped-4627-b74a.pinecone.io', 'metric': 'cosine', 'name': 'prez-speeches', 'spec': {'serverless': {'cloud': 'aws', 'region': 'us-east-1'}}, 'status': {'ready': True, 'state': 'Ready'}}]}
Now that we have confirmed the index exists and is ready for querying we can create the initial connection to the Vector database using the Langchain PineconeVectorStore class. Note that we have to pass the name of the index as well as the embeddings to the class' constructor. It's important that we use the same embeddings here that we used to convert the speeches to numerical vectors in the Pinecone index.
embedding = OpenAIEmbeddings(model='text-embedding-ada-002')
vectordb = PineconeVectorStore(
pinecone_api_key=os.getenv("PINECONE_API_KEY"),
embedding=embedding,
index_name=index_name
)
Now we can perform vector similarity search using the similiarity search function in Langchain. Under the hood this function creates a vector embedding of your question (query) and finds the closest documents using the cosine similiarity score between the embedded question vector and the embedded document vectors. The determination of closest documents to the question are calculated by the "nearest neighbors" algorithm. This process is depicted in image below,

The one thing to note is that I use the async similarity search for funsies and set it to return the top 5 documents.
question = "How did President Kennedy feel about the Berlin Wall?"
results = await vectordb.asimilarity_search(query=question, k=5)
I'll print out the document id's since the actual text for the top 5 will be too long for the screen.
for document in results:
print("Document ID:", document.id)
Document ID: 64fc63a1-79fd-4b40-bf8c-09f0617b9f0f Document ID: 0fa5431f-a374-429e-a622-a1ed1c2b0a21 Document ID: 121366d4-9f46-4f52-8e56-2523bf1c9c8f Document ID: 2da0bf3a-9adc-4dd0-a697-117bc3f0d8b9 Document ID: 4df626ad-0034-45cb-8144-88a21576785d
Now that we understand how to use the vector database to perform "retrivial" using similairty search, let's create a chain that will allow us to query the database and generate a response from the LLM. This will form the basis of a so-called "RAG Pipeline."
3. Building A RAG Pipeline ¶
Now we can use the vector database as a retriever which is a special Langchain Runnable object that takes in a string (query) and returns a list of Langchain Documents. This is depicted below,

We can see this in action,
retriever = vectordb.as_retriever()
print(type(retriever))
<class 'langchain_core.vectorstores.base.VectorStoreRetriever'>
Now we can query the vector database using the invoke
method of the retriever:
documents = retriever.invoke(input=question)
for document in documents:
print("Document ID:", document.id)
Document ID: 64fc63a1-79fd-4b40-bf8c-09f0617b9f0f Document ID: 0fa5431f-a374-429e-a622-a1ed1c2b0a21 Document ID: 121366d4-9f46-4f52-8e56-2523bf1c9c8f Document ID: 2da0bf3a-9adc-4dd0-a697-117bc3f0d8b9
Now let's talk about our prompt for RAG pipeline.
I used the classic rlm/rag-prompt from LangSmith. I couldn't use the original one as the function create_retrieval_chain expects the human input to be a variable input
while the original prompt has the input be question
. The whole prompt is,
from langchain.prompts import PromptTemplate
template = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {input}
Context: {context}
Answer:
"""
prompt = PromptTemplate(
template=template,
input_variables=["input", "context"],
)
Now I'll give an example of how to use this prompt. I'll use the question from the user as well as the documents retrieved from Pinecone as context:
print(
prompt.invoke({
"input": question,
"context": [document.id for document in documents]
}).text
)
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise. Question: How did President Kennedy feel about the Berlin Wall? Context: ['64fc63a1-79fd-4b40-bf8c-09f0617b9f0f', '0fa5431f-a374-429e-a622-a1ed1c2b0a21', '121366d4-9f46-4f52-8e56-2523bf1c9c8f', '2da0bf3a-9adc-4dd0-a697-117bc3f0d8b9'] Answer:
Note I only used the document ids as context in the prompt. This is because printing the actual Langchain Documents would be a lot of text for the screen. However, in a real RAG pipeline we would pass the actual documents to the LLM.
Now we'll move on to create our LLM ChatModel as this object will be needed to write the response to our question.
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
The LLM will be used as the generative part of the RAG pipeline.
The generative component in our RAG pipelien will be created by a function called create_stuff_documents_chain. This function will return a Runnable object and we'll give this object the name generative_chain
:
generate_chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
We can see what makes up this composite Runnable and the components of the chain:
print(generate_chain)
bound=RunnableBinding(bound=RunnableAssign(mapper={ context: RunnableLambda(format_docs) }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[]) | PromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {input} \nContext: {context} \nAnswer:\n") | ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x11b1cc490>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x11c833310>, root_client=<openai.OpenAI object at 0x11cd33e10>, root_async_client=<openai.AsyncOpenAI object at 0x11c86df90>, model_name='gpt-4o-mini', temperature=0.0, model_kwargs={}, openai_api_key=SecretStr('**********')) | StrOutputParser() kwargs={} config={'run_name': 'stuff_documents_chain'} config_factories=[]
Now we can call the chain using the invoke
method and see the answer to our question.
The chain takes in the prompt as input, passes it to the LLM and then the StrOutputParser which will return a string from the LLM instead of the AIMessage (which is the usual return type of a ChatModel).
answer = generate_chain.invoke(
{
'context': documents,
"input": question
}
)
print(answer)
President Kennedy viewed the Berlin Wall as a significant symbol of the failures of the Communist system and an offense against humanity, as it separated families and divided people. He expressed pride in the resilience of West Berlin and emphasized the importance of freedom and the right to make choices. Kennedy's speeches reflected a commitment to supporting the people of Berlin and a broader struggle for freedom worldwide.
Now we can put this all together as a RAG chain by passing the Pinecone Vector database retriever and the generative chain to the create_retrieval_chain. The retriever will take in the input question and perform similarity search and return the documents. These documents along with the input question will be passed to the generate_chain
to return the answer output.
The full RAG chain is below:
rag_chain = create_retrieval_chain(
retriever=retriever,
combine_docs_chain=generate_chain)
The definition of the rag_chain
is a bit different from generate_chain
above and we can see its compontents,
print(rag_chain)
bound=RunnableAssign(mapper={ context: RunnableBinding(bound=RunnableLambda(lambda x: x['input']) | VectorStoreRetriever(tags=['PineconeVectorStore', 'OpenAIEmbeddings'], vectorstore=<langchain_pinecone.vectorstores.PineconeVectorStore object at 0x11c830cd0>, search_kwargs={}), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[]) }) | RunnableAssign(mapper={ answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={ context: RunnableLambda(format_docs) }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[]) | PromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {input} \nContext: {context} \nAnswer:\n") | ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x11b1cc490>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x11c833310>, root_client=<openai.OpenAI object at 0x11cd33e10>, root_async_client=<openai.AsyncOpenAI object at 0x11c86df90>, model_name='gpt-4o-mini', temperature=0.0, model_kwargs={}, openai_api_key=SecretStr('**********')) | StrOutputParser(), kwargs={}, config={'run_name': 'stuff_documents_chain'}, config_factories=[]) }) kwargs={} config={'run_name': 'retrieval_chain'} config_factories=[]
We can see prompts that make up this chain:
rag_chain.get_prompts()
[PromptTemplate(input_variables=['page_content'], input_types={}, partial_variables={}, template='{page_content}'), PromptTemplate(input_variables=['context', 'input'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {input} \nContext: {context} \nAnswer:\n")]
And then test it out,
response = rag_chain.invoke({"input": question})
response
{'input': 'How did President Kennedy feel about the Berlin Wall?', 'context': [Document(id='64fc63a1-79fd-4b40-bf8c-09f0617b9f0f', metadata={'filename': 'berlin-crisis-19610725', 'seq_num': 1.0, 'source': 'gs://prezkennedyspeches/berlin-crisis-19610725.json', 'title': 'Radio and Television Report to the American People on the Berlin Crisis, July 25, 1961', 'url': 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/berlin-crisis-19610725'}, page_content='Listen to the speech. \xa0\xa0 View related documents. \nPresident John F. Kennedy\nThe White House\nJuly 25, 1961\nGood evening:\nSeven weeks ago tonight I returned from Europe to report on my meeting with Premier Khrushchev and the others. His grim warnings about the future of the world, his aide memoire on Berlin, his subsequent speeches and threats which he and his agents have launched, and the increase in the Soviet military budget that he has announced, have all prompted a series of decisions by the Administration and a series of consultations with the members of the NATO organization. In Berlin, as you recall, he intends to bring to an end, through a stroke of the pen, first our legal rights to be in West Berlin --and secondly our ability to make good on our commitment to the two million free people of that city. That we cannot permit.\nWe are clear about what must be done--and we intend to do it. I want to talk frankly with you tonight about the first steps that we shall take. These actions will require sacrifice on the part of many of our citizens. More will be required in the future. They will require, from all of us, courage and perseverance in the years to come. But if we and our allies act out of strength and unity of purpose--with calm determination and steady nerves--using restraint in our words as well as our weapons--I am hopeful that both peace and freedom will be sustained.\nThe immediate threat to free men is in West Berlin. But that isolated outpost is not an isolated problem. The threat is worldwide. Our effort must be equally wide and strong, and not be obsessed by any single manufactured crisis. We face a challenge in Berlin, but there is also a challenge in Southeast Asia, where the borders are less guarded, the enemy harder to find, and the dangers of communism less apparent to those who have so little. We face a challenge in our own hemisphere, and indeed wherever else the freedom of human beings is at stake.'), Document(id='0fa5431f-a374-429e-a622-a1ed1c2b0a21', metadata={'filename': 'berlin-w-germany-rudolph-wilde-platz-19630626', 'seq_num': 1.0, 'source': 'gs://prezkennedyspeches/berlin-w-germany-rudolph-wilde-platz-19630626.json', 'title': 'Remarks of President John F. Kennedy at the Rudolph Wilde Platz, Berlin, June 26, 1963', 'url': 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/berlin-w-germany-rudolph-wilde-platz-19630626'}, page_content='Listen to speech. \xa0\xa0 View related documents. \nPresident John F. Kennedy\nWest Berlin\nJune 26, 1963\n[This version is published in the Public Papers of the Presidents: John F. Kennedy, 1963. Both the text and the audio versions omit the words of the German translator. The audio file was edited by the White House Signal Agency (WHSA) shortly after the speech was recorded. The WHSA was charged with recording only the words of the President. The Kennedy Library has an audiotape of a network broadcast of the full speech, with the translator\'s words, and a journalist\'s commentary. Because of copyright restrictions, it is only available for listening at the Library.]\nI am proud to come to this city as the guest of your distinguished Mayor, who has symbolized throughout the world the fighting spirit of West Berlin. And I am proud to visit the Federal Republic with your distinguished Chancellor who for so many years has committed Germany to democracy and freedom and progress, and to come here in the company of my fellow American, General Clay, who has been in this city during its great moments of crisis and will come again if ever needed.\nTwo thousand years ago the proudest boast was "civis Romanus sum." Today, in the world of freedom, the proudest boast is "Ich bin ein Berliner."\nI appreciate my interpreter translating my German!\nThere are many people in the world who really don\'t understand, or say they don\'t, what is the great issue between the free world and the Communist world. Let them come to Berlin. There are some who say that communism is the wave of the future. Let them come to Berlin. And there are some who say in Europe and elsewhere we can work with the Communists. Let them come to Berlin. And there are even a few who say that it is true that communism is an evil system, but it permits us to make economic progress. Lass\' sie nach Berlin kommen. Let them come to Berlin.'), Document(id='121366d4-9f46-4f52-8e56-2523bf1c9c8f', metadata={'filename': 'berlin-w-germany-rudolph-wilde-platz-19630626', 'seq_num': 1.0, 'source': 'gs://prezkennedyspeches/berlin-w-germany-rudolph-wilde-platz-19630626.json', 'title': 'Remarks of President John F. Kennedy at the Rudolph Wilde Platz, Berlin, June 26, 1963', 'url': 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/berlin-w-germany-rudolph-wilde-platz-19630626'}, page_content='Freedom has many difficulties and democracy is not perfect, but we have never had to put a wall up to keep our people in, to prevent them from leaving us. I want to say, on behalf of my countrymen, who live many miles away on the other side of the Atlantic, who are far distant from you, that they take the greatest pride that they have been able to share with you, even from a distance, the story of the last 18 years. I know of no town, no city, that has been besieged for 18 years that still lives with the vitality and the force, and the hope and the determination of the city of West Berlin. While the wall is the most obvious and vivid demonstration of the failures of the Communist system, for all the world to see, we take no satisfaction in it, for it is, as your Mayor has said, an offense not only against history but an offense against humanity, separating families, dividing husbands and wives and brothers and sisters, and dividing a people who wish to be joined together.\nWhat is true of this city is true of Germany--real, lasting peace in Europe can never be assured as long as one German out of four is denied the elementary right of free men, and that is to make a free choice. In 18 years of peace and good faith, this generation of Germans has earned the right to be free, including the right to unite their families and their nation in lasting peace, with good will to all people. You live in a defended island of freedom, but your life is part of the main. So let me ask you as I close, to lift your eyes beyond the dangers of today, to the hopes of tomorrow, beyond the freedom merely of this city of Berlin, or your country of Germany, to the advance of freedom everywhere, beyond the wall to the day of peace with justice, beyond yourselves and ourselves to all mankind.'), Document(id='2da0bf3a-9adc-4dd0-a697-117bc3f0d8b9', metadata={'filename': 'american-society-of-newspaper-editors-19610420', 'seq_num': 1.0, 'source': 'gs://prezkennedyspeches/american-society-of-newspaper-editors-19610420.json', 'title': 'Address before the American Society of Newspaper Editors, Washington, D.C., April 20, 1961', 'url': 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/american-society-of-newspaper-editors-19610420'}, page_content='Listen to the speech.\xa0 \xa0 View related documents. \nPresident John F. Kennedy\nStatler Hilton Hotel, Washington, D.C.\nApril 20, 1961\nMr. Catledge, members of the American Society of Newspaper Editors, ladies and gentlemen:\nThe President of a great democracy such as ours, and the editors of great newspapers such as yours, owe a common obligation to the people: an obligation to present the facts, to present them with candor, and to present them in perspective. It is with that obligation in mind that I have decided in the last 24 hours to discuss briefly at this time the recent events in Cuba.\nOn that unhappy island, as in so many other arenas of the contest for freedom, the news has grown worse instead of better. I have emphasized before that this was a struggle of Cuban patriots against a Cuban dictator. While we could not be expected to hide our sympathies, we made it repeatedly clear that the armed forces of this country would not intervene in any way.\nAny unilateral American intervention, in the absence of an external attack upon ourselves or an ally, would have been contrary to our traditions and to our international obligations. But let the record show that our restraint is not inexhaustible. Should it ever appear that the inter-American doctrine of non-interference merely conceals or excuses a policy of nonaction-if the nations of this Hemisphere should fail to meet their commitments against outside Communist penetration-then I want it clearly understood that this Government will not hesitate in meeting its primary obligations which are to the security of our Nation!')], 'answer': "President Kennedy viewed the Berlin Wall as a significant symbol of the failures of the Communist system, stating that it was an offense against humanity that separated families and divided people. He expressed pride in the resilience of West Berlin and emphasized the importance of freedom and the right to make free choices. Kennedy's speeches reflected a commitment to supporting the people of Berlin and a determination to uphold democratic values in the face of Communist threats."}
The response will be a dictionary that contains the input question and the answer generated by the model. It also includes the context for which are all documents that were the most semantically related to our question and passed to the LLM to use to generate an answer.
We can see the associated data with context reference documents which will be important for our deployment.
references = [(doc.metadata["title"],
doc.page_content, doc.metadata["url"])
for doc in response['context']]
references
[('Radio and Television Report to the American People on the Berlin Crisis, July 25, 1961', 'Listen to the speech. \xa0\xa0 View related documents. \nPresident John F. Kennedy\nThe White House\nJuly 25, 1961\nGood evening:\nSeven weeks ago tonight I returned from Europe to report on my meeting with Premier Khrushchev and the others. His grim warnings about the future of the world, his aide memoire on Berlin, his subsequent speeches and threats which he and his agents have launched, and the increase in the Soviet military budget that he has announced, have all prompted a series of decisions by the Administration and a series of consultations with the members of the NATO organization. In Berlin, as you recall, he intends to bring to an end, through a stroke of the pen, first our legal rights to be in West Berlin --and secondly our ability to make good on our commitment to the two million free people of that city. That we cannot permit.\nWe are clear about what must be done--and we intend to do it. I want to talk frankly with you tonight about the first steps that we shall take. These actions will require sacrifice on the part of many of our citizens. More will be required in the future. They will require, from all of us, courage and perseverance in the years to come. But if we and our allies act out of strength and unity of purpose--with calm determination and steady nerves--using restraint in our words as well as our weapons--I am hopeful that both peace and freedom will be sustained.\nThe immediate threat to free men is in West Berlin. But that isolated outpost is not an isolated problem. The threat is worldwide. Our effort must be equally wide and strong, and not be obsessed by any single manufactured crisis. We face a challenge in Berlin, but there is also a challenge in Southeast Asia, where the borders are less guarded, the enemy harder to find, and the dangers of communism less apparent to those who have so little. We face a challenge in our own hemisphere, and indeed wherever else the freedom of human beings is at stake.', 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/berlin-crisis-19610725'), ('Remarks of President John F. Kennedy at the Rudolph Wilde Platz, Berlin, June 26, 1963', 'Listen to speech. \xa0\xa0 View related documents. \nPresident John F. Kennedy\nWest Berlin\nJune 26, 1963\n[This version is published in the Public Papers of the Presidents: John F. Kennedy, 1963. Both the text and the audio versions omit the words of the German translator. The audio file was edited by the White House Signal Agency (WHSA) shortly after the speech was recorded. The WHSA was charged with recording only the words of the President. The Kennedy Library has an audiotape of a network broadcast of the full speech, with the translator\'s words, and a journalist\'s commentary. Because of copyright restrictions, it is only available for listening at the Library.]\nI am proud to come to this city as the guest of your distinguished Mayor, who has symbolized throughout the world the fighting spirit of West Berlin. And I am proud to visit the Federal Republic with your distinguished Chancellor who for so many years has committed Germany to democracy and freedom and progress, and to come here in the company of my fellow American, General Clay, who has been in this city during its great moments of crisis and will come again if ever needed.\nTwo thousand years ago the proudest boast was "civis Romanus sum." Today, in the world of freedom, the proudest boast is "Ich bin ein Berliner."\nI appreciate my interpreter translating my German!\nThere are many people in the world who really don\'t understand, or say they don\'t, what is the great issue between the free world and the Communist world. Let them come to Berlin. There are some who say that communism is the wave of the future. Let them come to Berlin. And there are some who say in Europe and elsewhere we can work with the Communists. Let them come to Berlin. And there are even a few who say that it is true that communism is an evil system, but it permits us to make economic progress. Lass\' sie nach Berlin kommen. Let them come to Berlin.', 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/berlin-w-germany-rudolph-wilde-platz-19630626'), ('Remarks of President John F. Kennedy at the Rudolph Wilde Platz, Berlin, June 26, 1963', 'Freedom has many difficulties and democracy is not perfect, but we have never had to put a wall up to keep our people in, to prevent them from leaving us. I want to say, on behalf of my countrymen, who live many miles away on the other side of the Atlantic, who are far distant from you, that they take the greatest pride that they have been able to share with you, even from a distance, the story of the last 18 years. I know of no town, no city, that has been besieged for 18 years that still lives with the vitality and the force, and the hope and the determination of the city of West Berlin. While the wall is the most obvious and vivid demonstration of the failures of the Communist system, for all the world to see, we take no satisfaction in it, for it is, as your Mayor has said, an offense not only against history but an offense against humanity, separating families, dividing husbands and wives and brothers and sisters, and dividing a people who wish to be joined together.\nWhat is true of this city is true of Germany--real, lasting peace in Europe can never be assured as long as one German out of four is denied the elementary right of free men, and that is to make a free choice. In 18 years of peace and good faith, this generation of Germans has earned the right to be free, including the right to unite their families and their nation in lasting peace, with good will to all people. You live in a defended island of freedom, but your life is part of the main. So let me ask you as I close, to lift your eyes beyond the dangers of today, to the hopes of tomorrow, beyond the freedom merely of this city of Berlin, or your country of Germany, to the advance of freedom everywhere, beyond the wall to the day of peace with justice, beyond yourselves and ourselves to all mankind.', 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/berlin-w-germany-rudolph-wilde-platz-19630626'), ('Address before the American Society of Newspaper Editors, Washington, D.C., April 20, 1961', 'Listen to the speech.\xa0 \xa0 View related documents. \nPresident John F. Kennedy\nStatler Hilton Hotel, Washington, D.C.\nApril 20, 1961\nMr. Catledge, members of the American Society of Newspaper Editors, ladies and gentlemen:\nThe President of a great democracy such as ours, and the editors of great newspapers such as yours, owe a common obligation to the people: an obligation to present the facts, to present them with candor, and to present them in perspective. It is with that obligation in mind that I have decided in the last 24 hours to discuss briefly at this time the recent events in Cuba.\nOn that unhappy island, as in so many other arenas of the contest for freedom, the news has grown worse instead of better. I have emphasized before that this was a struggle of Cuban patriots against a Cuban dictator. While we could not be expected to hide our sympathies, we made it repeatedly clear that the armed forces of this country would not intervene in any way.\nAny unilateral American intervention, in the absence of an external attack upon ourselves or an ally, would have been contrary to our traditions and to our international obligations. But let the record show that our restraint is not inexhaustible. Should it ever appear that the inter-American doctrine of non-interference merely conceals or excuses a policy of nonaction-if the nations of this Hemisphere should fail to meet their commitments against outside Communist penetration-then I want it clearly understood that this Government will not hesitate in meeting its primary obligations which are to the security of our Nation!', 'https://www.jfklibrary.org//archives/other-resources/john-f-kennedy-speeches/american-society-of-newspaper-editors-19610420')]
4. Deploying A RAG Application ¶
Now in order to deploy this in a Streamlit App I'll create a function that called ask_question that takes in a question
and an index_name
for the vector database, it then runs all the logic we went through above and returns the response dictionary. I'll then print the answer from the LLM and then print out the retrieved documents as sources for the with the title as the speech and the the url as a hyperlink. The entire streamlit app with an example is shown below,

I won't go through the process of deploying this app to Google Cloud Run as I have covered that pretty extensively in a prior post.
5. Conclusions ¶
In this post I covered the basics of creating a Retrivial Augumented Generation (RAG) App using Langchain and deploying it as a Streamlit App. The RAG application is based on Speeches made by President Kenendy and were stored in a Pinecone Vector database. In a future post I will go over methods of evaluating and testing the RAG pipeline, but this is enough for now. Hope you enjoyed it!