langroid
diff --git a/‎presidential-speeches-rag/presidential-speeches-rag.ipynb‎
Lines changed: 14 additions & 7 deletions b/‎presidential-speeches-rag/presidential-speeches-rag.ipynb‎
Lines changed: 14 additions & 7 deletions
@@ -9,11 +9,12 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "id": "d7a4fc92-eb9a-4273-8ff6-0fc5b96236d7",
  "metadata": {},
  "source": [
- "Retrieval-Augmented Generation (RAG) is a widely-used technique that enables us to gather pertinent information from an external data source and provide it to our large language model (LLM). It helps solve two of the biggest limitations of LLMs: knowledge cutoffs, in which information after a certain date or for a specific source is not available to the LLM, and hallucination, in which the LLM makes up an answer to a question it doesn't have the information for. With RAG, we can ensure that the LLM has relevant information to answer the question at hand."
+ "Retrieval-Augmented Generation (RAG) is a widely-used technique that enables us to gather pertinent information from an external data source and provide it to our Large Language Model (LLM). It helps solve two of the biggest limitations of LLMs: knowledge cutoffs, in which information after a certain date or for a specific source is not available to the LLM, and hallucination, in which the LLM makes up an answer to a question it doesn't have the information for. With RAG, we can ensure that the LLM has relevant information to answer the question at hand."
  ]
  },
  {
@@ -87,11 +88,12 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "id": "283183cd-ba64-4e98-a0d9-a6165e88494e",
  "metadata": {},
  "source": [
- "The presidential speeches we'll be using are stored are in this [.csv file](https://github.com/groq/groq-api-cookbook/blob/main/presidential-speeches-rag/presidential_speeches.csv). Each row of the .csv contains fields for the date, president, party, speech title, speech summary and speech transcript, and includes every recorded presidential speech through the Trump presidency:"
+ "The presidential speeches we'll be using are stored in this [.csv file](https://github.com/groq/groq-api-cookbook/blob/main/presidential-speeches-rag/presidential_speeches.csv). Each row of the .csv contains fields for the date, president, party, speech title, speech summary and speech transcript, and includes every recorded presidential speech through the Trump presidency:"
  ]
  },
  {
@@ -252,11 +254,12 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "id": "8e1a9811-2fd6-4c99-ba11-a9df2df33ec0",
  "metadata": {},
  "source": [
- "A challenge with prompting LLMs can be running into limits with their context window. While this speech is not extremely long and would actually in Mixtral's context window, it is not always great practice to use way more of the context window than you need, so when using RAG we want to split up the text to provide only relevant parts of it to the LLM. To do so, we first need to ```tokenize``` the transcript. We'll use the Mixtral 8x7b tokenzier with the transformers AutoTokenizer class for this - this will show the number of tokens the Mixtral 8x7b model counts in Garfield's Inaugural Address:"
+ "A challenge with prompting LLMs can be running into limits with their context window. While this speech is not extremely long and would actually fit in Mixtral's context window, it is not always great practice to use way more of the context window than you need, so when using RAG we want to split up the text to provide only relevant parts of it to the LLM. To do so, we first need to ```tokenize``` the transcript. We'll use the Mixtral 8x7b tokenzier with the transformers AutoTokenizer class for this - this will show the number of tokens the Mixtral 8x7b model counts in Garfield's Inaugural Address:"
  ]
  },
  {
@@ -335,11 +338,12 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "id": "ce723eea-7e69-48c1-8452-957709d117db",
  "metadata": {},
  "source": [
- "Next, we will embed each chunk into a semantic vector space using the all-MiniLM-L6-v2 model, through LangChain's implementation of Sentence Transformers from [HuggingFace](https://huggingface.co/). Note that each embedding has a length of 384."
+ "Next, we will embed each chunk into a semantic vector space using the all-MiniLM-L6-v2 model, through LangChain's implementation of Sentence Transformers from [HuggingFace](https://huggingface.co/sentence-transformers). Note that each embedding has a length of 384."
  ]
  },
  {
@@ -368,11 +372,12 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "id": "4a52f835-69f8-465d-a06e-fb5e31656b37",
  "metadata": {},
  "source": [
- "Finally, we will embed our prompt and use cosine similarity to find the most relevant chunk to the question we'd like answered"
+ "Finally, we will embed our prompt and use cosine similarity to find the most relevant chunk to the question we'd like answered:"
  ]
  },
  {
@@ -505,11 +510,12 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "id": "eec976bc-5f33-49bc-a61a-5f2ee4a293d6",
  "metadata": {},
  "source": [
- "I will be using a Pinecone index called `presidential-speeches` for this demo. As mentioned above, you can sign up for Pinecone's Starter plan for free and have access to a single index, which is ideal for a small personal project. You can also use Chroma DB as an open source alternative. Note that either Vector DB will use the same embedding function we've defined above"
+ "I will be using a Pinecone index called `presidential-speeches` for this demo. As mentioned above, you can sign up for Pinecone's Starter plan for free and have access to a single index, which is ideal for a small personal project. You can also use Chroma DB as an open source alternative. Note that either Vector DB will use the same embedding function we've defined above:"
  ]
  },
  {
@@ -623,11 +629,12 @@
  ]
  },
  {
+ "attachments": {},
  "cell_type": "markdown",
  "id": "c9b86556-d31d-4896-a342-8eff9d9fb48b",
  "metadata": {},
  "source": [
- "In this notebook we've shown how to implement a RAG system using Groq API, LangChain and Pinecone by embedding, storing and searching over nearly 1,000 speeches from US presidents. By embedding speech transcripts into a vector database and leveraging the power of semantic search, we have demonstrated how to overcome two of the most significant challenges faced by large language models (LLMs): the knowledge cutoff and hallucination issues.\n",
+ "In this notebook we've shown how to implement a RAG system using Groq API, LangChain and Pinecone by embedding, storing and searching over nearly 1,000 speeches from US presidents. By embedding speech transcripts into a vector database and leveraging the power of semantic search, we have demonstrated how to overcome two of the most significant challenges faced by LLMs: the knowledge cutoff and hallucination issues.\n",
  "\n",
  "You can interact with this RAG application here: https://presidential-speeches-rag.streamlit.app/"
  ]
Original file line number	Diff line number	Diff line change
`@@ -9,11 +9,12 @@`
`9`	`9`	`]`
`10`	`10`	`},`
`11`	`11`	`{`
	`12`	`+ "attachments": {},`
`12`	`13`	`"cell_type": "markdown",`
`13`	`14`	`"id": "d7a4fc92-eb9a-4273-8ff6-0fc5b96236d7",`
`14`	`15`	`"metadata": {},`
`15`	`16`	`"source": [`
`16`		- "Retrieval-Augmented Generation (RAG) is a widely-used technique that enables us to gather pertinent information from an external data source and provide it to our large language model (LLM). It helps solve two of the biggest limitations of LLMs: knowledge cutoffs, in which information after a certain date or for a specific source is not available to the LLM, and hallucination, in which the LLM makes up an answer to a question it doesn't have the information for. With RAG, we can ensure that the LLM has relevant information to answer the question at hand."
	`17`	+ "Retrieval-Augmented Generation (RAG) is a widely-used technique that enables us to gather pertinent information from an external data source and provide it to our Large Language Model (LLM). It helps solve two of the biggest limitations of LLMs: knowledge cutoffs, in which information after a certain date or for a specific source is not available to the LLM, and hallucination, in which the LLM makes up an answer to a question it doesn't have the information for. With RAG, we can ensure that the LLM has relevant information to answer the question at hand."
`17`	`18`	`]`
`18`	`19`	`},`
`19`	`20`	`{`
`@@ -87,11 +88,12 @@`
`87`	`88`	`]`
`88`	`89`	`},`
`89`	`90`	`{`
	`91`	`+ "attachments": {},`
`90`	`92`	`"cell_type": "markdown",`
`91`	`93`	`"id": "283183cd-ba64-4e98-a0d9-a6165e88494e",`
`92`	`94`	`"metadata": {},`
`93`	`95`	`"source": [`
`94`		`- "The presidential speeches we'll be using are stored are in this [.csv file](https://github.com/groq/groq-api-cookbook/blob/main/presidential-speeches-rag/presidential_speeches.csv). Each row of the .csv contains fields for the date, president, party, speech title, speech summary and speech transcript, and includes every recorded presidential speech through the Trump presidency:"`
	`96`	`+ "The presidential speeches we'll be using are stored in this [.csv file](https://github.com/groq/groq-api-cookbook/blob/main/presidential-speeches-rag/presidential_speeches.csv). Each row of the .csv contains fields for the date, president, party, speech title, speech summary and speech transcript, and includes every recorded presidential speech through the Trump presidency:"`
`95`	`97`	`]`
`96`	`98`	`},`
`97`	`99`	`{`
`@@ -252,11 +254,12 @@`
`252`	`254`	`]`
`253`	`255`	`},`
`254`	`256`	`{`
	`257`	`+ "attachments": {},`
`255`	`258`	`"cell_type": "markdown",`
`256`	`259`	`"id": "8e1a9811-2fd6-4c99-ba11-a9df2df33ec0",`
`257`	`260`	`"metadata": {},`
`258`	`261`	`"source": [`
`259`		- "A challenge with prompting LLMs can be running into limits with their context window. While this speech is not extremely long and would actually in Mixtral's context window, it is not always great practice to use way more of the context window than you need, so when using RAG we want to split up the text to provide only relevant parts of it to the LLM. To do so, we first need to ```tokenize``` the transcript. We'll use the Mixtral 8x7b tokenzier with the transformers AutoTokenizer class for this - this will show the number of tokens the Mixtral 8x7b model counts in Garfield's Inaugural Address:"
	`262`	+ "A challenge with prompting LLMs can be running into limits with their context window. While this speech is not extremely long and would actually fit in Mixtral's context window, it is not always great practice to use way more of the context window than you need, so when using RAG we want to split up the text to provide only relevant parts of it to the LLM. To do so, we first need to ```tokenize``` the transcript. We'll use the Mixtral 8x7b tokenzier with the transformers AutoTokenizer class for this - this will show the number of tokens the Mixtral 8x7b model counts in Garfield's Inaugural Address:"
`260`	`263`	`]`
`261`	`264`	`},`
`262`	`265`	`{`
`@@ -335,11 +338,12 @@`
`335`	`338`	`]`
`336`	`339`	`},`
`337`	`340`	`{`
	`341`	`+ "attachments": {},`
`338`	`342`	`"cell_type": "markdown",`
`339`	`343`	`"id": "ce723eea-7e69-48c1-8452-957709d117db",`
`340`	`344`	`"metadata": {},`
`341`	`345`	`"source": [`
`342`		`- "Next, we will embed each chunk into a semantic vector space using the all-MiniLM-L6-v2 model, through LangChain's implementation of Sentence Transformers from [HuggingFace](https://huggingface.co/). Note that each embedding has a length of 384."`
	`346`	`+ "Next, we will embed each chunk into a semantic vector space using the all-MiniLM-L6-v2 model, through LangChain's implementation of Sentence Transformers from [HuggingFace](https://huggingface.co/sentence-transformers). Note that each embedding has a length of 384."`
`343`	`347`	`]`
`344`	`348`	`},`
`345`	`349`	`{`
`@@ -368,11 +372,12 @@`
`368`	`372`	`]`
`369`	`373`	`},`
`370`	`374`	`{`
	`375`	`+ "attachments": {},`
`371`	`376`	`"cell_type": "markdown",`
`372`	`377`	`"id": "4a52f835-69f8-465d-a06e-fb5e31656b37",`
`373`	`378`	`"metadata": {},`
`374`	`379`	`"source": [`
`375`		`- "Finally, we will embed our prompt and use cosine similarity to find the most relevant chunk to the question we'd like answered"`
	`380`	`+ "Finally, we will embed our prompt and use cosine similarity to find the most relevant chunk to the question we'd like answered:"`
`376`	`381`	`]`
`377`	`382`	`},`
`378`	`383`	`{`
`@@ -505,11 +510,12 @@`
`505`	`510`	`]`
`506`	`511`	`},`
`507`	`512`	`{`
	`513`	`+ "attachments": {},`
`508`	`514`	`"cell_type": "markdown",`
`509`	`515`	`"id": "eec976bc-5f33-49bc-a61a-5f2ee4a293d6",`
`510`	`516`	`"metadata": {},`
`511`	`517`	`"source": [`
`512`		- "I will be using a Pinecone index called `presidential-speeches` for this demo. As mentioned above, you can sign up for Pinecone's Starter plan for free and have access to a single index, which is ideal for a small personal project. You can also use Chroma DB as an open source alternative. Note that either Vector DB will use the same embedding function we've defined above"
	`518`	+ "I will be using a Pinecone index called `presidential-speeches` for this demo. As mentioned above, you can sign up for Pinecone's Starter plan for free and have access to a single index, which is ideal for a small personal project. You can also use Chroma DB as an open source alternative. Note that either Vector DB will use the same embedding function we've defined above:"
`513`	`519`	`]`
`514`	`520`	`},`
`515`	`521`	`{`
`@@ -623,11 +629,12 @@`
`623`	`629`	`]`
`624`	`630`	`},`
`625`	`631`	`{`
	`632`	`+ "attachments": {},`
`626`	`633`	`"cell_type": "markdown",`
`627`	`634`	`"id": "c9b86556-d31d-4896-a342-8eff9d9fb48b",`
`628`	`635`	`"metadata": {},`
`629`	`636`	`"source": [`
`630`		`- "In this notebook we've shown how to implement a RAG system using Groq API, LangChain and Pinecone by embedding, storing and searching over nearly 1,000 speeches from US presidents. By embedding speech transcripts into a vector database and leveraging the power of semantic search, we have demonstrated how to overcome two of the most significant challenges faced by large language models (LLMs): the knowledge cutoff and hallucination issues.\n",`
	`637`	`+ "In this notebook we've shown how to implement a RAG system using Groq API, LangChain and Pinecone by embedding, storing and searching over nearly 1,000 speeches from US presidents. By embedding speech transcripts into a vector database and leveraging the power of semantic search, we have demonstrated how to overcome two of the most significant challenges faced by LLMs: the knowledge cutoff and hallucination issues.\n",`
`631`	`638`	`"\n",`
`632`	`639`	`"You can interact with this RAG application here: https://presidential-speeches-rag.streamlit.app/"`
`633`	`640`	`]`