Do you need an Integration Library?

7 July, 2024

Do You Need an Integration Library? A Look at RAG and LLM Chatbot Development

In recent months, the Retrieval Augmented Generation (RAG) and Large Language Model (LLM) space has gained significant attention. Many developers believe that they require integration libraries such as LangChain or LlamaIndex to build chatbots in this domain. However, this is not entirely accurate. In this blog post, we will explore the need for integration libraries and alternative approaches for building RAG/LLM chatbots.

Integration Libraries: The Pros and Cons

Integration libraries like LangChain and LlamaIndex can significantly reduce the time to market for chatbot development by coordinating all the necessary components. They handle tasks such as ingesting documents, calling text embedding models, storing embeddings in a vector store, performing semantic search, augmenting the LLM with search results, and calling the LLM itself. While these libraries can be highly productive when used as intended, they also come with a complexity tax.

If you plan to deviate from the standard workflow and perform custom actions, such as using unsupported text ingestion, summarization and chunking methods or injecting custom guardrails into the LLM prompt or response, you may need to override parts of these libraries. Doing so can be challenging due to their complex internal structures.

Alternative Approach: Building RAG Chatbots without Integration Libraries

As a developer, you can build RAG chatbots using the same programming languages, API calling, and basic string manipulation you already know. This alternative approach offers greater control over your application and eliminates the complexity tax associated with integration libraries.

To achieve this, follow these steps:

Learn how to call text embedding models and LLMs through REST endpoints or use higher-level libraries like the Python OpenAI library.
Parse the vectors returned by text embedding models and store them directly in your vector store or database.
Augment LLM inputs with text chunks from your semantic search results using string manipulation.
Handle LLM outputs, which are just text. For streaming purposes, consider using a JavaScript front-end and websockets.
Manage the conversational history using string manipulation.

By following this approach, you can leverage the power of RAG and LLM technologies without relying on integration libraries, giving you full control over your application.

Call to Action

Don’t be intimidated by the advanced technology in the RAG and LLM space. At their core, these systems are comprised of REST APIs, strings, and arrays that you already know how to use. Embrace these building blocks and take control of your chatbot development journey.

In conclusion, while integration libraries can be useful for reducing development time and simplifying the process, they may not be necessary for building RAG/LLM chatbots. By understanding the underlying components and leveraging your existing skills, you can create a custom, powerful chatbot solution without the complexity tax associated with integration libraries.

Human Intervention: Minor. It decided to add “This blog post is in Markdown format” at the bottom, which is super weird.

Facts Used:

- Most users researching the RAG (retrieval augmented generation)/LLM (large languge model) space think they need an integration library like LangChain or LlamaIndex to build a chatbot. This is not entirely true.
- Integration libraries can coordinate all the necessary pieces to build a chatbot like ingesting documents, calling the text embedding model, storing embeddings in a vector store, performing semantic search, augmenting the LLM with the results and calling the LLM itself and getting results back.
- While integration libraries can drastically reduce your time to market, they also come at the price of a complexity tax.
- If you build with them as they are intended they can be very productive
- However if you need to go off-the-rails and do something custom, like doing a text chunking method that is not supported or injecting custom guardrails into the LLM prompt or the response, you will need to override some of what these libraries do
- These libraries tend to be complex internally, and modifying them to fit your purpose can be a daunting task.
- As an alternative, you can build RAG chatbots with the exact same programming languages, API calling and basic string manipulation you already know how to do, as a developer!
- Most text embedding models and LLMs are called through REST endpoints. Learn how to call them directly with REST or use a higher level library like the Python OpenAI library.
- Text embedding models return vectors which can be parsed easily and stored directly in your vector store or database
- LLM inputs are just text. Augmenting them with text chunks from your semantic search results is just string manipulation.
- LLM outputs are also just text. There is slightly more complexity when you want to use the streaming output vs the completion API which produces the full response without streaming. For streaming purposes a Javascript front end and websockets is ideal.
- The conversational history can be managed, again, with string manipulation.
- Call to action: Don’t be scared of the fancy new technology, it’s all just REST APIs, strings and arrays that you already know how to use.
- You can be in 100% control of your application, no integration libraries required.