Transactional Knowledge: Build Better Chatbots

Transactional Knowledge: Build Better Chatbots

Chatbots have become an essential tool for businesses to provide quick, accurate answers to their customers’ (or internal) questions. One of the keys to building effective chatbots is having high-quality knowledge sources that can provide grounded and relevant responses. In this blog post, we will explore how using transactional knowledge can lead to better chatbots by enabling real-time updates, avoiding downtime, and empowering business users.

LLM-driven RAG Chatbots and Knowledge Sources

LLM (large language model) driven RAG (retrieval augmented generation) chatbots rely on knowledge sources to provide accurate answers. These sources can include word docs, PDF docs, text files, and website scrapes. The chatbots retrieve knowledge using semantic search or a combination of vector search and lexical search, called hybrid search.

To enable efficient retrieval, the unstructured text is “chunked” into smaller pieces. Each chunk of knowledge is then meaningfully represented by a text embedder, which inputs text and produces dense vectors used for similarity search.

Ingesting these sources allows a RAG workflow to ground the LLM’s answers with retrieved chunks, resulting in more accurate and relevant responses.

Knowledge Changes Over Time

However, knowledge sources are not static; they change over time. For example, if an HR document is used as a knowledge source, HR policies may change. To keep chatbot responses up-to-date, it is crucial to update the knowledge sources.

Current Chatbot Update Method: Nuke and Pave

In most current chatbot implementations, updates involve a “nuke and pave” operation. This method involves deleting the existing chunks in the vector index and reingesting, chunking, and vectorizing the documents. While this approach ensures that the chatbot’s knowledge is up-to-date, it has some pitfalls:

  1. Impact on Recall and Precision: Changing the source text and, consequently, the chunk boundaries can affect the recall and precision of vector and lexical search results.
  2. Downtime: The chatbot cannot answer questions during the index rebuild process. Workarounds like Blue-Green environment setups or multiple indexes can mitigate this issue but still require replacing the entire knowledge base.

Alternative: Transactional Knowledge

Transactional knowledge offers a better solution for updating chatbot knowledge sources. In this approach, chunks are stored in a database like MongoDB, and a simple CRUD (create, read, update, delete) application is built on top of the chunks. This allows for real-time editing and updating of the knowledge without needing to nuke the entire index.

The benefits of transactional knowledge include:

  1. Real-time Updates: Business users can update the chatbot knowledge anytime it needs updating, without waiting for scheduled downtime.
  2. No Downtime: Chatbots do not experience downtime during updates, enabling continuous service for users.
  3. Empowering Business Users: Business users can directly contribute to the chatbot’s knowledge base, streamlining the update process and eliminating the need for coordination among stakeholders.

Conclusion

Transactional knowledge transforms the chatbot’s knowledge management into a content management system, a familiar paradigm for business users who create content regularly. By adopting this approach, businesses can build better chatbots with real-time updates, no downtime, and empowered business users. It’s time to ditch the nuke and pave method and embrace the future of chatbot knowledge management.

  • Human Intervention: Minor. This was applicable to internal an external chatbots. The LLM used here assumed it was external only.

Facts Used:

    • LLM (large language model) driven RAG (retrieval augmented generation) Chatbots rely on high quality knowledge sources to give accurate, grounded answers.
    • Chatbots retrieve their knowledge using semantic search or a combination of vector search and lexical search, called hybrid search
    • Chatbot knowledge sources (word docs, pdf docs, text files, site scrapes) are typically unstructured blobs of text that get broken up into pieces called “chunking” to allow each chunk of knowledge to be meaninfully represented by a text embedder, which inputs text and produces dense vectors which are used for similarity search later. In a hybrid scenario we also search the raw text with a search engine like Lucene.
    • In a pilot or PoC environment, we will typically use an integration library like LangChain to ingest and chunk our document sources, which are usually just sitting in a folder on a server or on our laptop.
    • Ingesting all these sources results in our ability to run a RAG workflow and ground our LLMs answers with retrieved chunks, so the answers are closer to the source material and not hallucinated from the LLMs pretraining.
    • However, knowledge changes over time. If we ingested an HR doc, maybe the HR policies have changed over time. How do we update knowledge?
    • In most current chatbot implementations users will do a nuke/pave operation: The current chunks in the vector index will be deleted, and the documents will be reingested, chunked and vectorized.
    • This style of update has some pitfalls: The default chunking method counts tokens in the source text and stores a fixed number of them, usually with some overlap on each side. This is called token limit with overlap method. If you change your source text, you change your chunk boundaries, and can potentially impact your recall and precision of your vector and lexical search results.
    • Nuke and pave also results in some downtime in your chatbot, meaning it can’t answer questions while the index is being rebuilt. You can get around this with a Blue-Green environment setup or multiple indexes, but again you’re replacing the entire knowledgebase!
    • An alternative to this is Transactional Knowledge. Imagine you store your chunks in a regular database, like MongoDB, and you build a simple CRUD application on top of your chunks. You can edit, and update the knowledge in real time, without having to nuke the entire index.
    • This also means you no longer need to schedule downtime for your chatbot, or coordinate all the business users and stakeholders to make all their updates in the same nightly window. Business users can be empowered to update the chatbot knowledge anytime it needs updating.
    • Transactional Knowledge is treating your chatbot’s knowledge like a content management system, which is a familiar paradigm for any business user who creates content right now.
    • Ditch the nuke and pave! Make your chatbot update in real-time!