LLM - Generative AI (Artificial Intelligence)
Last updated
Last updated
Tous droits réservés @ 2023 dydu.
The administration of bots (chatbots, callbots) can use a fully integrated and proprietary stack thanks to Dydu solution.
The underlying engine implements a mechanism that allows it to understand a wide variety of formulations of the same intent. In order to avoid errors of misunderstanding, as much as possible. However, it demands meticulous work on the knowledge structure and the matching groups used.
Dydu incorporates generative AI into its solution to maximize the bot's scope while minimizing the knowledge base's effort.
LLM integrations that have been implemented or currently on the Dydu product roadmap are complementary and not as a replacement. Dydu technology and LLMs are complementary even though they have different approaches.
NLU (Natural Language Understanding) is a part of NLP (Natural Language Processing). This fits the understanding part of the user's question.
In the Dydu knowledge base, Dydu NLU is reflected by the blue bubbles which define user intentions:
Matching groups are crucial to maximize understanding. The knowledge base administrator has to use them.
In order to enhance the bot's understanding, the Dydu engine will rely on an LLM with the following logic:
By using NLU Dydu and if there is a direct match -> a response is provided.
In case of a direct match via the LLM, a response is given; otherwise, a Dydu reformulation will be attempted to obtain a response from NLU Dydu, if possible.
The NLU Dydu block corresponds to the analysis of the user's question. The analyse is based of the distance calculation on the formulations and matching groups. If the score is too low, the LLM will be called upon to try to find the answer.
In this logic, the LLM relies on an index of vectors (embeddings) which is regularly regenerated from the contents of the Dydu knowledge base.
The mechanism used called RAG (retrieval augmented generation). It consists of finding the related knowledge(ies) (with their technical ID) to the question in the index.
A prompt is generated from the questions, answers and ids. As well, a series of instructions to make the model only relies on the data provided in the prompt. It returns the id of the closest knowledge, or nothing if no knowledge matches.
Thus, the response that will be formulated by the Dydu engine at the end of the chain will be the response that was written in the Dydu BMS and not a text generated by the LLM. The interest is to maximize understanding and avoid any drift in formulated responses such as hallucinations.
Another use of generative AI, in our chatbots, is to index external documents to enrich the bot's scope of knowledge. The general principle consists of interrogating a language model in a given situation, typically when the bot fails to understand it. This LLM is based on an index base generated from source documents.
The first step in implementing semantic indexing is determining the sources. We must define together which documents you want to make accessible to the bot.
Possible sources are:
files: PDF, MS Word (doc, docx), MS PowerPoint (ppt, pptx)
HTML pages: public site, intranet site, sharepoint site, etc.
Depending on the nature of the documents, a preliminary study can be carried out to verify that the indexing tool will work correctly on your documents. Indeed, a PDF document can contain text, but also images (e.g.: scanned contracts, images in svg format, etc.). The preliminary analysis will allow to adjust the content extraction tool to the specificities of your documents.
Documents can be hosted in different locations:
public sites
intranet sites
sharepoint sites
...
Depending on the case, service accounts may be required and the necessary permissions may need to be discussed with your security team. The tool must be able to access documents regularly to update the index.
The objective of semantic indexing is to enrich the knowledge base of the Dydu bot.
⚠️ We strongly advise against creating a bot without knowledge. Because, Dydu using semantic indexing results exclusively. Indeed, the results will be less controlled and the costs potentially higher depending on the model you use.
✅ A “hybrid” use is recommended: the Dydu knowledge base will be less substantial than if it does not rely on indexing and it will be able to handle the questions:
the most frequent
the most sensitive
requiring integration with your IS
requiring escalation to a human agent
You can choose to use the answers from semantic indexing in the following cases:
misunderstanding and/or reformulation of the Dydu bot
in addition to information on a Dydu response following an explicit request from the user
specific questions/thematic
according to user profile
...
When the LLM is requested to find answers in the documentary database, it uses the RAG (retrieval augmented generation) mechanism to first find zero, one or more paragraphs (or chunks) related to the question asked.
A prompt is generated from the chunks obtained as well as a series of instructions in order that the model only relies on the data provided in the prompt. The response will be formulated by the language model from all these elements.
The diagram below describes the general logic:
The following diagram presents the case where the entire Dydu + LLM solution is hosted by Dydu :
The Dydu server contains the entire Dydu bot software brick. We can provide, upon request, detailed and technical documentation of how a Dydu solution server is configured.
The objective of this architecture is to be able to offer our customers a solution entirely hosted in France , by us and compatible with the GDPR.
For LLMs uses, we need another server which is responsible for hosting several services:
The embedding engine: it's a language model which is responsible for vectorizing sentences.
The LLM (Llama2): the language model used through prompts and which generates responses
The Crawler: it's a tool developed by Dydu which allows to browse external sites or Dydu knowledge bases to download data (files, web pages, Dydu knowledge bases, etc.). This tool is configured with configuration files giving it instructions for documents to download and index. (URLs, file types, credentials, index update frequency, ...)
The RAG: it's an API open only to the Dydu engine which allows to search in the index and generate a response with a call to the LLM.
All requests made between the 2 Dydu servers are in HTTPS.
Our architecture does not adhere to a particular LLM. It's entirely possible to use it with models hosted by third parties. So, it's the same architecture as before but using the OpenAI and GPT embeddings engine:
This is also possible with Azure OpenAI, the third party in this case being Microsoft, which can offer guarantees on compatibility with the GDPR.
Many security questions must be addressed before starting a project using language models on your databases:
the crawler must be able to access your documents
-> you must open a flow between our server and your sites hosting these documents
-> the crawler must have a service account to authenticate if necessary
the vectorization of documents is stored in our index
-> a vector alone is unusable, but by using the embeddings engine which created it we can find the original text, this implies that the texts of your documents are hosted on our server.
It's important to question the level of confidentiality of the data in these documents and the personal nature of the contents.
in the case of using an LLM hosted by a third party, all documents and all questions are transmitted to this third party for vectorization as well as for querying the LLM.