dydu documentation
FrenchEnglish
  • A Single Software, various applications
  • First use guide
    • Getting started
    • Create your bot
    • Create your first knowledge
    • Create and publish your chatbot
    • Frequent use cases
    • Best practices
    • Glossary
  • Contents
    • Knowledge
      • Knowledge management
      • Tags management
      • Knowledge types
        • Answer to a question
        • Complementary answer
        • Predefined answer
        • Event-triggered knowledge
        • Slot filling
      • Answers elements
      • Accessibility for bot answers
      • Decision tree
      • Comments
      • Test the bot
      • Qualities alerts
    • Knowledge map
    • Matching groups
    • Global sentences
    • Language / Spaces
    • Context conditions
    • External Contents
    • Gallery
    • Web services
      • Web Services
      • Configuration examples (REST)
      • Configure OIDC on Keycloak for a Web Service
      • Frequently asked questions
    • Advanced
      • Server scripts
      • Predefined answer templates
      • Variables
      • Web services triggers
      • Top knowledge
    • Tools
    • Import/Export of knowledge
  • Learning
    • Dialogs
    • Suggestions
    • Misundestood sentences
  • Analytics
    • Exploitation
      • Important
      • Dialogs
      • Visitors
      • Themes
      • Knowledge
      • Qualification
      • Users feedbacks
      • Clicked links
      • Rewords
      • Performance
      • Other
    • Livechat
      • Dialogs
      • Knowledge
      • Operators
      • Satisfaction
      • Waiting queues
    • Knowledge base
      • Formulations
      • Users
      • Matches
    • Export
    • Configuration
  • Custom analytics
    • Reports
    • Alerts
    • Configuration
      • Reports
      • Exports
      • Predefined sources
      • Alerts
      • Preferences
      • Annex: List of indicators
  • Livechat
    • Enable livechat
    • Knowledge base setup
    • DYDU Livechat
      • Overview of interfaces
        • Operator Interface
        • Manager interface
      • Dydu livechat setup
        • General
        • Competencies
        • Waiting queues
          • General
            • Setting up the waiting queue
          • Competency
            • Setting up the waiting queue by competency
            • Setting up a knowledge base with the waiting queue by competency
        • Operator capactiy
        • Account parameters
    • Genesys Livechat connector
  • Integration
    • FAQ
      • Static FAQ
      • Dynamic FAQ
    • MetaBot
    • Customisation
    • Javascript actions
    • Custom event-triggered rules
    • Channels
      • Dydubox
      • Dydubox advanced
        • Css editor
          • Teaser
            • CSS Teaser Modification
          • Header
            • CSS Header Modification
          • Body
            • CSS Body Modification
        • Custom JS Editor
        • Label management
        • Possible integrations
      • Connector
        • Teams
        • META
          • Messenger
          • Instagram
          • WhatsApp
          • Compatibility of DYDU bot features with META
          • Meta application control
    • LLM - Generative AI (Artificial Intelligence)
    • Integration of a chatbox into a webview
  • Preferences
    • SAML 2
    • OpenID Connect (OIDC)
    • Users and rights
    • Bot
      • General
      • Dialogs
      • Survey
      • URLs
      • Search field
  • Other
    • How does your bot work?
    • Data protection
      • Cookie management policy
    • Console logs
    • Special keywords
    • Technical aspects
      • Hosting
      • Infrastructure
    • Security
      • General information
      • Server usage
      • Open source tools
      • User session expiration
  • Developers
    • API reference
      • Authentication
      • Dialog API
      • Dialogs Export
      • Search field
      • Import Export Bot
      • Import/Export Knowledge Base
      • Server Status API
      • Access to APIS
      • User Management in the BMS
    • Chatbox V5
      • Setup and integration
Powered by GitBook

Tous droits réservés @ 2023 dydu.

On this page
  • Integration of LLM (large language models) into the Dydu solution.
  • In support of Natural Language Understanding (NLU)
  • DYDU NLU
  • Dydu NLU + LLM:
  • LLM supply
  • Semantic indexing
  • Identifying sources
  • Access to sources
  • Implementation of the LLM strategy
  • Generate a response
  • Architecture
  • Hosting of LLMs by Dydu
  • Hosting of LLMs by a third party
  • Security and Privacy

Was this helpful?

  1. Integration

LLM - Generative AI (Artificial Intelligence)

PreviousMeta application controlNextIntegration of a chatbox into a webview

Last updated 7 months ago

Was this helpful?

Integration of LLM (large language models) into the Dydu solution.

The administration of bots (chatbots, callbots) can use a fully integrated and proprietary stack thanks to Dydu solution.

The underlying engine implements a mechanism that allows it to understand a wide variety of formulations of the same intent. In order to avoid errors of misunderstanding, as much as possible. However, it demands meticulous work on the knowledge structure and the matching groups used.

Dydu incorporates generative AI into its solution to maximize the bot's scope while minimizing the knowledge base's effort.

LLM integrations that have been implemented or currently on the Dydu product roadmap are complementary and not as a replacement. Dydu technology and LLMs are complementary even though they have different approaches.

In support of Natural Language Understanding (NLU)

NLU (Natural Language Understanding) is a part of NLP (Natural Language Processing). This fits the understanding part of the user's question.

DYDU NLU

In the Dydu knowledge base, Dydu NLU is reflected by the blue bubbles which define user intentions:

Dydu NLU + LLM:

In order to enhance the bot's understanding, the Dydu engine will rely on an LLM with the following logic:

  • By using NLU Dydu and if there is a direct match -> a response is provided.

  • In case of a direct match via the LLM, a response is given; otherwise, a Dydu reformulation will be attempted to obtain a response from NLU Dydu, if possible.

The NLU Dydu block corresponds to the analysis of the user's question. The analyse is based of the distance calculation on the formulations and matching groups. If the score is too low, the LLM will be called upon to try to find the answer.

LLM supply

In this logic, the LLM relies on an index of vectors (embeddings) which is regularly regenerated from the contents of the Dydu knowledge base.

Semantic indexing

Another use of generative AI, in our chatbots, is to index external documents to enrich the bot's scope of knowledge. The general principle consists of interrogating a language model in a given situation, typically when the bot fails to understand it. This LLM is based on an index base generated from source documents.

Identifying sources

The first step in implementing semantic indexing is determining the sources. We must define together which documents you want to make accessible to the bot.

Possible sources are:

  • files: PDF, MS Word (doc, docx), MS PowerPoint (ppt, pptx)

  • HTML pages: public site, intranet site, sharepoint site, etc.

Depending on the nature of the documents, a preliminary study can be carried out to verify that the indexing tool will work correctly on your documents. Indeed, a PDF document can contain text, but also images (e.g.: scanned contracts, images in svg format, etc.). The preliminary analysis will allow to adjust the content extraction tool to the specificities of your documents.

Access to sources

Documents can be hosted in different locations:

  • public sites

  • intranet sites

  • sharepoint sites

  • ...

Depending on the case, service accounts may be required and the necessary permissions may need to be discussed with your security team. The tool must be able to access documents regularly to update the index.

Implementation of the LLM strategy

The objective of semantic indexing is to enrich the knowledge base of the Dydu bot.

  • the most frequent

  • the most sensitive

  • requiring integration with your IS

  • requiring escalation to a human agent

You can choose to use the answers from semantic indexing in the following cases:

  • misunderstanding and/or reformulation of the Dydu bot

  • in addition to information on a Dydu response following an explicit request from the user

  • specific questions/thematic

  • according to user profile

  • ...

Generate a response

The diagram below describes the general logic:

Architecture

Hosting of LLMs by Dydu

The following diagram presents the case where the entire Dydu + LLM solution is hosted by Dydu :

The Dydu server contains the entire Dydu bot software brick. We can provide, upon request, detailed and technical documentation of how a Dydu solution server is configured.

The objective of this architecture is to be able to offer our customers a solution entirely hosted in France , by us and compatible with the GDPR.

For LLMs uses, we need another server which is responsible for hosting several services:

  • The embedding engine: it's a language model which is responsible for vectorizing sentences.

  • The Crawler: it's a tool developed by Dydu which allows to browse external sites or Dydu knowledge bases to download data (files, web pages, Dydu knowledge bases, etc.). This tool is configured with configuration files giving it instructions for documents to download and index. (URLs, file types, credentials, index update frequency, ...)

  • The RAG: it's an API open only to the Dydu engine which allows to search in the index and generate a response with a call to the LLM.

All requests made between the 2 Dydu servers are in HTTPS.

Hosting of LLMs by a third party

Our architecture does not adhere to a particular LLM. It's entirely possible to use it with models hosted by third parties. So, it's the same architecture as before but using the OpenAI and GPT embeddings engine:

This is also possible with Azure OpenAI, the third party in this case being Microsoft, which can offer guarantees on compatibility with the GDPR.

Security and Privacy

Many security questions must be addressed before starting a project using language models on your databases:

  • the crawler must be able to access your documents

-> you must open a flow between our server and your sites hosting these documents

-> the crawler must have a service account to authenticate if necessary

  • the vectorization of documents is stored in our index

-> a vector alone is unusable, but by using the embeddings engine which created it we can find the original text, this implies that the texts of your documents are hosted on our server.

It's important to question the level of confidentiality of the data in these documents and the personal nature of the contents.

  • in the case of using an LLM hosted by a third party, all documents and all questions are transmitted to this third party for vectorization as well as for querying the LLM.

are crucial to maximize understanding. The knowledge base administrator has to use them.

The mechanism used called . It consists of finding the related knowledge(ies) (with their technical ID) to the question in the index.

Ais generated from the questions, answers and ids. As well, a series of instructions to make the model only relies on the data provided in the prompt. It returns the id of the closest knowledge, or nothing if no knowledge matches.

Thus, the response that will be formulated by the Dydu engine at the end of the chain will be the response that was written in the Dydu BMS and not a text generated by the LLM. The interest is to maximize understanding and avoid any drift in formulated responses such as

We strongly advise against creating a bot without knowledge. Because, Dydu using semantic indexing results exclusively. Indeed, the results will be less controlled and the costs potentially higher depending on the model you use.

A “hybrid” use is recommended: the Dydu knowledge base will be less substantial than if it does not rely on indexing and it will be able to handle the questions:

When the LLM is requested to find answers in the documentary database, it uses the mechanism to first find zero, one or more paragraphs (or chunks) related to the question asked.

A is generated from the chunks obtained as well as a series of instructions in order that the model only relies on the data provided in the prompt. The response will be formulated by the language model from all these elements.

The LLM (): the language model used through prompts and which generates responses

⚠️
✅
Matching groups
RAG (retrieval augmented generation)
prompt
hallucinations.
RAG (retrieval augmented generation)
prompt
Llama2