External Contents
Last updated
Last updated
Tous droits réservés @ 2023 dydu.
As a Dydu bot manager, you have the ability to centralize and organize your external content sources directly from an intuitive interface in the BMS, allowing you to generate instant responses based on these sources and thereby improve the quality of responses provided to end users. Through the BMS navigation menu, you can access the External Content page: Content > External Content.
You will arrive on the RAG Edition page, where you can create a new collection.
By clicking on the "Your Collections" modal, you need to choose the name of the collection you want to create, and then click on "Create."
A page for your collection is displayed as shown below:
On this page, you have the ability to:
It is possible to import multiple documents of the following types: PDF, DOCX, PPTX, TXT.
The Dydu SharePoint reader enables the indexing of pages and files.
Please refer to this documentation.
You need to register a new application in your tenant that has read permissions. The tutorial below explains the process: https://learn.microsoft.com/en-us/azure/healthcare-apis/register-application When you reach the "API permissions" step, the two necessary permissions are:
Microsoft Graph -> Application Permissions -> Files.ReadAll (Grant Admin Consent)
Microsoft Graph -> Application Permissions -> BrowserSiteLists.Read.All (Grant Admin Consent)*
Set the permissions Files.ReadAll and BrowserSiteLists.Read.All for the Dydu application.
The required elements for the configuration are:
a. clientId
b. client Secret (the value)
c. tenant Id
d. SharePoint site ID
Go to the Azure portal:
Click on App registrations
Click on New registration
Give a name and click on "Register"
The application ID is the client_id
Click on Certificates & secrets. Then, in the "Client secrets" tab, click on New client secret.
Click on Certificates & secrets
Copy the generated secret value (client_secret)
Click on API permissions. Then click on Add a permission.
Click on Microsoft Graph
Then click on "Application permissions". Then add the Sites.Selected and Files.Read.All permissions.
Click on Grant admin consent for XXXX
To find the tenant ID:
Go to the website: https://entra.microsoft.com/
Click on "Overview":
The client ID corresponds to the tenant ID.
To find the SharePoint ID:
Compose the following URL: https://<tenant>.sharepoint.com/sites/<site-url>/_api/site/id
The SharePoint ID can be found in the result
Features:
1. Indexing pages and/or files from an entire SharePoint site:
Standard RAG
Displaying the original SharePoint URL in the result provided by the RAG
Possible integration with SAML authentication:
A user must be authenticated via SAML
We retrieve their group memberships
Document permission filtering is possible: access to a subset depending on the access rights.
Not indexed:
"Embedded" files in pages
Videos, and some other types (Excel, WMF, ...)
Currently, the process of retrieving documents and indexing takes time (several minutes), and the most frequent refresh is once a day.
Type of Websites:
Domain
Sitemap
Specific URLs
The informations about adding your source to your collection are displayed as follows:
Name: the name of your source that you added
Added by: the bot manager's identifier
Created at: the date when you added your source
Status: the status of your source
There are three states for the status:
"Waiting for action": status when no action has been taken.
"Completed": status when the operation (indexation or suggestion) is successful.
"Completed with errors": status when the operation (indexation or suggestion) was completed, but there are errors.
"Processing": status when the operation is ongoing.
Action: the actions you can perform on your added source > delete, edit.
Suggest knowledge from the collection
Indexation: index the content of the collection