In this lab, you will learn how to configure data sources using web crawlers, Upload files and Amazon S3 connectors to create knowledge bases and interact with an Amazon Q Business application to retrieve tailored information based on specific questions.
-
Click Select retriever. Under Retrievers select Native (Recommended). For Index provisioning, choose Enterprise with Number of units set to 1 and then click Confirm.
-
First, we are going to configure a Web crawler data source.
Select Web Crawler. Provide a Data source namewebcrawler
.
Select Source URLs and provide Source URLshttps://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/choosing-option.html
In Authentication and Web proxy - optional sections, leave settings as default.
-
For IAM role, select the IAM role containing
-DataSourceRole-
.
In Sync scope section, select Sync range to Sync domains with subdomains only, set Crawl depth to0
and Maximum links per page to5
.
-
For Sync mode, set Full synce.
For Sync run schedule, select Run on demand from the dropdown menu and click Add data source.
-
After the data source is created, click Sync now to start the sync.
It will take around 10-20 minutes to finish.
-
Next, we are going to configure an Upload files data source.
Firstly, download bankingFAQ.docx and save it to your local server.
Go back to the Data sources page, and click Add data source.
This time we select Upload files. Click Choose files to upload the file. Then click Upload -
Once done, navigate the application detail page. Under Index tab, check if the document count is 2.
-
For the last data source we are going to configure is an Amazon S3 data source.
Before adding the data source, we are going to upload files into an S3 bucket.
You can find the S3 url on CloudFormation which you provisioned since the infrastructure setup.
Download the below files. And upload them on the S3 bucket.
Click Upload and wait till it succeeded.
-
Go back to the Data sources page, and click Add data source.
Then we select Amazon S3, Give a data source namerestaurant-s3
.
Select the IAM role containing with-DataSourceRole-
.
-
For Sync scope, in the field Enter the data source location, enter the name of your S3 bucket that you have just uploaded the files.
Under Sync mode, select Full sync.
NOTE
Amazon S3 connector support Amazon Q Business access control by configuring the ACL file in the data source configuration. The ACL file specifies which users and groups are allowed or denied to access particular documents or an S3 prefix / folder. Please check further on AWS document.
-
In Sync run schedule, select Run on demand. And click Add data source.
-
After the data source was successfully created, click Sync now to start the sync, which takes up to 5-10 minutes.
-
Once complete, go to the Data sources page. Now you have 3 data sources configured.
Next we will chat with our knowledge bases via Amazon Q Business application.
In the next part, we will try chatting with Amazon Q Business application.