Sanchit Dilip Jain/Amazon Bedrock Knowledge Bases - Overview 🔍

Amazon Bedrock Knowledge Bases - Overview

Introduction

What is Amazon Bedrock?

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like Stability AI, Anthropic, and Meta via a single API. It also provides the broad capabilities needed to build Generative AI applications with security, privacy, and responsible AI.
Since Amazon Bedrock is serverless, you don’t have to manage any infrastructure, and you can securely integrate and deploy Generative AI capabilities into your applications using the AWS services you already know.

What are embeddings in RAG workflow?

An embedding is a way of representing documents as vectors in a high-dimensional space. These vectors capture the essence of the document’s content in a form that machines can process. By converting text into embeddings, we enable the computer to ‘understand’ and compare different pieces of text based on their contextual similarities.

Vector Databases: Organizing and Accessing Embeddings

A vector database allows us to store and query embeddings, facilitating quick and relevant retrieval of documents based on their vector representations. In essence, it acts as a bridge between the raw data and the actionable insights we seek from our language models.

What is Amazon Bedrock Knowledge Bases?

Knowledge Bases for Amazon Bedrock provides a managed Retrieval Augmented Generation (RAG) service to query uploaded data. By pointing to the location of the data in Amazon S3, the service automatically fetches the documents, divides them into blocks of text, converts the text into embeddings, and stores the embeddings in a vector database. There is also an API that allows us to build applications with the Knowledge Base.

Demo

In this demo, we will create a Knowledge Base with a subset of the AWS Well-Architected Framework.
- Prerequisite
  - Download well_arch_text_sample.csv and upload it to an S3 bucket in the same region you will use for the demo. For this demo, we will use a bucket with bedrockdemoapp in its name
- Creating a Knowledge Base in the AWS Console
  - To begin, navigate to the Knowledge Base Console
  - Select the orange Create Knowledge base button.
  - You can use the default name or enter it yourself. Then, select “Next” at the bottom right of the screen.
  - Select the Browse S3 button and the bucket with bedrockdemoapp in its name. Then press Next.
  - Select the Titan Embeddings Embeddings model and leave the default selection for the Vector store. Then, select “Next.”
  - On the next screen, scroll down and select “Create Knowledge Base.”
- Querying a Knowledge Base: When your Knowledge Base is ready, you can test it in the console.
  - Click the Sync button to start the data sync.
  - Click the Select Model button and choose Claude 3 Sonnet, then press Apply
  - From here, you can enter questions in the chat window where it says Enter your message here. For example, we can Can you explain what a VPC is?
  - Click Run, and the model will respond, and you can see the sources in the Knowledge Base by selecting Show result details.
- Using the Knowledge Base API: You can also query the Knowledge Base through the API. There are two supported methods:
  - retrieve: Returns documents related to a query
  - retrieve_and_generate: Does RAG workflow with the model.
  - To try them out:
    - Head back to your IDE and open kb_rag.py.
    - Update KB_ID with the id for your Knowledge Base. It is in the Knowledge base overview section for the Knowledge Base you created.
    - Run the code with python3 kb_rag.py.
    - Try playing with QUERY on line 4 to see what type of responses you get. The code performs the RAG workflow by converting the query into an embedding and returning the relevant documents.

Resources

Visit this page to find the latest documentation.