An Idea (by Ingenious Piece)

No Matter What People Tell You, Words And Ideas Can Change The World.

Follow publication

Embedding: Types, Use cases and Evaluation (Part 3 of RAG Series)

Making computers understand Text

Chandan Durgia
An Idea (by Ingenious Piece)
7 min readFeb 1, 2024

--

Photo by Nick Hillier on Unsplash

This is part 3 of the “Retrieval-Augmented Generation (RAG) — Basics to Advanced Series”. Links to other blogs in the series are at the bottom of this blog. Taking forward from part 1 (RAG Basics) and part 2 (Chunking), in this blog we will focus on the “Embedding” component which is relevant for embedding of chunks in the source content and the query. (highlighted in Blue). Since, fundamentally the concept is similar, we will cover this together.

Image by Author

What is Embedding?

In the last blog (Chunking), we discussed how we can break the source content (S) and the query into small chunks using various chunking strategies.

Now, as we know computers understand numbers, so these chunks of text have to be encoded to numbers (some mathematical form) which the computers can read, understand and process. Furthermore, we also would expect numbers to ensure that there are relationships between each word/chunk with the other words/chunks as well.

This conversion of chunks of text into a mathematical form is called embedding.

Since the models would use these numbers for computation, the numbering should be — meaningful and structured. To enable this there are some core principles behind embedding techniques:

  • Ensure that number retains the contextual understanding of the text
  • Ensure that number retains the semantic and syntactic properties of the text
  • Ensure that number retains the linear relationship between words

In order to adhere to these principles, Embedding algorithms necessitate more than simply assigning numerical values to words/chunks. It requires a broader representation of these numbers which is achieved through high dimensional vectors.

Given below is an illustration of how embedding works in real world applications. As mentioned, embedding converts the words/chunks/sentences into vectors and the values of these vectors are such that the…

--

--

An Idea (by Ingenious Piece)
An Idea (by Ingenious Piece)

Published in An Idea (by Ingenious Piece)

No Matter What People Tell You, Words And Ideas Can Change The World.

No responses yet

Write a response