Homepage

Embeddings

Last edit: May 19, 2024

What is an Embedding?

In the realms of Artificial Intelligence (AI), Machine Learning (ML), and Natural Language Processing (NLP), an embedding is a representation of data, often high-dimensional and complex, into a lower-dimensional vector space, making it easier to analyze and process.

This transformation captures the essential relationships and structures in the original data. For example, in natural language processing (NLP), word embeddings like Word2Vec or GloVe convert words into numerical vectors that reflect semantic similarities, allowing algorithms to understand and manipulate language more effectively.

platformOS supports embeddings via pgvector.

Embeddings use cases

The best use cases for embeddings include recommendation systems (e.g., suggesting products based on user preferences), search engines (enhancing the relevance of search results), and NLP tasks such as sentiment analysis, translation, and text summarization.

How can I work with embeddings in platformOS?

Embeddings are generated through machine learning models that learn to map data into a continuous vector space. One of the leading companies that do this really well is OpenAI. We have created an OpenAI module, which allows you to easily call the OpenAI Embeddings API. Depending on the chosen model, their API will return an embedding (vector) of a certain length and values. You can then persist such obtained embeddings using the embedding_create_rc GraphQL mutation, search for relevant embeddings using the embeddings_rc GraphQL query, etc.

Note

You do not have to use OpenAI for creating embeddings - you can use any provider you would like. By default, we expect embeddings to be a vector of length 1536 (which is compatible with, for example, text-embedding-ada-002 model). However if you require embedding of different length, you can contact us and we can adjust the settings in your dedicated stack as required.

Do you have any examples of using Embeddings in platformOS?

As a showcase, we have developed a search based on embeddings. You can check the code for the search page and the code for the embeddings generation.

We leverage the same technique in DocsKit.

Questions?

We are always happy to help with any questions you may have.

contact us