Edge SQL Vector Search
Vector Search is an Azion Edge SQL feature that enables customers to implement semantic search engines. While traditional search models aim to find exact matches, such as keyword matches, vector search models use specialized algorithms to identify similar items based on their mathematical representations, or vector embeddings.
By using Vector Search, you can implement various use cases:
- Enhancing search systems and offering personalized recommendations by finding items with similar characteristics or based on users’ preferences, such as related products in ecommerce or content in streaming platforms.
- Creating text embeddings to search for semantically similar text, where words or phrases are represented as vectors.
- Building AI-based applications, leveraging Natural Language Processing (NLP) for voice assistants and chatbots.
Distributed across the Azion global edge network, this feature enables more relevant search results, real-time recommendations, and insights, drastically reducing latency and improving user satisfaction. All of this while maintaining data locality and reducing dependence on the centralized database.
Implementation
Scope | Resource |
---|---|
Implement Vector Search | Guide explaining the basics of implementing Vector Search |
Get to know Azion Edge SQL and its features | Edge SQL reference |
Databases and storage
By leveraging Edge SQL, vector search databases are optimized to handle high-dimensional vector data at the edge. This enables fast, localized processing as well as reduced latency, allowing complex tasks for advanced data-intensive applications to run efficiently.
Edge SQL implements Main/Replicas, distributed within the Azion Edge Network, to enable ultra-low latency querying at the edge. This approach allows it to be accessed from any edge location, facilitating real-time processing and data analysis, and guaranteeing availability and fault tolerance. Edge SQL uses SQLite’s dialect.
Columns
To store vectors in a vector search database, you can add a column specifically for the vector data. Edge SQL Vector Seach supports embedding models, without dimension restrictions.
For example, using the text-embedding-3-small
model and a 1536
dimension, this column should be declared to hold an array of 32-bit floating-point numbers as a binary large object (BLOB) type, as follows. The (3) in the example specifies the number of 32-bit floating-point (F32) elements in the vector, indicating a 3-dimensional vector:
Then, you can insert data in the table, including vector embeddings (in this example, representing team stats for the 2023 season):
Embeddings
Embeddings are numerical vector representations of complex data (like words or images) that capture essential characteristics, enabling similarity-based searches. In the example, given an embedding of [80, 30, 60] for a team, a query can retrieve other teams with similar embeddings, helping identify teams with comparable performance stats.
Using embeddings, you can query to find similar information between the teams. For example, teams with stats similar to 82 goals scored, 25 goals conceded, and 63% possession:
Indexing
Considering Vector Search uses larger databases and datasets, it supports indexing through Approximate Nearest Neighbors (ANN), using SQL, wrapping the vector column into the libsql_vector_idx
function.
To adequately use the index, you can modify your query to guarantee the index is consulted. Using the index is not automatic, since it is internally represented as a different table. For the previous example query, we can modify it slightly to make sure the index is consulted: