How to Use Turso Vector Search and Embedding Functions in SQLite

Turso vector search enables you to store, query, and compare ML-generated embeddings directly inside SQLite-compatible SQL using built-in vector functions that operate on BLOB columns.

Turso extends SQLite with a native vector extension that handles dense and compact embedding formats without requiring external vector databases. According to the tursodatabase/turso source code, this implementation registers SQL functions at runtime to parse JSON arrays into binary formats, compute distances, and manipulate vectors while storing data in ordinary BLOB columns.

Understanding Turso's Vector Architecture

Turso's vector implementation centers on a flexible binary format defined in core/vector/vector_types.rs. The Vector struct and VectorType enum support multiple precision levels to balance accuracy against storage costs.

Supported Vector Formats

The extension recognizes four distinct storage formats:

  • Float32Dense – Standard 32-bit floating point arrays for high-precision embeddings (default for vector32)
  • Float64Dense – 64-bit doubles for maximum numerical precision (vector64)
  • Float8 – 8-bit quantized integers for compact storage (vector8)
  • Float1Bit – Binary vectors storing single bits, useful for hash-based similarity (vector1bit)

When you call creation functions like vector32 or vector1bit, the underlying operations::text::vector_from_text parses the JSON array and serializes it via operations::serialize::vector_serialize into a Blob that includes type metadata and dimension information.

Creating and Storing Vector Embeddings

Vectors integrate seamlessly with existing SQLite schemas. You store them as BLOB columns without special schema extensions, as the vector functions decode the binary format at query time.

Create a Table for Embeddings

CREATE TABLE articles (
    id INTEGER PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT,
    embedding BLOB
);

Insert 32-bit Float Embeddings

Use vector32() to convert JSON arrays into dense binary format:

INSERT INTO articles (id, title, content, embedding) VALUES (
    1,
    'Introduction to Machine Learning',
    'Machine learning is a subset of artificial intelligence...',
    vector32('[0.12, -0.34, 0.56, 0.78, -0.11, 0.45, -0.23, 0.67]')
);

Insert Compact and Binary Formats

For storage-constrained environments, use vector8 for quantized values or vector1bit for binary fingerprints:

-- 8-bit quantized vector
INSERT INTO articles (id, title, embedding) VALUES (
    2,
    'Compact model output',
    vector8('[128, 64, 255, 0, 12, 34, 56, 78]')
);

-- 1-bit binary vector for hash-based similarity
INSERT INTO articles (id, title, embedding) VALUES (
    3,
    'Binary fingerprint',
    vector1bit('[1,0,1,1,0,0,1,0,1,0,0,1,1,0,0,1]')
);

The SQL expression bindings in core/translate/expr/vectors.rs handle these function calls and route them to the appropriate serialization logic in core/vector/vector_types.rs.

Turso provides distance functions that extract raw float slices from the binary format and apply standard mathematical formulas. These functions are documented in docs/sql-reference/functions/vector.mdx and implemented in the operations::distance_* modules.

Cosine distance (vector_distance_cos) measures the angular difference between vectors, making it ideal for semantic text similarity:

SELECT
    id,
    title,
    vector_distance_cos(
        embedding, 
        vector32('[0.1, -0.3, 0.5, 0.8, -0.1, 0.4, -0.2, 0.6]')
    ) AS distance
FROM articles
ORDER BY distance ASC
LIMIT 5;

Euclidean (L2) Distance

For geometric proximity measurements, use vector_distance_l2:

SELECT
    id,
    title,
    vector_distance_l2(
        embedding, 
        vector32('[0.1, -0.3, 0.5, 0.8, -0.1, 0.4, -0.2, 0.6]')
    ) AS distance
FROM articles
ORDER BY distance ASC
LIMIT 5;

Alternative Distance Metrics

The extension also exposes:

  • vector_distance_dot – Negative dot product for maximum inner product search
  • vector_distance_jaccard – Jaccard distance for binary vector comparison

Each function handles special cases internally; for example, operations::distance_cos::vector_distance_cos automatically applies Hamming distance calculations when processing Float1Bit vectors.

Manipulating Vectors with Utility Functions

Turso includes utility functions for debugging and transforming embeddings without exporting data to application code.

Extract Vectors for Debugging

Convert binary storage back to JSON arrays using vector_extract:

SELECT vector_extract(embedding) FROM articles WHERE id = 1;

Concatenate Embeddings

Merge vectors from different rows with vector_concat:

SELECT vector_extract(
    vector_concat(
        (SELECT embedding FROM articles WHERE id = 1),
        (SELECT embedding FROM articles WHERE id = 2)
    )
);

Slice Sub-Vectors

Extract specific dimensions using vector_slice (zero-indexed):

SELECT vector_extract(
    vector_slice(embedding, 0, 4)
) FROM articles WHERE id = 1;

The test suite in testing/sqltests/turso-tests/vector.sqltest validates these operations across edge cases and format combinations.

Summary

  • Turso vector search stores embeddings as BLOBs using dense (32/64-bit) and compact (8-bit/1-bit) binary formats defined in core/vector/vector_types.rs.
  • Creation functions (vector32, vector64, vector8, vector1bit) parse JSON and serialize to binary via operations::serialize::vector_serialize.
  • Distance functions (vector_distance_cos, vector_distance_l2, vector_distance_dot, vector_distance_jaccard) compute similarity directly in SQL.
  • Utility functions (vector_extract, vector_concat, vector_slice) enable debugging and manipulation without leaving the database.
  • The implementation requires no schema changes; vectors live in standard BLOB columns with functions registered at runtime.

Frequently Asked Questions

What vector formats does Turso support?

Turso supports Float32Dense, Float64Dense, Float8 (quantized), and Float1Bit (binary) formats. You create these using vector32(), vector64(), vector8(), and vector1bit() respectively, all of which serialize JSON arrays into optimized binary BLOBs with appropriate type metadata.

How does Turso calculate vector distances?

Distance calculations occur in specialized operation modules like operations::distance_cos::vector_distance_cos. These functions extract raw slices from the binary format (e.g., as_f32_slice or as_f64_slice) and apply the standard mathematical formulas. For binary vectors, the code automatically switches to Hamming distance calculations.

Can I use Turso vector search with existing SQLite tables?

Yes. Turso's vector extension stores data in standard BLOB columns, so you can add an embedding column to existing tables without migration. The vector functions operate on these BLOBs at query time, requiring no special schema changes or separate vector tables.

How do I debug vector data in Turso?

Use the vector_extract function to convert binary BLOBs back to JSON arrays for inspection. For example: SELECT vector_extract(embedding) FROM table. This function reads the type tag from the binary header and reconstructs the original numeric representation, making it easy to verify stored embeddings during development.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →