From Keyword Search to Vector Search: A Practical Guide with SQLite

2026-02-28

A hands-on comparison of keyword search vs. vector search using SQLite's FTS5 and sqlite-vec, with real examples showing when each approach wins.


SQLite ships with FTS5 , a full-text search engine that's fast, battle-tested, and good enough for most apps. So why would you bolt on vector search?

Because keyword search has a fundamental limitation: it can only find what you literally typed. If your notes say "deployment pipeline" but you search for "CI/CD workflow", FTS5 returns nothing. Vector search understands that those phrases mean roughly the same thing.

In this post, I'll compare both approaches using real examples from a note-taking app I built, and show you exactly how to add vector search to an existing SQLite database using sqlite-vec .

Keyword Search with FTS5: The Baseline

SQLite's FTS5 is a virtual table that builds an inverted index over text:

CREATE VIRTUAL TABLE notes_fts USING fts5(
    content
);

-- Search
SELECT * FROM notes_fts
WHERE notes_fts MATCH 'meeting notes';

FTS5 supports prefix queries, phrase matching, boolean operators ( AND , OR , NOT ), and ranking with BM25. For many apps, this is all you need.

Where FTS5 excels:

  • Exact phrase matching ("error code 404")
  • Known-item search (you remember exact words)
  • Structured queries ("python AND async NOT javascript")
  • Zero external dependencies
  • Sub-millisecond queries on tens of thousands of documents
  • Where FTS5 fails:

  • Synonyms: "car" won't match "automobile"
  • Paraphrasing: "how to fix a bug" won't match "debugging techniques"
  • Conceptual search: "productivity tips" won't match "getting more done"
  • Typo tolerance is limited to prefix matching
  • Vector Search with sqlite-vec: The Upgrade

    sqlite-vec is a SQLite extension by Alex Garcia that adds vector operations through virtual tables. It's a single C file, no external dependencies, and works anywhere SQLite works.

    Setting Up

    First, create a virtual table for your vectors:

    CREATE VIRTUAL TABLE vec_note_chunks
        USING vec0(
            sentence_embedding float[384]
        );

    The 384 is the embedding dimension — determined by whichever model you use. I used AllMiniLM-L12-V2, which outputs 384-dimensional vectors. If you use OpenAI's text-embedding-3-small , that's 1536 dimensions.

    Storing Embeddings

    You need two things per document: the raw text (for display) and its embedding (for search).

    -- Regular table for text and metadata
    CREATE TABLE note_chunks (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        sentence TEXT NOT NULL,
        sentence_embedding BLOB,
        note_id INTEGER,
        FOREIGN KEY(note_id)
            REFERENCES notes(id) ON DELETE CASCADE
    );
    
    -- Vector index
    CREATE VIRTUAL TABLE vec_note_chunks
        USING vec0(
            sentence_embedding float[384]
        );

    The pattern is: store the embedding in both the regular table (as a blob, for later retrieval) and the virtual table (for indexed search). The rowid links them:

    WITH related_chunks AS (
        SELECT rowid, sentence_embedding
        FROM note_chunks
        WHERE note_id = ?1
    )
    INSERT INTO vec_note_chunks (
        rowid, sentence_embedding
    )
    SELECT rowid, sentence_embedding
    FROM related_chunks;

    Querying

    Here's the core search query:

    WITH matches AS (
        SELECT rowid, distance
        FROM vec_note_chunks
        WHERE sentence_embedding MATCH ?1
            AND distance > 0.8
        ORDER BY distance
        LIMIT 10
    )
    SELECT DISTINCT
        notes.id,
        notes.content,
        notes.created_at
    FROM matches
    JOIN note_chunks
        ON note_chunks.id = matches.rowid
    LEFT JOIN notes
        ON notes.id = note_chunks.note_id;

    Let's break this down:

  • MATCH ?1 — The query embedding (as a JSON array of floats)
  • distance > 0.8 — Cosine similarity threshold. Higher = more similar. I found 0.8 to be a good cutoff for "clearly related" content
  • The CTE pattern is important: do the expensive vector search first, then join for metadata. This avoids scanning the full notes table
  • Head-to-Head: When Does Each Win?

    Here are real examples from my note-taking app:

    Query FTS5 Result Vector Result Winner
    "meeting notes" Exact matches only Also finds "standup recap", "team sync" Vector
    "error code 502" Finds the exact error Finds vaguely related server errors FTS5
    "how to deploy" Notes containing "deploy" Also finds "CI/CD", "release process" Vector
    "React useState" Exact match on API name Fuzzy matches on state management FTS5
    "feeling stuck" Nothing (no notes use that phrase) Finds notes about blockers, debugging frustration Vector

    The pattern: FTS5 wins when precision matters (exact terms, code, identifiers). Vector search wins when recall matters (exploratory search, fuzzy concepts, rediscovery).

    The Distance Threshold Problem

    The hardest part of vector search isn't the infrastructure — it's choosing the right similarity threshold.

    Set it too high (e.g., 0.95) and you only get near-exact matches, defeating the purpose. Set it too low (e.g., 0.5) and you get noisy, irrelevant results.

    I landed on 0.8 after manual testing with my own notes. Here's what I observed:

  • > 0.9 : Very similar content, almost paraphrases
  • 0.8 - 0.9 : Clearly related topics, good for "related notes"
  • 0.7 - 0.8 : Loosely related, sometimes useful for discovery
  • < 0.7 : Usually noise
  • These numbers are specific to AllMiniLM-L12-V2. Different models have different distance distributions. Always calibrate with your own data.

    The Centroid Trick

    One useful pattern: computing a centroid embedding for each note (the average of all its chunk embeddings). This gives you a single vector that roughly represents the whole note:

    let centroid: Vec<f32> = sum_vector
        .iter()
        .map(|&sum| sum / num_vectors as f32)
        .collect();

    With centroids, you can do note-to-note similarity ("show me notes similar to this one") without comparing every chunk pair. It's an approximation, but a fast and useful one.

    The Best of Both Worlds: Hybrid Search

    In practice, you probably want both. A hybrid approach:

    This gives you precision when the user types exact terms, and recall when they're exploring. Many production search systems (Elasticsearch with kNN, PostgreSQL with pgvector) work this way.

    I didn't implement hybrid search in my app, but if I were starting over, I would. The two approaches are complementary, not competing.

    Performance

    On a dataset of ~500 notes (~2,000 chunks), using an M1 MacBook:

  • FTS5 query : < 1ms
  • Vector search : ~5-15ms (including embedding the query)
  • Insertion (chunk + embed + index): ~50-100ms per note
  • For a personal note-taking app, both are effectively instant. For larger datasets (100k+ documents), you'd want to look at HNSW indexes or purpose-built vector databases.

    Getting Started

    If you want to add vector search to an existing SQLite app:

    The entire vector search layer in my app is about 100 lines of SQL and 150 lines of Rust. SQLite + sqlite-vec gives you 80% of what a dedicated vector database offers, with zero operational overhead.


    SQLite is quietly becoming a legitimate platform for AI-powered applications. Between FTS5, sqlite-vec, and the upcoming official vector search support, you can build surprisingly sophisticated search without ever leaving the SQLite ecosystem.