From Keyword Search to Vector Search: A Practical Guide with SQLite

A hands-on comparison of keyword search vs. vector search using SQLite's FTS5 and sqlite-vec, with real examples showing when each approach wins.

SQLite ships with FTS5, a full-text search engine that's fast, battle-tested, and good enough for most apps. So why would you bolt on vector search?

Because keyword search has a fundamental limitation: it can only find what you literally typed. If your notes say "deployment pipeline" but you search for "CI/CD workflow", FTS5 returns nothing. Vector search understands that those phrases mean roughly the same thing.

In this post, I'll compare both approaches using real examples from a note-taking app I built, and show you exactly how to add vector search to an existing SQLite database using sqlite-vec.

Keyword Search with FTS5: The Baseline

SQLite's FTS5 is a virtual table that builds an inverted index over text:

CREATE VIRTUAL TABLE notes_fts USING fts5(
    content
);

-- Search
SELECT * FROM notes_fts
WHERE notes_fts MATCH 'meeting notes';

FTS5 supports prefix queries, phrase matching, boolean operators (AND, OR, NOT), and ranking with BM25. For many apps, this is all you need.

Where FTS5 excels:

Exact phrase matching ("error code 404")

Known-item search (you remember exact words)

Structured queries ("python AND async NOT javascript")

Zero external dependencies

Sub-millisecond queries on tens of thousands of documents

Where FTS5 fails:

Synonyms: "car" won't match "automobile"

Paraphrasing: "how to fix a bug" won't match "debugging techniques"

Conceptual search: "productivity tips" won't match "getting more done"

Typo tolerance is limited to prefix matching

Vector Search with sqlite-vec: The Upgrade

sqlite-vec is a SQLite extension by Alex Garcia that adds vector operations through virtual tables. It's a single C file, no external dependencies, and works anywhere SQLite works.

Setting Up

First, create a virtual table for your vectors:

CREATE VIRTUAL TABLE vec_note_chunks
    USING vec0(
        sentence_embedding float[384]
    );

The 384 is the embedding dimension — determined by whichever model you use. I used AllMiniLM-L12-V2, which outputs 384-dimensional vectors. If you use OpenAI's text-embedding-3-small, that's 1536 dimensions.

Storing Embeddings

You need two things per document: the raw text (for display) and its embedding (for search).

-- Regular table for text and metadata
CREATE TABLE note_chunks (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    sentence TEXT NOT NULL,
    sentence_embedding BLOB,
    note_id INTEGER,
    FOREIGN KEY(note_id)
        REFERENCES notes(id) ON DELETE CASCADE
);

-- Vector index
CREATE VIRTUAL TABLE vec_note_chunks
    USING vec0(
        sentence_embedding float[384]
    );

The pattern is: store the embedding in both the regular table (as a blob, for later retrieval) and the virtual table (for indexed search). The rowid links them:

WITH related_chunks AS (
    SELECT rowid, sentence_embedding
    FROM note_chunks
    WHERE note_id = ?1
)
INSERT INTO vec_note_chunks (
    rowid, sentence_embedding
)
SELECT rowid, sentence_embedding
FROM related_chunks;

Querying

Here's the core search query:

WITH matches AS (
    SELECT rowid, distance
    FROM vec_note_chunks
    WHERE sentence_embedding MATCH ?1
        AND distance > 0.8
    ORDER BY distance
    LIMIT 10
)
SELECT DISTINCT
    notes.id,
    notes.content,
    notes.created_at
FROM matches
JOIN note_chunks
    ON note_chunks.id = matches.rowid
LEFT JOIN notes
    ON notes.id = note_chunks.note_id;

Let's break this down:

MATCH ?1 — The query embedding (as a JSON array of floats)

distance > 0.8 — Cosine similarity threshold. Higher = more similar. I found 0.8 to be a good cutoff for "clearly related" content

The CTE pattern is important: do the expensive vector search first, then join for metadata. This avoids scanning the full notes table

Head-to-Head: When Does Each Win?

Here are real examples from my note-taking app:

Query	FTS5 Result	Vector Result	Winner
"meeting notes"	Exact matches only	Also finds "standup recap", "team sync"	Vector
"error code 502"	Finds the exact error	Finds vaguely related server errors	FTS5
"how to deploy"	Notes containing "deploy"	Also finds "CI/CD", "release process"	Vector
"React useState"	Exact match on API name	Fuzzy matches on state management	FTS5
"feeling stuck"	Nothing (no notes use that phrase)	Finds notes about blockers, debugging frustration	Vector

The pattern: FTS5 wins when precision matters (exact terms, code, identifiers). Vector search wins when recall matters (exploratory search, fuzzy concepts, rediscovery).

The Distance Threshold Problem

The hardest part of vector search isn't the infrastructure — it's choosing the right similarity threshold.

Set it too high (e.g., 0.95) and you only get near-exact matches, defeating the purpose. Set it too low (e.g., 0.5) and you get noisy, irrelevant results.

I landed on 0.8 after manual testing with my own notes. Here's what I observed:

> 0.9: Very similar content, almost paraphrases

0.8 - 0.9: Clearly related topics, good for "related notes"

0.7 - 0.8: Loosely related, sometimes useful for discovery

< 0.7: Usually noise

These numbers are specific to AllMiniLM-L12-V2. Different models have different distance distributions. Always calibrate with your own data.

The Centroid Trick

One useful pattern: computing a centroid embedding for each note (the average of all its chunk embeddings). This gives you a single vector that roughly represents the whole note:

let centroid: Vec<f32> = sum_vector
    .iter()
    .map(|&sum| sum / num_vectors as f32)
    .collect();

With centroids, you can do note-to-note similarity ("show me notes similar to this one") without comparing every chunk pair. It's an approximation, but a fast and useful one.

The Best of Both Worlds: Hybrid Search

In practice, you probably want both. A hybrid approach:

This gives you precision when the user types exact terms, and recall when they're exploring. Many production search systems (Elasticsearch with kNN, PostgreSQL with pgvector) work this way.

I didn't implement hybrid search in my app, but if I were starting over, I would. The two approaches are complementary, not competing.

Performance

On a dataset of ~500 notes (~2,000 chunks), using an M1 MacBook:

FTS5 query: < 1ms

Vector search: ~5-15ms (including embedding the query)

Insertion (chunk + embed + index): ~50-100ms per note

For a personal note-taking app, both are effectively instant. For larger datasets (100k+ documents), you'd want to look at HNSW indexes or purpose-built vector databases.

Getting Started

If you want to add vector search to an existing SQLite app:

The entire vector search layer in my app is about 100 lines of SQL and 150 lines of Rust. SQLite + sqlite-vec gives you 80% of what a dedicated vector database offers, with zero operational overhead.

SQLite is quietly becoming a legitimate platform for AI-powered applications. Between FTS5, sqlite-vec, and the upcoming official vector search support, you can build surprisingly sophisticated search without ever leaving the SQLite ecosystem.