Landbase
Toolkit · Semantic Search
← All tools
T1 · Tool

Semantic Search

Describe what you want in plain language — "fintech," "revenue leaders," "GTM engineers" — and Landbase returns the right entities. It understands intent, not just the literal keywords.

Examples

How customers use it

01
Finding a whole role family with AI
Ask for every revenue leader in SaaS and the titles fan out — CRO, VP RevOps, Head of GTM, SVP Go-to-Market. Semantic Search interprets the concept and returns the whole role family in one query, no manual enumeration required.
02
Targeting by concept, not keywords
Say "let's target fintech" without specifying industry codes or filter logic. Semantic Search treats "fintech" as a meaning cluster and pulls every company that fits — payments, lending, neobanks, B2B fintech infrastructure.
03
Finding people by career arc
Look for engineers who became founders, both ends of the arc captured in a single search. Semantic Search recognizes the trajectory and surfaces the right people, even when the current title reads "Founder & CEO" and the prior one was "Staff Software Engineer."
04
Capturing a whole function in one query
Ask for "data leaders" — but the function goes by a hundred names: Head of Data, Analytics Lead, VP Insights, Chief Data Officer, BI Director, Data Science Manager. One semantic query returns all of them.
Under the hood+
How it's used

It runs underneath every Landbase workflow. Title expansion before list builds. Concept filters where keyword search misses synonyms. Audience scoping when the customer can't enumerate every label that should count.

Why it matters

Most "filter UI" queries are quietly a semantic-search call underneath. Without it, every workflow narrows to the literal strings the customer remembered to type — and silently drops the variants they didn't.

The Similar Company Graph
Company Data
21.6M companies
Embedding
1,024-dim vectors
Indexing
Random Partition Forest
Retrieval
ANN + neighbor exploration
Reranking
with external signals
Similar Company Graph
21.5B edges
A naive pairwise comparison would be 233 trillion operations. The optimized pipeline does it in 361 billion — about 0.15% of brute force.
Stage by stage
  • Embedding. Every company is turned into a 1,024-dimensional vector that captures its meaning — description, services, signals, the whole picture compressed into a numeric fingerprint.
  • Random Partition Forest. A spatial index over the vectors. Lets us look up "close" without comparing every pair.
  • Approximate nearest neighbors + neighbor exploration. Two retrieval stages stacked. ANN gets the rough candidate set; a follow-up exploration step adds 140B candidates to fill in what ANN missed.
  • Reranking with external data. Top 1K neighbors get re-scored using signals the embedding alone doesn't see. Final recall lands at 100% at k=1, 99.7% at k=10, 99.5% at k=100.
  • Similar Company Graph. The output of the pipeline — a graph with 21.5 billion edges connecting every company to its closest peers. Lookups against it are instant.
By the numbers
21.6MCompanies indexed
1,024-dimEmbedding vectors
21.5BEdges in the graph
99.7%Recall @ k=10