ScyllaDB Brings Massive-Scale Vector Search to Real-Time AI

ScyllaDB’s integrated Vector Search can handle datasets of 1 billion vectors with P99 latency as low as 1.7 ms and throughput up to 252,000 QPS

SUNNYVALE, Calif., Jan. 20, 2026 (GLOBE NEWSWIRE) -- ScyllaDB today announced the general availability of its new Vector Search. This high-performance vector search supports the industry’s largest models with low TCO.

ScyllaDB is commonly used for real-time AI workloads such as latency-sensitive machine learning, predictive analytics, and fraud detection. It is trusted by high-growth companies such as Tripadvisor, ShareChat, and Freshworks to power large-scale latency-sensitive feature stores. As ScyllaDB’s customers began adopting vector search, many found standalone vector databases to be overly complex and costly at scale. In response, ScyllaDB added Vector Search to its ScyllaDB Cloud offering.

ScyllaDB Vector Search is built on ScyllaDB’s shard-per-core architecture with a Rust-based extension that leverages the USearch approximate-nearest-neighbor (ANN) search library. The architecture separates storage and indexing responsibilities while keeping the system unified from the user’s perspective.

The ScyllaDB nodes store both the structured attributes and the vector embeddings in the same distributed table.
The dedicated Vector Store service consumes updates from ScyllaDB via Change Data Capture (CDC) and builds ANN indexes in memory.
Queries are issued to the database, then internally routed to the Vector Store.

This design allows each layer to scale independently, optimizing for its own workload characteristics and eliminating resource interference.

The combination of ScyllaDB’s hardware-optimized shard-per-core architecture for high op/s queries and USearch’s C++ implementation with 10x performance gains over FAISS is a perfect fit for massive real-time AI workloads. Together, this enables industry-leading time-to-first-token.

In recent 1-billion-vector benchmarks, ScyllaDB Vector Search achieved sub-2 ms P99 latency with up to ~250,000 queries per second on large-scale similarity search workloads. These tests were performed with the publicly available yandex-deep 1b dataset, which contains 1 billion vectors, and the 3 + 3 node setup reflects realistic production deployments.

"ScyllaDB supports the most scalable vector search deployments at monstrous speed," said Dor Laor, co-founder and CEO at ScyllaDB. "Based on publicly available benchmarks, ScyllaDB currently demonstrates the fastest vector search performance at billion vector scale. It also offers excellent TCO across all model sizes. That means teams can support and scale their largest AI inference workloads without the traditional performance-cost tradeoffs."

About ScyllaDB
ScyllaDB is a specialty database for workloads that require predictable performance at scale. It’s adopted by organizations that require ultra-low latency, even at millions of features or operations per second, billions of embeddings, or petabytes of storage. ScyllaDB’s shard-per-core architecture taps the full power of modern infrastructure, translating to fewer nodes, less admin, and lower costs. Over 400 companies such as Disney+, Discord, Tripadvisor, Expedia, Zillow, Starbucks, and Comcast use ScyllaDB for their toughest database challenges. For more information: https://www.scylladb.com/

Media Contact
Wayne Ariola
wayne.ariola@scylladb.com

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.