One engine, built for real time at massive scale.

ArchitectureGPU-acceleratedDistributedColumnarScale-outTiered

A distributed, columnar, memory-first database with a vectorized engine that orchestrates work across GPU and CPU. Tables shard across worker ranks as compressed columnar chunks; data tiers from GPU VRAM down to object storage; streams are queried the same instant they land.

Start FreeDistributed · columnar · memory-first · tiered
The shape

Streams in. Every query shape out — one data path, no copies, no ETL window.

Sources → engine → query surfaces
SOURCES VECTORIZED DISTRIBUTED ENGINE QUERY SURFACES Kafka / streams CDC / DB sync S3 / data lake REST / SDK load SIMULTANEOUS QUERY & INGEST Rank 1 GPU + CPU · shard Rank 2 GPU + CPU · shard Rank 3 GPU + CPU · shard TIERED STORAGE VRAM › RAM › DISK CACHE › SSD › COLD S3 SQL · TPC-DS REST / MCP Vector search Graph solve Spatial / time
GPU-accelerated

The engine routes each operation to the right silicon.

Kinetica compiles each query into vectorized operations: parallel work — scans, joins, distance math — runs across thousands of GPU cores at once, while sequential logic stays on the CPU.

The same primitives serve every model — vector search and spatial joins share one path, not a bolted-on library.

  • Vectorized, not row-by-row — one instruction over many values, the way GPUs want work
  • GPU + CPU orchestration — intensive compute on GPU, sequential logic on CPU
  • GPU VRAM as the hottest tier — the active working set sits closest to the cores
QUERY ORCHESTRATION SQL query CPU sequential · control parse plan branch merge a few fast cores GPU parallel · vectorized thousands of cores
Intensive computation fans out to the GPU; sequential logic stays on the CPU.
Distributed / shared-nothing

Sharded across ranks, with nothing in the middle.

A cluster is worker ranks, each owning a slice of every table and computing it in parallel — no shared disk, no bottleneck. Within a rank, TOMs split tables into shards by a shard key.

A high-cardinality key spreads rows evenly; tables sharing a key join locally, no network reshuffle. Add ranks to scale on demand.

  • Worker ranks → TOMs → shards — a clear ownership hierarchy, no shared state
  • Shard-key locality — co-located joins avoid cross-network shuffles
  • Replicated dimensions — broadcast small tables so every join stays local
SHARED-NOTHING CLUSTER WORKER RANK 0 TOM 0 T1 · shard 0 cols · chunks T2 · shard 0 dim (replicated) GPU WORKER RANK 1 TOM 1 T1 · shard 1 cols · chunks T2 · shard 1 dim (replicated) GPU WORKER RANK N TOM N T1 · shard N cols · chunks T2 · shard N dim (replicated) GPU
Each rank owns its shards and its own GPU — no shared disk, no central bottleneck.
Columnar / chunked

Columns, compressed, in skippable chunks.

Data is stored column-by-column, so a query reads only the columns it touches, and same-type values compress tightly.

Each column splits into chunks; in-memory min/max metadata lets the engine skip any chunk that can't match — most queries scan a fraction of the table.

  • Columnar layout — read only the columns the query needs
  • Dictionary encoding + compression — less memory, less I/O, more in VRAM
  • Chunk-skipping — min/max metadata prunes scans automatically
CHUNK SKIPPING query: WHERE ts BETWEEN 14:00 AND 15:00 col: ts col: region col: value 08:00–10:00 skip 10:00–12:00 skip 13:00–15:00 read 15:00–17:00 skip 17:00–19:00 skip per-chunk min/max → 1 of 5 chunks read
In-memory min/max metadata lets the engine read only the chunks that can match.
Real-time / low latency

Query the data while it's still arriving.

Kinetica is streaming-first: Kafka and CDC records are queryable the instant they land — one table takes writes and reads at once, no load window.

Streaming materialized views keep derived results current as records arrive — no recompute.

  • Simultaneous query & ingest — no ETL window, no staleness gap
  • Streaming materialized views — derived state updates as data arrives
  • Vectorized stencils — continuous lightweight compute over the stream
SIMULTANEOUS QUERY & INGEST STREAM IN live table writes + reads, same instant QUERY OUT materialized view always fresh
One table accepts the stream and answers queries at the same time — no load window.
Massive scale / tiered

Hot data in VRAM. Cold data in S3. One query plan.

Tables shard across nodes; joins on a shared key stay local. On top sits configurable tiered storage.

The working set lives in GPU VRAM and RAM, warm data on disk and SSD, cold data in S3, HDFS, or Azure Blob — moved transparently between tiers, so capacity is bounded only by the cheapest one.

  • Shard-key locality — co-located joins, no network reshuffle
  • Tiered storage — VRAM → RAM → disk → SSD → cold object store
  • Petabyte capacity — guaranteed query completion at cold-storage size
TIERED STORAGE GPU VRAM query buffer RAM active + recent Disk Cache less-used Persist · SSD Cold · S3 / HDFS / Azure Blob HOT COLD transparent movement capacity bounded only by the cheapest tier
Data tiers from GPU VRAM down to object storage; the engine moves it transparently.
Workload management

Many users, one cluster, no one starves.

A shared database can't let one heavy query crowd out a live dashboard. Kinetica governs resources per user and group — how fast a request runs, how much it consumes, how long its data stays hot.

Scheduling priority

Who runs first

Each user or group carries a queue priority, so high-priority requests run ahead of competing work.

Resource limits

How much they use

Caps on CPU threads, memory, and tier usage stop any one user from overloading the cluster — a guard against accidental denial of service.

Eviction priority

How long data stays hot

Higher-priority data holds a fast tier longer; lower-priority data is evicted first as the working set shifts, keeping critical workloads warm.

Multi-model · one copy of the data

Every data model runs over the same tables.

Every retrieval mode compiles to the same primitives — no separate vector store, graph DB, or spatial engine to sync. Each has its own page.

SQL · TPC-DSKey-valueVectorGraphSpatialTime-series

The architecture is the advantage.

Real-time ingest, petabyte scale, every data model, GPU speed — in one engine. Spin up a free instance and run your own workload against it.

Frequently asked questions

What kind of database is Kinetica architecturally?
Kinetica is a distributed, vectorized, memory-first, columnar database with tiered storage, optimized for high-speed performance on streaming data. It exposes the table/view/schema model of a relational database while running vectorized kernels purpose-built for modern CPUs and GPUs.
How does Kinetica scale across nodes, and what is the head node's role?
Standard Kinetica clusters are identical commodity nodes in a shared-nothing layout, with one designated as the head aggregation node. The head node breaks queries into small tasks and distributes them across workers, then assembles the results, and adding nodes delivers near-linear scale-out. Data is distributed via automatic or user-specified sharding.
How does tiered storage work in Kinetica?
Kinetica intelligently manages an entire data corpus across GPU memory (VRAM), system memory, SIMD, disk/SSD, HDFS, and cloud storage like S3. Recent data can stay in GPU memory for rapid processing while historical data lives on disk or in cloud storage, and external table support extends to HDFS, S3, and Azure.
What column types and security controls does Kinetica support?
Core column types include int, long, float, double, string, and bytes, with rich date/time and geospatial type families layered on top. Cell-level security supports dynamic obfuscation, redaction, and column-level access rules, and Kinetica integrates with LDAP, Active Directory, and Kerberos for enterprise authentication.
How is Kinetica deployed, and what HA options exist?
Kinetica runs on your own hardware, in a self-managed cloud environment, as a managed service on AWS, or as a fully managed Kinetica Cloud service, with KAgent provisioning instances. For high availability, Kinetica offers in-cluster node and process failover, plus the option to group multiple clusters in a ring with eventual consistency.

To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions. Cookie Policy