One engine, built for real time at massive scale.

ArchitectureGPU-acceleratedDistributedColumnarScale-outTiered

A distributed, columnar, memory-first database with a vectorized engine that orchestrates work across GPU and CPU. Tables shard across worker ranks as compressed columnar chunks; data tiers from GPU VRAM down to object storage; streams are queried the same instant they land.

Start FreeDistributed · columnar · memory-first · tiered

The shape

Streams in. Every query shape out — one data path, no copies, no ETL window.

Sources → engine → query surfaces

GPU-accelerated

The engine routes each operation to the right silicon.

Kinetica compiles each query into vectorized operations: parallel work — scans, joins, distance math — runs across thousands of GPU cores at once, while sequential logic stays on the CPU.

The same primitives serve every model — vector search and spatial joins share one path, not a bolted-on library.

Vectorized, not row-by-row — one instruction over many values, the way GPUs want work
GPU + CPU orchestration — intensive compute on GPU, sequential logic on CPU
GPU VRAM as the hottest tier — the active working set sits closest to the cores

Intensive computation fans out to the GPU; sequential logic stays on the CPU.

Distributed / shared-nothing

Sharded across ranks, with nothing in the middle.

A cluster is worker ranks, each owning a slice of every table and computing it in parallel — no shared disk, no bottleneck. Within a rank, TOMs split tables into shards by a shard key.

A high-cardinality key spreads rows evenly; tables sharing a key join locally, no network reshuffle. Add ranks to scale on demand.

Worker ranks → TOMs → shards — a clear ownership hierarchy, no shared state
Shard-key locality — co-located joins avoid cross-network shuffles
Replicated dimensions — broadcast small tables so every join stays local

Each rank owns its shards and its own GPU — no shared disk, no central bottleneck.

Columnar / chunked

Columns, compressed, in skippable chunks.

Data is stored column-by-column, so a query reads only the columns it touches, and same-type values compress tightly.

Each column splits into chunks; in-memory min/max metadata lets the engine skip any chunk that can't match — most queries scan a fraction of the table.

Columnar layout — read only the columns the query needs
Dictionary encoding + compression — less memory, less I/O, more in VRAM
Chunk-skipping — min/max metadata prunes scans automatically

In-memory min/max metadata lets the engine read only the chunks that can match.

Real-time / low latency

Query the data while it's still arriving.

Kinetica is streaming-first: Kafka and CDC records are queryable the instant they land — one table takes writes and reads at once, no load window.

Streaming materialized views keep derived results current as records arrive — no recompute.

Simultaneous query & ingest — no ETL window, no staleness gap
Streaming materialized views — derived state updates as data arrives
Vectorized stencils — continuous lightweight compute over the stream

One table accepts the stream and answers queries at the same time — no load window.

Massive scale / tiered

Hot data in VRAM. Cold data in S3. One query plan.

Tables shard across nodes; joins on a shared key stay local. On top sits configurable tiered storage.

The working set lives in GPU VRAM and RAM, warm data on disk and SSD, cold data in S3, HDFS, or Azure Blob — moved transparently between tiers, so capacity is bounded only by the cheapest one.

Shard-key locality — co-located joins, no network reshuffle
Tiered storage — VRAM → RAM → disk → SSD → cold object store
Petabyte capacity — guaranteed query completion at cold-storage size

Data tiers from GPU VRAM down to object storage; the engine moves it transparently.

Workload management

Many users, one cluster, no one starves.

A shared database can't let one heavy query crowd out a live dashboard. Kinetica governs resources per user and group — how fast a request runs, how much it consumes, how long its data stays hot.

Scheduling priority

Who runs first

Each user or group carries a queue priority, so high-priority requests run ahead of competing work.

Resource limits

How much they use

Caps on CPU threads, memory, and tier usage stop any one user from overloading the cluster — a guard against accidental denial of service.

Eviction priority

How long data stays hot

Higher-priority data holds a fast tier longer; lower-priority data is evicted first as the working set shifts, keeping critical workloads warm.

Multi-model · one copy of the data

Every data model runs over the same tables.

Every retrieval mode compiles to the same primitives — no separate vector store, graph DB, or spatial engine to sync. Each has its own page.

SQL · TPC-DSKey-valueVectorGraphSpatialTime-series

The architecture is the advantage.

Real-time ingest, petabyte scale, every data model, GPU speed — in one engine. Spin up a free instance and run your own workload against it.

Start Free Read the docs

Frequently asked questions

What kind of database is Kinetica architecturally?

Kinetica is a distributed, vectorized, memory-first, columnar database with tiered storage, optimized for high-speed performance on streaming data. It exposes the table/view/schema model of a relational database while running vectorized kernels purpose-built for modern CPUs and GPUs.

How does Kinetica scale across nodes, and what is the head node's role?

Standard Kinetica clusters are identical commodity nodes in a shared-nothing layout, with one designated as the head aggregation node. The head node breaks queries into small tasks and distributes them across workers, then assembles the results, and adding nodes delivers near-linear scale-out. Data is distributed via automatic or user-specified sharding.

How does tiered storage work in Kinetica?

Kinetica intelligently manages an entire data corpus across GPU memory (VRAM), system memory, SIMD, disk/SSD, HDFS, and cloud storage like S3. Recent data can stay in GPU memory for rapid processing while historical data lives on disk or in cloud storage, and external table support extends to HDFS, S3, and Azure.

What column types and security controls does Kinetica support?

Core column types include int, long, float, double, string, and bytes, with rich date/time and geospatial type families layered on top. Cell-level security supports dynamic obfuscation, redaction, and column-level access rules, and Kinetica integrates with LDAP, Active Directory, and Kerberos for enterprise authentication.

How is Kinetica deployed, and what HA options exist?

Kinetica runs on your own hardware, in a self-managed cloud environment, as a managed service on AWS, or as a fully managed Kinetica Cloud service, with KAgent provisioning instances. For high availability, Kinetica offers in-cluster node and process failover, plus the option to group multiple clusters in a ring with eventual consistency.