Postgis vs Geomesa vs Kinetica

Working with real-time geospatial data at scale?
You’ll want a system that’s flexible, fast, scalable and easy to use.

What Tools for Geospatial Analysis at Scale?

Location data is exploding. Location-enabled chips for cellular connectivity are being used in more and more devices, new apps and the expansion of 5G networks are aiding in the collection of greater volumes of geospatial data. Add to that the increasing affordability of analyzable imagery from satellites, drones and surveillance cameras.

But designing systems to keep up with the volume and speeds required of modern location intelligence workloads can be tough. Traditional GIS tools and databases are no longer up to the task. To manage IoT at scale, you’ll need to turn to a new generation of data-centric tools for tracking of objects and events at scale.

So, what are some options?

Kinetica is a high performance real-time database designed from the ground-up for analysis of geospatial data at speed and scale.

Kinetica scales horizontally with a memory-first distributed architecture which allows you to handle larger geospatial data sets with ease. An eventually-consistent, no-locking design allows for simultaneous ingest and analysis with no latency.

Kinetica’s has been built from the ground up to take full advantage of modern processors. – it is able to vectorize queries for significantly faster results on ad-hoc exploratory queries. Spatial joins, which can be particularly complex, benefit most from this unique capability.

Kinetica can be accessed through SQL, ODBC/JDBC, REST APIs, or with Kinetica Workbench – an interactive SQL based notebook for sequential analysis. Over 130 spatial functions are available. Workbench makes it easy for business users to get up and running and interact with large volumes of interactive data. Workbench grants users an interactive data exploration experience at virtually unlimited scale, quickly and simply on a single system.

PostGIS is a library that provides geospatial data types, functions, and queries for use with PostgreSQL

PostGIS offers geospatial functionality for use at a moderate scale. It works with PostgreSQL – an easy to understand relational database that is well suited to many OLTP-type applications. Its architecture is general purpose and flexible, but not specifically optimized for analytics. As a result, non-indexed queries and aggregations will be slower to return results

PostGIS provides a very robust geospatial function library of both 2D, 3D, and raster functions. It has full support for many spatial reference systems (SRS), and plugs into popular 3rd party tools to provide missing functionality, such as for visualizations or routing algorithms. It’s SQL compliant and also allows users to bring in User Defined Functions.

Without in-built support for visualization, users of PostGIS may face difficulties when trying to display large quantities of real-time data on a map. Many popular visualization vendors rely on client-side rendering, which limits visualizations to small and manageable datasets.

Users will also run into limitations with PostGIS when working with geospatial data at scale. PostgreSQL is not inherently designed to scale out, it does not offer advanced vectorized queries, and it is not ready for high velocity data ingestion from streaming feeds. This makes it unsuitable for IoT style use cases where data volumes can be expected to increase quickly.

GeoMesa is an open-source suite of tools for large-scale geospatial querying and analytics on distributed computing systems – such as HBase, Accumulo, Cassandra, Redis, Kafka and Spark.

GeoMesa will enable you to store, index, query, and transform spatio-temporal data at scale with systems that are horizontally scalable, but weren’t intrinsically designed for using geospatial and temporal data. GeoMesa has become known as a good addition to the stack for big data use-cases which need to add geospatial queries on top of an established data management structure.

GeoMesa provides support for stream processing of spatial data on top of Apache Kafka, and it works with data stores including Redis, Cassandra, S3, HDFS, Accumulo and others. In addition, it supports Spark Analytics through various APIs.

But GeoMesa is just a collection of technologies to aid in building a large scale geospatial system. Effective usage requires mastery of multiple underlying data-systems and technologies. Stitching together these pieces is often time consuming and prone to error leading to less than optimal results. Performance is limited by choices made on the underlying data infrastructure.

Apache Sedona is another distributed computing framework for processing geospatial data at scale

Apache Sedona (incubating), formerly GeoSpark, gives you the ability to load, process, transform and analyze large volumes of geospatial data across different machines. It extends Apache Spark with distributed Spatial Datasets and Spatial SQL that efficiently load, process, and analyze large-scale spatial data across machines. It can be deployed on the cloud such as AWS, Azure, GCP. You can work with data using either Spark’s RDD API in popular languages like Java, Scala, Python, R, or by using SQL and ST_ functions.

Sedona leans on Apache Flink for stream processing, and can be hooked up to tools such as Zeppelin for data analysis. However this also requires stitching together multiple technologies to create a working system.

Sedona is best suited when adding geospatial functionality to existing Spark projects. But it is earlier in its development lifecycle than GeoMesa, and still lacks features and capabilities. Consider it as a component for large scale geospatial analysis use-cases with Spark.

Kinetica: Designed for the Next-Generation of Geospatial Applications

Kinetica is a blisteringly fast, scalable database, designed for real-time analysis of spatial and temporal data at scale

Consume Sensor Data at Volume

Kinetica is capable of ingesting feeds from billions of objects frequently updating their position. Multi-head ingest distributes ingestion across all nodes. Work with feeds from a variety of processors including Kafka.

Spatial & Temporal Joins

Kinetica includes a rich library of over 130 geospatial functions including geo-joins which enable you to derive context or alerting from noisy data. Identify when objects cross over thresholds, get close to each other, or deviate from course.

Scale Out

Kinetica’s distributed architecture and tiered storage enables users to be prepared for growth. Kinetica scales horizontally on commodity hardware. Sharding of data can be done automatically, or as specified and optimized by the user.

Data instantly available for analysis

Kinetica is a no-locking, eventually consistent database with writes automatically distributed. New data is available for query the moment it lands. No waiting for batch uploads, or indexes to update

Ad-hoc Joins across Large Datasets

Explore your data in ways that were never possible before. Kinetica’s vectorized kernels allow for complex analysis orders of magnitude faster than traditional systems.

Solve Routes & Relationships

What is the shortest route to guide an object to 10 different destinations? How do you match points to roads on a map? Kinetica’s Graph capabilities make this possible.

SQL

Kinetica’s robust SQL access is easily mastered by analysts and developers alike. With Kinetica Workbook you can develop repeatable interactive recipes to connect, analyze and create outputs with simple declarative SQL statements.

Connectors & Programmatic APIs

Kinetica can also be queried programmatically through ODBC/JDBC connectors, the REST API, or by using language specific libraries for Java, C++, Javascript, Node, Python and more.

Custom Functions and Machine Learning

Kinetica’s User Defined Function capabilities enable custom functions and machine learning models to be applied to make predictions, spot anomalies and deduce insights from noisy data.

Fast Lookup and High Concurrency.

Kinetica is able to build high performance key-value lookup tables, for high-speed lookup and concurrency.

Server Based Visualizations

Create server-side rendered visualizations from geospatial queries. Plot billions of points on a map, create heat maps, color code by area or generate animations.

In the Cloud or On-Premise

Kinetica is available as-a-service in the cloud on AWS and Azure infrastructure, or can be deployed on your own hardware.

Try Kinetica Now:

Kinetica Cloud is free for projects up to 10GB

Related Resources

Book a Demo!

The best way to appreciate the possibilities that Kinetica brings to high-performance real-time analytics is to see it in action.

Contact us, and we’ll give you a tour of Kinetica. We can also help you get started using it with your own data, your own schemas and your own queries.

Finding Competitive Coverage of the FSQ Places Dataset Over Road Networks Using Batch Isochrone Computations in One Tiny SQL Statement