Sensors have evolved from taking readings over time to taking readings over space and time. Understanding this trend and the resulting impacts are essential for innovators seeking to create value in the next wave of IoT products and services.
From Transactions to Interactions to Observations
The earliest form of data used for analytics described transactions. Examples of transactions include when an order is placed, inventory is replenished, or revenue is collected. The era of big data emerged when organizations began to harness interactions data. Interactions make a record of decisions made by humans. Examples of interactions include liking something on social media, watching a show through a streaming service, browsing the web, or playing a video game. What drives data volume growth today is proliferation of devices that capture observations. Curt Monash defines it as data that is more about observing humans and machines than recording the choices of humans. For example, sensors are able to monitor our pulse, the flow of traffic at an intersection, drone traffic in the air, the temperature of mechanical part, the location of goods as they move through the supply chain, the amount of energy consumption at a building, and so on. Sensors that observe humans and machines are synonymous with the Internet of Things (IoT), as the readings are available online for processing at the edge or cloud.
From Readings Over Time to Readings Over Time & Space
The first generation of IoT data were readings over time. A sensor that is capturing a reading like the temperature of a device would do so in meted time intervals, allowing monitoring for anomalies. The next generation of IoT are readings over time and space. Sensors are increasingly taking a reading with stamps on the time and location of the item. Most items are in motion: inventory, vehicles, planes, people, etc. Even things that are fixed in location like crops where sensors take readings of soil moisture require a longitude and latitude in order to send irrigation equipment to the right location when needed.
Cost of sensors and devices that generate geospatial data is falling rapidly with corresponding proliferation. The cost of location-enabled chips for cellular connectivity is expected to decline by 70% from 2017 to 2023. Costs of launching a satellite have fallen sharply over the past decade on a per-kilogram basis, meaning more data-collecting satellite launches over the next few years. The expansion of 5G networks is aiding in the collection of greater volumes of geospatial data. Bluetooth tags with integrated power-harvesting are expected to drop in price by two-thirds.
Critical Capabilities for Analyzing Things across Time & Space
With even greater data volumes and the growth of geospatial tags, new capabilities are needed to extract the full value from IoT data. Prior generation databases were never designed to handle non-explicit joins and real-time advanced analytics. Trying to use prior gen database for modern IoT analytics results in excessive costs and needless latency.
Non-Explicit Data Fusion
Traditional databases are optimized for joining primary and foreign keys, such as customer_numbers, session_ids, and order_numbers. That breaks down when trying to join time and space data with other data to provide context. Geo-joins are used to bring the power of location to a query. For instance, how many drones came within 500 meters of an airplane? Examples of geo-joins include Intersect, Within a Distance, Completely Within, Closest, and many others. Temporal joins answer questions like, “What was the last stock price at the time of order?” Examples include Start, Finish, Meet, Right Overlap, As-of, Left Overlap, Contain, Intersection, and many others.
Data Freshness
Traditional databases are optimized for transaction properties (ACID) at the sake of data freshness. The majority of IoT use cases gladly trade off the potential of a missing reading (among billions) for real-time insights. Traditional databases lock down the database tables while being loaded. They also require indexes and summary tables to be rebuilt to reflect the new data, further adding to the latency. Next generation IoT databases enable simultaneous query and ingest. They also take advantage of data level parallelism (aka vectorization) to remove the need for time consuming hacks and tricks to overcome performance limitations.
Spatio-Temporal Analytics
Value comes from not just managing IoT data but analyzing it. Because modern IoT data is both time series and geospatial, in-database analytics that are purpose built for spatio-temporal insights are required. This requires functions such as ST_Geometry, Entity Tracks, Heatmaps & Contours, Window Functions, As-Of Joins, Shortest Path, Centrality, Map Matching, and more.