James Dilworth

Interactive Geospatial Analysis with Massive Datasets

As more and more data becomes available from sensors, from customers, from transactions—much of it with time and location information—there are increasing demands to analyze these data sets and visualize the results on maps. But today’s geospatial toolsets are hardly up to the task. Spatial databases weren’t designed for a world where IoT systems might be tracking millions of sensors generating frequent updates. If you want to do analysis on large datasets with any sort of interactivity, if you want to visualize millions or billions of records—and then filter them and see results based on different groupings—then you’re going to need to solve a couple fundamental challenges: The first challenge is that most databases are simply not designed to perform large-scale geospatial analytics in a reasonable amount of time. Imagine trying to analyze millions of customer purchases and aggregating those based on ad-hoc proximity to retail stores. The types of polygon intersection calculations needed to run this are expensive. Multiply that by millions or billions of records, and […]

Seeing is Believing – Kinetica Trial Edition Now Available.

We’re excited to announce that a developer trial edition of Kinetica is now available. People who’ve seen Kinetica are consistently astounded at the performance of a database built from the ground up to leverage GPUs. With customers, we repeatedly see scenarios where queries that once took several minutes with traditional analytics systems now return in under a second. Batch jobs that used to run overnight can be replaced with real-time queries. Server sprawl where systems have spread out to hundreds of nodes to overcome processing bottlenecks can be reduced to single digits. But it’s one thing to hear about it, it’s another to experience it for yourself. Kinetica is quick and easy to install — you’ll probably be up and running in five minutes! All you’ll need is a Linux box (or VM), and it’s a ‘yum’ or ‘apt’ command away. Administration is simple, via a web-based admin panel, and we’ve bundled some demo data sets to get you started — or you can use data that you […]

Machine Learning and Predictive Analytics in Finance: Observations from the Field

Financial institutions have long been on the cutting edge of quantitative analytics. Trade decisioning, risk calculations, and fraud prevention are all now heavily driven by data. But as the volume of data has grown, as analysis has become ever more sophisticated, and as pressure builds for timely results, computation is more and more of a challenge. Increasingly computer scientists and engineers are being called on to tackle problems of scale and complexity common in finance. Machine learning offers new opportunities, such as to inform trade decisions made throughout the day or for more advanced risk calculations. The problem, however, is that massive compute resources and advanced data science libraries are required to take advantage of this paradigm. This inherently prevents organizations from expanding this area of the business to the scale they would like. How can financial services organizations get to the point where predictive models, optimized through machine learning, are made available to business users? – ideally with the up-to-the-moment data and sub-second response? Over the past […]

Advanced In-Database Analytics on the GPU

With Version 6.0, Kinetica introduces user-defined functions (UDFs), enabling GPU-accelerated data science logic to power advanced business analytics, on a single database platform. User-defined functions (UDFs) enable compute as well as data-processing, within the database. Such ‘in-database processing’ is available on several high-end databases such as Oracle, Teradata, Vertica and others, but this is the first time such functionality has been made available on a database that fully utilizes the parallel compute power of the GPU on a distributed platform. In-database processing in Kinetica creates a highly flexible means of doing advanced compute-to-grid analytics. This industry-first functionality stands to help democratize data science. Until now, organizations have typically needed to extract data to specialized environments to take advantage of GPU acceleration for data science workloads, such as machine learning and deep learning. Kinetica now makes it possible for sophisticated data science models to be developed and made available on the same database platform as is used for business analytics. How it Works UDFs and the associated orchestration API […]

The Top 6 Most Common Questions about Kinetica

You have questions. We have answers! As we pave a new market for hardware-accelerated databases, we’re frequently running into the same questions with eager-to-learn prospects and researchers. Here are some of the most popular ones, and you can find many more on the ‘Frequently Asked Questions‘ page. So, what exactly is Kinetica? Kinetica is a distributed, in-memory database accelerated by GPUs that can simultaneously ingest, analyze, and visualize streaming data for truly real-time actionable intelligence. Kinetica leverages the power of many core devices (such as GPUs) to deliver results orders of magnitude faster than traditional databases on a fraction of the hardware OLAP? OLTP? Or Both? Kinetica is a vectorized columnar database designed for analytics (OLAP) workloads. Kinetica was built from the ground up to leverage the parallel compute power of the GPU for fast response to analytic queries on large datasets. Kinetica stands out when used with streaming data, and high-cardinality data. Kinetica is not typically used as a system of record, but is a great analytics […]

GPU Computing Is Revolutionizing Real-Time Analytics For Retail, CPG, Logistics and Supply Chain

Kinetica’s recent webinar with NVIDIA discussed using GPUs to efficiently and quickly ingest, explore, and visualize streaming datasets in new commercial use cases. Mark Brooks, Principal Systems Engineer, and NVIDIA’s John Barco, Senior Director of Partner Solutions, detailed these use cases in their webinar. Their presentation is below. t Kinetica was designed to be able to meet the needs of U.S. Army Intelligence Programs, and it was originally developed as a database with a pretty extreme set of requirements intended to assess national security threats. At the time that the U.S. Intelligence organizations were evaluating database technologies, they went through all of the existing standard relational systems (Hadoop, NoSQL), and there was just nothing that could scale. The requirements involved 250 high-velocity data feeds, and they needed to do analytics in real-time across those feeds. Our company founders, who were consulting for the government at the time, had backgrounds in both geo-temporal capabilities and using GPUs, which were just starting to be recognized as a general-purpose compute device, […]

Vectorized processing is the secret sauce behind the fastest, most powerful analytics database, ever

How do you uncover security threats from billions of different signals when time is of the essence? This was the challenge faced by the US Army Intelligence and Security Command eight years ago. It is the challenge that became the genesis for Kinetica vectorized analytics database. Threat identification requires searching for patterns within large volumes of data and from a wide variety of streaming data sources. The military needed a system that would provide for ad-hoc exploration of that data, while it is fresh, and without knowing in advance what questions would need to be asked. The challenges faced by the military were in many ways a forerunner to those that are being encountered by all larger businesses today. Cyber data, IoT data, web data, and business transaction data are being generated at massive rates. Patterns and insights need to be recognized quickly, and the need for time relevancy is more important than ever. The variety and high cardinality of streaming data poses particular challenges for analytical systems. While […]

Finding Competitive Coverage of the FSQ Places Dataset Over Road Networks Using Batch Isochrone Computations in One Tiny SQL Statement