Overview
Datalake is an accelerator specific to power utilities. When enabled, it can be accessed via the menu button ( ) in the top-left corner of the Utilihive Console. |
Utilihive Datalake is a big data store and analytics solution that integrates seamlessly with the energy sector, enabling utilities to deliver data-driven solutions and innovate energy services.
Datalake provides a centralized data repository that focus on handling these domain specific entities:
-
Master data - complete history of assets and business entities.
-
Readings - time series data from sensors and end devices such as smart meters.
-
Events - representing alerts and notifications.
In addition, Utilihive Datalake can store and index arbitrary files, and maintains structural metadata and lineage over all data objects.
Concept
The main concept behind a data lake is to provide low-cost and scalable storage of data at different levels of structure and refinement, ranging from natural/raw form to data readily consumed by business processes. The motivation behind this is to capture and exploit information across a big data domain, whose characteristics is commonly defined by a high data volume, velocity, variety and veracity.
Hence, the key challenge of a data lake typically comes down to finding a good balance between flexibility and performance: The data must be sufficiently structured to enable efficient exploitation and analytics, but must also allow to operate across a broad range of data entities in a flexible manner.
Utilihive Datalake accommodates this challenge by providing constructs for partitioning data across two orthogonal axes:
-
Data refinement, ranging from unstructured file objects to highly specific tabular structures.
-
Data access tier, ranging from cold and infrequently accessed data to hot and frequently accessed data.
Using this principle, Utilihive Datalake is able to offer real-time and low-latency analytics as well as batch processing capabilities across a vast data repository.
Capabilities
The main cabilities of Utilihive Datalake are summarized below:
-
Store and index data at scale for a low cost.
-
Classical data warehouse capabilities for domain entities.
-
Exposes APIs that follow industry standards.
-
Highly configurable storage tiers and data lifecycle policies.
-
Flexible schema for inbound data objects, records are linked and structured upon time of use (schema-on-read).
-
Cloud native and cloud agnostic.