Integration

Datalake is an accelerator specific to power utilities. When enabled, it can be accessed via the menu button ( ) in the top-left corner of the Utilihive Console.

Data from Utilihive Datalake can be queried using standard drivers and programming language libraries.

SQL Database - Clickhouse

ClickHouse supports a SQL dialect close to ANSI SQL with extensions for online analytical processing. Standard drivers and client libraries are available to connect to ClickHouse and run read-only SQL queries.

ODBC

The ODBC driver is available for Windows, macOS, and Linux. Download and install the latest stable driver from the project’s Releases page.

After the driver has been installed, configure the DSN using a ClickHouse HTTPS endpoint. For example:

  • Name: assign a name for DSN

  • Host: e.g., clickhouse.domain.com

  • Port: e.g., 443

  • Database: default (unless otherwise noted)

  • User: username

  • Password: password

  • SSLMode: leave blank

  • Other fields: leave blank

JDBC

Perform the following steps to query data using JDBC drivers:

  • Download the latest release driver (*-all.jar) from the project’s Releases page. Save the *-all.jar file in a permanent location (e.g., c:\drivers\ on Windows).

  • Configure the software that will run queries using a ClickHouse HTTPS endpoint. For example:

    • Class Path: path of .jar file (e.g., c:\drivers\clickhouse-jdbc-0.3.2-patch10-all.jar)

    • Class Name: com.clickhouse.jdbc.ClickHouseDriver

    • If required, set example connection URL: jdbc:ch:https://clickhouse.domain.com:443

    • Set connection URL with host and port: jdbc:ch:https://clickhouse.domain.com:443

Python Language

Perform the following steps to query data using Python:

  • Install the clickhouse-driver package with the following pip command:

    pip install clickhouse-driver
  • Import the dependencies into your Python script with the following statement:

    from clickhouse_driver import Client
  • Create a client and establish a connection using the ClickHouse Native (TCP) SSL endpoint. For example:

    client = Client(host = "clickhouse.tpc.domain.com", port = "9440", user = "username", password = "password", secure = True)
  • Run a query. For example:

    result = client.execute("select reading_value, reading_date, reading_time from reading limit 5")
    print(result)

R Language

ClickHouse’s HTTP interface can be directly used for SQL queries. Currently, the R library over native connections does not support SSL, and the library over the HTTP interface does not support a read-only mode.

  • Start R and install the following packages from CRAN:

    install.packages("httr")
    install.packages("readr")
  • Load the required libraries in the following manner:

    library(httr)
    library(readr)
  • Set connection parameters using a ClickHouse HTTPS endpoint. For example:

    url <- "https://clickhouse.domain.com:443/query"
    user <- "username"
    pass <- "password"
  • Format and run a query. For example:

    # Define a query
    query <- 'select reading_value, reading_date, reading_time from reading limit 5'
    
    # Run the query
    r <- GET(url, authenticate(user, pass), query = list(query = query))
    
    # Convert response to data frame
    # Content-Type: text/tab-separated-values
    content = content(r, "raw")
    result <- read_tsv(content, col_names = FALSE)
    
    # Set column names
    colnames(result) <- c('reading_value', 'reading_date', 'reading_time')
    
    # Get data frame spec
    spec(result)
    
    # Get a single result
    result$reading_value[1]

Other languages

For information on integrating with other programming languages, check out the ClickHouse documentation on Interfaces and Client Libraries.

Graph Database - Dgraph

Datalake uses Dgraph, a native GraphQL database with a graph backend. Dgraph can be queried using one of the following query language options:

DQL extends and modifies the GraphQL standard to support deep queries. DQL is more flexible for some use cases. GraphQL can present limitations compared to DQL. With DQL, any data fields and edges can be queried and used as a filter.

GraphQL - Python language

Install the required packages using the following pip command:

pip install 'gql[all]' requests

Import the dependencies into your Python script with the following statements:

from gql import gql, Client
from gql.transport.requests import RequestsHTTPTransport
from requests.auth import HTTPBasicAuth

Create a client, and set the connection parameters using a GraphQL basic auth endpoint. For example:

auth = HTTPBasicAuth('username', 'password')
transport = RequestsHTTPTransport(url='https://graph.domain.com:443', auth=auth)
client = Client(transport=transport, fetch_schema_from_transport=True)

Format and run a query. For example:

query = """
{
  queryUsagePointDto(filter: { source: {eq: "nominal"} }, first: 5) {
    mrId,
    source
  }
}
"""

gql_query = gql(query)
result = graphql_client.execute(gql_query)

GraphQL - R language

Start R and install the following ghql package from CRAN:

install.packages("ghql")

Load the required libraries in the following manner:

library("ghql")
library("jsonlite")

Create a client, and set the connection parameters using a GraphQL basic auth endpoint. For example:

url <- "https://graph.domain.com:443/"
user <- "username"
pass <- "password"

auth <- jsonlite::base64_enc(paste0(user, ":", pass))
basic_auth <- paste0("Authorization = Basic ", auth)

cli <- GraphqlClient$new(url = url, headers = list(Authorization = basic_auth))

Define and run a query. For example:

qry <- Query$new()

qry$query('mydata', '{
  queryUsagePointDto(filter: { source: {eq: "nominal"} }, first: 5) {
    mrId,
    source
  }
}')

r <- cli$exec(qry$queries$mydata)
res <- jsonlite::fromJSON(r)

res$data$query
Check out the Introduction to ghql for more information.

DQL - Python language

Install the required packages with the following pip command:

pip install pydgraph

Import the dependencies into your Python script with the following statements:

import pydgraph
import grpc

Create a client, and set the connection parameters using a Dgraph gRPC auth token endpoint. For example:

client_stub = pydgraph.DgraphClientStub('dgraph.domain.com:443', credentials=grpc.ssl_channel_credentials())
client = pydgraph.DgraphClient(client_stub)
metadata=[("auth-token", 'token')])

Define a query function in the following manner:

def dgraph_query(query, variables={}):
    txn = client.txn(read_only=True)
    try:
        res = txn.query(query, variables=variables, metadata=metadata)
        return json.loads(res.json)
    finally:
      txn.discard()

Format and run a query. For example:

query = """{
    result(func: eq(UsagePointDto.type, "RES"), first: 5)
      @filter(eq(IdentifiedObject.source, "nominal")) {
        mrId: IdentifiedObject.mrId
        source: IdentifiedObject.source
        type: UsagePointDto.type
    }
}"""

result = dgraph_query(query)

DQL - R language

Direct HTTP requests can be used for DQL queries, as there is currently no stable R library available.

First, start R and install the following httr package from CRAN:

install.packages("httr")

Load the required libraries in the following manner:

library(httr)

Set the connection parameters using a Dgraph HTTP auth token endpoint. For example:

url <- "https://dgraph.domain.com:443/query"
authtoken <- "token"

Format and run a query. For example:

query <- '{
    result(func: eq(UsagePointDto.type, "RES"), first: 5)
      @filter(eq(IdentifiedObject.source, "nominal")) {
        mrId: IdentifiedObject.mrId
        source: IdentifiedObject.source
        type: UsagePointDto.type
    }
}
'

r <- POST(url, content_type("application/dql"), add_headers("auth-token" = authtoken), body = query)
res <- content(r)
res$data$result
Check out Dgraph’s Raw HTTP docs and the httr Quickstart Guide for more information.

Other languages

For help on integrating with other programming languages, refer to the Dgraph documentation.