Access IRIS database with ODBC or JDBC using Python

Front page > Programming > Access IRIS database with ODBC or JDBC using Python

Access IRIS database with ODBC or JDBC using Python

Published on 2024-11-15

Browse:468

Access IRIS database with ODBC or JDBC using Python

Problems with Strings

I am accessing IRIS databases with JDBC (or ODBC) using Python. I want to fetch the data into a pandas dataframe to manipulate the data and create charts from it. I ran into a problem with string handling while using JDBC. This post is to help if anyone else has the same issues. Or, if there is an easier way to solve this, let me know in the comments!

I am using OSX, so I am unsure how unique my problem is. I am using Jupyter Notebooks, although the code would generally be the same if you used any other Python program or framework.

The JDBC problem

When I fetch data from the database the column descriptions and any string data are returned as data type java.lang.String. If you print string data data it will look like: "(p,a,i,n,i,n,t,h,e,r,e,a,r)" instead of the expected "painintherear".

This is probably because character strings of data type java.lang.String are coming through as an iterable or array when fetched using JDBC. This can happen if the Python-Java bridge you're using (e.g., JayDeBeApi, JDBC) is not automatically converting java.lang.String to a Python str in a single step.

Python's str string representation, in contrast, has the whole string as a single unit. When Python retrieves a normal str (e.g. via ODBC), it doesn't split into individual characters.

The JDBC Solution

To fix this issue, you must ensure that the java.lang.String is correctly converted into Python's str type. You can explicitly handle this conversion when processing the fetched data so it is not interpreted as an iterable or list of characters.

There are many ways to do this string manipulation; this is what I did.

import pandas as pd

import pyodbc

import jaydebeapi
import jpype

def my_function(jdbc_used)

    # Some other code to create the connection goes here

    cursor.execute(query_string)

    if jdbc_used:
        # Fetch the results, convert java.lang.String in the data to Python str
        # (java.lang.String is returned "(p,a,i,n,i,n,t,h,e,r,e,a,r)" Convert to str type "painintherear"
        results = []
        for row in cursor.fetchall():
            converted_row = [str(item) if isinstance(item, jpype.java.lang.String) else item for item in row]
            results.append(converted_row)

        # Get the column names and ensure they are Python strings 
        column_names = [str(col[0]) for col in cursor.description]

        # Create the dataframe
        df = pd.DataFrame.from_records(results, columns=column_names)

        # Check the results
        print(df.head().to_string())

    else:  
        # I was also testing ODBC
        # For very large result sets get results in chunks using cursor.fetchmany(). or fetchall()
        results = cursor.fetchall()
        # Get the column names
        column_names = [column[0] for column in cursor.description]
        # Create the dataframe
        df = pd.DataFrame.from_records(results, columns=column_names)

    # Do stuff with your dataframe

The ODBC problem

When using an ODBC connection, strings are not returned or are NA.

If you're connecting to a database that contains Unicode data (e.g., names in different languages) or if your application needs to store or retrieve non-ASCII characters, you must ensure that the data remains correctly encoded when passed between the database and your Python application.

The ODBC solution

This code ensures that string data is encoded and decoded using UTF-8 when sending and retrieving data to the database. It's especially important when dealing with non-ASCII characters or ensuring compatibility with Unicode data.

def create_connection(connection_string, password):
    connection = None

    try:
        # print(f"Connecting to {connection_string}")
        connection = pyodbc.connect(connection_string   ";PWD="   password)

        # Ensure strings are read correctly
        connection.setdecoding(pyodbc.SQL_CHAR, encoding="utf8")
        connection.setdecoding(pyodbc.SQL_WCHAR, encoding="utf8")
        connection.setencoding(encoding="utf8")

    except pyodbc.Error as e:
        print(f"The error '{e}' occurred")

    return connection

connection.setdecoding(pyodbc.SQL_CHAR, encoding="utf8")

Tells pyodbc how to decode character data from the database when fetching SQL_CHAR types (typically, fixed-length character fields).

connection.setdecoding(pyodbc.SQL_WCHAR, encoding="utf8")

Sets the decoding for SQL_WCHAR, wide-character types (i.e., Unicode strings, such as NVARCHAR or NCHAR in SQL Server).

connection.setencoding(encoding="utf8")

Ensures that any strings or character data sent from Python to the database will be encoded using UTF-8,
meaning Python will translate its internal str type (which is Unicode) into UTF-8 bytes when communicating with the database.

Putting it all together

Install JDBC

Install JAVA - use dmg

https://www.oracle.com/middleeast/java/technologies/downloads/#jdk23-mac

Update shell to set default version

$ /usr/libexec/java_home -V
Matching Java Virtual Machines (2):
    23 (arm64) "Oracle Corporation" - "Java SE 23" /Library/Java/JavaVirtualMachines/jdk-23.jdk/Contents/Home
    1.8.421.09 (arm64) "Oracle Corporation" - "Java" /Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home
/Library/Java/JavaVirtualMachines/jdk-23.jdk/Contents/Home
$ echo $SHELL
/opt/homebrew/bin/bash
$ vi ~/.bash_profile

Add JAVA_HOME to your path

export JAVA_HOME=$(/usr/libexec/java_home -v 23)
export PATH=$JAVA_HOME/bin:$PATH

Get the JDBC driver

https://intersystems-community.github.io/iris-driver-distribution/

Put the jar file somewhere... I put it in $HOME

$ ls $HOME/*.jar
/Users/myname/intersystems-jdbc-3.8.4.jar

Sample code

It assumes you have set up ODBC (an example for another day, the dog ate my notes...).

Note: this is a hack of my real code. Note the variable names.

import os

import datetime
from datetime import date, time, datetime, timedelta

import pandas as pd
import pyodbc

import jaydebeapi
import jpype

def jdbc_create_connection(jdbc_url, jdbc_username, jdbc_password):

    # Path to JDBC driver
    jdbc_driver_path = '/Users/yourname/intersystems-jdbc-3.8.4.jar'

    # Ensure JAVA_HOME is set
    os.environ['JAVA_HOME']='/Library/Java/JavaVirtualMachines/jdk-23.jdk/Contents/Home'
    os.environ['CLASSPATH'] = jdbc_driver_path

    # Start the JVM (if not already running)
    if not jpype.isJVMStarted():
        jpype.startJVM(jpype.getDefaultJVMPath(), classpath=[jdbc_driver_path])

    # Connect to the database
    connection = None

    try:
        connection = jaydebeapi.connect("com.intersystems.jdbc.IRISDriver",
                                  jdbc_url,
                                  [jdbc_username, jdbc_password],
                                  jdbc_driver_path)
        print("Connection successful")
    except Exception as e:
        print(f"An error occurred: {e}")

    return connection


def odbc_create_connection(connection_string):
    connection = None

    try:
        # print(f"Connecting to {connection_string}")
        connection = pyodbc.connect(connection_string)

        # Ensure strings are read correctly
        connection.setdecoding(pyodbc.SQL_CHAR, encoding="utf8")
        connection.setdecoding(pyodbc.SQL_WCHAR, encoding="utf8")
        connection.setencoding(encoding="utf8")

    except pyodbc.Error as e:
        print(f"The error '{e}' occurred")

    return connection

# Parameters

odbc_driver = "InterSystems ODBC"
odbc_host = "your_host"
odbc_port = "51773"
odbc_namespace = "your_namespace"
odbc_username = "username"
odbc_password = "password"

jdbc_host = "your_host"
jdbc_port = "51773"
jdbc_namespace = "your_namespace"
jdbc_username = "username"
jdbc_password = "password"

# Create connection and create charts

jdbc_used = True

if jdbc_used:
    print("Using JDBC")
    jdbc_url = f"jdbc:IRIS://{jdbc_host}:{jdbc_port}/{jdbc_namespace}?useUnicode=true&characterEncoding=UTF-8"
    connection = jdbc_create_connection(jdbc_url, jdbc_username, jdbc_password)
else:
    print("Using ODBC")
    connection_string = f"Driver={odbc_driver};Host={odbc_host};Port={odbc_port};Database={odbc_namespace};UID={odbc_username};PWD={odbc_password}"
    connection = odbc_create_connection(connection_string)


if connection is None:
    print("Unable to connect to IRIS")
    exit()

cursor = connection.cursor()

site = "SAMPLE"
table_name = "your.TableNAME"

desired_columns = [
    "RunDate",
    "ActiveUsersCount",
    "EpisodeCountEmergency",
    "EpisodeCountInpatient",
    "EpisodeCountOutpatient",
    "EpisodeCountTotal",
    "AppointmentCount",
    "PrintCountTotal",
    "site",
]

# Construct the column selection part of the query
column_selection = ", ".join(desired_columns)

query_string = f"SELECT {column_selection} FROM {table_name} WHERE Site = '{site}'"

print(query_string)
cursor.execute(query_string)

if jdbc_used:
    # Fetch the results
    results = []
    for row in cursor.fetchall():
        converted_row = [str(item) if isinstance(item, jpype.java.lang.String) else item for item in row]
        results.append(converted_row)

    # Get the column names and ensure they are Python strings (java.lang.String is returned "(p,a,i,n,i,n,t,h,e,a,r,s,e)"
    column_names = [str(col[0]) for col in cursor.description]

    # Create the dataframe
    df = pd.DataFrame.from_records(results, columns=column_names)
    print(df.head().to_string())
else:
    # For very large result sets get results in chunks using cursor.fetchmany(). or fetchall()
    results = cursor.fetchall()
    # Get the column names
    column_names = [column[0] for column in cursor.description]
    # Create the dataframe
    df = pd.DataFrame.from_records(results, columns=column_names)

    print(df.head().to_string())

# # Build charts for a site
# cf.build_7_day_rolling_average_chart(site, cursor, jdbc_used)

cursor.close()
connection.close()

# Shutdown the JVM (if you started it)
# jpype.shutdownJVM()

Release Statement This article is reproduced at: https://dev.to/intersystems/access-iris-database-with-odbc-or-jdbc-using-python-54ok?1 If there is any infringement, please contact [email protected] to delete it

Latest tutorial More>

Beyond `if` Statements: Where Else Can a Type with an Explicit `bool` Conversion Be Used Without Casting?
Contextual Conversion to bool Allowed Without a CastYour class defines an explicit conversion to bool, enabling you to use its instance 't' di...

Programming Published on 2024-11-16
What Happened to Column Offsetting in Bootstrap 4 Beta?
Bootstrap 4 Beta: The Removal and Restoration of Column OffsettingBootstrap 4, in its Beta 1 release, introduced significant changes to the way column...

Programming Published on 2024-11-16
How do I combine two associative arrays in PHP while preserving unique IDs and handling duplicate names?
Combining Associative Arrays in PHPIn PHP, combining two associative arrays into a single array is a common task. Consider the following request:Descr...

Programming Published on 2024-11-16
Using WebSockets in Go for Real-Time Communication
Building apps that require real-time updates—like chat applications, live notifications, or collaborative tools—requires a communication method faster...

Programming Published on 2024-11-16
$How Can I Find Users with Today\'s Birthdays Using MySQL?$
How Can I Find Users with Today\'s Birthdays Using MySQL?
How to Identify Users with Today's Birthdays Using MySQLDetermining if today is a user's birthday using MySQL involves finding all rows where ...

Programming Published on 2024-11-16
$How to Fix \"ImproperlyConfigured: Error loading MySQLdb module\" in Django on macOS?$
How to Fix \"ImproperlyConfigured: Error loading MySQLdb module\" in Django on macOS?
MySQL Improperly Configured: The Problem with Relative PathsWhen running python manage.py runserver in Django, you may encounter the following error:I...

Programming Published on 2024-11-16
Why does floating-point arithmetic differ between x86 and x64 in Visual Studio 2010?
Floating Point Arithmetic Discrepancy between x86 and x64In Visual Studio 2010, a noticeable difference in floating-point arithmetic between x86 and x...

Programming Published on 2024-11-15
How can I improve the performance of MySQL LIKE operator with wildcards?
MySQL LIKE Operator OptimizationQuestion: Can the performance of the MySQL LIKE operator be improved when using wildcards (e.g., '%test%')?Ans...

Programming Published on 2024-11-15
How can I send data via POST to an external website using PHP?
Redirecting and Sending Data via POST in PHPIn PHP, you may encounter a situation where you need to redirect a user to an external website and pass da...

Programming Published on 2024-11-15
How can I Catch Segmentation Faults in Linux Using GCC?
Catching Segmentation Faults in LinuxQ: I'm experiencing segmentation faults in a third-party library, but I'm unable to resolve the underlyin...

Programming Published on 2024-11-15
How Can I Access the Type of a Go Struct Without Creating an Instance?
Accessing Reflect.Type Without Physical Struct CreationIn Go, dynamically loading solutions to problems requires accessing the type of structs without...

Programming Published on 2024-11-15
How to Efficiently Convert Integers to Byte Arrays in Java?
Efficient Conversion of Integers to Byte Arrays in JavaConverting an integer to a byte array can be useful for various purposes, such as network trans...

Programming Published on 2024-11-15
How to Sort a Slice of Structs by Multiple Fields in Go?
Sorting Slice Objects by Multiple FieldsSorting by Multiple CriteriaConsider the following Parent and Child structs:type Parent struct { id ...

Programming Published on 2024-11-15
Qt Threads vs. Python Threads: Which Should I Use in PyQt Applications?
Threading in PyQt Applications: Qt Threads vs. Python ThreadsDevelopers seeking to create responsive GUI applications using PyQt often encounter the c...

Programming Published on 2024-11-15
Why Doesn't My PHP Submit Button Trigger Echoes and Table Display?
PHP Submit Button Dilemma: Unavailable Echoes and TableYour code intends to display echoes and a table when the "Submit" button is clicked o...

Programming Published on 2024-11-15