Giter VIP home page Giter VIP logo

sqlalchemy_dremio's Introduction

SQLAlchemy Dremio

PyPI Build

A SQLAlchemy dialect for Dremio via ODBC and Flight interfaces.

Installation

From pip:

pip install sqlalchemy_dremio

Or from conda:

conda install sqlalchemy-dremio

To install from source: python setup.py install

Usage

Connection String example:

Dremio Software:

dremio+flight://user:password@host:port/dremio

Dremio Cloud:

dremio+flight://data.dremio.cloud:443/?Token=<TOKEN>UseEncryption=true&disableCertificateVerification=true

Options:

Schema - (Optional) The schema to use

TLS:

UseEncryption=true|false - (Optional) Enables TLS connection. Must be enabled on Dremio to use it. DisableCertificateVerification=true|false - (Optional) Disables certificate verirication.

WLM:

https://docs.dremio.com/software/advanced-administration/workload-management/#query-tagging--direct-routing-configuration

routing_queue - (Optional) The queue in which queries should run routing_tag - (Optonal) Routing tag to use. routing_engine - (Optional) The engine in which the queries should run

Superset Integration

The ODBC connection to superset is now deprecated. Please update sqlalchemy_dremio to 3.0.2 to use the flight connection.

Release Notes

3.0.4

  • Addressing issue #34 and #37: Add driver name to dialects

3.0.3

  • Add back missing routing_engine property.

3.0.2

  • Add implementations of has_table and get_view_names.

3.0.1

  • Made connection string property keys case-insensitive
  • Fix incorrect lookup of the token property
  • Fix incorrect lookup of the DisableCertificateVerification property

sqlalchemy_dremio's People

Contributors

chufe-dremio avatar craustin avatar jduo avatar mcdeck avatar naren-dremio avatar narendrans avatar villebro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sqlalchemy_dremio's Issues

Extend _type_map with Array

I have some tables containing Array columns in a database. When DremioDialect.get_columns() is called ctype = _type_map[col[1]] throws an KEYERROR 'ARRAY'.
This can be solved with extending _type_map in base.py and query.py with:

_type_map = {
     ...
    'ARRAY': types.ARRAY,
    'array': types.ARRAY,
}

sqlalchemy_dremio arrow flight ssl certificates

I'm having same issue as #22 , since drivers are not available (I have been told they are "legacy"), I'm trying to do the same with new driver and Dremio 22.0.3

However I am still getting connection issues, which (I think) are caused self signed certificate. Unfortunately the only message I'm getting is this:

File "/home/superset/.local/lib/python3.8/site-packages/sqlalchemy_dremio/db.py", line 20, in connect return Connection(c) File "/home/superset/.local/lib/python3.8/site-packages/sqlalchemy_dremio/db.py", line 53, in __init__ client.authenticate(HttpDremioClientAuthHandler(splits[0].split("=")[1], splits[1].split("=")[1])) File "pyarrow/_flight.pyx", line 1177, in pyarrow._flight.FlightClient.authenticate File "pyarrow/_flight.pyx", line 69, in pyarrow._flight.check_flight_status pyarrow._flight.FlightUnavailableError: gRPC returned unavailable error, with message: failed to connect to all addresses

I would like to confirm this, but before that, could you advise me with:
1 - Whether sqlalchemy_dremio supports Apache Arrow Flight SQL Driver ?
2 - How to disable cert verification in Superset or specify crt/pem file ?

Since I'm trying to run this on Debian, I'm afraid I can't use default store until its supported there.

I tried with your test:
DREMIO_CONNECTION_URL="dremio+flight://user:pass@server:32010/dremio;DisableCertificateVerification=0&DisableCertificateValidation=0"

AttributeError: 'DremioDialect_flight' object has no attribute 'driver'

Hi,
Thank you for creating this!
I am hoping the following may be an easy fix. I think there was a change made to SQLAlchemy as it attempts to print driver and name in a warning and this causes the module to fail.

/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py in execute(self, statement, *multiparams, **params)
   1304             )
   1305         else:
-> 1306             return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
   1307 
   1308     def _execute_function(self, func, multiparams, params, execution_options):

/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/sql/elements.py in _execute_on_connection(self, connection, multiparams, params, execution_options, _force)
    330     ):
    331         if _force or self.supports_execution:
--> 332             return connection._execute_clauseelement(
    333                 self, multiparams, params, execution_options
    334             )

/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py in _execute_clauseelement(self, elem, multiparams, params, execution_options)
   1488         )
   1489 
-> 1490         compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
   1491             dialect=dialect,
   1492             compiled_cache=compiled_cache,

/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/sql/elements.py in _compile_w_cache(self, dialect, compiled_cache, column_keys, for_executemany, schema_translate_map, **kw)
    510         **kw
    511     ):
--> 512         if compiled_cache is not None and dialect._supports_statement_cache:
    513             elem_cache_key = self._generate_cache_key()
    514         else:

/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py in __get__(self, obj, cls)
   1111         if obj is None:
   1112             return self
-> 1113         obj.__dict__[self.__name__] = result = self.fget(obj)
   1114         return result
   1115 

/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py in _supports_statement_cache(self)
    363                 "for SQLAlchemy 1.4 caching support.   Alternatively, this "
    364                 "attribute may be set to False which will disable this "
--> 365                 "warning." % (self.name, self.driver),
    366                 code="cprf",
    367             )

AttributeError: 'DremioDialect_flight' object has no attribute 'driver'

Flight support for Passwords with = in them

We are using Dremio with OAuth.
In order to access the API we then need to create an Access Token in Dremio. As these Dremio Tokens are base64 encoded by default, they always end with equal signs.

The Problem is now in:

client.authenticate(HttpDremioClientAuthHandler(splits[0].split("=")[1], splits[1].split("=")[1]))

Here, we split the password at the = sign and only send the item number 1. The problem is that the split does not only act on the first equal sign, but all of them.

A simple fix would be (I think):

client.authenticate(HttpDremioClientAuthHandler(splits[0].split("=", 1)[1], splits[1].split("=", 1)[1])) 

thus limiting the number of splits to one and not splitting up the password at any occuring = further.

Connection pool - The default implementation is not recommended for production use

Hello,

For both the Dremio and Dremio+Flight dialects, the SingletonThreadPool connection pool implementation is used by default.

While this configuration can be easily overridden, it would possibly be safer to use a different connection pool implementation as this one is not recommended for production use and can cause some inconvenient behaviors in a production setup (e.g., connection randomly closed in the middle of a query execution).
See https://docs.sqlalchemy.org/en/14/core/pooling.html#sqlalchemy.pool.SingletonThreadPool for further detail.

class DremioDialect(default.DefaultDialect):
name = 'dremio'
supports_sane_rowcount = False
supports_sane_multi_rowcount = False
poolclass = pool.SingletonThreadPool
statement_compiler = DremioCompiler

class DremioDialect_flight(default.DefaultDialect):
name = 'dremio+flight'
supports_sane_rowcount = False
supports_sane_multi_rowcount = False
poolclass = pool.SingletonThreadPool
statement_compiler = DremioCompiler

Thanks again and keep up with the good work ๐Ÿ™‚

dremio+flight connection issue

Hi,

I'm struggling to connect to dremio running in the community edition dremio/dremio-oss docker container.

I've seen many similar issues like #11 and #23 but none of those solution seem to be working for me.

I first encountered this when trying to connect from Superset but I can reproduce it running in a simple Python script.

$ python --version
Python 3.8.10
$ pip install sqlalchemy_dremio
Requirement already satisfied: sqlalchemy_dremio in /home/tobias/.local/lib/python3.8/site-packages (3.0.3)
Requirement already satisfied: sqlalchemy>=1.3.24 in /home/tobias/.local/lib/python3.8/site-packages (from sqlalchemy_dremio) (2.0.8)
Requirement already satisfied: pyarrow>=5.0.0 in /home/tobias/.local/lib/python3.8/site-packages (from sqlalchemy_dremio) (11.0.0)
Requirement already satisfied: typing-extensions>=4.2.0 in /home/tobias/.local/lib/python3.8/site-packages (from sqlalchemy>=1.3.24->sqlalchemy_dremio) (4.5.0)
Requirement already satisfied: greenlet!=0.4.17; platform_machine == "aarch64" or (platform_machine == "ppc64le" or (platform_machine == "x86_64" or (platform_machine == "amd64" or (platform_machine == "AMD64" or (platform_machine == "win32" or platform_machine == "WIN32"))))) in /usr/lib/python3/dist-packages (from sqlalchemy>=1.3.24->sqlalchemy_dremio) (0.4.15)
Requirement already satisfied: numpy>=1.16.6 in /home/tobias/.local/lib/python3.8/site-packages (from pyarrow>=5.0.0->sqlalchemy_dremio) (1.24.2)
>>> import sqlalchemy as sa
>>> engine = sa.create_engine('dremio+flight://dremio:dremio123@localhost:31010/dremio?UseEncryption=false&disableCertificateVerification=true', echo=True)
>>> con = engine.connect()
...
 File "pyarrow/_flight.pyx", line 1398, in pyarrow._flight.FlightClient.authenticate_basic_token
  File "pyarrow/_flight.pyx", line 71, in pyarrow._flight.check_flight_status
pyarrow._flight.FlightUnavailableError: Flight returned unavailable error, with message: failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:31010: Socket closed
>>>

The relevant section of my dremio.conf looks as follows:

services: {
  coordinator.enabled: true,
  coordinator.master.enabled: true,
  executor.enabled: true,
  flight.enabled: true,
  #flight.auth.mode: "legacy.arrow.flight.auth"
  flight.auth.mode: "arrow.flight.auth2"
  #flight.use_session_service: true,
  #flight.ssl.enabled: true,
  #flight.ssl.auto-certificate.enabled: true
}

running converted dremio rpm

Hi,

Followed your instructions running on ubuntu/debian. After converting to .deb and running using dpkg, I got the following, not sure whether it is a warning or error.

/var/lib/dpkg/info/dremio-odbc.postrm: line 2: [: upgrade: integer expression expected

I tested running alien -i dremio-odbc-LATEST.x86_64.rpm, without converting, gave the same message.

I am using -
alien version 8.95
dremio-odbc-LATEST.x86_64.rpm

Parametrized statements - Issue with quoted strings

Hello,

We are currently experimenting with the parametrized statements support recently brought by b39604e and faced an issue with the generated queries when the parameter's value contains single quotes.

As an example, the statement SELECT * FROM cereal WHERE brand IN (?, ?) having Kellog's and Nestle as values for the parameters will result in the following SQL statement, which syntax is invalid SELECT * FROM cereal WHERE brand IN ('Kellog's', 'Nestle').

The observed behavior is likely due to the following code, which does not seem to properly escape string parameters :

# Workaround since Dremio does not support parameterized stmts
# Old queries should not have used queries with parameters, since Dremio does not support it
# and these queries failed. If there is no parameter, everything should work as before.
def do_execute(self, cursor, statement, parameters, context):
replaced_stmt = statement
for v in parameters:
if isinstance(v, (int, float)):
replaced_stmt = replaced_stmt.replace('?', str(v), 1)
else:
replaced_stmt = replaced_stmt.replace('?', "'" + str(v) + "'", 1)

Beyond the syntax issue with the generated queries, this implementation could also make the code vulnerable to SQL injection.

dremio+flight connection issue

Hi,

For ODBC connection with dremio://user:password@host:port/dremio;SSL=0 and querying SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10 and SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 1000000 gives almost same time (0.014818636999999968 s). I am wondering if it's working correctly! Both Dremio and this client are local.

When I set connection string to dremio+flight://user:password@host:port/dremio it gives error:

/dremio/sqlalchemy_dremio$ py.test test
=========================================================================================== test session starts ============================================================================================
platform linux -- Python 3.6.9, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
rootdir: /home/tahmad/ogits/dremio/sqlalchemy_dremio/test, configfile: pytest.ini
collected 3 items                                                                                                                                                                                          

test/test_dremio.py::test_connect_args ERROR                                                                                                                                                         [ 33%]
test/test_dremio.py::test_simple_sql ERROR                                                                                                                                                           [ 66%]
test/test_dremio.py::test_row_count ERROR                                                                                                                                                            [100%]

================================================================================================== ERRORS ==================================================================================================
___________________________________________________________________________________ ERROR at setup of test_connect_args ____________________________________________________________________________________

request = <SubRequest 'init_test_schema' for <Function test_connect_args>>

    @pytest.fixture(scope='session', autouse=True)
    def init_test_schema(request):
        test_sql = Path("scripts/sample.sql")
>       get_engine().execute(open(test_sql).read())

test/conftest.py:32: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/deprecations.py:390: in warned
    return fn(*args, **kwargs)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3036: in execute
    connection = self.connect(close_with_result=True)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3095: in connect
    return self._connection_cls(self, close_with_result=close_with_result)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:91: in __init__
    else engine.raw_connection()
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3174: in raw_connection
    return self._wrap_pool_connect(self.pool.connect, _connection)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3141: in _wrap_pool_connect
    return fn()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py:383: in connect
    return _ConnectionFairy._checkout(self, self._fairy)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:755: in _checkout
    fairy = _ConnectionRecord.checkout(pool)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:419: in checkout
    rec = pool._do_get()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py:365: in _do_get
    c = self._create_connection()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:247: in _create_connection
    return _ConnectionRecord(self)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:362: in __init__
    self.__connect(first_connect_check=True)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:605: in __connect
    pool.logger.debug("Error on connect(): %s", e)
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py:72: in __exit__
    with_traceback=exc_tb,
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/compat.py:211: in raise_
    raise exception
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:599: in __connect
    connection = pool._invoke_creator(self)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/create.py:578: in connect
    return dialect.connect(*cargs, **cparams)
sqlalchemy_dremio/flight.py:196: in connect
    return self.dbapi.connect(*cargs, **cparams)
sqlalchemy_dremio/db.py:20: in connect
    return Connection(c)
sqlalchemy_dremio/db.py:53: in __init__
    client.authenticate(HttpDremioClientAuthHandler(splits[0].split("=")[1], splits[1].split("=")[1]))
pyarrow/_flight.pyx:1155: in pyarrow._flight.FlightClient.authenticate
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   pyarrow._flight.FlightUnavailableError: gRPC returned unavailable error, with message: failed to connect to all addresses

pyarrow/_flight.pyx:70: FlightUnavailableError
____________________________________________________________________________________ ERROR at setup of test_simple_sql _____________________________________________________________________________________

request = <SubRequest 'init_test_schema' for <Function test_connect_args>>

    @pytest.fixture(scope='session', autouse=True)
    def init_test_schema(request):
        test_sql = Path("scripts/sample.sql")
>       get_engine().execute(open(test_sql).read())

test/conftest.py:32: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/deprecations.py:390: in warned
    return fn(*args, **kwargs)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3036: in execute
    connection = self.connect(close_with_result=True)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3095: in connect
    return self._connection_cls(self, close_with_result=close_with_result)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:91: in __init__
    else engine.raw_connection()
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3174: in raw_connection
    return self._wrap_pool_connect(self.pool.connect, _connection)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3141: in _wrap_pool_connect
    return fn()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py:383: in connect
    return _ConnectionFairy._checkout(self, self._fairy)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:755: in _checkout
    fairy = _ConnectionRecord.checkout(pool)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:419: in checkout
    rec = pool._do_get()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py:365: in _do_get
    c = self._create_connection()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:247: in _create_connection
    return _ConnectionRecord(self)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:362: in __init__
    self.__connect(first_connect_check=True)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:605: in __connect
    pool.logger.debug("Error on connect(): %s", e)
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py:72: in __exit__
    with_traceback=exc_tb,
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/compat.py:211: in raise_
    raise exception
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:599: in __connect
    connection = pool._invoke_creator(self)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/create.py:578: in connect
    return dialect.connect(*cargs, **cparams)
sqlalchemy_dremio/flight.py:196: in connect
    return self.dbapi.connect(*cargs, **cparams)
sqlalchemy_dremio/db.py:20: in connect
    return Connection(c)
sqlalchemy_dremio/db.py:53: in __init__
    client.authenticate(HttpDremioClientAuthHandler(splits[0].split("=")[1], splits[1].split("=")[1]))
pyarrow/_flight.pyx:1155: in pyarrow._flight.FlightClient.authenticate
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   pyarrow._flight.FlightUnavailableError: gRPC returned unavailable error, with message: failed to connect to all addresses

pyarrow/_flight.pyx:70: FlightUnavailableError
_____________________________________________________________________________________ ERROR at setup of test_row_count _____________________________________________________________________________________

request = <SubRequest 'init_test_schema' for <Function test_connect_args>>

    @pytest.fixture(scope='session', autouse=True)
    def init_test_schema(request):
        test_sql = Path("scripts/sample.sql")
>       get_engine().execute(open(test_sql).read())

test/conftest.py:32: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/deprecations.py:390: in warned
    return fn(*args, **kwargs)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3036: in execute
    connection = self.connect(close_with_result=True)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3095: in connect
    return self._connection_cls(self, close_with_result=close_with_result)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:91: in __init__
    else engine.raw_connection()
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3174: in raw_connection
    return self._wrap_pool_connect(self.pool.connect, _connection)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/base.py:3141: in _wrap_pool_connect
    return fn()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py:383: in connect
    return _ConnectionFairy._checkout(self, self._fairy)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:755: in _checkout
    fairy = _ConnectionRecord.checkout(pool)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:419: in checkout
    rec = pool._do_get()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py:365: in _do_get
    c = self._create_connection()
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:247: in _create_connection
    return _ConnectionRecord(self)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:362: in __init__
    self.__connect(first_connect_check=True)
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:605: in __connect
    pool.logger.debug("Error on connect(): %s", e)
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py:72: in __exit__
    with_traceback=exc_tb,
../../../.local/lib/python3.6/site-packages/sqlalchemy/util/compat.py:211: in raise_
    raise exception
../../../.local/lib/python3.6/site-packages/sqlalchemy/pool/base.py:599: in __connect
    connection = pool._invoke_creator(self)
../../../.local/lib/python3.6/site-packages/sqlalchemy/engine/create.py:578: in connect
    return dialect.connect(*cargs, **cparams)
sqlalchemy_dremio/flight.py:196: in connect
    return self.dbapi.connect(*cargs, **cparams)
sqlalchemy_dremio/db.py:20: in connect
    return Connection(c)
sqlalchemy_dremio/db.py:53: in __init__
    client.authenticate(HttpDremioClientAuthHandler(splits[0].split("=")[1], splits[1].split("=")[1]))
pyarrow/_flight.pyx:1155: in pyarrow._flight.FlightClient.authenticate
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   pyarrow._flight.FlightUnavailableError: gRPC returned unavailable error, with message: failed to connect to all addresses

pyarrow/_flight.pyx:70: FlightUnavailableError
========================================================================================= short test summary info ==========================================================================================
ERROR test/test_dremio.py::test_connect_args - pyarrow._flight.FlightUnavailableError: gRPC returned unavailable error, with message: failed to connect to all addresses
ERROR test/test_dremio.py::test_simple_sql - pyarrow._flight.FlightUnavailableError: gRPC returned unavailable error, with message: failed to connect to all addresses
ERROR test/test_dremio.py::test_row_count - pyarrow._flight.FlightUnavailableError: gRPC returned unavailable error, with message: failed to connect to all addresses
======================================================================================= 3 errors in 77.79s (0:01:17) =======================================================================================

NotImplementedError for dremio+flight dialect

Hi @narendrans

First of all thanks for you contribution to the oss community.

I am trying to configure superset using sqlalchemy_dremio with dremio+flight.Connection test looks good and I was able to run queries using sql lab but when I tried add dataset superset raises below error.

AFAIU issue occurs because of the missing implementations. I see with the following commit, you've already implemented the has_tabe method for dremio dialect but it's missing in dremio+flight: a004b1c

Error:

| superset     return f(self, *args, **kwargs)                                                           
โ”‚ superset   File "/app/superset/views/base_api.py", line 85, in wraps                                   
โ”‚ superset     raise ex                                                                                  
โ”‚ superset   File "/app/superset/views/base_api.py", line 82, in wraps                                   
โ”‚ superset     duration, response = time_function(f, self, *args, **kwargs)                              
โ”‚ superset   File "/app/superset/utils/core.py", line 1468, in time_function                             
โ”‚ superset     response = func(*args, **kwargs)                                                          
โ”‚ superset   File "/app/superset/utils/log.py", line 242, in wrapper                                     
โ”‚ superset     value = f(*args, **kwargs)                                                                
โ”‚ superset   File "/app/superset/datasets/api.py", line 252, in post                                     
โ”‚ superset     new_model = CreateDatasetCommand(g.user, item).run()                                      
โ”‚ superset   File "/app/superset/datasets/commands/create.py", line 51, in run                           
โ”‚ superset     dataset.fetch_metadata(commit=False)                                                      
โ”‚ superset   File "/app/superset/connectors/sqla/models.py", line 1531, in fetch_metadata                
โ”‚ superset     new_columns = self.external_metadata()                                                    
โ”‚ superset   File "/app/superset/connectors/sqla/models.py", line 666, in external_metadata              
โ”‚ superset     return get_physical_table_metadata(                                                       
โ”‚ superset   File "/app/superset/connectors/sqla/utils.py", line 48, in get_physical_table_metadata      
โ”‚ superset     database.has_table_by_name(table_name=table_name, schema=_schema_name)                    
โ”‚ superset   File "/app/superset/models/core.py", line 734, in has_table_by_name                         
โ”‚ superset     return engine.has_table(table_name, schema)                                               
โ”‚ superset   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2331, in has_ 
โ”‚ table                                                                                                  
โ”‚ superset     return self.run_callable(self.dialect.has_table, table_name, schema)                      
โ”‚ superset   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2212, in run_ 
โ”‚ callable                                                                                               
โ”‚ superset     return conn.run_callable(callable_, *args, **kwargs)                                      
โ”‚ superset   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1653, in run_ 
โ”‚ callable                                                                                               
โ”‚ superset     return callable_(self, *args, **kwargs)                                                   
โ”‚ superset   File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/interfaces.py", line 479, in 
โ”‚  has_table                                                                                             
โ”‚ superset     raise NotImplementedError()                                                               
โ”‚ superset NotImplementedError

Superset Dremio Connectivity

Hi,

We have hosted Dremio and Superset on an AKS Cluster in Azure and we are trying to connect Superset to the Dremio Database(Lakehouse) for fetching some dashboards. We have installed all the required drivers(arrowflight, sqlalchemy_dremio and unixodc/dev) to establish the connection.

Strangely we are able not able to connect to Dremio from the Superset UI using the connection strings:

dremio+flight://admin:[email protected]:32010/dremio
dremio://admin:[email protected]:31010/databaseschema.dataset/dremio?SSL=0
Here's the error:

(builtins.NoneType) None\n(Background on this error at: https://sqlalche.me/e/14/dbapi)", "error_type": "GENERIC_DB_ENGINE_ERROR", "level": "error", "extra": {"engine_name": "Dremio", "issue_codes": [{"code": 1002, "message": "Issue 1002 - The database returned an unexpected error."}]}}]

However, while trying from inside the Superset pod, using this python script here, the connection goes through without any issues.

All queries sent from client query INFORMATION_SCHEMA."TABLES"

Dremio is receiving the following query along with every request:
SELECT COUNT(*) FROM INFORMATION_SCHEMA."TABLES" WHERE TABLE_NAME = '{query text here}'

Is this expected behavior?

I installed the master branch from the repo, running dremio CE 24.

AttributeError: 'DremioDialect_flight' object has no attribute 'driver' #37

Querybook:3.28.0
sqlalchemy-dremio 3.0.3

'DremioDialect_flight' object has no attribute 'driver'
Stack trace:
  File "/opt/querybook/querybook/server/lib/query_executor/base_executor.py", line 552, in poll
    self.start()
  File "/opt/querybook/querybook/server/lib/query_executor/base_executor.py", line 547, in start
    self._run_next_statement()
  File "/opt/querybook/querybook/server/lib/query_executor/base_executor.py", line 600, in _run_next_statement
    self._execute(statement)
  File "/opt/querybook/querybook/server/lib/query_executor/base_executor.py", line 640, in _execute
    self._cursor.run(statement)
  File "/opt/querybook/querybook/server/lib/query_executor/clients/sqlalchemy.py", line 47, in run
    self._cursor = self._connection.execute(sqlalchemy.sql.text(query))
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1306, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 332, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1490, in _execute_clauseelement
    compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 512, in _compile_w_cache
    if compiled_cache is not None and dialect._supports_statement_cache:
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1113, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 365, in _supports_statement_cache
    "warning." % (self.name, self.driver),

Performance optimization of flight

Summary

Performance of queries via flight protocol is not optimal.

Test case

Running a query that returns 1M records. Both Dremio and the client is local.

Dataset: Samples."samples.dremio.com"."NYC-taxi-trips" (LIMIT 1000000)

Flight: 3.375s - Execution time in Dremio: <1s
ODBC: 3.703125

It looks like the time is spent entirely on the client side processing even though the data has been sent from the server in less than 1 second. Could it be this? https://github.com/narendrans/sqlalchemy_dremio/blob/master/sqlalchemy_dremio/query.py#L17

Statistics:

image

Call Graph:

flight_perf_profiler

Superset - NaTType does not support utcoffset

Hi team,

I have issue about select data dremio from superset.

Column type in table is date but some have null value. Then my superset got this error

Dremio Error
NaTType does not support utcoffset

This may be triggered by:
Issue 1002 - The database returned an unexpected error.

See less

Pls help me. tks

Problem installing version 2.0.0 with pip

Hello,

I'm trying to install the 2.0.0 version of the library via pip, which is giving me the following error:

$ pip install sqlalchemy-dremio==2.0.0
ERROR: Could not find a version that satisfies the requirement sqlalchemy-dremio==2.0.0 (from versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.2.0, 1.2.1)
ERROR: No matching distribution found for sqlalchemy-dremio==2.0.0

It seems that pip can't find the latest version, running the same command but in verbose mode (-vv) I got the current log (truncated for readability):

$ pip install sqlalchemy-dremio==2.0.0 -vv
https://pypi.org:443 "GET /simple/sqlalchemy-dremio/ HTTP/1.1" 304 0
Fetched page https://pypi.org/simple/sqlalchemy-dremio/ as application/vnd.pypi.simple.v1+json
[...]
   Skipping link: unsupported archive format: .egg: https://files.pythonhosted.org/packages/93/ef/bd72a44fe3a611f9f0992eb018fd6f40bcbbb689dc2501cbed312e950959/sqlalchemy_dremio-2.0.0-py3.9.egg (from https://pypi.org/simple/sqlalchemy-dremio/)
[...]

I'm no expert in python packaging, but it looks like that the 2.0.0 version of the lib was published as .egg format which pip does not support (as far I'm aware).

I tried installing the lib via git, I'm using the Superset tutorial for adding new database drivers in docker, but the Superset Docker image doesn't contains git installed by default, so I installed git in the Superset Docker image and successfully installed the master branch of the sqlalchemy-dremio in the Superset container, but when I got to create the new Dremio connection, the Dremio option doesn't seems to be available in the Superset Connect Database interface.

Am I doing something wrong? Is there something more that I can do to install the Dremio connector in the Superset image?

pytest fails sqlalchemy.exc.StatementError: (builtins.KeyError) 'object' for test query

['UID=dremio', 'PWD=dremio123', 'HOST=localhost', 'PORT=32010', 'Schema=dremio', 'SSL=0']
['UID', 'dremio']
['PWD', 'dremio123']
['HOST', 'localhost']
['PORT', '32010']
['Schema', 'dremio']
['SSL', '0']
Traceback (most recent call last):
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1351, in _cursor_execute
    self.dialect.do_execute(cursor, statement, parameters, context)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
  File "/Users/naren/dev/sqlalchemy_dremio/sqlalchemy_dremio/db.py", line 30, in g
    return f(self, *args, **kwargs)
  File "/Users/naren/dev/sqlalchemy_dremio/sqlalchemy_dremio/db.py", line 149, in execute
    self._results, self.description = execute(
  File "/Users/naren/dev/sqlalchemy_dremio/sqlalchemy_dremio/query.py", line 68, in execute
    o = (x, _type_map[str(y.name)], None, None, True)
KeyError: 'object'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/naren/dev/sqlalchemy_dremio/test.py", line 9, in <module>
    result = engine.execute(sql)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2234, in execute
    connection = self._contextual_connect(close_with_result=True)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2302, in _contextual_connect
    self._wrap_pool_connect(self.pool.connect, None),
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
    return fn()
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/pool/impl.py", line 366, in connect
    return _ConnectionFairy._checkout(self, self._fairy)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
    rec = pool._do_get()
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/pool/impl.py", line 349, in _do_get
    c = self._create_connection()
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
    return _ConnectionRecord(self)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
    self.__connect(first_connect_check=True)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/pool/base.py", line 664, in __connect
    pool.dispatch.first_connect.for_modify(
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 314, in exec_once_unless_exception
    self._exec_once_impl(True, *args, **kw)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 285, in _exec_once_impl
    self(*args, **kw)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
    fn(*args, **kw)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 1406, in go
    return once_fn(*arg, **kw)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/strategies.py", line 199, in first_connect
    dialect.initialize(c)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 330, in initialize
    self.returns_unicode_strings = self._check_unicode_returns(connection)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 433, in _check_unicode_returns
    results = {check_unicode(test) for test in tests}
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 433, in <setcomp>
    results = {check_unicode(test) for test in tests}
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 402, in check_unicode
    connection._cursor_execute(cursor, statement, parameters)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1353, in _cursor_execute
    self._handle_dbapi_exception(
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
    util.raise_(
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1351, in _cursor_execute
    self.dialect.do_execute(cursor, statement, parameters, context)
  File "/Users/naren/opt/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
  File "/Users/naren/dev/sqlalchemy_dremio/sqlalchemy_dremio/db.py", line 30, in g
    return f(self, *args, **kwargs)
  File "/Users/naren/dev/sqlalchemy_dremio/sqlalchemy_dremio/db.py", line 149, in execute
    self._results, self.description = execute(
  File "/Users/naren/dev/sqlalchemy_dremio/sqlalchemy_dremio/query.py", line 68, in execute
    o = (x, _type_map[str(y.name)], None, None, True)
sqlalchemy.exc.StatementError: (builtins.KeyError) 'object'
[SQL: SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1]```

Change driver path for Mac OS X

Fix: Unexpected error (pyodbc.Error) ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'Dremio Connector' : file not found (0) (SQLDriverConnect)")

Driver files unavailable

I know this is a long shot, but dremio ODBC driver files are no longer available for download on the default dremio page and they refuse to supply them again and don't give us a reason: https://community.dremio.com/t/odbc-drivers-download/9220/8

Would it be possible for you to supply them in this project? Assuming you have some version of them locally stored.

I need to connect to an existing dremio instance via Apache superset and it uses this project

SSL=0 still try to verify SSL cert

I try to test the new driver without SSL, but looks not work. Is this expected?

>>> from sqlalchemy import *
>>> db_uri = "dremio+flight://dremio:dremio123@localhost:32010/dremio;SSL=0"
>>> engine = create_engine(db_uri)
>>> sql = 'SELECT * FROM sys.options limit 5 -- SQL Alchemy Flight Test '
>>>
>>> result = engine.execute(sql)

['UID=dremio', 'PWD=dremio123', 'HOST=localhost', 'PORT=32010', 'Schema=dremio', 'SSL=0']


E0922 22:19:30.986260000 4618952128 ssl_utils.cc:570] load_file: {"created":"@1663856370.986224000","description":"Failed to load file","file":"/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/grpc/src/17cc203898-db2679e7f2.clean/src/core/lib/iomgr/load_file.cc","file_line":72,"filename":"/usr/share/grpc/roots.pem","referenced_errors":[{"created":"@1663856370.986220000","description":"No such file or directory","errno":2,"file":"/Users/runner/work/crossbow/crossbow/vcpkg/buildtrees/grpc/src/17cc203898-db2679e7f2.clean/src/core/lib/iomgr/load_file.cc","file_line":45,"os_error":"No such file or directory","syscall":"fopen"}]}
E0922 22:19:30.986294000 4618952128 ssl_security_connector.cc:413] Could not get default pem root certs.
E0922 22:19:30.986304000 4618952128 secure_channel_create.cc:108] Failed to create secure subchannel for secure name 'localhost:32010'
E0922 22:19:30.986313000 4618952128 secure_channel_create.cc:50] Failed to create channel args during subchannel creation.
E0922 22:19:30.986352000 4618952128 ssl_security_connector.cc:413] Could not get default pem root certs.
E0922 22:19:30.986360000 4618952128 secure_channel_create.cc:108] Failed to create secure subchannel for secure name 'localhost:32010'
E0922 22:19:30.986367000 4618952128 secure_channel_create.cc:50] Failed to create channel args during subchannel creation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.