xzkostyan / clickhouse-sqlalchemy Goto Github PK

View Code? Open in Web Editor NEW

410.0 15.0 116.0 582 KB

ClickHouse dialect for SQLAlchemy

Home Page: https://clickhouse-sqlalchemy.readthedocs.io

License: Other

Python 100.00%

clickhouse yandex database sqlalchemy dialect

clickhouse-sqlalchemy's People

Stargazers

Watchers

Forkers

gribanov-d 311devs lispython evo-company antonio-antuan enterstudio vecchp deadlemon khvn26 sochi vadfabi jhirniak bobelev nikitka bolshakovno rrockru aminought yimian nitoqq hhell myxof ods villebro ei-grad armymaksim niyaznz sdrenn sxm1129 achilleash vmarkovtsev isgasho megosu lomocadem wwgc aamalev carlosefr sledenev lance0805 hodgesrm evgg hydrolix adcombo pombredanne songenjie sbdelisle slach wlhjason alexanderfromearth eivasch iuryamorim cwurm trellisinsights fozar ulixius9 robinovitch61 save-my-heart astupidbear youngsofun akolov shashankkoppar kozlek akrustam jstastny poofeg gtoonstra bruino randomowo someakk reflection hantmac antonfriberg datafold teddycr untername hhalina howbownotnow alexey-milovidov tvorogme samuelfirst skorpyon vanchaxy arpitjain799 bitscrunch nnyy ninefiveblade 0del 9en9i fusetools myscale usizy jonburdo thomas-brandstaetter mrcljx memfault getcalliper mau-dra net-mist akurdyukov subkanthi leemurus

clickhouse-sqlalchemy's Issues

AttributeError: 'DropTable' object has no attribute 'if_exists'

subj

Code to reproduce:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('clickhouse://.../testing')
df = pd.DataFrame({'value': [0, 1], 'str': ['qwer', 'asdf']})

conn = engine.connect()
df.to_sql('test_drop', conn, if_exists='replace', index=False)
conn.close()

Clickhouse:

CREATE TABLE testing.test_drop (value Int32, str String) ENGINE = TinyLog;

Full exception:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-20-56f4dbb057f1> in <module>()
      1 with DBConnection('ch_credentials.pkl', dbname='testing') as conn:
----> 2     test_df.to_sql('test_drop', conn, if_exists='replace', index=False)

/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in to_sql(self, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
   1532         sql.to_sql(self, name, con, flavor=flavor, schema=schema,
   1533                    if_exists=if_exists, index=index, index_label=index_label,
-> 1534                    chunksize=chunksize, dtype=dtype)
   1535 
   1536     def to_pickle(self, path, compression='infer',

/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in to_sql(frame, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
    471     pandas_sql.to_sql(frame, name, if_exists=if_exists, index=index,
    472                       index_label=index_label, schema=schema,
--> 473                       chunksize=chunksize, dtype=dtype)
    474 
    475 

/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype)
   1153                          if_exists=if_exists, index_label=index_label,
   1154                          schema=schema, dtype=dtype)
-> 1155         table.create()
   1156         table.insert(chunksize)
   1157         if (not name.isdigit() and not name.islower()):

/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in create(self)
    592                 raise ValueError("Table '%s' already exists." % self.name)
    593             elif self.if_exists == 'replace':
--> 594                 self.pd_sql.drop_table(self.name, self.schema)
    595                 self._execute_create()
    596             elif self.if_exists == 'append':

/usr/local/lib/python3.6/site-packages/pandas/io/sql.py in drop_table(self, table_name, schema)
   1203         if self.has_table(table_name, schema):
   1204             self.meta.reflect(only=[table_name], schema=schema)
-> 1205             self.get_table(table_name, schema).drop()
   1206             self.meta.clear()
   1207 

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in drop(self, bind, checkfirst)
    792         bind._run_visitor(ddl.SchemaDropper,
    793                           self,
--> 794                           checkfirst=checkfirst)
    795 
    796     def tometadata(self, metadata, schema=RETAIN_SCHEMA,

/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _run_visitor(self, visitorcallable, element, **kwargs)
   1547     def _run_visitor(self, visitorcallable, element, **kwargs):
   1548         visitorcallable(self.dialect, self,
-> 1549                         **kwargs).traverse_single(element)
   1550 
   1551 

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py in traverse_single(self, obj, **kw)
    119             meth = getattr(v, "visit_%s" % obj.__visit_name__, None)
    120             if meth:
--> 121                 return meth(obj, **kw)
    122 
    123     def iterate(self, obj):

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py in visit_table(self, table, drop_ok, _is_metadata_operation)
    953                 self.traverse_single(column.default)
    954 
--> 955         self.connection.execute(DropTable(table))
    956 
    957         table.dispatch.after_drop(

/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py in execute(self, object, *multiparams, **params)
    946             raise exc.ObjectNotExecutableError(object)
    947         else:
--> 948             return meth(self, multiparams, params)
    949 
    950     def _execute_function(self, func, multiparams, params):

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py in _execute_on_connection(self, connection, multiparams, params)
     66 
     67     def _execute_on_connection(self, connection, multiparams, params):
---> 68         return connection._execute_ddl(self, multiparams, params)
     69 
     70     def execute(self, bind=None, target=None):

/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py in _execute_ddl(self, ddl, multiparams, params)
   1001             dialect=dialect,
   1002             schema_translate_map=self.schema_for_object
-> 1003             if not self.schema_for_object.is_default else None)
   1004         ret = self._execute_context(
   1005             dialect,

<string> in <lambda>(self, bind, dialect, **kw)

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/elements.py in compile(self, default, bind, dialect, **kw)
    440             else:
    441                 dialect = default.StrCompileDialect()
--> 442         return self._compiler(dialect, bind=bind, **kw)
    443 
    444     def _compiler(self, dialect, **kw):

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/ddl.py in _compiler(self, dialect, **kw)
     24         Dialect."""
     25 
---> 26         return dialect.ddl_compiler(dialect, self, **kw)
     27 
     28 

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py in __init__(self, dialect, statement, bind, schema_translate_map, compile_kwargs)
    217             if self.can_execute:
    218                 self.execution_options = statement._execution_options
--> 219             self.string = self.process(self.statement, **compile_kwargs)
    220 
    221     @util.deprecated("0.7", ":class:`.Compiled` objects now compile "

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/compiler.py in process(self, obj, **kwargs)
    243 
    244     def process(self, obj, **kwargs):
--> 245         return obj._compiler_dispatch(self, **kwargs)
    246 
    247     def __str__(self):

/usr/local/lib/python3.6/site-packages/sqlalchemy/sql/visitors.py in _compiler_dispatch(self, visitor, **kw)
     79                     raise exc.UnsupportedCompilationError(visitor, cls)
     80                 else:
---> 81                     return meth(self, **kw)
     82         else:
     83             # The optimization opportunity is lost for this case because the

~/Projects/Git/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/base.py in visit_drop_table(self, drop)
    293         text = '\nDROP TABLE '
    294 
--> 295         if drop.if_exists:
    296             text += 'IF EXISTS '
    297 

AttributeError: 'DropTable' object has no attribute 'if_exists'

Lib version - '0.0.5', installed from master-branch.

Select count() generator raised StopIteration

Hi,
try select count() from clickhouse_db --> raise error generator raised StopIteration:

clickhouse_sqlalchemy/drivers/http/connector.py", line 193, in _process_response
    self._columns = next(response, None)
RuntimeError: generator raised StopIteration

Unknown type varchar error on Nullable columns

I'm using native driver with clickhouse 20.1.4.14 and latest clickhouse-sqlalchemy from PIP.

It seems the issue was coming with clickhouse update (we were using 19.1.x before), but this is a guess as we switched to 20.1 changed driver from sqlalchemy-clickhouse to clickhouse-sqlalchemy due to some other issues.

So when running a very basic selection of a varchar field, we get an error:

select description from my_database.creatives;

The corresponding error is attached below with tracelog. Interestingly I figured the toString() function will make the query work:

select toString(description) from my_database.creatives;

Somehow it might be related to Nullable columns, as this column is defined as follows. Note the fields of the query are not null in this case, so there is real data coming back!

CREATE TABLE IF NOT EXISTS `my_database`.`creatives` (
  `id` Int64,
  `description` Nullable(varchar),
  ...
)
ENGINE = MergeTree
PRIMARY KEY (`id`)
ORDER BY id;

I'd be happy about any feedback. Thanks!
PS: Sorry for the bad formatting of the trace

2020-02-19 17:49:44,069:DEBUG:clickhouse_driver.connection:Query: select version()
2020-02-19 17:49:44,069:DEBUG:clickhouse_driver.connection:Block send time: 0.000063
2020-02-19 17:49:44,071:INFO:root:Running statement 1 out of 1
2020-02-19 17:49:44,083:INFO:root:Parsing with sqlparse statement select description from my_database.creatives
2020-02-19 17:49:44,085:INFO:root:Parsing with sqlparse statement select description from my_database.creatives
2020-02-19 17:49:44,086:INFO:root:Running query: 
select description from spearad_data.creatives
LIMIT 1000
2020-02-19 17:49:44,086:DEBUG:clickhouse_driver.connection:Connecting. Database: my_database. User: default
2020-02-19 17:49:44,089:DEBUG:clickhouse_driver.connection:Connected to ClickHouse server version 20.1.4, revision: 54431
2020-02-19 17:49:44,089:DEBUG:clickhouse_driver.connection:Query: select description from my_database.creatives
LIMIT 1000
2020-02-19 17:49:44,089:DEBUG:clickhouse_driver.connection:Block send time: 0.000060
2020-02-19 17:49:44,091:DEBUG:root:�[36m[stats_logger] (timing) sqllab.query.time_executing_query | 4.578125 �[0m
2020-02-19 17:49:44,091:ERROR:root:Orig exception: Code: 50. Unknown type varchar
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 74, in get_column_by_spec
    cls = column_by_type[spec]
KeyError: 'varchar'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py", line 156, in execute
    **execute_kwargs
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 214, in execute
    columnar=columnar
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 337, in process_ordinary_query
    columnar=columnar)
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 83, in receive_result
    return result.get_result()
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/result.py", line 48, in get_result
    for packet in self.packet_generator:
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 95, in packet_generator
    packet = self.receive_packet()
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/client.py", line 109, in receive_packet
    packet = self.connection.receive_packet()
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/connection.py", line 386, in receive_packet
    packet.block = self.receive_data()
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/connection.py", line 449, in receive_data
    return self.block_in.read()
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/streams/native.py", line 75, in read
    self.fin)
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 83, in read_column
    column = get_column_by_spec(column_spec, column_options=column_options)
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 63, in get_column_by_spec
    return create_nullable_column(spec, create_column_with_options)
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/nullablecolumn.py", line 5, in create_nullable_column
    nested = column_by_spec_getter(inner)
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 45, in create_column_with_options
    return get_column_by_spec(x, column_options)
  File "/usr/local/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 78, in get_column_by_spec
    raise errors.UnknownTypeError('Unknown type 
{}
'.format(e.args[0]))
clickhouse_driver.errors.UnknownTypeError: Code: 50. Unknown type varchar
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/superset/superset/sql_lab.py", line 204, in execute_sql_statement
    db_engine_spec.execute(cursor, sql, async_=True)
  File "/home/superset/superset/db_engine_specs/base.py", line 493, in execute
    cursor.execute(query)
  File "/usr/local/lib/python3.6/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py", line 160, in execute
    raise DatabaseException(orig)
clickhouse_sqlalchemy.exceptions.DatabaseException: Orig exception: Code: 50. Unknown type varchar
10.192.10.21 - - [19/Feb/2020 17:49:44] "POST /superset/sql_json/ HTTP/1.1" 500 -
2020-02-19 17:49:44,111:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:49:44] "POST /superset/sql_json/ HTTP/1.1" 500 -
2020-02-19 17:49:44,255:DEBUG:root:�[36m[stats_logger] (incr) queries�[0m
10.192.10.21 - - [19/Feb/2020 17:49:44] "GET /superset/queries/1582128376000 HTTP/1.1" 200 -
2020-02-19 17:49:44,263:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:49:44] "GET /superset/queries/1582128376000 HTTP/1.1" 200 -
127.0.0.1 - - [19/Feb/2020 17:49:46] "GET /health HTTP/1.1" 200 -
2020-02-19 17:49:46,382:INFO:werkzeug:127.0.0.1 - - [19/Feb/2020 17:49:46] "GET /health HTTP/1.1" 200 -
10.192.11.13 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:03,099:INFO:werkzeug:10.192.11.13 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
10.192.10.21 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:03,223:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:50:03] "GET / HTTP/1.1" 302 -
127.0.0.1 - - [19/Feb/2020 17:50:16] "GET /health HTTP/1.1" 200 -
2020-02-19 17:50:16,818:INFO:werkzeug:127.0.0.1 - - [19/Feb/2020 17:50:16] "GET /health HTTP/1.1" 200 -
10.192.11.13 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:33,104:INFO:werkzeug:10.192.11.13 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
10.192.10.21 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
2020-02-19 17:50:33,230:INFO:werkzeug:10.192.10.21 - - [19/Feb/2020 17:50:33] "GET / HTTP/1.1" 302 -
127.0.0.1 - - [19/Feb/2020 17:50:47] "GET /health HTTP/1.1" 200 -
2020-02-19 17:50:47,249:INFO:werkzeug:127.0.0.1 - - [19/Feb/2020 17:50:47] "GET /health HTTP/1.1" 200 -
10.192.11.13 - - [19/Feb/2020 17:51:03] "GET / HTTP/1.1" 302 -```

Date of null value cannot be inserted with using pandas.dataframe.to_sql

If I use pandas.dataframe.to_sql, insert the empty date value in dataframe to Clickhouse,There will be a mistake here.

code

def main():
    d2 = pd.DataFrame({
        'date': [pd.to_datetime('2018-01-01'), None, pd.to_datetime('2018-01-03')]
    })
    print(d2)
    """
            date
    0 2018-01-01
    1        NaT
    2 2018-01-03
    """
    ch_engine = sqla.create_engine(Const.CH_URI)
    d2.to_sql(name='t_date', con=ch_engine,
              if_exists='append', index=False)

error message

.............
  File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/service.py", line 80, in write_column
    column.write_data(items, buf)
  File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/base.py", line 79, in write_data
    self._write_data(items, buf)
  File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/base.py", line 82, in _write_data
    prepared = self.prepare_items(items)
  File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/base.py", line 69, in prepare_items
    x = before_write(x)
  File "/anaconda3/envs/study/lib/python3.6/site-packages/clickhouse_driver/columns/datetimecolumn.py", line 26, in before_write_item
    if value.tzinfo is None:
AttributeError: 'NoneType' object has no attribute 'tzinfo'

so,I edit 'clickhouse_driver/columns/datetimecolumn.py',such as:

    def before_write_item(self, value):
        if self.timezone:
            # Set server's timezone for offset-naive datetime.
            # if value.tzinfo is None:
            #     value = self.timezone.localize(value)
            # value = value.astimezone(utc)
            # return int(timegm(value.timetuple()))
            """
            my code
            """
            if hasattr(value, 'tzinfo'):
                if value.tzinfo is None:
                    value = self.timezone.localize(value)
                value = value.astimezone(utc)
                return int(timegm(value.timetuple()))
            else:
                return 0
        else:
            # If datetime is offset-aware use it's timezone.
            if value.tzinfo is not None:
                value = value.astimezone(utc)
                return int(timegm(value.timetuple()))

            return int(mktime(value.timetuple()))

Null date can also be inserted into Clickhouse (0000-00-00 00:00:00).

Cannot use foreignkey

Hello, I try to create a relationship between two tables as I do with a mysql for example. I got the following issue when trying to. I think it's more a newbies question than a real issue.

Code

#!/usr/bin/env python3
# coding=utf-8
from sqlalchemy import create_engine, Column, MetaData, literal, ForeignKey
from clickhouse_sqlalchemy import Table, make_session, get_declarative_base, types, engines
from sqlalchemy.orm import relationship, sessionmaker

uri = 'clickhouse://default:@localhost/test'

engine = create_engine(uri)
session = make_session(engine)
metadata = MetaData(bind=engine)

Base = get_declarative_base(metadata=metadata)

class Tableas(Base):
    __tablename__ = "tableas"
    uid = Column(types.Int64, primary_key=True)
    item = Column(types.String)
    __table_args__ = (
        engines.Memory(),
    )

class Datas(Base):
    __tablename__ = "datas"
    uid = Column(types.Int64, primary_key=True)
    insert_date = Column(types.Date, primary_key=True)

    # Relation to the tables
    tablea_uid = Column(types.Int64, ForeignKey('tableas.uid'))
    tablea = relationship(Tableas)

    __table_args__ = (
        engines.Memory(),
    )


if __name__ == "__main__":
    Base.metadata.create_all(engine)

Error :

 clickhouse_sqlalchemy.exceptions.DatabaseException: Orig exception: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 77 (line 6, col 2): ,
        FOREIGN KEY(tablea_uid) REFERENCES tableas (uid)
) ENGINE = Memory

. Expected one of: identifier, column declaration, e.what() = DB::Exception

The tableas is well created

clickhouse :) describe tableas;

DESCRIBE TABLE tableas

┌─name─┬─type───┬─default_type─┬─default_expression─┬─comment_expression─┬─codec_expression─┐
│ uid  │ Int64  │              │                    │                    │                  │
│ item │ String │              │                    │                    │                  │
└──────┴────────┴──────────────┴────────────────────┴────────────────────┴──────────────────┘

2 rows in set. Elapsed: 0.009 sec.

Is it a limitation ? or a user error ?

Do you plan to implement DELETE and UPDATE statements?

Sorry, if it's not the right place to ask, I just don't know, where is.

Strage behaviour with Clickhouse 20.1 using Superset

Ok, so here comes a nice one: We are using API mode, not native with Clickhouse 20.1.4.1, clickhouse-sqlalchemy was installed today with PIP, so should be latest official version available there.

When querying the DB its seems, there's data missing and corrupt column headers:

The first 2 rows are broken and will not be interpreted as real data, so results with less than 2 rows will show no data. This is not dependent on the result, table or data, so an overall behaviour. As Crosscheck I used JDBC/DBeaver and results are looking fine there.

Following table has 7 entries with IDs from 1 to 7:

The Count shows even no data at all:

Verification with DBeaver and JDBC driver looks good:

Also sometimes the column header is broken and seems to "merge" come columns. Note the "1_1", "0_1", "0_2" and "1_2" are int values from the first two columns from the resultset concatenated with an underscore!

Some of this seems to be related to varchars (as these are also shown as NullType in the Column overview, so some Metadata seems broken or at least affected). Hash is a varchar:

Columns prefixed with table name are not supported

When join two tables, I found that there are no columns prefixed with table name. However, it is not work for clickhouse with the following error:

    org.jkiss.dbeaver.model.sql.DBSQLException: SQL Error [352]:
ClickHouse exception, code: 352, host: 172.0.0.1, port: xxxx; Code: 352, 
e.displayText() = DB::Exception: Column 'user_id' is ambiguous 
(version 19.16.7.24 (official build))

Because my SQL have the same column name in two join tables.

The related code in this repository is in /clickhouse-sqlalchemy/drivers/base.py. There are related code:

    def visit_column(self, column, include_table=True, **kwargs):
        # Columns prefixed with table name are not supported
        return super(ClickHouseCompiler, self).visit_column(
            column, include_table=False, **kwargs
        )

There is always False for include_table argument in this function.

SQL queries with pd.read_sql drop first two records

For some reason (unknown to me), using pd.read_sql drops the second record and uses the first one as a header. Any idea why is it happening? Here's a sample code used on the Rate example

import pandas as pd
query = """
select day, value from rate limit 2;
"""
df = pd.read_sql(query, engine)
print(df)

returns

Empty DataFrame
Columns: [2017-12-01, 200]
Index: []

Thanks!

Cursor execution not including names and types

Hi,

I've started to play with your lib and I've noticed one major bug when using the HTTP driver:

When calling the execute() method of a Cursor the SQL will not be appended with ' FORMAT TabSeparatedWithNamesAndTypes' as you could initially expect from the drivers/http/base.py file. This causes the result not to include the columns name and therefore the TSV post processing
(drivers/http/connector.py#195) will go wrong because the first 2 rows are actually data samples.

It seems to me that the following method of ClickHouseDialect_http

    def _execute(self, connection, sql):
        sql += ' FORMAT TabSeparatedWithNamesAndTypes'
        return connection.execute(sql)

is actually never called (Cursor or not), but I could be wrong.

One quick and easy fix could be to move this to the execute() method of Cursor.

Thanks,
Timothé

Errors when passing None to Nullable column using orm table

Hi!
I have the next table:

class Rate(Base):
    day = Column(types.Date, primary_key=True)
    value = Column(types.Int32, nullable=True)
    
    __table_args__ = (
        engines.MergeTree(date_col='day', key_expressions=('day',)),
    )

Then I try to insert data to this table:

session.execute(Rate.__table__.insert(), [{'day':datetime.now(), 'value': None}, {'day':datetime.now(), 'value': 1}])

And it causes the following error:

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in write_items(self, items, buf)
    141         try:
--> 142             buf.write(s.pack(*items))
    143 

error: required argument is not an integer

During handling of the above exception, another exception occurred:

StructPackException                       Traceback (most recent call last)
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/service.py in write_column(context, column_name, column_spec, items, buf, types_check)
     90         column.write_state_prefix(buf)
---> 91         column.write_data(items, buf)
     92 

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in write_data(self, items, buf)
     78 
---> 79         self._write_data(items, buf)
     80 

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in _write_data(self, items, buf)
     82         prepared = self.prepare_items(items)
---> 83         self.write_items(prepared, buf)
     84 

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/base.py in write_items(self, items, buf)
    144         except struct_error as e:
--> 145             raise exceptions.StructPackException(e)
    146 

StructPackException: required argument is not an integer

During handling of the above exception, another exception occurred:

TypeMismatchError                         Traceback (most recent call last)
~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py in executemany(self, operation, seq_of_parameters, context)
    164                 operation, params=seq_of_parameters,
--> 165                 external_tables=external_tables, settings=settings
    166             )

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/client.py in execute(self, query, params, with_column_types, external_tables, query_id, settings, types_check, columnar)
    204                     query, params, external_tables=external_tables,
--> 205                     query_id=query_id, types_check=types_check
    206                 )

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/client.py in process_insert_query(self, query_without_data, data, external_tables, query_id, types_check)
    355         if sample_block:
--> 356             self.send_data(sample_block, data, types_check=types_check)
    357             packet = self.connection.receive_packet()

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/client.py in send_data(self, sample_block, data, types_check)
    379                           types_check=types_check)
--> 380             self.connection.send_data(block)
    381 

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/connection.py in send_data(self, block, table_name)
    478 
--> 479         self.block_out.write(block)
    480         self.block_out.reset()

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/streams/native.py in write(self, block)
     40                 write_column(self.context, col_name, col_type, items,
---> 41                              self.fout, types_check=block.types_check)
     42 

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_driver/columns/service.py in write_column(context, column_name, column_spec, items, buf, types_check)
    106             'Column {}: {}'.format(
--> 107                 column_name, str(error)
    108             )

TypeMismatchError: Code: 53. Type mismatch in VALUES section. Repeat query with types_check=True for detailed info. Column value: required argument is not an integer

During handling of the above exception, another exception occurred:

DatabaseException                         Traceback (most recent call last)
<ipython-input-210-be61f4dbb946> in <module>
----> 1 session.execute(Rate.__table__.insert(), [{'day':datetime.now(), 'value': None}, {'day':datetime.now(), 'value': 1}])

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py in execute(self, clause, params, mapper, bind, **kw)
   1267 
   1268         return self._connection_for_bind(bind, close_with_result=True).execute(
-> 1269             clause, params or {}
   1270         )
   1271 

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in execute(self, object_, *multiparams, **params)
    986             raise exc.ObjectNotExecutableError(object_)
    987         else:
--> 988             return meth(self, multiparams, params)
    989 
    990     def _execute_function(self, func, multiparams, params):

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py in _execute_on_connection(self, connection, multiparams, params)
    285     def _execute_on_connection(self, connection, multiparams, params):
    286         if self.supports_execution:
--> 287             return connection._execute_clauseelement(self, multiparams, params)
    288         else:
    289             raise exc.ObjectNotExecutableError(self)

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_clauseelement(self, elem, multiparams, params)
   1105             distilled_params,
   1106             compiled_sql,
-> 1107             distilled_params,
   1108         )
   1109         if self._has_events or self.engine._has_events:

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1246         except BaseException as e:
   1247             self._handle_dbapi_exception(
-> 1248                 e, statement, parameters, cursor, context
   1249             )
   1250 

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _handle_dbapi_exception(self, e, statement, parameters, cursor, context)
   1466                 util.raise_from_cause(sqlalchemy_exception, exc_info)
   1467             else:
-> 1468                 util.reraise(*exc_info)
   1469 
   1470         finally:

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py in reraise(tp, value, tb, cause)
    151         if value.__traceback__ is not tb:
    152             raise value.with_traceback(tb)
--> 153         raise value
    154 
    155     def u(s):

~/dev/kafka_sink/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1222                 if not evt_handled:
   1223                     self.dialect.do_executemany(
-> 1224                         cursor, statement, parameters, context
   1225                     )
   1226             elif not parameters and context.no_parameters:

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/base.py in do_executemany(self, cursor, statement, parameters, context)
    584 
    585     def do_executemany(self, cursor, statement, parameters, context=None):
--> 586         cursor.executemany(statement, parameters, context=context)
    587 
    588     def do_execute(self, cursor, statement, parameters, context=None):

~/dev/kafka_sink/venv/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/native/connector.py in executemany(self, operation, seq_of_parameters, context)
    167 
    168         except DriverError as orig:
--> 169             raise DatabaseException(orig)
    170 
    171         self._process_response(response, context)

DatabaseException: Orig exception: Code: 53. Type mismatch in VALUES section. Repeat query with types_check=True for detailed info. Column value: required argument is not an integer

Could you please tell what I am doing wrong. I just want to insert None value to nullable columns like I would do in oiriginal python clickhouse_driver.

FINAL modifier

Is it possible to apply FINAL modifier to query?
https://clickhouse.tech/docs/ru/query_language/select/#select-from-final

Nullable types not detected correctly

I have a column with type Nullable(Float32). When I query this value, it gives me a string value as the output for this column.

The converters at

clickhouse-sqlalchemy/src/drivers/http/transport.py

Line 10 in fe2c8b7

converters = {

do not support Nullable types

DatabaseException: "orig" parameter not provided to the base exception class

At the moment, DatabaseException constructor now looks like this:

class DatabaseException(Exception):
    def __init__(self, orig):
        self.orig = orig
        super(DatabaseException, self).__init__()

It can cause problems when DatabaseException is being thrown from a multiprocessings process (e.g. while working with big data in ClickHouse in parallel), ForkingPickler complaining about the missing orig parameter. So I suggest to add orig into the super constructor, because orig is non-optional:

class DatabaseException(Exception):
    def __init__(self, orig):
        self.orig = orig
        super(DatabaseException, self).__init__(orig)

Query execution raises exception when using raw connection

Hi,

We're having some issues with the driver under a use case which is using a raw connection.
How to reproduce:

from sqlalchemy import create_engine, Column, MetaData, literal
from datetime import date, timedelta
from clickhouse_sqlalchemy import make_session, get_declarative_base, types, engines
uri = 'clickhouse+native://default:xxx@localhost/test'
engine = create_engine(uri)
session = make_session(engine)
metadata = MetaData(bind=engine)
Base = get_declarative_base(metadata=metadata)

class Rate(Base):
    day = Column(types.Date, primary_key=True)
    value = Column(types.Int32)
    __table_args__ = (
        engines.Memory(),
    )
table = Rate.__table__
table.create()
today = date.today()
rates = [{'day': today - timedelta(i), 'value': 200 - i} for i in range(100)]

# Emits single INSERT statement.
session.execute(table.insert(), rates) # This is fine
session.execute("SELECT * FROM rate") # Still OK

# now using a raw connection
raw = engine.raw_connection()
cur = raw.cursor()

qs = cur.execute("SELECT * FROM rate") # Boum

Traceback (most recent call last):
  File "/home/achille/.virtualenvs/clickhouse/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-14-392e6e940555>", line 1, in <module>
    qs = cur.execute("SELECT * FROM rate")
  File "/home/achille/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/native/connector.py", line 142, in execute
    external_tables, execute, settings = self._prepare(context)
  File "/home/achille/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/native/connector.py", line 115, in _prepare
    execution_options = context.execution_options
AttributeError: 'NoneType' object has no attribute 'execution_options'

So the problem is in fact simple and related to the last update of drivers/native/connector.py:

the execute and executemany methods of Cursor can take context=None as argument but _prepare does not handle this case and the very first line of it raises the exception:

    def _prepare(self, context):
        execution_options = context.execution_options  # Here

        external_tables = self.make_external_tables(
            context.dialect, execution_options
        )

        transport = self._connection.transport
        execute = transport.execute
        execute_iter = getattr(transport, 'execute_iter', None)

        self._stream_results = execution_options.get('stream_results', False)
        settings = execution_options.get('settings')

        if self._stream_results and execute_iter:
            execute = execute_iter
            settings = settings or {}
            settings['max_block_size'] = execution_options['max_row_buffer']

        return external_tables, execute, settings

    def execute(self, operation, parameters=None, context=None):
        self._reset_state()
        self._begin_query()

        try:
            external_tables, execute, settings = self._prepare(context)

I have a working patch for this (well it's just a check on context no big deal), but as I'm not using any external tables I cannot test their usage with a raw_connection. However I think this particular usage may be ignored, so I'll make a PR to fix this.

Thanks for you work

Ошибка компиляции запроса?

Натолкнулся на проблему с компиляцией запросов

engine_clickhouse = create_engine(config['sqlalchemy']['conn_string_clickhouse'],
                                          convert_unicode=True,
                                          echo=config.getboolean('sqlalchemy', 'ECHO_DB'))
SessionClickhouse = ScopedSession(sessionmaker(bind=engine_clickhouse))

from clickhouse_sqlalchemy import types, engines
from clickhouse_sqlalchemy.ext.declarative import get_declarative_base
from sqlalchemy import Column

CHBase = get_declarative_base()

# кусок модельки, ничего необычного
class PixelStats(CHBase):

    ...
    ts_spawn = Column(types.UInt32, primary_key=True)
    ...

Вот такая петрушка в консоли:

>>> from proj.core.model.meta import SessionClickHouse
>>> from proj.core.model_clickhouse import PixelStats
>>> from proj.core.orm_extensions.compiler import Compiler
>>> SessionClickHouse.bind.dialect.name
'clickhouse'

>>> print Compiler.compile_query(SessionClickHouse.query(PixelStats.ts_spawn - PixelStats.ts_spawn % 3600), dialect=SessionClickHouse.bind.dialect)
SELECT ts_spawn - ts_spawn %(ts_spawn_1)s AS anon_1 
FROM pixel_stats
>>> SessionClickHouse.query(PixelStats.ts_spawn - PixelStats.ts_spawn % 3600).first()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2700, in first
    ret = list(self[0:1])
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2498, in __getitem__
    return list(res)
  File "/home/anton/Projects/proj/core/proj/core/model/cachingquery.py", line 106, in __iter__
    return Query.__iter__(self)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2802, in __iter__
    return self._execute_and_instances(context)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/orm/query.py", line 2817, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute
    return meth(self, multiparams, params)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
    context)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1344, in _handle_dbapi_exception
    util.reraise(*exc_info)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
    context)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 435, in do_execute
    cursor.execute(statement, parameters, context=context)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/connector.py", line 105, in execute
    self._process_response(response_gen)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/connector.py", line 197, in _process_response
    self._columns = next(response, None)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/transport.py", line 39, in execute
    r = self._send(query, params=params, stream=True)
  File "/home/anton/Projects/proj/.venv/lib/python2.7/site-packages/clickhouse_sqlalchemy/drivers/http/transport.py", line 77, in _send
    raise DatabaseException(orig)
DatabaseException: Orig exception: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 43 (line 1, col 43): AS anon_1 
FROM pixel_stats 
 LIMIT 1 FORMAT TabSeparatedWithNamesAndTypes. Expected one of: Comma, INTO OUTFILE, FROM, WITH, HAVING, SETTINGS, GROUP BY, ORDER BY, UNION ALL, PREWHERE, WHERE, token, LIMIT, FORMAT, e.what() = DB::Exception

Ручной "компилятор":


from psycopg2.extensions import adapt
from sqlalchemy.dialects import postgresql

_pg_dialect = postgresql.dialect()


class Compiler(object):
    encoding = 'utf8'

    @classmethod
    def compile_statement(cls, statement, dialect=_pg_dialect):
        comp = statement.compile(dialect=dialect)
        params = {}
        for k, v in comp.params.items():
            if isinstance(v, unicode):
                v = v.encode(cls.encoding)
            if dialect.name == _pg_dialect.name:
                params[k] = adapt(v)
        return (comp.string.encode(cls.encoding) % params).decode(cls.encoding)

    @classmethod
    def compile_query(cls, query, dialect=_pg_dialect):
        return cls.compile_statement(query.statement, dialect=dialect)

Аналогичная операция прекрасно работает с пг (с другой моделью, со стандартным алхимичный declarative_base() и с пг-шным же engine):

>>> from proj.core.model import Stat
>>> from proj.core.model.meta import Session
>>> Session.bind.dialect.name
'postgresql' 
>>> print Compiler.compile_query(Session.query(Stat.ts_spawn - Stat.ts_spawn % 3600), dialect=Session.bind.dialect)
SELECT stats.ts_spawn - stats.ts_spawn % 3600 AS anon_1 
FROM stats
>>> Session.query(Stat.ts_spawn - Stat.ts_spawn % 3600).first()
(1508162400,)

Понимаю, что есть функции mod в пг и modulo в кликхаусе, но хотелось бы использовать оператор %, именно за счет того, что он одинаков в разных СУБД.

Error in get_columns with ClickHouse >=18.15

Getting error when calling function clickhouse_sqlalchemy.drivers.base.ClickHouseDialect#get_columns:

packages/clickhouse_sqlalchemy/drivers/base.py", line 424, in get_columns
    for name, type_, default_type, default_expression in rows:
ValueError: too many values to unpack (expected 4)

this happens in new version of Clickhouse because the added new column comment_expression - https://github.com/yandex/ClickHouse/blame/7beb25eaaf6a31399c8ad9355fd93789ea068b6f/dbms/src/Interpreters/InterpreterDescribeQuery.cpp#L48

Table engine does not reflected

I'm trying to reflect the existing table using SQL Expression Language:

from sqlalchemy import create_engine, MetaData, Table

engine = create_engine('clickhouse://default:@localhost:8123/MyDatabase')
metadata = MetaData()
MyTable = Table('MyTable', metadata, autoload=True, autoload_with=engine)

But the newly created Table() object does not contains any info about table engine (the engine of an actual table is ReplacingMergeTree). I tried to get the engine info with Reflection Inspector:

from sqlalchemy import create_engine, MetaData, Table
from sqlalchemy.engine import reflection

engine = create_engine('clickhouse://default:@localhost:8123/MyDatabase')
insp = reflection.Inspector.from_engine(engine) 
print(insp.get_table_options('MyTable'))

It's possible to get the e.g., MySQL table engine in this way (get_table_options() returns {'mysql_engine': 'InnoDB'}), but I can't do this for ClickHouse. Is there any way?
I'm new to SQLAlchemy and clickhouse-sqlalchemy, so please forgive if I misunderstood something)

Update operation support

Conda package enum34 issue

Hey @xzkostyan !

It misses enum34, even in python3.6
AFAIK it should not accorund to driver's setup.py

Ideas?

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/engine/__init__.py", line 387, in create_engine
    return strategy.create(*args, **kwargs)
  File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 56, in create
    entrypoint = u._get_entrypoint()
  File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/engine/url.py", line 139, in _get_entrypoint
    cls = registry.load(name)
  File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 214, in load
    return impl.load()
  File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2315, in load
    self.require(*args, **kwargs)
  File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2338, in require
    items = working_set.resolve(reqs, env, installer, extras=self.extras)
  File "/Users/krisz/.pyenv/versions/ml3/lib/python3.6/site-packages/pkg_resources/__init__.py", line 858, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'enum34' distribution was not found and is required by clickhouse-driver

It was wrong to create the ClickHouse engine

I want to migrate the data from SqlServer to ClickHouse.

but...

I tried to connect Clickhouse with'sqlalchemy.create_engine', but here it looks like engine failed.

Can you tell me what's wrong with it?

Thank you for your reply.

This is my code:

import sqlalchemy as sqla
import pymssql
import pandas as pd

SQL_DB_CONN_STR = 'mssql+pymssql://sa:[email protected]:1433/UFDATA_008_2015'
CH_DB_CONN_STR = 'clickhouse://default:@192.168. 1.188/zy'


table_name = 'AA_PinYin'
sql = 'select * from dbo.{}'.format(table_name)
sql_engine = sqla.create_engine(SQL_DB_CONN_STR)
ch_engine = sqla.create_engine(CH_DB_CONN_STR)
sql_df = pd.read_sql(sql, sql_engine, chunksize=5000)
num = 1
for i in sql_df:
    if num == 1:
        i.to_sql(name=table_name, con=ch_engine, if_exists='replace')
    else:
        i.to_sql(name=table_name, con=ch_engine, if_exists='append')
    num= num + 1

---------------------------------------------------------------------------
CompileError                              Traceback (most recent call last)
<ipython-input-7-efbf2225776b> in <module>()
     18     print(j)
     19     if num == 1:
---> 20         i.to_sql(name=table_name, con=ch_engine, if_exists='replace')
     21     else:
     22         i.to_sql(name=table_name, con=ch_engine, if_exists='append')
.......
c:\users\jeff\appdata\local\programs\python\python35\lib\site-packages\sqlalchemy\sql\compiler.py in visit_create_table(self, create)
   2481             text += separator + "\t" + const
   2482 
-> 2483         text += "\n)%s\n\n" % self.post_create_table(table)
   2484         return text
   2485 

c:\users\jeff\appdata\local\programs\python\python35\lib\site-packages\clickhouse_sqlalchemy\drivers\base.py in post_create_table(self, table)
    287 
    288         if not engine:
--> 289             raise exc.CompileError("No engine for table '%s'" % table.name)
    290 
    291         return ' ENGINE = ' + self.process(engine)

CompileError: No engine for table 'AA_PinYin'`

connection pool suppt?

INSERT FROM SELECT statement

Hi, I am trying to insert values from select statement to table. Let's say I have one table Table1 with columns A and B. I want select all records from Table1, add new column C with some value and insert this to second table Table2.

But I don't know how to add new column C.

My raw SQL query would look like

INSERT INTO Table2
FROM
     SELECT A, B, `some_value` AS C
     FROM Table1

My python code:

from sqlalchemy import cast

# Labels must be present.
select_query =   session.query(Table1).filter(Table.A.in_(ids)).subquery() # How to add new column C?

session.execute(
    Table2.insert()
    .from_select(['A', 'B', 'C'], select_query)
)

Thanks in advance.

INSERT is broken

I found this code

pos = rv.rfind('VALUES (')
# Remove (%s)-templates from VALUES clause if exists.
# ClickHouse server since version 19.3.3 parse query after VALUES and
# allows inplace parameters.
# Example: INSERT INTO test (x) VALUES (1), (2).
if pos != -1:
    rv = rv[:pos + 6]

in base.py in function visit_insert

It truncates rv from INSERT INTO data (param) VALUES (%(param)s) to INSERT INTO data (param) VALUES

And I didn't found a test, that should check that.

Database addition error with superset

Unable to add new database, it shows the error

There is no exception thrown on the server, As well as no other error response.

is it possible to create ReplicatedReplacingMergeTree

SQL execution - Frist two rows always missing

I am trying to run this with clickhouse and am trying just a select 1 and a select * from mytable limit 1.

When I do this or even more complex queries in postman using the HTTP api directly, it works:

POST /?database=mydatabase&amp;query_id=a HTTP/1.1
Host: localhost:8123
Authorization: Basic .....
Cache-Control: no-cache
Postman-Token: 2fa39919-4195-42e7-a182-b61ba28596c5

select * from mytable limit 1

But when I do this it just gives me an empty set:

In [5]: import sqlalchemy
In [6]: eng = sqlalchemy.create_engine("clickhouse://default:@localhost:8123/mydatabase")
In [7]: res = eng.execute("select 1")
In [8]: list(res)
Out[8]: []

On further checking, when I do a:

LIMIT 2 I get back 0 records
LIMIT 3 I get back 1 record
LIMIT 4 I get back 2 record
LIMIT 5 I get back 3 record
...

When I check using APIs, I can see that the first 2 rows for any query I write seem to be missing. I even did a show tables and validated this.

When I do res.keys() I can see that the first row is being thought of as the header or columnnames

ClickHouse server version 1.1.54380
python 2.7.14
clickhouse-sqlalchemy (0.0.5)
sqlalchemy (1.2.6)

Получаю данные в обратном порядке

Здравствуйте.
В http версии протокола получаю данные в обратном порядке. Проблема наблюдается для версии 0.1.0 и выше.

from sqlalchemy import create_engine, text
from sqlalchemy.orm import sessionmaker


engine = create_engine(uri)
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
print(list(session.execute(text('select x from test_table order by x'))))

Результат для версии 0.0.10:
[(1,), (2,), (3,), (4,)]
Результат для версии 0.1.2:
[(4,), (3,), (2,), (1,)]

Информация о таблице

create table test_table
(
  x Int32
)
  engine = Memory;

Причиной данной поведении может быть изменения в методе clickhouse_sqlalchemy.drivers.http.connector.Cursor._process_response. Ранее ответ от кликхауса реверсился, в новых версиях убрали это.

python3.6
SQLAlchemy==1.3.13

Спасибо.

can't use raw sql with http driver

If I call any query using http-session and execute method, got that traceback:

  File "/home/anton/Projects/clickhouse-sqlalchemy/tests/sql/test_schema.py", line 51, in test_reflect
    session.execute(text('select 1'))
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1187, in execute
    bind, close_with_result=True).execute(clause, params or {})
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1051, in _connection_for_bind
    engine, execution_options)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 409, in _connection_for_bind
    conn = bind.contextual_connect()
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 2123, in contextual_connect
    self._wrap_pool_connect(self.pool.connect, None),
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
    return fn()
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 355, in connect
    return _ConnectionFairy._checkout(self)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 743, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 484, in checkout
    rec = pool._do_get()
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/impl.py", line 128, in _do_get
    self._dec_overflow()
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/langhelpers.py", line 66, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/compat.py", line 249, in reraise
    raise value
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/impl.py", line 125, in _do_get
    return self._create_connection()
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 302, in _create_connection
    return _ConnectionRecord(self)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 429, in __init__
    self.__connect(first_connect_check=True)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/pool/base.py", line 636, in __connect
    exec_once(self.connection, self)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/event/attr.py", line 274, in exec_once
    self(*args, **kw)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/event/attr.py", line 284, in __call__
    fn(*args, **kw)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/langhelpers.py", line 1336, in go
    return once_fn(*arg, **kw)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/strategies.py", line 184, in first_connect
    dialect.initialize(c)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/default.py", line 264, in initialize
    self._get_server_version_info(connection)
  File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/base.py", line 698, in _get_server_version_info
    version = connection.scalar('select version()')
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 880, in scalar
    return self.execute(object, *multiparams, **params).scalar()
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 942, in execute
    return self._execute_text(object, multiparams, params)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1104, in _execute_text
    statement, parameters
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1200, in _execute_context
    context)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1416, in _handle_dbapi_exception
    util.reraise(*exc_info)
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/util/compat.py", line 249, in reraise
    raise value
  File "/home/anton/Projects/venvs/clickhouse-sqlalchemy/lib/python3.7/site-packages/SQLAlchemy-1.3.0b1-py3.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1193, in _execute_context
    context)
  File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/base.py", line 689, in do_execute
    cursor.execute(statement, parameters, context=context)
  File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/http/connector.py", line 105, in execute
    self._process_response(response_gen)
  File "/home/anton/Projects/clickhouse-sqlalchemy/clickhouse_sqlalchemy/drivers/http/connector.py", line 193, in _process_response
    self._columns = next(response, None)
RuntimeError: generator raised StopIteration

It happens when I use both session.execute(text('select 1')) and session.execute('select 1) calls.
The reason is here: clickhouse_sqlalchemy.drivers.http.transport.RequestsTransport#execute.
Looks like statement format TabSeparatedWithNamesAndTypes must be specified for all the queries.

I see two options:

force users to use that statement in their queries
append it manually... somehow.

AttributeError: 'Select' object has no attribute '_array_join'

When I updated clickhouse-sqlalchemy from repo I got exception on select-query OrmObject.query.filter_by(id=1]).first():

  File "/home/vasily/PycharmProjects/np_diff/venv/lib/python3.6/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 232, in _compose_select_body
    if select._array_join is not None:
AttributeError: 'Select' object has no attribute '_array_join'

If I just comment this two lines in /driver/base.py

 if select._array_join is not None:
           text += select._array_join._compiler_dispatch(self, **kwargs)

or add check
if "_array_join" in select.__dict__.keys() and select._array_join is not None:
exception disappear. But, I'm sure it's wrong way to fix. Do I have to update sqlalchemy (I use 1.3.4) or something else?

Distribute LICENSE file

Same as mymarilyn/clickhouse-driver#101

Can't reflect database with Array field in a table

sqlalchemy can't reflect database if it has a table with an Array field.

CREATE TABLE test_db.test(
    id Int64,
    array_field Array(Float64)
) ENGINE = Memory()

import sqlalchemy as sa
from clickhouse_sqlalchemy import make_session

engine = sa.create_engine('clickhouse+native://user:password@host:9000/test_db')
ch_session = make_session(engine)
metadata = sa.MetaData(bind=engine, quote_schema='')
metadata.reflect()

Raises:

TypeError                                 Traceback (most recent call last)
<ipython-input-81-1a8e5f1aa1a4> in <module>()
      2 ch_session = make_session(engine)
      3 metadata = sa.MetaData(bind=engine, quote_schema='')
----> 4 metadata.reflect()

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in reflect(self, bind, schema, views, only, extend_existing, autoload_replace, **dialect_kwargs)
   3907 
   3908             for name in load:
-> 3909                 Table(name, self, **reflect_opts)
   3910 
   3911     def append_ddl_listener(self, event_name, listener):

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in __new__(cls, *args, **kw)
    437             except:
    438                 with util.safe_reraise():
--> 439                     metadata._remove_table(name, schema)
    440 
    441     @property

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py in __exit__(self, type_, value, traceback)
     64             self._exc_info = None   # remove potential circular references
     65             if not self.warn_only:
---> 66                 compat.reraise(exc_type, exc_value, exc_tb)
     67         else:
     68             if not compat.py3k and self._exc_info and self._exc_info[1]:

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/util/compat.py in reraise(tp, value, tb, cause)
    185         if value.__traceback__ is not tb:
    186             raise value.with_traceback(tb)
--> 187         raise value
    188 
    189 else:

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in __new__(cls, *args, **kw)
    432             metadata._add_table(name, schema, table)
    433             try:
--> 434                 table._init(name, metadata, *args, **kw)
    435                 table.dispatch.after_parent_attach(table, metadata)
    436                 return table

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in _init(self, name, metadata, *args, **kwargs)
    512             self._autoload(
    513                 metadata, autoload_with,
--> 514                 include_columns, _extend_on=_extend_on)
    515 
    516         # initialize all the column, etc. objects.  done after reflection to

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/sql/schema.py in _autoload(self, metadata, autoload_with, include_columns, exclude_columns, _extend_on)
    525                 autoload_with.dialect.reflecttable,
    526                 self, include_columns, exclude_columns,
--> 527                 _extend_on=_extend_on
    528             )
    529         else:

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/base.py in run_callable(self, callable_, *args, **kwargs)
   1532 
   1533         """
-> 1534         return callable_(self, *args, **kwargs)
   1535 
   1536     def _run_visitor(self, visitorcallable, element, **kwargs):

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/default.py in reflecttable(self, connection, table, include_columns, exclude_columns, **opts)
    370         insp = reflection.Inspector.from_engine(connection)
    371         return insp.reflecttable(
--> 372             table, include_columns, exclude_columns, **opts)
    373 
    374     def get_pk_constraint(self, conn, table_name, schema=None, **kw):

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/reflection.py in reflecttable(self, table, include_columns, exclude_columns, _extend_on)
    596 
    597         for col_d in self.get_columns(
--> 598                 table_name, schema, **table.dialect_kwargs):
    599             found_table = True
    600 

~/.pyenv/versions/jupyter3.6.4/lib/python3.6/site-packages/sqlalchemy/engine/reflection.py in get_columns(self, table_name, schema, **kw)
    372             coltype = col_def['type']
    373             if not isinstance(coltype, TypeEngine):
--> 374                 col_def['type'] = coltype()
    375         return col_defs
    376 

TypeError: __init__() missing 1 required positional argument: 'item_type'

Merging both connectors?

Hi 👋, I wrote the https://github.com/cloudflare/sqlalchemy-clickhouse dialect and was wondering if it made sense to merge both of these projects? I see you use pretty much the same dialect, but with a way nicer connector. The HTTPS and support for native driver is neat. I'd very much appreciate help maintaining the project instead of having to cherry-pick patches between two very similar projects.
What do you think?

Join support for sqlalchemy core api

Hi, it seems the current join support is only applied for orm api. I'm using sqlalchemy core api to do some query-generation task, and I need to use join in my querys. But I always get this error in drivers/base.py:

join has no attribute global_

Could you add support for core api queries? Thx!

Wrong Enum usage

clickhouse_sqlalchemy/drivers/base.py:583 looks like this code is never called.
Code: type_enum = enum.Enum('%s_enum' % name, options)
Enum accepts only one arg.

Support `CODEC` column expressions

Currently, it doesn't appear to be possible to specify column compression codecs when using declarative tables. For many columns, monotonously growing DateTime columns in particular, usage of such expressions can result in extreme compression rate and performance improvements.

Would you in general like to support these? I'd be willing to contribute the required changes, however my experience with SQLAlchemy is rather limited, so I'd need some guidance on how that would have to be designed. It appears like it is possible to handle extra arguments passed to the Column constructor and adding an argument such as codec=['DoubleDelta', 'ZSTD'] looks like to way to go for me personally, but I might be horribly wrong.

Migration Support

Hello,

Was looking to support migrations in my app. I was wondering if this is on the roadmap, or if you have any suggestions for creating migrations based on changes in my sql-alchemy Base classes.

Thanks!

pip install in editable mode not working

Hi,

The package is not properly installed when using pip install -e . (but works totally fine when not using -e)
This is because of the following part in the setup.py:

packages=[
        p.replace('src', 'clickhouse_sqlalchemy')
        for p in find_packages(exclude=['tests'])
        if p.startswith('src')
    ],

This is only a limitation when working on clickhouse-sqlalchemy code, so it's more of a dev experience issue.

May I suggest to rename src/ to clickhouse_sqlalchemy/ to make it works (and change the setup.py to match), this should have no side effects (successfully tried the refacto on my local repo)

Let me know if you're fine with this and if you want me to create the PR.

Thanks,
Timothé

Is it possible to use it for our special engine "Graph"?

Our project use a database extended from clickhouse. In our database, there is an engine named Graph to support graphic data, and we can use such a statement to create a graph table:
create table graph1 (w Float64) engine=Graph;
Is it possible to use clickhouse-sqlalchemy in this case?

"http" driver execute sql error

clickhouse-sqlalchemy==0.0.5

sqlalchemy==1.1.9

pandas==0.22.0

When I use "http" driver to execute sql by sqlalchemy, the returned data not complete, like this:

and it is ok with when I use "native" driver, how can I fix this error.

Besides, when i use pandas.read_sql() method to get data, it cannot get columns, the columns is data, not columns, like this:

Initial Update

The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.

custom partition key

В 1.1.54310 добавлена возможность задавать произвольный ключ партицирования, будет здорово, если будет реализована генерация DDL.

Возможные варианты ENGINE:

... ) ENGINE = MergeTree 
PARTITION BY toDate(ts_spawn) 
ORDER BY (user_id, country_id);

... ) ENGINE = MergeTree 
PARTITION BY ts_spawn
ORDER BY user_id;

... ) ENGINE = MergeTree 
PARTITION BY ts_spawn - ts_spawn % 86400
ORDER BY (user_id, country_id);

Raised StopIteration on insert

When I execute code like this

engine = create_engine(db_opt)
Session = sessionmaker(bind=engine)
session = Session()
item = Item(data='test')
session.add(item)
session.flush()

the following error appears:

RuntimeError: generator raised StopIteration

I'm using Python 3.7. On 3.6 it works fine.
Looks like it's because of how Python 3.7 handles StopIteration:
https://www.python.org/dev/peps/pep-0479/

Full error log: error.log

flask integration

Can you provide an example of how integrate with flask-sqlalchemy?

Best regards.

Support of ClickHouse server version 18.12.17 revision 54407

Hi,

Sorry for such a basic question question: I am building a setup using native interface (non http) with a clickhouse server running 18.12.17 revision 54407.

I do this message in return while testing connectivity (using superset):
Source db config looks like:

clickhouse+native://[email protected]

ERROR: {"error": "Connection failed!\n\nThe error message returned was:\n'NoneType' object has no attribute 'encode'"}

On the server side, I do have (hiding my IP address):

2018.09.21 09:26:16.777616 [ 732 ] <Trace> TCPHandlerFactory: TCP Request. Address: [::ffff:XX.XXX.XXX.XXX]:55812
2018.09.21 09:26:16.777857 [ 732 ] <Warning> TCPHandler: Client has gone away.
2018.09.21 09:26:16.777933 [ 732 ] <Information> TCPHandler: Done processing connection.

When I do a basic telnet on port 9000 to check connectivity I do have this:
telnet side:

Escape character is '^]'.

�e�DB::NetException/DB::NetException: Unexpected packet from clientusr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x9491e46]
1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x22) [0x3019952]
2. /usr/bin/clickhouse-server(DB::TCPHandler::receiveHello()+0xce) [0x3023fae]
3. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x1c4) [0x3026e64]
4. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x2b) [0x302836b]
5. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x96107df]
6. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0x16a) [0x9610bba]
7. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x77) [0x972d6b7]
8. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x9729228]
9. /usr/bin/clickhouse-server() [0x9e659ef]
10. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f2ea6e166ba]
11. /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f2ea643741d]
Connection closed by foreign host.

server side:

2018.09.21 09:35:27.908995 [ 29 ] <Error> ServerErrorHandler: Code: 101, e.displayText() = DB::NetException: Unexpected packet from client, e.what() = DB::NetException, Stack trace:

0. /usr/bin/clickhouse-server(StackTrace::StackTrace()+0x16) [0x9491e46]
1. /usr/bin/clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)+0x22) [0x3019952]
2. /usr/bin/clickhouse-server(DB::TCPHandler::receiveHello()+0xce) [0x3023fae]
3. /usr/bin/clickhouse-server(DB::TCPHandler::runImpl()+0x1c4) [0x3026e64]
4. /usr/bin/clickhouse-server(DB::TCPHandler::run()+0x2b) [0x302836b]
5. /usr/bin/clickhouse-server(Poco::Net::TCPServerConnection::start()+0xf) [0x96107df]
6. /usr/bin/clickhouse-server(Poco::Net::TCPServerDispatcher::run()+0x16a) [0x9610bba]
7. /usr/bin/clickhouse-server(Poco::PooledThread::run()+0x77) [0x972d6b7]
8. /usr/bin/clickhouse-server(Poco::ThreadImpl::runnableEntry(void*)+0x38) [0x9729228]
9. /usr/bin/clickhouse-server() [0x9e659ef]
10. /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f2ea6e166ba]
11. /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f2ea643741d]

2018.09.21 09:35:29.654793 [ 732 ] <Trace> TCPHandlerFactory: TCP Request. Address: [::ffff:XX.XXX.XXX.XXX]:55822

Which tend to demonstrate that connectivity is fine ...

Do you think it could be a version support issue ?

Thanks Matt.

ARRAY JOIN

Is there a way to add ARRAY JOIN clause to my query?

get_schema_names() doesn't work

Trying to inspect schema and table names from an engine fails as the column name seems to be missing from the ResultProxy instance. Accessing it via the integer position works, but it seems accessing via name doesn't work as described by the API: https://docs.sqlalchemy.org/en/13/core/connections.html#sqlalchemy.engine.ResultProxy

Example code:

from sqlalchemy import create_engine, inspect

uri = 'clickhouse://default:@localhost'
engine = create_engine(uri)

print('Running manually')
res = engine.execute('SHOW DATABASES')
for row in res:
    print(row[0])  # has to be referenced by integer position to work

print('\nRunning via inspector')
insp = inspect(engine)
insp.get_schema_names()

Result:

Running manually
test

Running via inspector
Traceback (most recent call last):
  File "click.py", line 16, in <module>
    insp.get_schema_names()
  File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 160, in get_schema_names
    self.bind, info_cache=self.info_cache
  File "<string>", line 2, in get_schema_names
  File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 56, in cache
    ret = fn(self, con, *args, **kw)
  File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 560, in get_schema_names
    return [row.name for row in connection.execute('SHOW DATABASES')]
  File "/Users/ville/.local/share/virtualenvs/incubator-superset-Q6iPzijF/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/base.py", line 560, in <listcomp>
    return [row.name for row in connection.execute('SHOW DATABASES')]
AttributeError: Could not locate column in row for column 'name'

query.count() with custom modifiers (sample) generates incorrect query

session.query(...).count()
That code generates something like that:

select count(*) from (
select ...
from table
) anon_1

If I trying to use sample, query will look like that:

session.query(...).sample(0.1)
select count(*) from (
select ...
from table
sample 0.1
) anon_1
sample 0.1

The same problem appears with ARRAY JOIN

DDL ON CLUSTER clause suport

Why I should have primary key and I didn't have any table ,it tell me I have

py3.7
clickhouse-driver 0.0.19
clickhouse-sqlalchemy 0.0.10

primary key

from clickhouse_sqlalchemy.ext.declarative import declarative_base


class BaseModel(declarative_base()):
    __abstract__ = True
    __table_args__ = {'extend_existing': True}


class MachineOp(BaseModel):
    __tablename__ = 'machine_op'
    date = Column(Date)
    time = Column(UInt32)
    ip = Column(String)
    behavior_type = Column(UInt8)
    file_name = Column(String)
    file_path = Column(String)
    detail = Column(String)

ERROR

sqlalchemy.exc.ArgumentError: Mapper mapped class MachineOp->machine_op could not assemble any primary key columns for mapped table 'machine_op'

table exist

class BaseModel(declarative_base()):
    __abstract__ = True
    # __table_args__ = {'extend_existing': True}

class MachineOp(BaseModel):
    __tablename__ = 'xxxxdfaf'
    date = Column(Date)
    time = Column(UInt32)
    ip = Column(String)
    behavior_type = Column(UInt8)
    file_name = Column(String)
    file_path = Column(String)
    detail = Column(String)

I didn't have any Table 'xxxxdfaf' . It was strange.

ERROR: Failure: InvalidRequestError (Table 'xxxxdfaf' is already defined for this MetaData instance.  Specify 'extend_existing=True' to redefine options and columns on an existing Table object.)

My table setting

CREATE TABLE machine_op (
	`date` Date,
	`time` UInt32,
	`ip` String,
	`behavior_type` UInt8,
	`file_name` String,
	`file_path` String,
	`detail` String
) ENGINE = MergeTree()
PARTITION BY date
ORDER BY date;