Giter VIP home page Giter VIP logo

clickhouse-java's Introduction

ClickHouse Java Client & JDBC Driver

Table of Contents

About the Project

This is official Java Client and JDBC for ClickHouse Database (https://github.com/ClickHouse/Clickhouse). Java Client is the core component and provides API to interact with the database. In 2023 this component and its API was refactored into a new component client-v2. Both version are available but older one will be deprecated soon. However it will receive security and critical bug fixes. New client-v2 has stable API and we are working on performance and feature parity to make it a production ready.
JDBC driver component is an implementation of JDBC API. It uses Java Client API to interact with the database server.

Benefits of using Client-V2:

  • Stable API.
  • Minimal functionality is implemented
    • SSL & mTLS support
    • RowBinary* formats support for reading
    • Proxy support
    • HTTP protocol
  • New Insert API that accepts a list of POJOs
  • New Query API that returns a list of GenericRecords that cant be used as DTOs
  • Native format reader
  • Performance improvements
    • Less number of internal buffers compare to the old client
    • More configuration for performance tuning
    • Less object allocation
  • Upcoming new features

Old client still be used when:

  • using JDBC driver ( we are working on its refactoring )

Important

Upcomming deprecations:

Component Version Comment
Clickhouse CLI Client (Java) 0.7.0 Please use clickhouse-client (see https://clickhouse.com/docs/en/interfaces/cli#clickhouse-client)
ClickHouse GRPC Client 0.7.0 Please use the ClickHouse http client instead. GRPC protos still available https://github.com/ClickHouse/ClickHouse/tree/master/src/Server/grpc_protos

Installation

Releases: Maven Central (web site https://mvnrepository.com/artifact/com.clickhouse)

Nightly Builds: https://s01.oss.sonatype.org/content/repositories/snapshots/com/clickhouse/

Client V2

Artifacts

Component Maven Central Link
ClickHouse Java Client V2 Maven Central

Compatibility

ClickHouse Version Client Version Comment
Server >= 23.0 0.6.2

Features

  • Http API for ClickHouse support
  • Bi-directional Compression
    • LZ4
  • Insert from POJO (data is provided as list of java objects)
  • Query formats support:
    • RowBinary readers
    • Native format reader
  • Apache HTTP Client as HTTP client
    • Connection pooling
    • Failures on retry
  • SSL support
  • Cloud support
  • Proxy support

Examples

Begin-with Usage Examples

Spring Demo Service

Minimal client setup:

String endpoint = "https://<db-instance hostname>:8443/"
Client client = new Client.Builder()
        .addEndpoint(endpoint)
        .setUsername(user)
        .setPassword(password)
        .setDefaultDatabase(database)
        .build();

Insert POJOs example:

client.register(
  ArticleViewEvent.class, // your DTO class  
  client.getTableSchema(TABLE_NAME)); // corresponding table

List<ArticleViewEvents> events = // load data 

try (InsertResponse response = client.insert(TABLE_NAME, events).get(1, TimeUnit.SECONDS)) {
  // process results 
}

Query results reader example:

// Default format is RowBinaryWithNamesAndTypesFormatReader so reader have all information about columns
try (QueryResponse response = client.query(sql).get(3, TimeUnit.SECONDS);) {

    // Create a reader to access the data in a convenient way
    ClickHouseBinaryFormatReader reader = new RowBinaryWithNamesAndTypesFormatReader(response.getInputStream(),
            response.getSettings());

    while (reader.hasNext()) {
        reader.next(); // Read the next record from stream and parse it

        double id = reader.getDouble("id");
        String title = reader.getString("title");
        String url = reader.getString("url");

        // result processing 
    }
}

Query result as list of object example:

// Data is read completely and returned as list of objects.
client.queryAll(sql).forEach(row -> {
              double id = row.getDouble("id");
              String title = row.getString("title");
              String url = row.getString("url");

              // result processing
            });

Connecting to the ClickHouse Cloud instance or DB server having not a self-signed certificate:

Client client = new Client.Builder()
  .addEndpoint("https://" + dbHost + ":8443")
  .setUsername("default")
  .setPassword("")
  .build(),

Connecting to a database instance with self-signed certificate:

Client client = new Client.Builder()
  .addEndpoint("https://" + dbHost + ":8443")
  .setUsername("default")
  .setPassword("")
  .setRootCertificate("localhost.crt") // path to the CA certificate
  //.setClientKey("user.key") // user private key 
  //.setClientCertificate("user.crt") // user public certificate
  .build(),

Client V1

Artifacts

Component Maven Central Link
ClickHouse Java HTTP Client Maven Central
ClickHouse JDBC Driver Maven Central

Features

  • Http API for ClickHouse support
  • Bi-directional Compression
    • LZ4
  • Apache HTTP Client as HTTP client
    • Connection pooling
    • Failures on retry
  • SSL & mTLS support
  • Cloud support
  • Proxy support

Examples

See java client examples

See JDBC examples

Compatibility

  • All projects in this repo are tested with all active LTS versions of ClickHouse.
  • Support policy
  • We recommend to upgrade client continuously to not miss security fixes and new improvements
    • If you have an issue with migration - create and issue and we will respond!

Documentation

Java Client V1 Docs :: ClickHouse website

JDBC Docs :: ClickHouse website.

Contributing

Please see our contributing guide.

clickhouse-java's People

Contributors

alex-krash avatar alexey-milovidov avatar amaslak avatar bentsileviav avatar bgranvea avatar blinkov avatar bluedogseyes avatar chernser avatar dcastanier avatar den-crane avatar dependabot[bot] avatar dysosmus avatar enqueue avatar izebit avatar jackywoo avatar javiercj93 avatar jkee avatar mbaksheev avatar meegooo avatar mshustov avatar mzitnik avatar orloffv avatar paultagoras avatar ryan-tu avatar serebrserg avatar smagellan avatar tkroman avatar tobia avatar zgmnkv avatar zhicwu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clickhouse-java's Issues

Spark SQLException: Unsupported type 2003 - Array(String)

Spark 2.1 (Hadoop 2.7)
Java/Python
Clickhouse server: 1.1.54236

Table in clickhouse, has a column of type Array(String).
Spark fails with java.util.SQLException: Unsupported type 2003.

Java code:

HashMap<String, String> clickhouseOptions = new HashMap<String, String>() {{
            put("driver", "ru.yandex.clickhouse.ClickHouseDriver");
            put("url", "jdbc:clickhouse://clickhouse:8123");
            put("dbtable", "(select dimensions.name from testtable)");
        }};

        SparkSession spark = SparkSession.builder().getOrCreate();
        Dataset<Row> test = spark.read().format("jdbc").options(clickhouseOptions).load();

load() method fails, where it is loading the schema into the dataframe.

Spark Stacktrace :

Exception in thread "main" java.sql.SQLException: Unsupported type 2003
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:209)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$5.apply(JdbcUtils.scala:246)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$5.apply(JdbcUtils.scala:246)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:245)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:45)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
	at com.activitystream.demos.SparkDemo.testClickhouseJDBC(SparkDemo.java:29)
	at com.activitystream.demos.SparkDemo.main(SparkDemo.java:55)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

executeBatch implementation

@serebrserg @AndreevDm

Currently i am working on integration between metabase and ClickHouse.

It seems that clickhouse-jdbc driver doesn't provide support for formats like:
INSERT INTO test.batch_insert (s, i) VALUES (?, ?), (?, ?)
and
INSERT INTO test.batch_insert (s, i) VALUES (?, 101), (?, 102)

Looks like it is supported by ClickHouse (i've wrote some tests to theese formats and fixed 'em), so can it be supported in JDBC?

Table select doesn't work in JetBrains IDE

I successfully compiled last master and installed it in PyCharm settings. Connection is OK, I see schema, databases, tables and there fields. Queries to ClickHouse via query interface work fine.
But when I try to fetch data of any table clicking on it I receive:

[47] ClickHouse exception, code: 47, host: localhost, port: 58270; Code: 47, e.displayText() = DB::Exception: Unknown identifier: t, e.what() = DB::Exception

java.lang.Throwable: Code: 47, e.displayText() = DB::Exception: Unknown identifier: t, e.what() = DB::Exception

	at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:53)
	at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:28)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:507)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:96)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:80)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:76)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.execute(ClickHouseStatementImpl.java:162)
	at com.intellij.database.remote.jdbc.impl.RemoteStatementImpl.execute(RemoteStatementImpl.java:71)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:324)
	at sun.rmi.transport.Transport$1.run(Transport.java:200)
	at sun.rmi.transport.Transport$1.run(Transport.java:197)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Several databases(shards)

Hi!
Could you, please, tell how you work with shards(distributed engine) using your jdbc-driver?
How you do service discovery? You create several datasources and use it or somehow otherwise?
Thanks!

Error when try to connect through Apache Spark

When I try to load some table using the current driver version (builded with mvn package) through pyspark I got an error above:

def load_table(table_name):
    return spark.read.format("jdbc").options(
        driver=JDBC_DRIVER,
        url=CONNECTION_URL,
        dbtable=table_name
    ).load()
17/06/22 18:38:08 INFO clickhouse.ClickHouseDriver: Driver registered
17/06/22 18:38:08 INFO clickhouse.ClickHouseDriver: Creating connection

........

py4j.protocol.Py4JJavaError: An error occurred while calling o47.load.
: java.lang.NoSuchFieldError: INSTANCE
	at org.apache.http.impl.io.DefaultHttpRequestWriterFactory.<init>(DefaultHttpRequestWriterFactory.java:52)
	at org.apache.http.impl.io.DefaultHttpRequestWriterFactory.<init>(DefaultHttpRequestWriterFactory.java:56)
	at org.apache.http.impl.io.DefaultHttpRequestWriterFactory.<clinit>(DefaultHttpRequestWriterFactory.java:46)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<init>(ManagedHttpClientConnectionFactory.java:82)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<init>(ManagedHttpClientConnectionFactory.java:95)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<init>(ManagedHttpClientConnectionFactory.java:104)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<clinit>(ManagedHttpClientConnectionFactory.java:62)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$InternalConnectionFactory.<init>(PoolingHttpClientConnectionManager.java:572)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:174)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:158)
	at ru.yandex.clickhouse.util.ClickHouseHttpClientBuilder.getConnectionManager(ClickHouseHttpClientBuilder.java:77)
	at ru.yandex.clickhouse.util.ClickHouseHttpClientBuilder.buildClient(ClickHouseHttpClientBuilder.java:54)
	at ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:69)
	at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:58)
	at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:50)
	at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:32)

Default connection timeout is way too low (50ms).

ClickHouseConnectionSettings.CONNECTION_TIMEOUT("connection_timeout", 50)

I traced this into Apache HTTP client being used -- this is milliseconds value even though it is specified as int.

With this default anything that is not very local will fail (in my case -- trying to connect to database in EU DC from Canada.

How I can get WITH TOTALS result?

Hi!

Example query:

"
SELECT EventDate, count() AS c
FROM test.hits
GROUP BY EventDate
WITH TOTALS
ORDER BY EventDate
"

Where in ResultSet I can get totals?

java.lang.RuntimeException: Parse exception: ByteFragment

I have query

SELECT
date,
groupArray([widget, toString(actionCount)]) AS widgetList
FROM
(
SELECT
date,
arrayJoin(queries) AS widget,
count(*) AS actionCount
FROM prodlog_temp.log
WHERE mark = 'page updated'
GROUP BY
date,
widget
ORDER BY
date ASC,
actionCount DESC
LIMIT 5 BY date
)
GROUP BY date
LIMIT 10

Which was executed into client
http://joxi.ru/n2Yv7nNcjnPk0A

But into JDBC-driver i had error:
http://joxi.ru/DmBqenoHN8aqxr
(in Apache Zeppelin)

Version of driver - ru.yandex.clickhouse:clickhouse-jdbc:0.1.14

ClickHouseStatementImpl.getUpdateCount() doesn't signal no more results

The implementation of ClickHouseStatementImpl.getUpdateCount() will always return 0 if the previous statement didn't return a result set, no matter how often getMoreResults() is called. This causes a loop that tries to read all results from a Statement to loop forever since the following expression from the JavaDoc for Statement is always false:

((stmt.getMoreResults() == false) && (stmt.getUpdateCount() == -1))

Setting password

I cannot find possibility to set password with jdbc-driver.
And as I see password is not used when you create connection:
public Connection getConnection(String username, String password) throws SQLException { return driver.connect(url, properties); }
@serebrserg, maybe, there is something that I had not noticed? Can you help me with this?

JDBC doesn't throw an exception when `CREATE TABLE AS SELECT` times out

Clickhouse version 1.1.54140.
Checked against JDBC driver versions 0.1.12 and 0.1.17.


Steps to reproduce:

  • create user with max_execution_time 1
  • execute using JDBC: CREATE TABLE table_name ENGINE = Memory AS select ...,
    where ... is a query that is guaranteed to time out
  • verify that:
    • JDBC didn't throw an exception for timed out CREATE TABLE ... query
    • table_name was created but is empty

Test (scala):

  test("create table from select timeout") {
    withConn { implicit conn =>
      withStatement { st =>

        val tableName = "tmp_test_table_" + Math.abs(Random.nextInt()) + "_" + (System.currentTimeMillis() % 1000)
        try {
          st.executeUpdate(
            s"""CREATE TABLE $tableName ENGINE = Memory  AS
               select count(*) as a from dw.ad_aggr_internal_bid_by_time_shard"""
          )

          val rs = st.executeQuery(s"select a from $tableName")
          assert(rs.next() === true)
          rs.close()

        } finally {
          Try(st.executeUpdate(s"drop table $tableName"))
        }
      }
    }
  }

Fails with assertion error: false did not equal true, but should fail on st.executeUpdate(, which times out.


This is JDBC-specific bug, as the same request over http correctly returns an exception:

echo 'CREATE TABLE qqq ENGINE = Memory  AS select count(*) as a from dw.ad_aggr_internal_bid_by_time_shard' | curl 'http://localhost:8123/?user=test&database=dw' -d @-
Code: 159, e.displayText() = DB::Exception: Timeout exceeded: elapsed 1.080002066 seconds, maximum: 1, e.what() = DB::Exception

BigInteger value throw exception

If we set BitInteger value in prepared statement it throw exception like this:

 PreparedStatement st = null;
        try {
            st = new ClickHouseDataSource(
                    connections.get( 0 ),
                    properties
            ).getConnection().prepareStatement(
                    "INSERT INTO form_test(response_id,submit_date) VALUES (123, ?)" );
            st.setObject( 1, new BigInteger( "123" ) );

ClickHouse exception, code: 53, host: Code: 53, e.displayText() = DB::Exception: Type mismatch in IN or VALUES section: Int64 expected, String got: (at row 1), e.what() = DB::Exception

And question Int64 it is BigInteger value or Long?
Thanks!

Nested

Нет возможности вставлять данные в поля типа Nested через statement. При вставке говорит "No such column".

Problem with spaces

If there are spaces in the beginning of "select" query like " SELECT FROM 1", than method executeQuery return null.
I found that in the method isSelect you check the query with startWith without checking for leading spaces.

Cannot override setting (max_execution_time) in readonly mode

I cannot connect through the JDBC driver as a readonly user, because " socketTimeout + dataTransferTimeout is sent to ClickHouse as max_execution_time. ", and readonly users cannot run SET statements.
Any query throws an exception
DB::Exception: Cannot override setting (max_execution_time) in readonly mode

Error "Incompatible Jackson version" when using with PySpark 2.2.0

Hello.
When I try to use this lib to dump PySpark Dataframe to ClickHouse I get a mistake.

  1. clone and compile repo
cd /var/bigdata/
git clone https://github.com/yandex/clickhouse-jdbc
cd clickhouse-jdbc
mvn package assembly:single -DskipTests=true
mv target/clickhouse-jdbc-0.1-SNAPSHOT-jar-with-dependencies.jar $SPARK_PATH/jars/
  1. My pyspark code:
df = load_df(...)
url = "jdbc:clickhouse://default:@abc.def.ru/db"
properties = {"driver": "ru.yandex.clickhouse.ClickHouseDriver"}
df.write.jdbc(url=config['ch_url'], table="data.posdata_test", mode="append", properties=properties)
  1. Try to ran it by bash
$SPARK_PATH/bin/spark-submit import.py --jars clickhouse-jdbc-0.1-SNAPSHOT-jar-with-dependencies.jar 
  1. I got this error
[2017-09-20 11:09:01,474] [root] [ERROR] [-] [-] exception during process
Traceback (most recent call last):
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 76, in main
    process_sources(source_list)
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 61, in process_sources
    df = load_df_from_sources(source_list, first_row_is_header=True)
  File "/home/vsmelov/PycharmProjects/etl/spark_test/load_df.py", line 193, in load_df_from_sources
    rdd = load_rdd_from_file(source.source_file_name, first_row_is_header=first_row_is_header)
  File "/home/vsmelov/PycharmProjects/etl/spark_test/load_df.py", line 157, in load_rdd_from_file
    rdd = df.rdd.mapPartitionsWithIndex(
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 85, in rdd
    jrdd = self._jdf.javaToPython()
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
    return f(*a, **kw)
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o27.javaToPython.
: java.lang.ExceptionInInitializerError
	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
	at org.apache.spark.sql.Dataset.javaToPython(Dataset.scala:2794)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:280)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:214)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.7.3
	at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
	at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
	at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:722)
	at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
	at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
	... 16 more
Stack (most recent call last):
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 89, in <module>
    main()
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 78, in main
    logger.error('exception during process', exc_info=1, stack_info=1)

My Spark version:

vsmelov@vsmelov:~/PycharmProjects/etl/spark_test$ $SPARK_PATH/bin/spark-submit --version
version 2.2.0
Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_131
Branch 
Compiled by user jenkins on 2017-06-30T22:58:04Z

What can I do with it?
Thanks in advance.

Error: Could not initialize class ru.yandex.clickhouse.ClickHouseUtil when using with PySpark

When I dump PySpark DataFrame to ClickHouse containing columns with types Integer, DateTime, Float, Date all works fine.
But when I try to dump DataFrame with String-typed column I get error.

My PySpark code:

    df = spark.range(0, 10).withColumnRenamed('id', 'field_int')
    from pyspark.sql.functions import lit
    df = df.withColumn('field_str', lit('abcdef'))
    df.write.jdbc(url=config['ch_url'], table="test_with_string", mode="append",
                  properties=config["ch_properties"])

Error:

Caused by: java.lang.NoClassDefFoundError: Could not initialize class ru.yandex.clickhouse.ClickHouseUtil
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.setString(ClickHousePreparedStatementImpl.java:214)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeSetter$8.apply(JdbcUtils.scala:525)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeSetter$8.apply(JdbcUtils.scala:524)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:629)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:782)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:782)
	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:926)
	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:926)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:108)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	... 1 more

What can I do with it?
I am not Java developer so I don't know how to compile and "import" class ru.yandex.clickhouse.ClickHouseUtil to my project.

There are some files in compiled repository

vsmelov@vsmelov:~/PycharmProjects/etl/spark_test$ find /var/bigdata/clickhouse-jdbc/ -name '*ClickHouseUtil*'
/var/bigdata/clickhouse-jdbc/src/test/java/ru/yandex/clickhouse/ClickHouseUtilTest.java
/var/bigdata/clickhouse-jdbc/src/main/java/ru/yandex/clickhouse/ClickHouseUtil.java
/var/bigdata/clickhouse-jdbc/target/classes/ru/yandex/clickhouse/ClickHouseUtil.class
/var/bigdata/clickhouse-jdbc/target/test-classes/ru/yandex/clickhouse/ClickHouseUtilTest.class
/var/bigdata/clickhouse-jdbc/target/apidocs/ru/yandex/clickhouse/class-use/ClickHouseUtil.html
/var/bigdata/clickhouse-jdbc/target/apidocs/ru/yandex/clickhouse/ClickHouseUtil.html

But there is no .jar files
Thanks in advance.

Ignite cache Store

I would like import models from Clickhouse to Ignite Console by jdbc driver, console failed to collect schemas

java.lang.NoClassDefFoundError: com/google/common/collect/MapMaker
        at ru.yandex.clickhouse.ClickHouseDriver.<clinit>(ClickHouseDriver.java:36)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.ignite.console.agent.db.DbMetadataReader.connect(DbMetadataReader.java:126)
        at org.apache.ignite.console.agent.handlers.DatabaseListener.connect(DatabaseListener.java:203)
        at org.apache.ignite.console.agent.handlers.DatabaseListener.schemas(DatabaseListener.java:220)
        at org.apache.ignite.console.agent.handlers.DatabaseListener$1.execute(DatabaseListener.java:82)
        at org.apache.ignite.console.agent.handlers.AbstractListener$1.run(AbstractListener.java:67)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.google.common.collect.MapMaker
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 13 more

screenshot 5

AutoCommit

Hi!
As I see in code you have

public void setAutoCommit(boolean autoCommit) throws SQLException {
}

It means that ClickHouse supports autocommit by default or there is now commit in ClickHouse?
Or maybe I don't understand something? :)
Thanks!

Remote connectivity issue

new ClickHouseDataSource("jdbc:clickhouse://clickhouse:8123").getConnection().createStatement();
When trying to connect like this, I get the following error

Exception in thread "main" ru.yandex.clickhouse.except.ClickHouseException: ClickHouse exception, code: 210, host: clickhouse, port: 8123; Connect to clickhouse:8123 [clickhouse/40.78.***.***] failed: connect timed out at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:42) at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:23) at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:461) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:73) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:69) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at ru.yandex.clickhouse.util.LogProxy.invoke(LogProxy.java:42) at com.sun.proxy.$Proxy3.executeQuery(Unknown Source) at am.galstyan.ClickhouseJdbc.main(ClickhouseJdbc.java:17) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

It seems that connection works fine only if the host is local otherwise it throws this exception

even when I am trying to connect, say, fb.com, it throws the very same exception, however when I am trying to connect to localhost, it works fine.
COuld you help me resolve the issue

TCP based connector

Is it possible to implement TCP based driver instead of HTTP? In this case it should become more stable and faster.

Error when using Spark - Syntax error: failed at position 8

Spark: 2.1.0 (Hadoop 2.7)
Java/Python
clickhouse server/client 1.1

When querying using spark and clickhouse jdbc driver (newest source, head 164d10c), we got this error:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 192.168.1.179, executor 2): 
ru.yandex.clickhouse.except.ClickHouseException: 
ClickHouse exception, code: 62, host: clickhouse, port: 8123; 
Code: 62, e.displayText() = DB::Exception: 
Syntax error: failed at position 8: "name","city","country" FROM default.spark FORMAT TabSeparatedWithNamesAndTypes;, 
expected not empty list of expressions, e.what() = DB::Exception

Caused by this line in the code:
https://goo.gl/73N8Ya

Our java spark code looks like this:

HashMap<String, String> clickhouseOptions = new HashMap<String, String>() {{
    put("driver", "ru.yandex.clickhouse.ClickHouseDriver");
    put("url", "jdbc:clickhouse://clickhouse:8123");
    put("dbtable", "default.spark");
}};

SparkSession spark = SparkSession.builder().getOrCreate();
Dataset<Row> test = spark.read().format("jdbc").options(clickhouseOptions).load();
test.show();

When calling show() the error happens.

The default.spark table has three columns name, city, country
The select query looks like this, coming form Spark:
SELECT "name","city","country" FROM default.spark

After clickhousifySql(sql), https://goo.gl/1wAuWb:
SELECT "name","city","country" FROM default.spark FORMAT TabSeparatedWithNamesAndTypes;

The problem lies in the quotation marks, removing them or replacing with a back quote, to support spaces in column names, solves the error:
SELECT `name`,`city`,`country` FROM default.spark FORMAT TabSeparatedWithNamesAndTypes;

Translate Statement#setMaxRows() into LIMIT clause

Using graphical JDBC clients to browse ClickHouse tables, I noticed that the row limit / pagination features of the clients (and thus their table browsing interfaces) practically do not work. What I mean is that on sufficiently large tables, the queries take so much time that they are useless in practice (and eventually fail anyway.)

The reason is that the graphical clients use Statement#setMaxRows() to set the number of rows they are interested in before executing the query, as per JDBC specs. But the driver does not send that information to the server, sending instead a limitless query via HTTP POST.

By its nature, being throughput-oriented, ClickHouse will interpret a limitless query as really wanting the entire table, and so will take a long time to prepare the result set, which in many cases will not fit in the server memory and thus will eventually fail.

I believe ClickHouseStatementImpl should add maxRows to the query source as a LIMIT clause (before adding FORMAT) if that option was set by the JDBC client, instead of sending a limitless query and then truncating the result set.

getTables() does not list .inner. tables

I suppose it happened when different types were assigned to different objects.
I can see .inner. tables if I use 0.1.14, and all objects are listed as 'table'.
With 0.1.25 or 0.1.26 there are tables, views and others, and no .inner. tables.

ClickHouseRowBinaryStream doesn't support arrays

There is a lack of implementation. I tried to implement it by myself but got no luck. Here is a sample method for string array:

public void writeArray(String[] array) throws IOException {
        Preconditions.checkNotNull(array);
        writeUnsignedLeb128(array.length);
        for (String el : array) {
            writeString(el);
        }
}

With this implementation I get DB::Exception: Cannot read all data from clickhouse server version 1.1.54236.
Do you see what can be wrong with this code?

Incorrect parsing of complex types (Arrays and Tuples)

Here are some cases that I've found (rs is a ResultSet instance):

  1. Tuple:

    select ('1', 2) as a;
    

    can only be extracted as String:

    rs.getMetaData.getColumnTypeName(1) = "Tuple(String, UInt8)"
    rs.getObject(1) = "('1',2)"
    
  2. Nested arrays:

    select [[1,2], [3,4]] as a;
    

    cannot be extracted as Array:

    rs.getMetaData.getColumnTypeName(1) = "Array(Array(UInt8))"
    rs.getObject(1) // Method threw 'java.lang.RuntimeException' exception.
                               // Parse exception: ByteFragment{[[[1,2],[3,4]]], start=0, len=13}
    rs.getArray(1) // Method threw 'java.lang.IllegalStateException' exception.
    
  3. Tuples nested in Array:

    select [('a',1), ('b', 2)] as a;
    

    strange parsing behavior when parsed as Array:

    rs.getMetaData.getColumnTypeName(1) = Array(Tuple(String, UInt8))
    rs.getArray(1)={ru.yandex.clickhouse.ClickHouseArray}:
           array = {java.lang.String[4]@3644} 
           0 = "('a'"
           1 = "1)"
           2 = "('b'"
           3 = "2)"
    

Return types incorrect for nullable values

When we select a value that is nullable (i.e. Nullable(Float)), what we get back from the JDBC driver is a value of type String. Looking at the code it seems that only non-nullable Float, Int and UInt are currently supported return types, with everything else being converted to String. Would be great to get a fix for this, so that upon selecting a Nullable(Float) value we actually get back a value of type Float.

PreparedStatement.execute() fails on null values

PreparedStatement.execute() fails if a null value has been bound to a parameter with setNull:

Caused by: java.lang.Throwable: Code: 62, e.displayText() = DB::Exception: Cannot parse expression of type Nullable(String) here: \N), e.what() = DB::Exception

	at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:53)

PreparedStatement.executeBatch() however works.

Example code:

        Connection connection = DriverManager.getConnection("jdbc:clickhouse://clickhouse:8123/default");
        Statement st = connection.createStatement();
        st.executeUpdate("create table if not exists test_null_insert(value Nullable(String)) engine = Memory");

        try {
            PreparedStatement ps = connection.prepareStatement("insert into test_null_insert(value) values(?)");

            // Works
            ps.setNull(1, Types.VARCHAR);
            ps.addBatch();
            ps.executeBatch();

            // Doesn't work
            ps.setNull(1, Types.VARCHAR);
            ps.execute();
        } finally {
            ResultSet rs = st.executeQuery("select value from test_null_insert");
            while (rs.next()) {
                String value = rs.getString(1);
                System.out.println(value != null ? "\"" + value + "\"" : "null");
            }

            st.executeUpdate("drop table test_null_insert");
            connection.close();
        }

Same Connection executes queries on different nodes under certain DNS settings

This is the setup that we have in our company:

  • CH cluster (multiple nodes with different IPs)
  • DNS server that resolved an internal hostname to the list of IPs of CH nodes
  • CH clients that have low setting for networkaddress.cache.ttl and sun.net.inetaddr.ttl to enable DNS-based service discovery and load balancing
  • CH client creates Connection and executed bunch of dependent requests:
    • create "temporary" † tables
    • fill these tables with data
    • query these tables
    • drop tables

† "temporary" tables are not really temporary due to the restrictions of the CH HTTP API, so these tables have to be dropped manually.

And here is what happened:
Sometimes during the long "session" java's DNS cache was flushed, and since certain point of time CH JDBC requests started to query another CH node, despite the fact that all queries were made within the same CH Connection.

We found a workaround: currently we're finding all IPs for the hostname ourselves and creating CH Connection with one specific IP.

So, here is the question: what is the semantics of CH Connection? Shouldn't it provide any "visibility" guarantees (like if you successfully created temporary table and then try to query it within the same Connection, you are guaranteed to see that table)? If it should, then probably jdbc driver has to somehow guarantee that all queries within the same Connection go to the same CH node regardless of the client DNS configuration.

This is not a bug report, it's more of a question/suggestion/discussion.

Thanks.

Can we move database names from schemas to catalogs?

I have been working on the driver trying to get a few SQL management tools to work (DBeaver and DBVisualizer) but I'm having a hard time with the "current database" popup menus.

Apparently they work well with catalogs, but are very buggy with schemas. Probably because getSchema / setSchema was introduced relatively late into the JDBC standard. (I know, it was years ago with Java 7, but many tools are still behind.)

Since Clickhouse only has one level of namespace (the database name) can we change this driver to expose them as catalogs, instead of schemas? Would you accept a pull request with this change?

Batch insert with 30 or more columns causes SQLException: Not all parameters binded

Hi!
I get this exception, when 30 or more columns passed with batch insert:

java.sql.SQLException: Not all parameters binded
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.checkBinded(ClickHousePreparedStatementImpl.java:120)
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.buildBinds(ClickHousePreparedStatementImpl.java:126)
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.addBatch(ClickHousePreparedStatementImpl.java:327)
	at org.easybatch.jdbc.JdbcRecordWriter.writeRecords(JdbcRecordWriter.java:86)
	at org.easybatch.core.job.BatchJob.writeBatch(BatchJob.java:203)
	at org.easybatch.core.job.BatchJob.call(BatchJob.java:79)
	at org.easybatch.core.job.BatchJob.call(BatchJob.java:22)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

Example code - https://github.com/greysap/randombatch2clickhouse/blob/master/src/main/java/house/greysap/App.java
This code will work after commenting lines 74, 186 and 298 (i.e. lowering columns number from 30 to 29).

What's the best way to insert batch with many columns via JDBC driver?

Single-column ResultSet gets truncated on empty string

Here is the minimal example:

 select c from (
    select 'a' as c
    UNION ALL select 'b' as c
    UNION ALL select '' as c
    UNION ALL select 'd' as c
 )

ResultSet is read using something like this:

val s = conn.createStatement()
val r = s.executeQuery(sql)

while (r.next()) {
   println(r.getString(1))
}

Expected result should be:

a
b

d

Actual result is:

a
b

Here is the full test snippet (scala):
https://gist.github.com/Aivean/fe8c4f90c183f08aa908baed7ca89478

P.S. Both TCP and HTTP interfaces return correct result.

Unable to set connection_timeout

Hi,
I am trying to open an connection with special properties, but my properties don't get applied.
Neither with Clickhouseproperties

Class.forName(CLICKHOUSE_DRIVER_NAME);
ClickHouseProperties clickHouseProperties = new ClickHouseProperties();
clickHouseProperties.setConnectionTimeout(3000);
ClickHouseDataSource dataSource = new ClickHouseDataSource(CLICKHOUSE_URL, clickHouseProperties);

nor with normal Properties.

Class.forName(CLICKHOUSE_DRIVER_NAME);
Properties properties = new Properties();
properties.setProperty("connection_timeout", "3000");
ClickHouseDataSource dataSource = new ClickHouseDataSource(CLICKHOUSE_URL, properties);

Every-Time I debug, 50ms are set for connection_timeout:

Maybe this is caused by the the values are saved as Integers inside the hashtable and later when calling getSetting(....) -> getProperty(...) -> null is returned, because the value isn't an instanceOf String and the default value is used.

Maybe in the following Method inside ClickhouseProperties you shoud you get() instead of getProperty()

private <T> T getSetting(Properties info, String key, Object defaultValue, Class clazz){
    Object val = info.getProperty(key);
    if (val == null)
        return (T)defaultValue;
    if (clazz == int.class || clazz == Integer.class) {
        return (T) clazz.cast(Integer.valueOf((String) val));
    }
    if (clazz == long.class || clazz == Long.class) {
        return (T) clazz.cast(Long.valueOf((String) val));
    }
    if (clazz == boolean.class || clazz == Boolean.class) {
        return (T) clazz.cast(Boolean.valueOf((String) val));
    }
    return (T) clazz.cast(val);
}

Several statements in one query

Hi, @serebrserg !
Can we execute several statements in one query?
For example: "CREATE TABLE .....; CREATE TABLE .....;"
For now it is not working. Is it bug of jdbc-connector or normal behavior of database?
Thanks!

Float64 gets sorted as string

It started in 0.1.26 actually.
I use SQL Workbench/J (http://www.sql-workbench.net/).
Float64 gets sorted as string
screen shot 2017-10-10 at 5 58 14 pm
Float32 gets sorted as numbers
screen shot 2017-10-10 at 5 59 05 pm
This is the difference that I can see in the application:
screen shot 2017-10-10 at 5 59 36 pm
Unfortunately I cannot easily see how the types were interpreted in 0.1.25 because of the 'Unknown identifier: default_type' error, but I checked with a JDBC driver for another database, sorting works correctly for data types that mapped to 'Double'

jdk version

Is EXACTLY jdk 1.6 needed?
Why is this not compiling on a higher version?

Support ssl/tls/https connections

Right now there is no way (or we didn't find a way on the docs or by reading the code) to connect to clickhouse servers using ssl/tls.

Maybe because the protocol "http" is hardcoded here:

https://github.com/yandex/clickhouse-jdbc/blob/8d2e9fd7f1f6342e9baaa4b8e7a601d29afd1e8d/src/main/java/ru/yandex/clickhouse/ClickHouseStatementImpl.java#L544

and then the wrong type of factory is created here:

https://github.com/yandex/clickhouse-jdbc/blob/1d7561c23feae11eda13165ed08232bf5bac2988/src/main/java/ru/yandex/clickhouse/util/ClickHouseHttpClientBuilder.java#L48

It could be enabled by passing params on the jdbc connection url like:

String url = "jdbc:clickhouse://clickhouse-000.example.com:8443/default?ssl=true&sslrootcert=ca.pem";

like postgresql.

SimpleDateFormat is not thread safe

In class ClickHouseResultSet there is static SimpleDateFormat which is not thread safe.
Could you replace it with something else or use syncronizing or thradLocal for this?

Failed to respond

Sometime get such exception(when do select/insert/alter queries):
Caused by: org.apache.http.NoHttpResponseException: clickhouse2:8123 failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:492) ... 26 more
Maybe, do know why it can happens? Wrong configuration of something else?
Thanks!

Add support for decompress=1 (compression of ClickHouseStatement.sendStream(..)

Please consider adding this. For example our current usecase (not yet on ClickHouse) involves sending live data batches from multiple servers to the database -- compression should help to significantly reduce the amount of traffic since this data is alpha-numeric.

In the short term (as a workaround) -- is there information available about how to implement this compression myself?

ClickHouseLZ4Stream contains decompression code. Assuming the compression format is the same, this code seems to contain all the required information EXCEPT the checksum format (which is only read and not checked). So what is the checksum format? Is it actually checked by server?

getTimestamp and getTime don't use info about CH server's time zone.

Hello.

While working with dates I faced with a TimeZones problem.
More specifically, clickhouse doesn't use timezone in dateTime representation, ok. But when jdbc driver converts a dateTime string with the SimpleDateFormat here it uses timezone of the host, where jdbc is running but not the clickhouse server!
So, it was not clear to figure out the bug in our system when In our db data stored in UTC, but clients work in, let's say CST. And this is the place where UTC data was shifted to CST that led to a terrible mess.

I think, It would be better to set timezone in jdbc as a property and do this at ClickHouseResultSet initialization time: (That actually I do now).

sdf.setTimeZone(TimeZone.getTimeZone("UTC")); // or property value

But also I believe that there is a way to get proper timezone of the clickhouse server and use it, instead of local host timezone.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.