clickhouse / clickhouse-java Goto Github PK

View Code? Open in Web Editor NEW

1.4K 54.0 528.0 7.32 MB

ClickHouse Java Clients & JDBC Driver

Home Page: https://clickhouse.com

License: Apache License 2.0

Java 99.99% Shell 0.01%

clickhouse jdbc jdbc-driver java-client

clickhouse-java's Introduction

ClickHouse Java Client & JDBC Driver

Table of Contents

About The project
Important Updates
Installation
Client V2
Client V1
Documentation
Contributing

About the Project

This is official Java Client and JDBC for ClickHouse Database (https://github.com/ClickHouse/Clickhouse). Java Client is the core component and provides API to interact with the database. In 2023 this component and its API was refactored into a new component client-v2. Both version are available but older one will be deprecated soon. However it will receive security and critical bug fixes. New client-v2 has stable API and we are working on performance and feature parity to make it a production ready.
JDBC driver component is an implementation of JDBC API. It uses Java Client API to interact with the database server.

Benefits of using Client-V2:

Stable API.
Minimal functionality is implemented
- SSL & mTLS support
- RowBinary* formats support for reading
- Proxy support
- HTTP protocol
New Insert API that accepts a list of POJOs
New Query API that returns a list of GenericRecords that cant be used as DTOs
Native format reader
Performance improvements
- Less number of internal buffers compare to the old client
- More configuration for performance tuning
- Less object allocation
Upcoming new features

Old client still be used when:

using JDBC driver ( we are working on its refactoring )

Important

Upcomming deprecations:

Component	Version	Comment
Clickhouse CLI Client (Java)	0.7.0	Please use `clickhouse-client` (see https://clickhouse.com/docs/en/interfaces/cli#clickhouse-client)
ClickHouse GRPC Client	0.7.0	Please use the ClickHouse http client instead. GRPC protos still available https://github.com/ClickHouse/ClickHouse/tree/master/src/Server/grpc_protos

Installation

Releases: Maven Central (web site https://mvnrepository.com/artifact/com.clickhouse)

Nightly Builds: https://s01.oss.sonatype.org/content/repositories/snapshots/com/clickhouse/

Client V2

Artifacts

Component	Maven Central Link
ClickHouse Java Client V2

Compatibility

ClickHouse Version	Client Version	Comment
Server >= 23.0	0.6.2

Features

Http API for ClickHouse support
Bi-directional Compression
- LZ4
Insert from POJO (data is provided as list of java objects)
Query formats support:
- RowBinary readers
- Native format reader
Apache HTTP Client as HTTP client
- Connection pooling
- Failures on retry
SSL support
Cloud support
Proxy support

Examples

Begin-with Usage Examples

Spring Demo Service

Minimal client setup:

String endpoint = "https://<db-instance hostname>:8443/"
Client client = new Client.Builder()
        .addEndpoint(endpoint)
        .setUsername(user)
        .setPassword(password)
        .setDefaultDatabase(database)
        .build();

Insert POJOs example:

client.register(
  ArticleViewEvent.class, // your DTO class  
  client.getTableSchema(TABLE_NAME)); // corresponding table

List<ArticleViewEvents> events = // load data 

try (InsertResponse response = client.insert(TABLE_NAME, events).get(1, TimeUnit.SECONDS)) {
  // process results 
}

Query results reader example:

// Default format is RowBinaryWithNamesAndTypesFormatReader so reader have all information about columns
try (QueryResponse response = client.query(sql).get(3, TimeUnit.SECONDS);) {

    // Create a reader to access the data in a convenient way
    ClickHouseBinaryFormatReader reader = new RowBinaryWithNamesAndTypesFormatReader(response.getInputStream(),
            response.getSettings());

    while (reader.hasNext()) {
        reader.next(); // Read the next record from stream and parse it

        double id = reader.getDouble("id");
        String title = reader.getString("title");
        String url = reader.getString("url");

        // result processing 
    }
}

Query result as list of object example:

// Data is read completely and returned as list of objects.
client.queryAll(sql).forEach(row -> {
              double id = row.getDouble("id");
              String title = row.getString("title");
              String url = row.getString("url");

              // result processing
            });

Connecting to the ClickHouse Cloud instance or DB server having not a self-signed certificate:

Client client = new Client.Builder()
  .addEndpoint("https://" + dbHost + ":8443")
  .setUsername("default")
  .setPassword("")
  .build(),

Connecting to a database instance with self-signed certificate:

Client client = new Client.Builder()
  .addEndpoint("https://" + dbHost + ":8443")
  .setUsername("default")
  .setPassword("")
  .setRootCertificate("localhost.crt") // path to the CA certificate
  //.setClientKey("user.key") // user private key 
  //.setClientCertificate("user.crt") // user public certificate
  .build(),

Client V1

Artifacts

Component	Maven Central Link
ClickHouse Java HTTP Client
ClickHouse JDBC Driver

Features

Http API for ClickHouse support
Bi-directional Compression
- LZ4
Apache HTTP Client as HTTP client
- Connection pooling
- Failures on retry
SSL & mTLS support
Cloud support
Proxy support

Examples

See java client examples

See JDBC examples

Compatibility

All projects in this repo are tested with all active LTS versions of ClickHouse.
Support policy
We recommend to upgrade client continuously to not miss security fixes and new improvements
- If you have an issue with migration - create and issue and we will respond!

Documentation

Java Client V1 Docs :: ClickHouse website

JDBC Docs :: ClickHouse website.

Contributing

Please see our contributing guide.

clickhouse-java's People

Contributors

Stargazers

Watchers

Forkers

dez-php fabura andr83 amaslak zhurs igx22 somewater drstep akormushin rasmus93 izebit aligator77 smagellan khurtindn chinaworld duremar starwix mingjiang-zeng suyilearn afds dulong dpoluyanov zgmnkv rtvt123 mgrenonville alekseevgena nettoyeur vvv1559 egorlitvinenko cadovvl hakanerp dysosmus xu10081008 timbog duzhanyuan badgerpc sigurdurm seufagner lingya magazov ramazanpolat langzizx39 vladimirlagunov peturrun apextw evo-company wuchangqi bravomao lstarby tburch sleepythread tkroman hamsterready sczhaoqi hanks110 jackge1979 contiamo stevens515 maximvarentsov milktea02 dxf1122 kovaliovs bondy123 harlinb astervinus interair ezhaka aspros-zhong jaykelin sdumzg songhtdo twosixlabs th-ha rakhithjk anke522 xq262144 the-alchemist slonovsky xc70203 tac0x2a akshayrjoshi robincacou michail-nikolaev danastott tsl-karlp enqueue ei-grad chris-data tonylincon ivanrudniev tianchenglong morimekta flyingtiger1977 osovitskiy zlobober kanterov t-m-l-c edonin h6ak cloud-fchen

clickhouse-java's Issues

Sessions support

Hi,

Since sessions are now supported for http interface it would be nice to have them supported in JDBC as well.

Thanks.

Spark SQLException: Unsupported type 2003 - Array(String)

Spark 2.1 (Hadoop 2.7)
Java/Python
Clickhouse server: 1.1.54236

Table in clickhouse, has a column of type Array(String).
Spark fails with java.util.SQLException: Unsupported type 2003.

Java code:

HashMap<String, String> clickhouseOptions = new HashMap<String, String>() {{
            put("driver", "ru.yandex.clickhouse.ClickHouseDriver");
            put("url", "jdbc:clickhouse://clickhouse:8123");
            put("dbtable", "(select dimensions.name from testtable)");
        }};

        SparkSession spark = SparkSession.builder().getOrCreate();
        Dataset<Row> test = spark.read().format("jdbc").options(clickhouseOptions).load();

load() method fails, where it is loading the schema into the dataframe.

Spark Stacktrace :

Exception in thread "main" java.sql.SQLException: Unsupported type 2003
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:209)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$5.apply(JdbcUtils.scala:246)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$5.apply(JdbcUtils.scala:246)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:245)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:45)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
	at com.activitystream.demos.SparkDemo.testClickhouseJDBC(SparkDemo.java:29)
	at com.activitystream.demos.SparkDemo.main(SparkDemo.java:55)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Is there a rich sql management studio that works with clickhouse-jdbc?

Hi all,

Is there a rich sql management studio that works with Clickhouse via clickhouse-jdbc correctly? I've tried DBVisualizer, and SQLWorkbench, they cannot create and alter tables, and even cannot show table data, wrong syntax 🚆

executeBatch implementation

@serebrserg @AndreevDm

Currently i am working on integration between metabase and ClickHouse.

It seems that clickhouse-jdbc driver doesn't provide support for formats like:
INSERT INTO test.batch_insert (s, i) VALUES (?, ?), (?, ?)
and
INSERT INTO test.batch_insert (s, i) VALUES (?, 101), (?, 102)

Looks like it is supported by ClickHouse (i've wrote some tests to theese formats and fixed 'em), so can it be supported in JDBC?

Table select doesn't work in JetBrains IDE

I successfully compiled last master and installed it in PyCharm settings. Connection is OK, I see schema, databases, tables and there fields. Queries to ClickHouse via query interface work fine.
But when I try to fetch data of any table clicking on it I receive:

[47] ClickHouse exception, code: 47, host: localhost, port: 58270; Code: 47, e.displayText() = DB::Exception: Unknown identifier: t, e.what() = DB::Exception

java.lang.Throwable: Code: 47, e.displayText() = DB::Exception: Unknown identifier: t, e.what() = DB::Exception

	at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:53)
	at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:28)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:507)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:96)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:80)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:76)
	at ru.yandex.clickhouse.ClickHouseStatementImpl.execute(ClickHouseStatementImpl.java:162)
	at com.intellij.database.remote.jdbc.impl.RemoteStatementImpl.execute(RemoteStatementImpl.java:71)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:324)
	at sun.rmi.transport.Transport$1.run(Transport.java:200)
	at sun.rmi.transport.Transport$1.run(Transport.java:197)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Several databases(shards)

Hi!
Could you, please, tell how you work with shards(distributed engine) using your jdbc-driver?
How you do service discovery? You create several datasources and use it or somehow otherwise?
Thanks!

Error when try to connect through Apache Spark

When I try to load some table using the current driver version (builded with mvn package) through pyspark I got an error above:

def load_table(table_name):
    return spark.read.format("jdbc").options(
        driver=JDBC_DRIVER,
        url=CONNECTION_URL,
        dbtable=table_name
    ).load()

17/06/22 18:38:08 INFO clickhouse.ClickHouseDriver: Driver registered
17/06/22 18:38:08 INFO clickhouse.ClickHouseDriver: Creating connection

........

py4j.protocol.Py4JJavaError: An error occurred while calling o47.load.
: java.lang.NoSuchFieldError: INSTANCE
	at org.apache.http.impl.io.DefaultHttpRequestWriterFactory.<init>(DefaultHttpRequestWriterFactory.java:52)
	at org.apache.http.impl.io.DefaultHttpRequestWriterFactory.<init>(DefaultHttpRequestWriterFactory.java:56)
	at org.apache.http.impl.io.DefaultHttpRequestWriterFactory.<clinit>(DefaultHttpRequestWriterFactory.java:46)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<init>(ManagedHttpClientConnectionFactory.java:82)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<init>(ManagedHttpClientConnectionFactory.java:95)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<init>(ManagedHttpClientConnectionFactory.java:104)
	at org.apache.http.impl.conn.ManagedHttpClientConnectionFactory.<clinit>(ManagedHttpClientConnectionFactory.java:62)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$InternalConnectionFactory.<init>(PoolingHttpClientConnectionManager.java:572)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:174)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:158)
	at ru.yandex.clickhouse.util.ClickHouseHttpClientBuilder.getConnectionManager(ClickHouseHttpClientBuilder.java:77)
	at ru.yandex.clickhouse.util.ClickHouseHttpClientBuilder.buildClient(ClickHouseHttpClientBuilder.java:54)
	at ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:69)
	at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:58)
	at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:50)
	at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:32)

Default connection timeout is way too low (50ms).

ClickHouseConnectionSettings.CONNECTION_TIMEOUT("connection_timeout", 50)

I traced this into Apache HTTP client being used -- this is milliseconds value even though it is specified as int.

With this default anything that is not very local will fail (in my case -- trying to connect to database in EU DC from Canada.

How I can get WITH TOTALS result?

Hi!

Example query:

"
SELECT EventDate, count() AS c
FROM test.hits
GROUP BY EventDate
WITH TOTALS
ORDER BY EventDate
"

Where in ResultSet I can get totals?

java.lang.RuntimeException: Parse exception: ByteFragment

I have query

SELECT
date,
groupArray([widget, toString(actionCount)]) AS widgetList
FROM
(
SELECT
date,
arrayJoin(queries) AS widget,
count(*) AS actionCount
FROM prodlog_temp.log
WHERE mark = 'page updated'
GROUP BY
date,
widget
ORDER BY
date ASC,
actionCount DESC
LIMIT 5 BY date
)
GROUP BY date
LIMIT 10

Which was executed into client
http://joxi.ru/n2Yv7nNcjnPk0A

But into JDBC-driver i had error:
http://joxi.ru/DmBqenoHN8aqxr
(in Apache Zeppelin)

Version of driver - ru.yandex.clickhouse:clickhouse-jdbc:0.1.14

Can I download/upload CSV like a PostgreSQL CopyManager

First of all, I'm interested in downloading the result of the query into a stream.

https://jdbc.postgresql.org/documentation/publicapi/org/postgresql/copy/CopyManager.html

Examples
http://stackoverflow.com/questions/6958965/how-to-copy-a-data-from-file-to-postgresql-using-jdbc

ClickHouseStatementImpl.getUpdateCount() doesn't signal no more results

The implementation of ClickHouseStatementImpl.getUpdateCount() will always return 0 if the previous statement didn't return a result set, no matter how often getMoreResults() is called. This causes a loop that tries to read all results from a Statement to loop forever since the following expression from the JavaDoc for Statement is always false:

((stmt.getMoreResults() == false) && (stmt.getUpdateCount() == -1))

Support for JDBC 4.1 API IS_GENERATEDCOLUMN column

According to The JDBC 4.1 API method getColumns should support column IS_GENERATEDCOLUMN

Please add support

Setting password

I cannot find possibility to set password with jdbc-driver.
And as I see password is not used when you create connection:
public Connection getConnection(String username, String password) throws SQLException { return driver.connect(url, properties); }
@serebrserg, maybe, there is something that I had not noticed? Can you help me with this?

JDBC doesn't throw an exception when `CREATE TABLE AS SELECT` times out

Clickhouse version 1.1.54140.
Checked against JDBC driver versions 0.1.12 and 0.1.17.

Steps to reproduce:

create user with max_execution_time 1
execute using JDBC: CREATE TABLE table_name ENGINE = Memory AS select ...,
where ... is a query that is guaranteed to time out
verify that:
- JDBC didn't throw an exception for timed out CREATE TABLE ... query
- table_name was created but is empty

Test (scala):

  test("create table from select timeout") {
    withConn { implicit conn =>
      withStatement { st =>

        val tableName = "tmp_test_table_" + Math.abs(Random.nextInt()) + "_" + (System.currentTimeMillis() % 1000)
        try {
          st.executeUpdate(
            s"""CREATE TABLE $tableName ENGINE = Memory  AS
               select count(*) as a from dw.ad_aggr_internal_bid_by_time_shard"""
          )

          val rs = st.executeQuery(s"select a from $tableName")
          assert(rs.next() === true)
          rs.close()

        } finally {
          Try(st.executeUpdate(s"drop table $tableName"))
        }
      }
    }
  }

Fails with assertion error: false did not equal true, but should fail on st.executeUpdate(, which times out.

This is JDBC-specific bug, as the same request over http correctly returns an exception:

echo 'CREATE TABLE qqq ENGINE = Memory  AS select count(*) as a from dw.ad_aggr_internal_bid_by_time_shard' | curl 'http://localhost:8123/?user=test&database=dw' -d @-
Code: 159, e.displayText() = DB::Exception: Timeout exceeded: elapsed 1.080002066 seconds, maximum: 1, e.what() = DB::Exception

ClickHouseConnectionImpl does not account for properties passed in url

It just passes default properties to initTimeZone

BigInteger value throw exception

If we set BitInteger value in prepared statement it throw exception like this:

 PreparedStatement st = null;
        try {
            st = new ClickHouseDataSource(
                    connections.get( 0 ),
                    properties
            ).getConnection().prepareStatement(
                    "INSERT INTO form_test(response_id,submit_date) VALUES (123, ?)" );
            st.setObject( 1, new BigInteger( "123" ) );

ClickHouse exception, code: 53, host: Code: 53, e.displayText() = DB::Exception: Type mismatch in IN or VALUES section: Int64 expected, String got: (at row 1), e.what() = DB::Exception

And question Int64 it is BigInteger value or Long?
Thanks!

Nested

Нет возможности вставлять данные в поля типа Nested через statement. При вставке говорит "No such column".

Problem with spaces

If there are spaces in the beginning of "select" query like " SELECT FROM 1", than method executeQuery return null.
I found that in the method isSelect you check the query with startWith without checking for leading spaces.

Cannot override setting (max_execution_time) in readonly mode

I cannot connect through the JDBC driver as a readonly user, because " socketTimeout + dataTransferTimeout is sent to ClickHouse as max_execution_time. ", and readonly users cannot run SET statements.
Any query throws an exception
DB::Exception: Cannot override setting (max_execution_time) in readonly mode

Error "Incompatible Jackson version" when using with PySpark 2.2.0

Hello.
When I try to use this lib to dump PySpark Dataframe to ClickHouse I get a mistake.

clone and compile repo

cd /var/bigdata/
git clone https://github.com/yandex/clickhouse-jdbc
cd clickhouse-jdbc
mvn package assembly:single -DskipTests=true
mv target/clickhouse-jdbc-0.1-SNAPSHOT-jar-with-dependencies.jar $SPARK_PATH/jars/

My pyspark code:

df = load_df(...)
url = "jdbc:clickhouse://default:@abc.def.ru/db"
properties = {"driver": "ru.yandex.clickhouse.ClickHouseDriver"}
df.write.jdbc(url=config['ch_url'], table="data.posdata_test", mode="append", properties=properties)

Try to ran it by bash

$SPARK_PATH/bin/spark-submit import.py --jars clickhouse-jdbc-0.1-SNAPSHOT-jar-with-dependencies.jar

I got this error

[2017-09-20 11:09:01,474] [root] [ERROR] [-] [-] exception during process
Traceback (most recent call last):
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 76, in main
    process_sources(source_list)
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 61, in process_sources
    df = load_df_from_sources(source_list, first_row_is_header=True)
  File "/home/vsmelov/PycharmProjects/etl/spark_test/load_df.py", line 193, in load_df_from_sources
    rdd = load_rdd_from_file(source.source_file_name, first_row_is_header=first_row_is_header)
  File "/home/vsmelov/PycharmProjects/etl/spark_test/load_df.py", line 157, in load_rdd_from_file
    rdd = df.rdd.mapPartitionsWithIndex(
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 85, in rdd
    jrdd = self._jdf.javaToPython()
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
    return f(*a, **kw)
  File "/var/bigdata/spark-2.2.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o27.javaToPython.
: java.lang.ExceptionInInitializerError
	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
	at org.apache.spark.sql.Dataset.javaToPython(Dataset.scala:2794)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:280)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:214)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.7.3
	at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
	at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
	at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:722)
	at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
	at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
	... 16 more
Stack (most recent call last):
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 89, in <module>
    main()
  File "/home/vsmelov/PycharmProjects/etl/spark_test/import.py", line 78, in main
    logger.error('exception during process', exc_info=1, stack_info=1)

My Spark version:

vsmelov@vsmelov:~/PycharmProjects/etl/spark_test$ $SPARK_PATH/bin/spark-submit --version
version 2.2.0
Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_131
Branch 
Compiled by user jenkins on 2017-06-30T22:58:04Z

What can I do with it?
Thanks in advance.

Error: Could not initialize class ru.yandex.clickhouse.ClickHouseUtil when using with PySpark

When I dump PySpark DataFrame to ClickHouse containing columns with types Integer, DateTime, Float, Date all works fine.
But when I try to dump DataFrame with String-typed column I get error.

My PySpark code:

    df = spark.range(0, 10).withColumnRenamed('id', 'field_int')
    from pyspark.sql.functions import lit
    df = df.withColumn('field_str', lit('abcdef'))
    df.write.jdbc(url=config['ch_url'], table="test_with_string", mode="append",
                  properties=config["ch_properties"])

Error:

Caused by: java.lang.NoClassDefFoundError: Could not initialize class ru.yandex.clickhouse.ClickHouseUtil
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.setString(ClickHousePreparedStatementImpl.java:214)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeSetter$8.apply(JdbcUtils.scala:525)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeSetter$8.apply(JdbcUtils.scala:524)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:629)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:782)
	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:782)
	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:926)
	at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:926)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2062)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:108)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	... 1 more

What can I do with it?
I am not Java developer so I don't know how to compile and "import" class ru.yandex.clickhouse.ClickHouseUtil to my project.

There are some files in compiled repository

vsmelov@vsmelov:~/PycharmProjects/etl/spark_test$ find /var/bigdata/clickhouse-jdbc/ -name '*ClickHouseUtil*'
/var/bigdata/clickhouse-jdbc/src/test/java/ru/yandex/clickhouse/ClickHouseUtilTest.java
/var/bigdata/clickhouse-jdbc/src/main/java/ru/yandex/clickhouse/ClickHouseUtil.java
/var/bigdata/clickhouse-jdbc/target/classes/ru/yandex/clickhouse/ClickHouseUtil.class
/var/bigdata/clickhouse-jdbc/target/test-classes/ru/yandex/clickhouse/ClickHouseUtilTest.class
/var/bigdata/clickhouse-jdbc/target/apidocs/ru/yandex/clickhouse/class-use/ClickHouseUtil.html
/var/bigdata/clickhouse-jdbc/target/apidocs/ru/yandex/clickhouse/ClickHouseUtil.html

But there is no .jar files
Thanks in advance.

Driver is not available in maven central

Hi guys,

It's definitely a good idea to make driver available with Maven Central.

Конторе привет от @viasyn :з

Ignite cache Store

I would like import models from Clickhouse to Ignite Console by jdbc driver, console failed to collect schemas

java.lang.NoClassDefFoundError: com/google/common/collect/MapMaker
        at ru.yandex.clickhouse.ClickHouseDriver.<clinit>(ClickHouseDriver.java:36)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.ignite.console.agent.db.DbMetadataReader.connect(DbMetadataReader.java:126)
        at org.apache.ignite.console.agent.handlers.DatabaseListener.connect(DatabaseListener.java:203)
        at org.apache.ignite.console.agent.handlers.DatabaseListener.schemas(DatabaseListener.java:220)
        at org.apache.ignite.console.agent.handlers.DatabaseListener$1.execute(DatabaseListener.java:82)
        at org.apache.ignite.console.agent.handlers.AbstractListener$1.run(AbstractListener.java:67)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: com.google.common.collect.MapMaker
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 13 more

AutoCommit

Hi!
As I see in code you have

public void setAutoCommit(boolean autoCommit) throws SQLException {
}

It means that ClickHouse supports autocommit by default or there is now commit in ClickHouse?
Or maybe I don't understand something? :)
Thanks!

Implement database connection pooling with Tomcat

Is there an example we can look into to see how to configure a pulled connection with Tomcat or any other container?

Remote connectivity issue

new ClickHouseDataSource("jdbc:clickhouse://clickhouse:8123").getConnection().createStatement();
When trying to connect like this, I get the following error

Exception in thread "main" ru.yandex.clickhouse.except.ClickHouseException: ClickHouse exception, code: 210, host: clickhouse, port: 8123; Connect to clickhouse:8123 [clickhouse/40.78.***.***] failed: connect timed out at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:42) at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:23) at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:461) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:73) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:69) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at ru.yandex.clickhouse.util.LogProxy.invoke(LogProxy.java:42) at com.sun.proxy.$Proxy3.executeQuery(Unknown Source) at am.galstyan.ClickhouseJdbc.main(ClickhouseJdbc.java:17) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

It seems that connection works fine only if the host is local otherwise it throws this exception

even when I am trying to connect, say, fb.com, it throws the very same exception, however when I am trying to connect to localhost, it works fine.
COuld you help me resolve the issue

TCP based connector

Is it possible to implement TCP based driver instead of HTTP? In this case it should become more stable and faster.

Error trying to parse UInt32 values bigger than Integer.MaxValue

TypeUtils.toSqlType(String) returns Types.INTEGER for "UInt32" which leads to ClickHouseResultSet.getObject() trying to parse it through getInt(columnIndex) and results in failure for values bigger than Integer.MaxValue.

Error when using Spark - Syntax error: failed at position 8

Spark: 2.1.0 (Hadoop 2.7)
Java/Python
clickhouse server/client 1.1

When querying using spark and clickhouse jdbc driver (newest source, head 164d10c), we got this error:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 192.168.1.179, executor 2): 
ru.yandex.clickhouse.except.ClickHouseException: 
ClickHouse exception, code: 62, host: clickhouse, port: 8123; 
Code: 62, e.displayText() = DB::Exception: 
Syntax error: failed at position 8: "name","city","country" FROM default.spark FORMAT TabSeparatedWithNamesAndTypes;, 
expected not empty list of expressions, e.what() = DB::Exception

Caused by this line in the code:
https://goo.gl/73N8Ya

Our java spark code looks like this:

HashMap<String, String> clickhouseOptions = new HashMap<String, String>() {{
    put("driver", "ru.yandex.clickhouse.ClickHouseDriver");
    put("url", "jdbc:clickhouse://clickhouse:8123");
    put("dbtable", "default.spark");
}};

SparkSession spark = SparkSession.builder().getOrCreate();
Dataset<Row> test = spark.read().format("jdbc").options(clickhouseOptions).load();
test.show();

When calling show() the error happens.

The default.spark table has three columns name, city, country
The select query looks like this, coming form Spark:
SELECT "name","city","country" FROM default.spark

After clickhousifySql(sql), https://goo.gl/1wAuWb:
SELECT "name","city","country" FROM default.spark FORMAT TabSeparatedWithNamesAndTypes;

The problem lies in the quotation marks, removing them or replacing with a back quote, to support spaces in column names, solves the error:
SELECT `name`,`city`,`country` FROM default.spark FORMAT TabSeparatedWithNamesAndTypes;

Translate Statement#setMaxRows() into LIMIT clause

Using graphical JDBC clients to browse ClickHouse tables, I noticed that the row limit / pagination features of the clients (and thus their table browsing interfaces) practically do not work. What I mean is that on sufficiently large tables, the queries take so much time that they are useless in practice (and eventually fail anyway.)

The reason is that the graphical clients use Statement#setMaxRows() to set the number of rows they are interested in before executing the query, as per JDBC specs. But the driver does not send that information to the server, sending instead a limitless query via HTTP POST.

By its nature, being throughput-oriented, ClickHouse will interpret a limitless query as really wanting the entire table, and so will take a long time to prepare the result set, which in many cases will not fit in the server memory and thus will eventually fail.

I believe ClickHouseStatementImpl should add maxRows to the query source as a LIMIT clause (before adding FORMAT) if that option was set by the JDBC client, instead of sending a limitless query and then truncating the result set.

getTables() does not list .inner. tables

I suppose it happened when different types were assigned to different objects.
I can see .inner. tables if I use 0.1.14, and all objects are listed as 'table'.
With 0.1.25 or 0.1.26 there are tables, views and others, and no .inner. tables.

ClickHouseRowBinaryStream doesn't support arrays

There is a lack of implementation. I tried to implement it by myself but got no luck. Here is a sample method for string array:

public void writeArray(String[] array) throws IOException {
        Preconditions.checkNotNull(array);
        writeUnsignedLeb128(array.length);
        for (String el : array) {
            writeString(el);
        }
}

With this implementation I get DB::Exception: Cannot read all data from clickhouse server version 1.1.54236.
Do you see what can be wrong with this code?

Incorrect parsing of complex types (Arrays and Tuples)

Here are some cases that I've found (rs is a ResultSet instance):

Tuple:

select ('1', 2) as a;

can only be extracted as String:

rs.getMetaData.getColumnTypeName(1) = "Tuple(String, UInt8)"
rs.getObject(1) = "('1',2)"

Nested arrays:

select [[1,2], [3,4]] as a;

cannot be extracted as Array:

rs.getMetaData.getColumnTypeName(1) = "Array(Array(UInt8))"
rs.getObject(1) // Method threw 'java.lang.RuntimeException' exception.
                           // Parse exception: ByteFragment{[[[1,2],[3,4]]], start=0, len=13}
rs.getArray(1) // Method threw 'java.lang.IllegalStateException' exception.

Tuples nested in Array:

select [('a',1), ('b', 2)] as a;

strange parsing behavior when parsed as Array:

rs.getMetaData.getColumnTypeName(1) = Array(Tuple(String, UInt8))
rs.getArray(1)={ru.yandex.clickhouse.ClickHouseArray}:
       array = {java.lang.String[4]@3644} 
       0 = "('a'"
       1 = "1)"
       2 = "('b'"
       3 = "2)"

Return types incorrect for nullable values

When we select a value that is nullable (i.e. Nullable(Float)), what we get back from the JDBC driver is a value of type String. Looking at the code it seems that only non-nullable Float, Int and UInt are currently supported return types, with everything else being converted to String. Would be great to get a fix for this, so that upon selecting a Nullable(Float) value we actually get back a value of type Float.

PreparedStatement.execute() fails on null values

PreparedStatement.execute() fails if a null value has been bound to a parameter with setNull:

Caused by: java.lang.Throwable: Code: 62, e.displayText() = DB::Exception: Cannot parse expression of type Nullable(String) here: \N), e.what() = DB::Exception

	at ru.yandex.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:53)

PreparedStatement.executeBatch() however works.

Example code:

        Connection connection = DriverManager.getConnection("jdbc:clickhouse://clickhouse:8123/default");
        Statement st = connection.createStatement();
        st.executeUpdate("create table if not exists test_null_insert(value Nullable(String)) engine = Memory");

        try {
            PreparedStatement ps = connection.prepareStatement("insert into test_null_insert(value) values(?)");

            // Works
            ps.setNull(1, Types.VARCHAR);
            ps.addBatch();
            ps.executeBatch();

            // Doesn't work
            ps.setNull(1, Types.VARCHAR);
            ps.execute();
        } finally {
            ResultSet rs = st.executeQuery("select value from test_null_insert");
            while (rs.next()) {
                String value = rs.getString(1);
                System.out.println(value != null ? "\"" + value + "\"" : "null");
            }

            st.executeUpdate("drop table test_null_insert");
            connection.close();
        }

Same Connection executes queries on different nodes under certain DNS settings

This is the setup that we have in our company:

CH cluster (multiple nodes with different IPs)
DNS server that resolved an internal hostname to the list of IPs of CH nodes
CH clients that have low setting for networkaddress.cache.ttl and sun.net.inetaddr.ttl to enable DNS-based service discovery and load balancing
CH client creates Connection and executed bunch of dependent requests:
- create "temporary" † tables
- fill these tables with data
- query these tables
- drop tables

† "temporary" tables are not really temporary due to the restrictions of the CH HTTP API, so these tables have to be dropped manually.

And here is what happened:
Sometimes during the long "session" java's DNS cache was flushed, and since certain point of time CH JDBC requests started to query another CH node, despite the fact that all queries were made within the same CH Connection.

We found a workaround: currently we're finding all IPs for the hostname ourselves and creating CH Connection with one specific IP.

So, here is the question: what is the semantics of CH Connection? Shouldn't it provide any "visibility" guarantees (like if you successfully created temporary table and then try to query it within the same Connection, you are guaranteed to see that table)? If it should, then probably jdbc driver has to somehow guarantee that all queries within the same Connection go to the same CH node regardless of the client DNS configuration.

This is not a bug report, it's more of a question/suggestion/discussion.

Thanks.

Can we move database names from schemas to catalogs?

I have been working on the driver trying to get a few SQL management tools to work (DBeaver and DBVisualizer) but I'm having a hard time with the "current database" popup menus.

Apparently they work well with catalogs, but are very buggy with schemas. Probably because getSchema / setSchema was introduced relatively late into the JDBC standard. (I know, it was years ago with Java 7, but many tools are still behind.)

Since Clickhouse only has one level of namespace (the database name) can we change this driver to expose them as catalogs, instead of schemas? Would you accept a pull request with this change?

Batch insert with 30 or more columns causes SQLException: Not all parameters binded

Hi!
I get this exception, when 30 or more columns passed with batch insert:

java.sql.SQLException: Not all parameters binded
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.checkBinded(ClickHousePreparedStatementImpl.java:120)
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.buildBinds(ClickHousePreparedStatementImpl.java:126)
	at ru.yandex.clickhouse.ClickHousePreparedStatementImpl.addBatch(ClickHousePreparedStatementImpl.java:327)
	at org.easybatch.jdbc.JdbcRecordWriter.writeRecords(JdbcRecordWriter.java:86)
	at org.easybatch.core.job.BatchJob.writeBatch(BatchJob.java:203)
	at org.easybatch.core.job.BatchJob.call(BatchJob.java:79)
	at org.easybatch.core.job.BatchJob.call(BatchJob.java:22)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)

Example code - https://github.com/greysap/randombatch2clickhouse/blob/master/src/main/java/house/greysap/App.java
This code will work after commenting lines 74, 186 and 298 (i.e. lowering columns number from 30 to 29).

What's the best way to insert batch with many columns via JDBC driver?

Single-column ResultSet gets truncated on empty string

Here is the minimal example:

 select c from (
    select 'a' as c
    UNION ALL select 'b' as c
    UNION ALL select '' as c
    UNION ALL select 'd' as c
 )

ResultSet is read using something like this:

val s = conn.createStatement()
val r = s.executeQuery(sql)

while (r.next()) {
   println(r.getString(1))
}

Expected result should be:

a
b

d

Actual result is:

a
b

Here is the full test snippet (scala):
https://gist.github.com/Aivean/fe8c4f90c183f08aa908baed7ca89478

P.S. Both TCP and HTTP interfaces return correct result.

Unable to set connection_timeout

Hi,
I am trying to open an connection with special properties, but my properties don't get applied.
Neither with Clickhouseproperties

Class.forName(CLICKHOUSE_DRIVER_NAME);
ClickHouseProperties clickHouseProperties = new ClickHouseProperties();
clickHouseProperties.setConnectionTimeout(3000);
ClickHouseDataSource dataSource = new ClickHouseDataSource(CLICKHOUSE_URL, clickHouseProperties);

nor with normal Properties.

Class.forName(CLICKHOUSE_DRIVER_NAME);
Properties properties = new Properties();
properties.setProperty("connection_timeout", "3000");
ClickHouseDataSource dataSource = new ClickHouseDataSource(CLICKHOUSE_URL, properties);

Every-Time I debug, 50ms are set for connection_timeout:

Maybe this is caused by the the values are saved as Integers inside the hashtable and later when calling getSetting(....) -> getProperty(...) -> null is returned, because the value isn't an instanceOf String and the default value is used.

Maybe in the following Method inside ClickhouseProperties you shoud you get() instead of getProperty()

private <T> T getSetting(Properties info, String key, Object defaultValue, Class clazz){
    Object val = info.getProperty(key);
    if (val == null)
        return (T)defaultValue;
    if (clazz == int.class || clazz == Integer.class) {
        return (T) clazz.cast(Integer.valueOf((String) val));
    }
    if (clazz == long.class || clazz == Long.class) {
        return (T) clazz.cast(Long.valueOf((String) val));
    }
    if (clazz == boolean.class || clazz == Boolean.class) {
        return (T) clazz.cast(Boolean.valueOf((String) val));
    }
    return (T) clazz.cast(val);
}

setString and setObject have different behaviour for null values

When using prepared statements, setString and setObject have different behaviour for null values. The setString method uses ClickHouseUtil.escape(...) which produces the literal string 'NULL', whereas setObject produces \N

Several statements in one query

Hi, @serebrserg !
Can we execute several statements in one query?
For example: "CREATE TABLE .....; CREATE TABLE .....;"
For now it is not working. Is it bug of jdbc-connector or normal behavior of database?
Thanks!

Float64 gets sorted as string

It started in 0.1.26 actually.
I use SQL Workbench/J (http://www.sql-workbench.net/).
Float64 gets sorted as string

Float32 gets sorted as numbers

This is the difference that I can see in the application:

Unfortunately I cannot easily see how the types were interpreted in 0.1.25 because of the 'Unknown identifier: default_type' error, but I checked with a JDBC driver for another database, sorting works correctly for data types that mapped to 'Double'

jdk version

Is EXACTLY jdk 1.6 needed?
Why is this not compiling on a higher version?

Support ssl/tls/https connections

Right now there is no way (or we didn't find a way on the docs or by reading the code) to connect to clickhouse servers using ssl/tls.

Maybe because the protocol "http" is hardcoded here:

https://github.com/yandex/clickhouse-jdbc/blob/8d2e9fd7f1f6342e9baaa4b8e7a601d29afd1e8d/src/main/java/ru/yandex/clickhouse/ClickHouseStatementImpl.java#L544

and then the wrong type of factory is created here:

https://github.com/yandex/clickhouse-jdbc/blob/1d7561c23feae11eda13165ed08232bf5bac2988/src/main/java/ru/yandex/clickhouse/util/ClickHouseHttpClientBuilder.java#L48

It could be enabled by passing params on the jdbc connection url like:

String url = "jdbc:clickhouse://clickhouse-000.example.com:8443/default?ssl=true&sslrootcert=ca.pem";

like postgresql.

SimpleDateFormat is not thread safe

In class ClickHouseResultSet there is static SimpleDateFormat which is not thread safe.
Could you replace it with something else or use syncronizing or thradLocal for this?

Failed to respond

Sometime get such exception(when do select/insert/alter queries):
Caused by: org.apache.http.NoHttpResponseException: clickhouse2:8123 failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:492) ... 26 more
Maybe, do know why it can happens? Wrong configuration of something else?
Thanks!

Add support for decompress=1 (compression of ClickHouseStatement.sendStream(..)

Please consider adding this. For example our current usecase (not yet on ClickHouse) involves sending live data batches from multiple servers to the database -- compression should help to significantly reduce the amount of traffic since this data is alpha-numeric.

In the short term (as a workaround) -- is there information available about how to implement this compression myself?

ClickHouseLZ4Stream contains decompression code. Assuming the compression format is the same, this code seems to contain all the required information EXCEPT the checksum format (which is only read and not checked). So what is the checksum format? Is it actually checked by server?

getTimestamp and getTime don't use info about CH server's time zone.

Hello.

While working with dates I faced with a TimeZones problem.
More specifically, clickhouse doesn't use timezone in dateTime representation, ok. But when jdbc driver converts a dateTime string with the SimpleDateFormat here it uses timezone of the host, where jdbc is running but not the clickhouse server!
So, it was not clear to figure out the bug in our system when In our db data stored in UTC, but clients work in, let's say CST. And this is the place where UTC data was shifted to CST that led to a terrible mess.

I think, It would be better to set timezone in jdbc as a property and do this at ClickHouseResultSet initialization time: (That actually I do now).

sdf.setTimeZone(TimeZone.getTimeZone("UTC")); // or property value

But also I believe that there is a way to get proper timezone of the clickhouse server and use it, instead of local host timezone.

Thanks.

clickhouse / clickhouse-java Goto Github PK

clickhouse-java's Introduction

ClickHouse Java Client & JDBC Driver

About the Project

Important

Upcomming deprecations:

Installation

Client V2

Artifacts

Compatibility

Features

Examples

Client V1

Artifacts

Features

Examples

Compatibility

Documentation

Contributing

clickhouse-java's People

Contributors

Stargazers

Watchers

Forkers

clickhouse-java's Issues

Recommend Projects

Recommend Topics

Recommend Org