Giter VIP home page Giter VIP logo

crate / crate Goto Github PK

View Code? Open in Web Editor NEW
4.0K 174.0 546.0 144.85 MB

CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.

Home Page: https://cratedb.com/product

License: Apache License 2.0

Java 99.61% Shell 0.03% Python 0.22% ANTLR 0.15%
sql distributed database cluster iot cratedb distributed-database iot-database iot-analytics industrial-iot olap big-data dbms analytics time-series lucene tsdb postgresql elasticsearch distributed-sql-database

crate's Introduction

CrateDB


image

image

image

Help us improve CrateDB by taking our User Survey!

About

CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.

CrateDB offers the benefits of an SQL database and the scalability and flexibility typically associated with NoSQL databases. Modest CrateDB clusters can ingest tens of thousands of records per second without breaking a sweat. You can run ad-hoc queries using standard SQL. CrateDB's blazing-fast distributed query execution engine parallelizes query workloads across the whole cluster.

CrateDB is well suited to containerization, can be scaled horizontally using ephemeral virtual machines (e.g., Kubernetes, AWS, and Azure) with no shared state. You can deploy and run CrateDB on any sort of network — from personal computers to multi-region hybrid clouds and the edge.

Features

  • Use standard SQL via the PostgreSQL wire protocol or an HTTP API.
  • Dynamic table schemas and queryable objects provide document-oriented features in addition to the relational features of SQL.
  • Support for time-series data, real-time full-text search, geospatial data types and search capabilities.
  • Horizontally scalable, highly available and fault-tolerant clusters that run very well in virtualized and containerized environments.
  • Extremely fast distributed query execution.
  • Auto-partitioning, auto-sharding, and auto-replication.
  • Self-healing and auto-rebalancing.
  • User-defined functions (UDFs) can be used to extend the functionality of CrateDB.

Screenshots

CrateDB provides an Admin UI:

Screenshots of the CrateDB Admin UI

Try CrateDB

The fastest way to try CrateDB out is by running:

sh$ bash -c "$(curl -L try.crate.io)"

Or spin up the official Docker image:

sh$ docker run --publish 4200:4200 --publish 5432:5432 --env CRATE_HEAP_SIZE=1g crate -Cdiscovery.type=single-node

Visit the installation documentation to see all the available download and install options.

Once you're up and running, head over to the introductory docs. To interact with CrateDB, you can use the Admin UI sql console or the CrateDB shell CLI tool. Alternatively, review the list of recommended clients and tools that work with CrateDB.

For container-specific documentation, check out the CrateDB on Docker how-to guide or the CrateDB on Kubernetes how-to guide.

Contributing

This project is primarily maintained by Crate.io, but we welcome community contributions!

See the developer docs and the contribution docs for more information.

Security

The CrateDB team and community take security bugs seriously. We appreciate your efforts to responsibly disclose your findings, and will make every effort to acknowledge your contributions.

If you think you discovered a security flaw, please follow the guidelines at SECURITY.md.

Help

Looking for more help?

crate's People

Contributors

amotl avatar andreidan avatar autophagy avatar baurzhansakhariev avatar burnes avatar chaudum avatar dobe avatar hammerhead avatar hlcianfagna avatar infoverload avatar jayeff avatar jeeminso avatar jodok avatar kovrus avatar lukasender avatar m-kharbat avatar marregui avatar matriv avatar mfussenegger avatar mkleen avatar msbt avatar mxm avatar nomicode avatar proddata avatar pustovit avatar quodt avatar romseygeek avatar seut avatar smakalias avatar turbo-ele avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crate's Issues

Incompatible minimum and maximum heap sizes specified

Hi,
I downloaded crate and started it on console bin/crate. I run it on Mac OS.
Unfortunately I get the following error:

Error occurred during initialization of VM
Incompatible minimum and maximum heap sizes specified

Any ideas?

Error while importing gzip file

I'm trying to import a gzip file from within the web console. I hit an error.
The command is the following:

copy audience_bulk from 'file:///home/mping/audience_10.json.gz' with (compression=gzip)

Importing a regular file (not a gzip) does work. I'm with crate 0.41.1

Stack trace:

java.lang.NullPointerException
        at io.crate.analyze.AbstractDataAnalysis.getReference(AbstractDataAnalysis.java:508)
        at io.crate.analyze.AbstractDataAnalysis.getReference(AbstractDataAnalysis.java:500)
        at io.crate.analyze.DataStatementAnalyzer.visitQualifiedNameReference(DataStatementAnalyzer.java:467)
        at io.crate.analyze.DataStatementAnalyzer.visitQualifiedNameReference(DataStatementAnalyzer.java:48)
        at io.crate.sql.tree.QualifiedNameReference.accept(QualifiedNameReference.java:49)
        at io.crate.sql.tree.AstVisitor.process(AstVisitor.java:31)
        at io.crate.analyze.CopyStatementAnalyzer.settingsFromProperties(CopyStatementAnalyzer.java:90)
        at io.crate.analyze.CopyStatementAnalyzer.visitCopyFromStatement(CopyStatementAnalyzer.java:42)
        at io.crate.analyze.CopyStatementAnalyzer.visitCopyFromStatement(CopyStatementAnalyzer.java:36)
        at io.crate.sql.tree.CopyFromStatement.accept(CopyFromStatement.java:89)
        at io.crate.analyze.Analyzer.analyze(Analyzer.java:53)
        at io.crate.action.sql.TransportSQLAction.doExecute(TransportSQLAction.java:110)
        at io.crate.action.sql.TransportSQLAction.doExecute(TransportSQLAction.java:68)
        at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:63)
        at org.elasticsearch.client.node.NodeClient.execute(NodeClient.java:92)
        at io.crate.action.sql.SQLRequestBuilder.doExecute(SQLRequestBuilder.java:42)
        at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:85)
        at io.crate.rest.action.RestSQLAction.handleRequest(RestSQLAction.java:71)
        at org.elasticsearch.rest.RestController.executeHandler(RestController.java:159)
        at org.elasticsearch.rest.RestController$RestHandlerFilter.process(RestController.java:234)
        at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:215)
at io.crate.rest.CrateRestFilter.process(CrateRestFilter.java:67)
        at org.elasticsearch.rest.RestController$ControllerFilterChain.continueProcessing(RestController.java:218)
        at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:152)
        at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:121)
        at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:83)
        at io.crate.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:320)
        at io.crate.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:45)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at io.crate.http.netty.HttpBlobHandler.messageReceived(HttpBlobHandler.java:128)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
        at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
        at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Thanks guys.

Add not null constraint

Working on the active record adapter and having a "not null" constraint is assumed from the abstract adapter. Currently impleming a work around to get migrations working.

Cross Data Center Replication

I am working on a project that requires cross data center replication for failover in the event of the loss of a complete region. I have reviewed the available documentation and the comments within the config files of crate itself but I am not seeing much information on this topic.

An example of a setup would be:

2 DCs (East and West)
2 Crate Nodes per Data Center linked via a Secure VPN

I can easily setup the replication between nodes over the VPN but I am not seeing anyway to make the replication DC aware. Ideally I would want one full set of the data to live in each data center for failover in the event of a full loss of one of the two regions.

Is this possible on crate? If so how would you configure crate to be aware of the multiple regions for replication?

Allow renaming of tables.

The ability to rename tables can make it easier to handle schema changes, restores from backup, etc.

Would be nice to have the same syntax as Postgres:

ALTER TABLE table_name RENAME TO new_name


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

0.40.2 still not compatible with elasticsearch-cloud-aws plugin

This plugin is extremely useful and would be great for discovery and cluster allocation awareness.

root@node:/usr/share/crate# bin/plugin install elasticsearch/elasticsearch-cloud-aws/2.2.0
-> Installing elasticsearch/elasticsearch-cloud-aws/2.2.0...
Trying http://download.elasticsearch.org/elasticsearch/elasticsearch-cloud-aws/elasticsearch-cloud-aws-2.2.0.zip...
Downloading ... DONE
Installed elasticsearch/elasticsearch-cloud-aws/2.2.0 into /usr/share/crate/plugins/cloud-aws
root@node:/usr/share/crate# bin/crate
[2014-07-27 04:45:57,912][INFO ][node                     ] [Mar-Vell] version[1.2.2], pid[4026], build[${build/NA]
[2014-07-27 04:45:57,913][INFO ][node                     ] [Mar-Vell] initializing ...
[2014-07-27 04:45:57,996][INFO ][plugins                  ] [Mar-Vell] loaded [blob, sql, udc, crate-core, cloud-aws, admin-ui], sites [bigdesk, head, crate-admin, segmentspy]
{1.2.2}: Initialization Failed ...
- NoClassDefFoundError[org/elasticsearch/common/collect/Lists]
        ClassNotFoundException[org.elasticsearch.common.collect.Lists]

cloud-aws 2.2.0 should be compatible w/ ES 1.2.2 used in Crate 0.40.2? https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/es-1.2

Created table does not show up in web front end

Following the Hello Crate tutorial, I used the following code in the web console to create a table:

create table tweets (
created_at timestamp,
id string primary key,
retweeted boolean,
source string INDEX using fulltext,
text string INDEX using fulltext,
user_id string
);

This returned CREATE TABLE OK (0.428 sec)

I can select against this table with no issues:

select * from tweets limit 100;

SELECT OK (0.003 sec)

(no rows expected, none returned).

However this table does not show up on the following page:

http://192.168.1.15:4200/_plugin/crate-admin/#/tables

Post Twitter authorisation screen issue

On authourising the Twitter application preceding the data import on the Getting Started page you are returned to the web root, rather than somewhere useful like the screen you started on.

Eg you start at:

http://192.168.1.15:4200/_plugin/crate-admin/#/tutorial

Which takes you through the twitter auth process, which then returns you to:

http://192.168.1.15:4200/?start_twitter=true

The app is indeed authed at that point and can get data with no issue (other than the other issue I have just reported).

Add blob storage option to store files on s3

Having a database means having to deal with data. As long as the data is text the size is often somehow manageable (speaking of gb rather than tb) but with blobs it can easily grow a lot larger. It might be reasonable for some setups to store the data on amazons s3 (or another data store) rather than with the text data for pricing and stability reasons. Do you think it might be a good idea to add support for this?

Error when node.master: false and node.data: false

Hello,

I have the following configuration:

1 node as

#2. You want this node to only serve as a master: to not store any data and
#    to have free resources. This will be the "coordinator" of your cluster.
#
node.master: true
node.data: false

2 nodes as

# This are the nodes i replicate when i want to scale
node.master: true
node.data: true

Every app node has a local crate node as:

#3. You want this node to be neither master nor data node, but
#    to act as a "search load balancer" (fetching data from nodes,
#    aggregating results, etc.)
node.master: false
node.data: false

But when i try to initiate crate with this last configuration (node.master: false, node.data: false), i get the following error:

[2014-04-30 22:32:11,159][WARN ][transport                ] [Boost] Registered two transport handlers for action el-crate-export, handlers: org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$TransportHandler@4ff5f9e6, org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$TransportHandler@5aa7ff4f
[2014-04-30 22:32:11,159][WARN ][transport                ] [Boost] Registered two transport handlers for action el-crate-export/s, handlers: org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler@1de9940d, org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler@4b6aca08
[2014-04-30 22:32:11,160][WARN ][transport                ] [Boost] Registered two transport handlers for action el-crate-export, handlers: org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$TransportHandler@51f6f0d2, org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$TransportHandler@4ff5f9e6
[2014-04-30 22:32:11,160][WARN ][transport                ] [Boost] Registered two transport handlers for action el-crate-export/s, handlers: org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler@2b3c9508, org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler@1de9940d
[2014-04-30 22:32:11,160][WARN ][transport                ] [Boost] Registered two transport handlers for action el-crate-export, handlers: org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$TransportHandler@165c5665, org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$TransportHandler@51f6f0d2
[2014-04-30 22:32:11,161][WARN ][transport                ] [Boost] Registered two transport handlers for action el-crate-export/s, handlers: org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler@82c0c5b, org.elasticsearch.action.support.broadcast.TransportBroadcastOperationAction$ShardTransportHandler@2b3c9508
{1.0.1}: Initialization Failed ...
1) ElasticsearchIllegalStateException[node is not configured to store local location]

I cant seem to understand the meaning of this error, maybe its a bug?

Prevent duplicates in blob store?

So, different users are storing the same file, and one user decides to delete his file. What can be done so not all the users lose that file ?

OutOfMemoryError

While importing using copy from for a 8 million record file after about 1.2 million records.

[2014-07-30 16:53:47,808][WARN ][index.engine.internal    ] [Kirigi] [.partitioned.accelerometerdata.043j2d1g6goj0][2] failed engine [out of memory]
[2014-07-30 16:54:32,710][ERROR][io.crate.executor.transport.TransportCollectNodeAction] Error when creating result futures
java.lang.OutOfMemoryError: Java heap space
        at org.elasticsearch.index.get.ShardGetService.stats(ShardGetService.java:92)
        at org.elasticsearch.index.shard.service.InternalIndexShard.getStats(InternalIndexShard.java:515)
        at org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:136)
        at org.elasticsearch.action.admin.indices.stats.ShardStats.<init>(ShardStats.java:49)
        at org.elasticsearch.indices.InternalIndicesService.stats(InternalIndicesService.java:211)
        at org.elasticsearch.indices.InternalIndicesService.stats(InternalIndicesService.java:174)
        at org.elasticsearch.node.service.NodeService.stats(NodeService.java:138)
        at io.crate.operation.reference.sys.node.NodePortExpression$2.value(NodePortExpression.java:70)
        at io.crate.operation.reference.sys.node.NodePortExpression$2.value(NodePortExpression.java:67)
        at io.crate.operation.reference.sys.SysObjectReference.value(SysObjectReference.java:45)
        at io.crate.operation.reference.sys.SysObjectReference.value(SysObjectReference.java:31)
        at io.crate.analyze.EvaluatingNormalizer.visitReference(EvaluatingNormalizer.java:92)
        at io.crate.analyze.EvaluatingNormalizer.visitReference(EvaluatingNormalizer.java:49)
        at io.crate.planner.symbol.Reference.accept(Reference.java:76)
        at io.crate.planner.symbol.SymbolVisitor.process(SymbolVisitor.java:30)
        at io.crate.analyze.EvaluatingNormalizer.normalize(EvaluatingNormalizer.java:146)
        at io.crate.analyze.EvaluatingNormalizer.normalize(EvaluatingNormalizer.java:118)
        at io.crate.planner.node.dql.CollectNode.normalize(CollectNode.java:246)
        at io.crate.operation.collect.MapSideDataCollectOperation.handleNodeCollect(MapSideDataCollectOperation.java:173)
        at io.crate.operation.collect.MapSideDataCollectOperation.collect(MapSideDataCollectOperation.java:157)
        at io.crate.executor.transport.TransportCollectNodeAction.nodeOperation(TransportCollectNodeAction.java:115)
        at io.crate.executor.transport.TransportCollectNodeAction.access$700(TransportCollectNodeAction.java:52)
        at io.crate.executor.transport.TransportCollectNodeAction$AsyncAction$1.run(TransportCollectNodeAction.java:173)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

where clause fails to apply over partitions

I have an audience table partitioned by timestamp. The following query produces 0 results:

select * from audience 
where 1398902400000 < timestamp and timestamp < 1404000000000
limit 1;

But if I split the where, both of the queryes give results that would otherwise pass the where clause:

select * from audience where 1398902400000 < timestamp limit 1;
result: ... timestamp: 1403222400000

select * from audience where timestamp < 1404000000000 limit 1;
result: ... timestamp: 1403222400000

I'm guessing this has something to do with the partitions.

copy from without partition clause

I did something silly and imported some data without using the partitioned clause on a partitioned table.

What can be done to fix the data? Is it possible to export and reimport the data again?

The problem is that the cluster already contained some live data.

Is the documentation still valid?
https://crate.io/docs/stable/sql/reference/copy_from.html#partition-clause

The files that are imported should not contain values for the partitioned columns as the COPY FROM ... PARTITION statement does not check that and will unnecessarily import them.

Unsupported type 'integer_array' for custom aggregation function

Hi,

I'm trying to write a custom aggregation function for crate 0.41.1. I've managed to get it working for a String type, but my proper use case is for an array type. However, if I try to run the aggregation function on an array type, I hit on an exception:

io.crate.exceptions.UnhandledServerException: unsupported type 'integer_array'
    at io.crate.operation.reference.doc.lucene.LuceneDocLevelReferenceResolver.getImplementation(LuceneDocLevelReferenceResolver.java:82)
    at io.crate.operation.reference.doc.lucene.LuceneDocLevelReferenceResolver.getImplementation(LuceneDocLevelReferenceResolver.java:31)
    at io.crate.operation.collect.CollectInputSymbolVisitor.visitReference(CollectInputSymbolVisitor.java:67)
    at io.crate.operation.collect.CollectInputSymbolVisitor.visitReference(CollectInputSymbolVisitor.java:39)
    at io.crate.planner.symbol.Reference.accept(Reference.java:76)
    at io.crate.planner.symbol.SymbolVisitor.process(SymbolVisitor.java:30)
    at io.crate.operation.AbstractImplementationSymbolVisitor.process(AbstractImplementationSymbolVisitor.java:59)
    at io.crate.operation.collect.ShardCollectService.getLuceneIndexCollector(ShardCollectService.java:175)
    at io.crate.operation.collect.ShardCollectService.getCollector(ShardCollectService.java:147)
    at io.crate.operation.collect.MapSideDataCollectOperation.handleShardCollect(MapSideDataCollectOperation.java:274)
    at io.crate.operation.collect.MapSideDataCollectOperation.collect(MapSideDataCollectOperation.java:160)
    at io.crate.executor.transport.TransportCollectNodeAction.nodeOperation(TransportCollectNodeAction.java:115)
    at io.crate.executor.transport.TransportCollectNodeAction.access$700(TransportCollectNodeAction.java:52)
    at io.crate.executor.transport.TransportCollectNodeAction$AsyncAction$1.run(TransportCollectNodeAction.java:173)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
[2014-08-08 11:26:13,650][ERROR][io.crate.executor.transport.TransportCollectNodeAction] Error when creating result futures
io.crate.exceptions.UnhandledServerException: io.crate.exceptions.UnhandledServerException: unsupported type 'integer_array'
    at io.crate.operation.collect.MapSideDataCollectOperation.handleShardCollect(MapSideDataCollectOperation.java:285)
    at io.crate.operation.collect.MapSideDataCollectOperation.collect(MapSideDataCollectOperation.java:160)
    at io.crate.executor.transport.TransportCollectNodeAction.nodeOperation(TransportCollectNodeAction.java:115)
    at io.crate.executor.transport.TransportCollectNodeAction.access$700(TransportCollectNodeAction.java:52)
    at io.crate.executor.transport.TransportCollectNodeAction$AsyncAction$1.run(TransportCollectNodeAction.java:173)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: io.crate.exceptions.UnhandledServerException: unsupported type 'integer_array'
    at io.crate.operation.reference.doc.lucene.LuceneDocLevelReferenceResolver.getImplementation(LuceneDocLevelReferenceResolver.java:82)
    at io.crate.operation.reference.doc.lucene.LuceneDocLevelReferenceResolver.getImplementation(LuceneDocLevelReferenceResolver.java:31)
    at io.crate.operation.collect.CollectInputSymbolVisitor.visitReference(CollectInputSymbolVisitor.java:67)
    at io.crate.operation.collect.CollectInputSymbolVisitor.visitReference(CollectInputSymbolVisitor.java:39)
    at io.crate.planner.symbol.Reference.accept(Reference.java:76)
    at io.crate.planner.symbol.SymbolVisitor.process(SymbolVisitor.java:30)
    at io.crate.operation.AbstractImplementationSymbolVisitor.process(AbstractImplementationSymbolVisitor.java:59)
    at io.crate.operation.collect.ShardCollectService.getLuceneIndexCollector(ShardCollectService.java:175)
    at io.crate.operation.collect.ShardCollectService.getCollector(ShardCollectService.java:147)
    at io.crate.operation.collect.MapSideDataCollectOperation.handleShardCollect(MapSideDataCollectOperation.java:274)
    ... 7 more

IllegalIndexShardStateException

  • Environment: 8-nodes m3-large nodes (ssd), ubuntu 12.04
  • Data: 230GB index with ~ 300M documents
  • Stack trace:
[ERROR][io.crate.executor.transport.TransportCollectNodeAction] Error when creating result futures
io.crate.exceptions.CrateException: [traffic][26] CurrentState[POST_RECOVERY] operations only allowed when started/relocated
    at io.crate.operation.collect.MapSideDataCollectOperation.handleShardCollect(MapSideDataCollectOperation.java:209)
    at io.crate.operation.collect.MapSideDataCollectOperation.collect(MapSideDataCollectOperation.java:115)
    at io.crate.executor.transport.TransportCollectNodeAction.nodeOperation(TransportCollectNodeAction.java:99)
    at io.crate.executor.transport.TransportCollectNodeAction.access$600(TransportCollectNodeAction.java:49)
    at io.crate.executor.transport.TransportCollectNodeAction$AsyncAction$1.run(TransportCollectNodeAction.java:153)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
Caused by: org.elasticsearch.index.shard.IllegalIndexShardStateException: [traffic][26] CurrentState[POST_RECOVERY] operations only allowed when started/relocated
    at org.elasticsearch.index.shard.service.InternalIndexShard.readAllowed(InternalIndexShard.java:806)
    at org.elasticsearch.index.shard.service.InternalIndexShard.acquireSearcher(InternalIndexShard.java:644)
    at org.elasticsearch.index.shard.service.InternalIndexShard.acquireSearcher(InternalIndexShard.java:639)
    at org.elasticsearch.index.shard.service.InternalIndexShard.docStats(InternalIndexShard.java:493)
    at io.crate.operation.reference.sys.shard.ShardNumDocsExpression.value(ShardNumDocsExpression.java:41)
    at io.crate.operation.reference.sys.shard.ShardNumDocsExpression.value(ShardNumDocsExpression.java:27)
    at io.crate.analyze.EvaluatingNormalizer.visitReference(EvaluatingNormalizer.java:90)
    at io.crate.analyze.EvaluatingNormalizer.visitReference(EvaluatingNormalizer.java:49)
    at io.crate.planner.symbol.Reference.accept(Reference.java:76)
    at io.crate.planner.symbol.SymbolVisitor.process(SymbolVisitor.java:32)
    at io.crate.analyze.EvaluatingNormalizer.normalize(EvaluatingNormalizer.java:167)
    at io.crate.analyze.EvaluatingNormalizer.normalize(EvaluatingNormalizer.java:139)
    at io.crate.planner.node.dql.CollectNode.normalize(CollectNode.java:231)
    at io.crate.operation.collect.ShardCollectService.getCollector(ShardCollectService.java:109)
    at io.crate.operation.collect.MapSideDataCollectOperation.handleShardCollect(MapSideDataCollectOperation.java:201)
    ... 7 more
  • Steps to reproduce
    • Run heavy query like: select count(distinct {high cardinality field}) from {index_alias} ...
    • Wait couple minutes (no response, cluster is hanging with 100% cpu load)
    • Restart crate service on the node you've queried
      service crate restart
    • _plugin/head/ shows green, go to logs and check them

_sql Endpoint issues

  1. For ORM implementation "insert into" should return generated key
  2. Option to give types back. Example: a sql console can not format date, because it is not aware about timestamp columns - same functions is required for a JDBC driver (resultset meta data)

MongoDB JSON Import - not able to group by column with array value

I've direct imported from a JSON Mongo dump. I'm tying to group by a column (type string) which happens to be a mac address.

There's 20million+ rows of data and when I try and group by the mac, I get an error:

client_mac has a value that is an array

The query:

select count(*), client_mac from streams group by client_mac order client_mac limit 100;

Was just on IRC and the suggestion was that there's a rogue mac which may not be a string - however, with so many rows, I can't scrub before importing.

Any suggestions?

KeyError: 'cols'

I am playing with Crate using the following code:

mport sqlalchemy as sa
from sqlalchemy import func, desc
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

from crate.client.sqlalchemy.types import Object

hosts = ['localhost:4200']
engine = sa.create_engine('crate://', connect_args={'servers': hosts})
Base = declarative_base()
Session = sessionmaker(engine)
session = Session()

class Foo2(Base):
tablename = 'foo2'

user_id = sa.Column(sa.String, primary_key=True)
created_at = sa.Column(sa.Integer)

def __repr__(self):
    return '<Tweet {0}>'.format(self.user_id)

for i, row in enumerate(session.query(Foo2).order_by(Foo2.user_id)):
print i, row

import pdb; pdb.set_trace()
for i in range(100000):
f = Foo2(user_id=i, created_at=i)
session.add(f)
session.flush()

This raises the following error:

-> for i in range(100000):
(Pdb) c
Traceback (most recent call last):
File "foo.py", line 29, in
for i in range(100000):
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1907, in flush
self._flush(objects)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2025, in _flush
transaction.rollback(_capture_exception=True)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 57, in exit
compat.reraise(exc_type, exc_value, exc_tb)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1989, in _flush
flush_context.execute()
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 371, in execute
rec.execute(self)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 524, in execute
uow
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 64, in save_obj
mapper, table, insert)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 568, in _emit_insert_statements
execute(statement, multiparams)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 727, in execute
return meth(self, multiparams, params)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 824, in _execute_clauseelement
compiled_sql, distilled_params
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 971, in _execute_context
result = context.get_result_proxy()
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 768, in get_result_proxy
return result.ResultProxy(self)
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 404, in init
self._init_metadata()
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 407, in _init_metadata
metadata = self._cursor_description()
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/sqlalchemy/engine/result.py", line 517, in _cursor_description
return self._saved_cursor.description
File "/Users/ajung/src/crate-0.40.2/lib/python2.7/site-packages/crate/client/cursor.py", line 174, in description
for col in self._result["cols"]:
KeyError: 'cols'

check for 'is not null' in object types

class User(...)
club = Column(
'club',
Object,
default=None,
)

select * from users where club is not null

Error!

UnsupportedFeatureException[unknown function: op_isnull(object)]; nested: UnsupportedOperationException[unknown function: op_isnull(object)];

Error with importing demo twitter feed

The demo twitter feed import in the Admin interface is failing to store any of the tweets.

Server is being run on JRE7 on Windows Server 2012R2, client on Windows 8. No firewall problems, also tested on JRE7 on Windows 7 64bit with both client and server on same machine.

The following is the request/response series which are causing the issues:

POST http://192.168.1.15:4200/_sql HTTP/1.1
Host: 192.168.1.15:4200
Connection: keep-alive
Content-Length: 768
Accept: application/json, text/plain, /
Origin: http://192.168.1.15:4200
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36
Content-Type: application/json;charset=UTF-8
Referer: http://192.168.1.15:4200/_plugin/crate-admin/
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8

{"stmt":"create table tweets ( id string primary key, created_at timestamp, text string INDEX using fulltext, source string INDEX using fulltext, retweeted boolean, "user" object(strict) as ( created_at timestamp, verified boolean, followers_count integer, id string, statuses_count integer, description string INDEX using fulltext, friends_count integer, location string INDEX using fulltext ) ) with (number_of_replicas = '0-all')"}

HTTP/1.1 500 Internal Server Error
Access-Control-Allow-Origin: *
Content-Type: application/json; charset=UTF-8
Content-Length: 108

{"error":{"message":"NoClassDefFoundError[Could not initialize class io.airlift.slice.Slices]","code":1000}}

POST http://192.168.1.15:4200/_sql HTTP/1.1
Host: 192.168.1.15:4200
Connection: keep-alive
Content-Length: 795
Accept: application/json, text/plain, /
Origin: http://192.168.1.15:4200
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36
Content-Type: application/json;charset=UTF-8
Referer: http://192.168.1.15:4200/_plugin/crate-admin/
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8

{"stmt":"insert into tweets values ($1, $2, $3, $4, $5, $6)","args":[1398526680000,"460080361267929091",false,"<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone","RT @Nax_Rag: 東方神起9周年おめでとう!\nこれからも二人を信じて付いていきます! \n#東方神起9thAnniversary",{"followers_count":58,"verified":false,"description":"東方神起 の2人が大好き(チャンミン寄り♪)BB♡の曲も好き(テソン寄り♪)2NE1♡洋楽も聞くよ♪ 音楽に元気をもらってます。日常の事もつぶやいてるBigeast&VIP。 オバペンで マイペースですがよろしくです(゚v゚)♪","friends_count":92,"statuses_count":5401,"created_at":1360151496000,"id":"1153801358","location":""}]}

HTTP/1.1 400 Bad Request
Access-Control-Allow-Origin: *
Content-Type: application/json; charset=UTF-8
Content-Length: 575

{"error":{"message":"ValidationException[Validation failed for user_id: Invalid input for type string: {id=1153801358, location=, friends_count=92, description=東方神起 の2人が大好き(チャンミン寄り♪)BB♡の曲も好き(テソン寄り♪)2NE1♡洋楽も聞くよ♪ 音楽に元気をもらってます。日常の事もつぶやいてるBigeast&VIP。 オバペンで マイペースですがよろしくです(゚v゚)♪, verified=false, created_at=1360151496000, statuses_count=5401, followers_count=58}]","code":4003,"args":["user_id"]}}

I was able to create a table from within the online console, but any inserts also failed with the error NoClassDefFoundError[Could not initialize class io.airlift.slice.Slices]".

This is the sole Crate server on the network.

Error while importing file with array type

I'm having trouble importing a json file with an array type, from the web client. I understand I cannot do plain insert into queries from the web client shell, but IMHO importing a file should work.

The top level message is: SQLActionException[MapperParsingException[object mapping [visitors_register_10] trying to serialize a value with no field associated with it, current value [0]]] Options

The table is:

create table audience_bulk (
    browser_version string,
    //redacted,
    timestamp long, 
        visitors_register_10 array(object as (val integer))
) clustered into 10 shards partitioned by (timestamp);

The statement is: `copy audience_bulk from 'file:///home/mping/file.json'

Here's a sample of the file:

{"browser_version":"IE 9.0","visitors_register_10":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,32768,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"timestamp":"1398902400000"}

The stack trace is :

[2014-08-06 15:21:05,184][DEBUG][io.crate.action.sql.TransportSQLAction] [meo-loucos01] Error processing SQLRequest
org.elasticsearch.common.util.concurrent.UncategorizedExecutionException: Failed execution
    at org.elasticsearch.action.support.AdapterActionFuture.rethrowExecutionException(AdapterActionFuture.java:92)
    at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:50)
    at org.elasticsearch.action.support.AbstractListenableActionFuture.executeListener(AbstractListenableActionFuture.java:131)
    at org.elasticsearch.action.support.AbstractListenableActionFuture.done(AbstractListenableActionFuture.java:103)
    at org.elasticsearch.common.util.concurrent.BaseFuture.setException(BaseFuture.java:180)
    at org.elasticsearch.action.support.AdapterActionFuture.onFailure(AdapterActionFuture.java:103)
    at io.crate.executor.transport.TransportCollectNodeAction$1.onFailure(TransportCollectNodeAction.java:138)
    at com.google.common.util.concurrent.Futures$5.run(Futures.java:1222)
    at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
    at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
    at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
    at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)
    at io.crate.operation.projectors.CollectingProjector.upstreamFailed(CollectingProjector.java:87)
    at io.crate.operation.projectors.AbstractIndexWriterProjector$2.onFailure(AbstractIndexWriterProjector.java:176)
    at com.google.common.util.concurrent.Futures$5.run(Futures.java:1222)
    at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
    at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
    at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
    at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)
    at org.elasticsearch.action.bulk.BulkShardProcessor.setFailure(BulkShardProcessor.java:170)
    at org.elasticsearch.action.bulk.BulkShardProcessor.processResponse(BulkShardProcessor.java:277)
    at org.elasticsearch.action.bulk.BulkShardProcessor.access$200(BulkShardProcessor.java:61)
    at org.elasticsearch.action.bulk.BulkShardProcessor$ResponseListener.onResponse(BulkShardProcessor.java:315)
    at org.elasticsearch.action.bulk.BulkShardProcessor$ResponseListener.onResponse(BulkShardProcessor.java:305)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$2.handleResponse(TransportShardReplicationOperationAction.java:461)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$2.handleResponse(TransportShardReplicationOperationAction.java:447)
    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:154)
    at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
    at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
    at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: MapperParsingException[object mapping [visitors_register_10] trying to serialize a value with no field associated with it, current value [0]]
    at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.getValue(BaseFuture.java:288)
    at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:275)
    at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:113)
    at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:45)
    ... 50 more
Caused by: java.lang.RuntimeException: MapperParsingException[object mapping [visitors_register_10] trying to serialize a value with no field associated with it, current value [0]]
    ... 30 more

Thanks.

Query for array(string) type fails - or syntax documentation is missing

select id,ips,domain from netinfos where ips='110.76.141.122' fails with

SQLParseException[wrong type 'string'. expected: 'string_array']; nested: IllegalArgumentException[wrong type 'string'. expected: 'string_array'];

The documentation gives no example to query values in arrays.

to reproduce:
create table netinfo (id string, ips array(string))
insert into netinfo (id,ips) values (?,?) - 'wordpress.com', [' 76.74.254.126','192.0.82.252']
select id,ips from netinfos where ips='76.74.254.126'

CRATE Version 0.31.0 (from website download)

Include an insert Object example in the Documentation

The documentation states that one can use a JSON-blob for an object field, however all of my attempts to insert using JSON data result in errors. I think it would be nice if the documentation provided an insert example to go along with the create examples. Or, if the insert section had an object insert example.

Randomly getting a OutOfBundsException

This error occurs at random when running integration tests against crate

[2014-08-13 14:33:22,070][ERROR][io.crate.executor.transport.task.elasticsearch.ESSearchTask$SearchResponseListener] Error executing SELECT statement
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to execute phase [query], all shards failed
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:233)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onFailure(TransportSearchTypeAction.java:179)
    at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:523)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Exception in thread "elasticsearch[Miles Warren][search][T#4]" java.lang.ArrayIndexOutOfBoundsException: 0
    at io.crate.executor.transport.task.elasticsearch.ESSearchTask$SearchResponseListener.onFailure(ESSearchTask.java:191)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.raiseEarlyFailure(TransportSearchTypeAction.java:330)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:233)
    at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onFailure(TransportSearchTypeAction.java:179)
    at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:523)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)

Add documentation on insert statements for array, object

Hi,
trying to get an array insert working but all syntax example I find on the web seem not to work with crate.

Postgres Syntax

INSERT INTO rock_band 
VALUES
('Led Zeppelin',
'{"Page", "Plant", "Jones", "Bonham"}'
)

what's the correct statement for Crate?

COPY TO 's3://...' fails when parsing secret with slashes

When running a 'COPY TO ' command with a destination of the form 's3://key:secret@bucket' I get a syntax error if 'secret' contains a slash.

Ie. in the following I'd expect a auth failure, instead of the error given:

cr> copy clubs to directory 's3://foo:ba/r@quux';
UncategorizedExecutionException[Failed execution]; nested: ExecutionException[java.lang.IllegalArgumentException: The bucket name parameter must be specified when initiating a multipart upload]; nested: IllegalArgumentException[The bucket name parameter must be specified when initiating a multipart upload]; 

SQL: not like

It looks a "not like" query does not work. See attached example.
bildschirmfoto 2014-05-27 um 17 45 10

Seen on 0.38

Document Crate's Features & Storage Details

Crate looks extremely exciting! While evaluating it, and I'm sure others too, there is little information provided on some of the following:

  • ACID Properties
  • Consistency Guarantees (How is replication done…)
  • Transactional Support (If any…)
  • Storage Durability (In what scenarios could data loss occur due…)
  • Index Types (Geo, Full Text, Composite etc…)

I've tried to find any of the above through the web docs or the Github repo but have not found much other than Full Text support.

Sql: Allow array counting

Hi,

it would be really great if we could do some basic arrays operations like array length, or array intersect (like the && operator in postgresql ie) ...

Thanks a lot for your work

Support "Create/Drop index" statements

Please add support for create index statements, dropping as well.

  CREATE  INDEX "index_posts_on_comment_count" ON "posts" ("comment_count")
  DROP INDEX "index_posts_on_comment_count" ON "posts"

ActiveRecord migrations use this syntax to add indices.

IS NOT NULL in query results in "SQLActionException[null]"

Hello,

A query such as

select * from t1 where t1 is not null;

results in

Error!

SQLActionException[null]

"IS NULL" seems to work OK.

image

Version:

[2014-08-18 02:49:02,950][INFO ][node ] [Gailyn Bailey] version[1.2.2], pid[5257], build[${build/NA]
[2014-08-18 02:49:02,951][INFO ][node ] [Gailyn Bailey] initializing ...
[2014-08-18 02:49:03,076][INFO ][plugins ] [Gailyn Bailey] loaded [blob, sql, udc, crate-core, admin-ui], sites [head, bigdesk, segmentspy, crate-admin]
[2014-08-18 02:49:03,880][INFO ][io.crate.module.CrateCoreModule] configuring crate. version: 0.41.2

order by nested field does not work on all fields

i'm getting the following exception:

UnsupportedFeatureException[unsupported name reference: user.followers_count]; nested: UnsupportedOperationException[unsupported name reference: user.followers_count];

when i execute the following query:

select * from tweets order by user.followers_count limit 100

But the query

select * from tweets order by user.id limit 100

works fine

_sql endpoint should return a row when creating a BLOB Table

After creating the blob table it should return a row property in the JSON, it exists in CREATE TABLE, but not in CREATE BLOB TABLE.

CREATE BLOB TABLE behaves different than CREATE TABLE.
I had to put this workaround into the node-crate (Node.js driver) to make CREATE BLOB TABLE working with the new Mocha-Tests (including BLOB's now). Other drivers based on _sql endpoint might have the same issue (but maybe not discovered or also with a workaround).

response.on('end', function() {
            var result = {};
            try {
                result = JSON.parse(str);
            } catch (ex) {
                console.log('error:' + sql)
                if (cb) cb(ex, null, null);
                return;
            }
            if (/CREATE BLOB/im.test (sql))
            {
                // workaround CRATE does not return a row when it creates a BLOB
                result.rows=[];
            }

            var jsons = result.rows.map(function(e) {

Commit of 100k small records takes almost 2 minutes

I have the following code for inserting 100k small rows into a local Crate instance.
The commit phase (Python 2.7) takes almost 2 minutes which appears horribly slow, even more horrible: more than 5 minutes with Python 3.3.

import time
import sqlalchemy as sa
from sqlalchemy import func, desc
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

from crate.client.sqlalchemy.types import Object

hosts = ['localhost:4200']
engine = sa.create_engine('crate://', connect_args={'servers': hosts})
Base = declarative_base()
Session = sessionmaker(engine)
session = Session()

class Foo2(Base):
tablename = 'foo2'

user_id = sa.Column(sa.String, primary_key=True)
created_at = sa.Column(sa.Integer)

def __repr__(self):
    return '<Tweet {0}>'.format(self.user_id)

for i in range(100000):
f = Foo2(user_id=i, created_at=i)
if i % 10000 == 0:
print i
session.add(f)
print 'flush'

session.flush()

print 'commit'
ts = time.time()
session.commit()
print time.time() - ts
print 'done'

Better Batch Inserts

Just testing batch insert statements as in the Crate Java Client documentation and have come across the interesting result where it is slower than multiple insert executions.

In my testing it took ~300ms to insert 1000 items individually but took ~1600ms to insert items in the same statement.

I have ruled out the possibility of the statement string generation (?, ?, ?, ?, ?, ?), ... (3ms) and the args generation (30ms) being the issue in my code and it should be easy to reproduce.

It would be good if you could investigate this and provide batch inserts that are quicker and also perhaps don't require a really long parameter substitution string to be generated.

Thanks,
Alex

Columns w/ INDEX OFF cannot be aggregated.

Just discovered this when attempting to sum() some statistics columns that I had set to INDEX OFF.

At a minimum the documentation should be updated to reflect that this is the case.

Limit on Update and Delete

I'm not sure if this has already been pointed out, but LIMIT doesn't work on some basic queries operations, such as:

_Queries_

delete from sample limit 100;
update sample set bar = 2 where admin = 1 limit 10;

_Sample error_

SQLParseException[line 1:20: mismatched input 'limit' expecting EOF]; nested: ParsingException[line 1:20: mismatched input 'limit' expecting EOF]; nested: MismatchedTokenException;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.