Giter VIP home page Giter VIP logo

indexr's People

Contributors

flowbehappy avatar radbrawler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

indexr's Issues

Maven compile problem

Try to compile indexR today, there has been a can not find this BhcompressLibrary package error, how can this solution

Hi i got a error agian java.lang.IndexOutOfBoundsException: null

01:33:11.696 [RT-Ingest-auth_data] DEBUG i.i.server.rt.fetcher.Kafka08Fetcher - Illegal data: {"ydbpartion":"20161229","tablename":"auth_data","SOURCE_LINENO":"6579","ZIPNAME":"20161229","COLLECTION_EQUIPMENT_LONGITUDE":"119.6525360030066","CAPTURE_TIME":1482940891,"AREA_CODE":"330799","APPID":"","APP_SOFTWARE_NAME":"","COLLECTION_EQUIPMENT_ADRESS":"ZG5200161212_1010","COLLECTION_EQUIPMENT_MAC_LONG":68216227659,"ACCESS_AP_ENCRYPTION_TYPE":"","SSID_POSITION":"","PARSER_END_TIME":1483941482275,"X_COORDINATE":"","Y_COORDINATE":"","NETBAR_WACODE":"33070135100001","FILE_TIME":1483005406000,"FILETIME":"2016-12-29 17:56:46.000","NAME":"","ACCESS_AP_CHANNEL":"","NATIVE_PLACE":"","PARSER_START_TIME":1483941442234,"TYPE_NAME":"121212","PARSERENDTIME":"2017-01-09 13:58:02.275","TERMINAL_FIELDSTRENGTH":"0","CAPTURE_HMS":"00:01:31","MODEL":"","ATTRIBUTION":"","REGISTER_PHOTO":"","AUTH_ACCOUNT":"","MINUTE1":24715681,"INGESTTIME":"2017-01-09 13:52:38.095","COLLECTION_EQUIPMENT_MAC":"000FE201074B","COLLECTION_EQUIPMENT_NAME":"ZG520016121212_1010","CACHE_SSID":"","LOGNAME":"146-330701-1482941174-63113-WA_SOURCE_FJ_1001-0.bcp","IMEI_ESN_MEID":"","SESSIONID":"","TYPE_CODE":"skynet","ACCESS_AP_MAC":"","COLLECTION_EQUIPMENT_LATITUDE":"29.084175110946724","IMSI":"","MAC_LONG":240385446749710,"PARSERSTARTTIME":"2017-01-09 13:57:22.234","OS_NAME":"","PARSERTIME":324180,"COLLECTION_EQUIPMENTID":"728489494000FE201074B","CAPTURE_YMD":"2016-12-29","CERTIFICATE_CODE":"","APP_COMPANY_NAME":"","INGEST_TIME":1483941158095,"AREA_NAME":"江南区","CURRENT_PHOTO":"","AUTH_TYPE":"","IDENTIFICATION_TYPE":"","APP_VERSION":"","END_TIME":"","MAC":"DAA11929A60E","BRAND":""}
java.lang.IndexOutOfBoundsException: null
at java.nio.Buffer.checkIndex(Buffer.java:540) ~[na:1.8.0_40]
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) ~[na:1.8.0_40]
at io.indexr.util.UTF8JsonDeserializer.basicNumberCheck(UTF8JsonDeserializer.java:297) ~[indexr-common-0.1.0.jar:na]
at io.indexr.util.UTF8JsonDeserializer.onNumber(UTF8JsonDeserializer.java:303) ~[indexr-common-0.1.0.jar:na]
at io.indexr.util.UTF8JsonDeserializer.parse(UTF8JsonDeserializer.java:186) ~[indexr-common-0.1.0.jar:na]
at io.indexr.segment.rt.UTF8JsonRowCreator.create(UTF8JsonRowCreator.java:148) ~[indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.UTF8JsonRowCreator.create(UTF8JsonRowCreator.java:136) ~[indexr-segment-0.1.0.jar:na]
at io.indexr.server.rt.fetcher.Kafka08Fetcher.parseUTF8Row(Kafka08Fetcher.java:159) [indexr-server-0.1.0.jar:na]
at io.indexr.server.rt.fetcher.Kafka08Fetcher.next(Kafka08Fetcher.java:148) [indexr-server-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeSegment.ingestRows(RealtimeSegment.java:276) [indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeSegment.realtimeIngest(RealtimeSegment.java:186) [indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable.doIngest(RealtimeTable.java:313) [indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable.ingest(RealtimeTable.java:255) [indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable.access$300(RealtimeTable.java:29) [indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable$1.lambda$run$13(RealtimeTable.java:178) [indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable$1$$Lambda$73/762015796.f(Unknown Source) [indexr-segment-0.1.0.jar:na]
at io.indexr.util.Try.on(Try.java:38) [indexr-common-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable$1.run(RealtimeTable.java:177) [indexr-segment-0.1.0.jar:na]
at io.indexr.util.DelayRepeatTask.runTaskInThread(DelayRepeatTask.java:39) [indexr-common-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable.lambda$startIngest$14(RealtimeTable.java:196) [indexr-segment-0.1.0.jar:na]
at io.indexr.segment.rt.RealtimeTable$$Lambda$72/1479243425.run(Unknown Source) [indexr-segment-0.1.0.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]

drill新建storage失败

image
drill新建indexR storage失败
Please retry: Error while creating/ updating storage : org.apache.drill.exec.store.AbstractStoragePlugin: method ()V not found

spark 2.4 compatible

Hi @flowbehappy,
is any plan support Spark 2.4 and future release ?
IndexR is great tool for bigdata, but now it seem not to be active ?

Thanks.

unable to create storage plugin

Hi, i am tseting indexR in MapR v6.0 and I have managed to get hive to work in your example but could not get to create drill stroage plugin in the web console.I tried the following

{
"type": "indexr",
"enabled": true
}

but web console kept complaining error (unable to create/update storage). Did I missed any part?

i could not consume data in my own table, how should id do .

i found some msg is here ===
11:26:31.575 [RT-Ingest-auth_data] INFO io.indexr.segment.rt.RealtimeSegment - Table [auth_data]: consume msgs [6], fail msgs [5], ignore rows [1], produce rows [0], ingest rows [0], final rows [0], took time [1905.71s], consume memory [0K], total rt memoryusage [0M], avaliable memory [50822M], segment: [rts.201701111054.af068ccd-79f4-427c-b24b-e4047db6b6f5.seg]
11:26:31.575 [RT-Ingest-auth_data] INFO io.indexr.segment.rt.RealtimeTable - Finish ingest segment. [ingested: 0, got: 0 (NaN%)]. [table: auth_data, rtsg: rt/rtsg.201701111047.26de00b2-dc8c-4c47-846f-283595f69f78.seg, segment: rts.201701111054.af068ccd-79f4-427c-b24b-e4047db6b6f5.seg]
what i could do, to find the reason fail msgs[5].
fail msgs [5], ignore rows [1], produce rows [0]

in the table schma ,i turn off the grouping . and remove the dims and metrics.

think you very much

compile error

我想请教下编译这源码的顺序,发现编译有不少错误,比如indexr-query-opt工程的AstListener.java代码
提升:
import io.indexr.query.parsers.RQLBaseListener;
import io.indexr.query.parsers.RQLParser;
两个库不存在。
请教有没有具体的编译源码手册。

OOM in indexr hive

Current implementation does not manually free packs' off-heap memory after use. The dependency of GC is not reliable. We should reimplement it with column API, not iterator API.

indexr+drill1.8.0 单机模式启动后一直报如下的错误。

my problem is after create table by index-tool ,the drill console is continue get the error info 。
I use the json string to create the table
{
"schema":{
"columns":
[
{"name": "date", "dataType": "int"},
{"name": "d1", "dataType": "string"},
{"name": "m1", "dataType": "int"},
{"name": "m2", "dataType": "long"},
{"name": "m3", "dataType": "float"},
{"name": "m4", "dataType": "double"}
]
}
}
and in the drill console could find table test
image

i don't know why this occur,place help me , think you。

22:43:52.383 [IndexR-Refresh-Notify] ERROR i.i.server.rt.RealtimeSegmentPool -
java.lang.NoClassDefFoundError: org/apache/spark/unsafe/Platform
at io.indexr.server.rt.RealtimeSegmentPool.refreshLocalRT(RealtimeSegmentPool.java:227) [indexr-server-0.1.0.jar:na]
at io.indexr.server.rt.RealtimeSegmentPool$$Lambda$39/859028208.f(Unknown Source) ~[na:na]
at io.indexr.util.Try.on(Try.java:12) ~[indexr-common-0.1.0.jar:na]
at io.indexr.server.rt.RealtimeSegmentPool.refresh(RealtimeSegmentPool.java:198) [indexr-server-0.1.0.jar:na]
at io.indexr.server.rt.RealtimeSegmentPool.lambda$new$4(RealtimeSegmentPool.java:103) [indexr-server-0.1.0.jar:na]
at io.indexr.server.rt.RealtimeSegmentPool$$Lambda$42/1944303858.run(Unknown Source) [indexr-server-0.1.0.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_40]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_40]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_40]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_40]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_40]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_40]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]

did i miss something , still get the same error java.lang.NoClassDefFoundError io.indexr.compress.bh.BhcompressLibrary

i run the shell
sh script/setup_lib.sh
sh script/setup_indexr-segment.sh
sh script/setup_indexr-query.sh

and check it ,BhcompressLibrary.java was generated. and recompile the sorce .
indexr-segment-0.1.0.jar also has the class file
image

so i repalce the jar file in every node . including the indextools .
and restart the drill cluster. after period of a time , the error is found when save in memory rows to disk.
did i miss some step ?

the error msg ==>

16:47:28.981 [IndexR-RT-Handle-4] ERROR io.indexr.segment.rt.RTSGroup - Save in memory rows to disk failed. [table: auth_data4, rtsg: rt/rtsg.201701111551.87dee7ce-877d-471c-a9f3-dc4197dcef5d.seg, segment: rts.201701111557.d1838032-d9df-4f96-bdd4-be32434e587e.seg]
java.lang.NoClassDefFoundError: Could not initialize class io.indexr.compress.bh.BhcompressLibrary
at io.indexr.compress.bh.BHCompressor.compressInt(BHCompressor.java:68) ~[indexr-segment-0.1.0.jar:na]

Hi I get a error . could not scan file /opt/app/drill/jars/drill-indexr-storage-1.8.0.jar

before I use the hdfs file system, this error is not happen.
I modiry the indexr.config.properties. and restart the drill cluster.
I check the ${drill_home}/jars/drill-indexr-storage-1.8.0.jar . it's there.
image

14:45:04.491 [main] WARN org.reflections.Reflections - could not scan file /opt/app/drill/jars/drill-indexr-storage-1.8.0.jar!/org/apache/drill/exec/store/indexr/DrillIndexRTable.class with scanner SubTypesScanner
org.reflections.ReflectionsException: could not create class file from DrillIndexRTable.class
at org.reflections.scanners.AbstractScanner.scan(AbstractScanner.java:30) ~[reflections-0.9.8.jar:na]
at org.reflections.Reflections.scan(Reflections.java:217) [reflections-0.9.8.jar:na]
at org.reflections.Reflections.scan(Reflections.java:166) [reflections-0.9.8.jar:na]
at org.reflections.Reflections.(Reflections.java:94) [reflections-0.9.8.jar:na]
at org.apache.drill.common.scanner.ClassPathScanner.scan(ClassPathScanner.java:402) [drill-common-1.9.0.jar:1.9.0]
at org.apache.drill.common.scanner.RunTimeScan.fromPrescan(RunTimeScan.java:62) [drill-common-1.9.0.jar:1.9.0]
at org.apache.drill.common.scanner.ClassPathScanner.fromPrescan(ClassPathScanner.java:452) [drill-common-1.9.0.jar:1.9.0]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:283) [drill-java-exec-1.9.0.jar:1.9.0]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:272) [drill-java-exec-1.9.0.jar:1.9.0]
at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:268) [drill-java-exec-1.9.0.jar:1.9.0]
Caused by: org.reflections.ReflectionsException: could not create class file from DrillIndexRTable.class
at org.reflections.adapters.JavassistAdapter.createClassObject(JavassistAdapter.java:128) ~[reflections-0.9.8.jar:na]
at org.reflections.adapters.JavassistAdapter.getOfCreateClassObject(JavassistAdapter.java:118) ~[reflections-0.9.8.jar:na]
at org.reflections.adapters.JavassistAdapter.getOfCreateClassObject(JavassistAdapter.java:26) ~[reflections-0.9.8.jar:na]
at org.reflections.scanners.AbstractScanner.scan(AbstractScanner.java:27) ~[reflections-0.9.8.jar:na]
... 9 common frames omitted
Caused by: java.io.IOException: invalid constant type: 18
at javassist.bytecode.ConstPool.readOne(ConstPool.java:1090) ~[javassist-3.12.1.GA.jar:na]
at javassist.bytecode.ConstPool.read(ConstPool.java:1033) ~[javassist-3.12.1.GA.jar:na]
at javassist.bytecode.ConstPool.(ConstPool.java:149) ~[javassist-3.12.1.GA.jar:na]
at javassist.bytecode.ClassFile.read(ClassFile.java:737) ~[javassist-3.12.1.GA.jar:na]
at javassist.bytecode.ClassFile.(ClassFile.java:108) ~[javassist-3.12.1.GA.jar:na]
at org.reflections.adapters.JavassistAdapter.createClassObject(JavassistAdapter.java:126) ~[reflections-0.9.8.jar:na]
... 12 common frames omitted

the data in table does't match the column.

Hi flowbehappy. my problem is here
image
in the top two rows , is the wrong value in the table schema column .
describer: the json data key is 60 . and the table schema column is 63. the json data key order is not match the table schema .
in the botton one rows is the right value in the table schema column.
describer the json data key is 63 which match the table schema . and order is match both.

could i add you QQ or email . i's makes me convenient to contact you .
my qq is 2545392961 email is [email protected]
glad to hear you news

hi,I compiled indexr-0.2.1-bugfix2 encountered an error!

I compiled indexr-0.2.1-bugfix2 encountered an error:
Failed to execute goal on project indexr-query-opt: Could not resolve dependencies for project io.indexr:indexr-query-opt:jar:0.2.1: The following artifacts could not be resolved: io.indexr:indexr-common:jar:0.2.1, com.google.guava:guava:jar:16.0.1, org.apache.hadoop:hadoop-client:jar:2.6.0-cdh5.7.1: Could not find artifact io.indexr:indexr-common:jar:0.2.1 in central (https://repo.maven.apache.org/maven2)
I only modified hadoop.version for 2.6.0-cdh5.7.1,but i compiled release-0.2.0 successed.
Hope to get help, thanks!

IndexR nodes disappear from ZK after running a period of time

Run bin/tools.sh -cmd listnode command without any output.

This could be the issue of Zookeeper. My network experience a disruption during that period.
And I haven't found any other influences besides listnode.

Restart cluster can fix this issue.

java.lang.NoClassDefFoundError: Could not initialize class io.indexr.compress.bh.BhcompressLibrary

Hi I meet e error . could not initialize class BhcompressLibrary.
but i don‘t find the class in the class path io.indexr.compress.bh 。
and my ide also has a error which is could not resove the class BhcompressLibrary.

image
image

my error message is:
[IndexR-RT-Handle-0] ERROR io.indexr.segment.rt.RTSGroup - Save in memory rows to disk failed. [table: kafka_test, rtsg: rt/rtsg.201701091144.fe1da5a7-3a36-4b6a-a844-1586103ecacb.seg, segment: rts.201701091145.90316abe-1e6c-40b0-9333-197efcdf440b.seg]
java.lang.NoClassDefFoundError: Could not initialize class io.indexr.compress.bh.BhcompressLibrary
at io.indexr.compress.bh.BHCompressor.compressInt(BHCompressor.java:68) ~[indexr-segment-0.1.0.jar:na]
at io.indexr.segment.pack.NumOp.compress(NumOp.java:68) ~[indexr-segment-0.1.0.jar:na]

Issue while compiling => script/setup_indexr-segment.sh: source: not found

I'm facing issues while compiling on ubuntu 16.04.

/home/anmol.singh/git/indexr/script/setup_indexr-segment.sh: 5: /home/anmol.singh/git/indexr/script/setup_indexr-segment.sh: **source: not found**

/home/anmol.singh/git/indexr/script/setup_lib.sh: 5: /home/anmol.singh/git/indexr/script/setup_lib.sh: **source: not found**

/home/anmol.singh/git/indexr/script/release_indexr-drill.sh: 15: /home/anmol.singh/git/indexr/script/release_indexr-drill.sh: **function: not found** cp: missing destination file operand after '/home/anmol.singh/git/indexr/distribution/indexr-/indexr-drill/jars/3rdparty/' Try 'cp --help' for more information.
/home/anmol.singh/git/indexr/script/release_indexr-drill.sh: 21: /home/anmol.singh/git/indexr/script/release_indexr-drill.sh: Syntax error: "}" unexpected
/home/anmol.singh/git/indexr/script/release_indexr-hive.sh: 5: /home/anmol.singh/git/indexr/script/release_indexr-hive.sh: source: not found
/home/anmol.singh/git/indexr/script/compile_indexr-hive.sh: 5: /home/anmol.singh/git/indexr/script/compile_indexr-hive.sh: source: not found

Here is some problem I found : (1)missing data (2)same sql get error sometimes (3)the web query result is not match the console query result.

(1) missing data
i send 47 datas to the table auth_data5 in the night.
image
but found 16 in the moring .
image

(2)same sql get error sometimes
image

(3)the web query result is not match the console query result.
the web query result is 33
image
but the console result is 16
image

image

in the final , i want to know how to use the hdfs filesystem to save indexr data .
i found the data was saved in disk ,not in the hadoop hdfs.
image
i also copyed the core-site.xml and hdfs-site.xml to the ${drill_home}/conf and ${index_home}/conf .

Can't parallelize fragment: Every mandatory node has exhausted the maximum width per node limit.

Evaluating Indexr on existing HDP2.5 , Drill 1.9.0 cluster
Based on https://github.com/shunfei/indexr/wiki/Deployment instructions created following table

{ "schema": { "columns": [ { "name": "host_id", "dataType": "int" }, { "name": "host_name", "dataType": "string" }, { "name": "cpu_count", "dataType": "int" } ] }, "location": "/indexr/segment/hosts", "mode": "vlt", "agg": { "grouping": true, "dims": [ "host_id", "host_name" ], "metrics": [ { "name": "cpu_count", "agg": "min" } ] }, "realtime": { "save.period.minutes": 20, "upload.period.minutes": 60, "max.row.memory": 500000, "max.row.realtime": 10000000, "fetcher": { "type": "kafka-0.8", "topic": "pg_imp1", "number.empty.as.zero": false, "properties": { "zookeeper.connect": "drsahmd2.is.com:2181,drsahmd3.is.com:2181,drsahmd4.is.com:2181", "zookeeper.connection.timeout.ms": "15000", "zookeeper.session.timeout.ms": "40000", "zookeeper.sync.time.ms": "5000", "fetch.message.max.bytes": "1048586", "auto.offset.reset": "largest", "auto.commit.enable": "true", "auto.commit.interval.ms": "5000", "group.id": "c2" } } } }
All OK, no errors in log, for deployments or enabling realtime stuff too.
But QUERIES fails with following error. Same error with or without realtime enabled.
Our use case is we dynamically create tables and let query return empty results IF there are NO DATA yet.

select * from indexr.hosts limit 10

SHOULD return NO RESULTS
but instead FAILS

2017-09-13 18:40:56,232 [2646d19f-bd5d-5eb5-ee18-90b2022076e2:foreman] ERROR o.a.d.e.p.f.HardAffinityFragmentParallelizer - Can't parallelize fragment: Every mandatory node has exhausted the maximum width per node limit. Endpoint pool: {address: "drsahmd3.is.com" user_port: 31010 control_port: 31011 data_port: 31012 =EndpointAffinity [endpoint=address: "drsahmd3.is.com" user_port: 31010 control_port: 31011 data_port: 31012, affinity=0.0, mandatory=true, maxWidth=2147483647]} Assignment so far: {address: "drsahmd3.is.com" user_port: 31010 control_port: 31011 data_port: 31012 =1} Width: 2 2017-09-13 18:40:56,300 [2646d19f-bd5d-5eb5-ee18-90b2022076e2:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: PhysicalOperatorSetupException: Can not parallelize fragment.

We have 3 DN, 3replicatoin , 3 Colocated Drillbits

bin/tools.sh -cmd listrs -t hosts
rt/rtsg.201709131831.94ecc793-b374-4294-bfe5-64183d62fdb2.seg:

host: drsahmd3.is.com
rowCount: 0
version: 8
mode: vlt

bin/tools.sh -cmd lisths -t hosts
bin/tools.sh -cmd rttnode -t hosts
hosts:

drsahmd3.is.com

bin/tools.sh -cmd listnode
drsahmd3.is.com
drsahmd4.is.com
drsahmd5.is.com

Issue: Is this expected behaviour for EMPTY tables?

P.S.: Thank you guys for open sourcing such an amazing work ( We kind of know the hard work goes behind this, as we also built and evolved a system with same objective from past few years)

维护询问

你好,麻烦问一下这个项目目前还有在维护吗?

我是否可以用impala来查询

impala可以查hive
而hive可以查indexR
那我是否可以用impala->hive->indexR
这样查询
还有用户手册是不是可以提供中文版的啊

About indexr index

在一篇文章中看到indexr支持三种索引:packRSIndex、packExtIndex、outIndex,这三种索引分别如何指定那,在文档中没有看到相关说明,谢谢!

About Indexr data

我在hdfs看到有indexr表的文件,在本地同样看到有很多indexr的文件,indexr_dictmerge9059835984776223950类的文件自动生成,且很多。
还有indexr的rt目录,这个上传到hdfs的周期是多久上传一次。

Is the indexr sql process outputs the wrong type?

  • drill-1.10.0
  • indexr-0.6.1

when I do some search with drill. try this sql:

select tb2.col11, count(1) from 
(
select FLATTEN(convert_fromJSON('[1,1,3,4,5]')) as col11 from {table_name}
) tb2 group by tb2.col11
  • If {table_name} is "(values(1))", it works.
  • if {table_name} is a parquet table such as dfs./opt/drill/apache-drill-1.10.0/sample-data/region.parquet, it still works.
  • if {table_name} is a indexr table. There is a problem:
java.sql.SQLException: SYSTEM ERROR: UnsupportedOperationException: Map, Array, Union or repeated scalar type should not be used in group by, order by or in a comparison operator. Drill does not support compare between MAP:REPEATED and MAP:REPEATED.

Then I do another test. This one works.

select tb2.col11 from (
select FLATTEN(split(col1, '0')) as col11 from 
    (select `col` as col1 from indexr表 where ... limit 1000 ) as tb1
) tb2 group by tb2.col11

This one (without 'limit') panics the same problem:

select tb2.col11 from (
select FLATTEN(split(col1, '0')) as col11 from 
    (select `col` as col1 from indexr表 where ... ) as tb1
) tb2 group by tb2.col11

This one panics the same problem.

select tb2.col11 from (
select FLATTEN(split(`col`, '0')) as col11 from indexr表  where ... limit 1000 
) tb2 group by tb2.col11

Is this problem comes from indexr-query-opt. The indexr sql process outputs the wrong type.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.