vearch / vearch Goto Github PK

Distributed vector search for AI-native applications

License: Apache License 2.0

Shell 0.66% Go 39.89% Dockerfile 0.02% Python 23.37% CMake 0.98% C++ 33.42% C 0.20% SWIG 0.47% Jupyter Notebook 0.96% Makefile 0.03%

vectors vector-search cloud-native document-retrieval embeddings vector-database hybrid-search rag retrieval-augmented-generation ai-native

vearch's Introduction

简体中文 | English

Overview

Vearch is a cloud-native distributed vector database for efficient similarity search of embedding vectors in your AI applications.

Key features

Hybrid search: Both vector search and scalar filtering.
Performance: Fast vector retrieval - search from millions of objects in milliseconds.
Scalability & Reliability: Replication and elastic scaling out.

Document

Restful APIs

Tutorial | 参考文档

OpenAPIs

Tutorial

SDK

Usage cases

Use Vearch as a memory backend

Real world Demos

VisualSearch: Vearch can be leveraged to build a complete visual search system to index billions of images. The image retrieval plugin for object detection and feature extraction is also required.

Quick start

Deploy vearch cluster on k8s

Add charts through the repo

$ helm repo add vearch https://vearch.github.io/vearch-helm
$ helm repo update && helm install my-release vearch/vearch

Add charts from local

$ git clone https://github.com/vearch/vearch-helm.git && cd vearch-helm
$ helm install my-release ./charts -f ./charts/values.yaml

Start by docker-compose

standalone mode

$ cd cloud
$ cp ../config/config.toml .
$ docker-compose --profile standalone up -d

cluster mode

$ cd cloud
$ cp ../config/config_cluster.toml .
$ docker-compose --profile cluster up -d

Deploy by docker: Quickly start with vearch docker image, please see DeployByDocker

Compile by source code: Quickly compile the source codes, please see SourceCompileDeployment

Components

Vearch Architecture

Master: Responsible for schema mananagement, cluster-level metadata, and resource coordination.

Router: Provides RESTful API: upsert, delete, search and query; request routing, and result merging.

PartitionServer (PS): Hosts document partitions with raft-based replication. Gamma is the core vector search engine implemented based on faiss. It provides the ability of storing, indexing and retrieving the vectors and scalars.

Reference

Reference to cite when you use Vearch in a research paper:

@misc{li2019design,
      title={The Design and Implementation of a Real Time Visual Search System on JD E-commerce Platform},
      author={Jie Li and Haifeng Liu and Chuanghua Gui and Jianyu Chen and Zhenyun Ni and Ning Wang},
      year={2019},
      eprint={1908.07389},
      archivePrefix={arXiv},
      primaryClass={cs.IR}
}

Community

You can report bugs or ask questions in the issues page of the repository.

For public discussion of Vearch or for questions, you can also send email to [email protected].

Our slack : https://vearchwrokspace.slack.com

Known Users

Welcome to register the company name in this issue: #230 (in order of registration)

License

Licensed under the Apache License, Version 2.0. For detail see LICENSE and NOTICE.

vearch's People

Contributors

Stargazers

Watchers

Forkers

chenjianyu wding109 fossabot zhuzhengyi ares2013 shineer ssscottt xiedake bill-wangbiao skypigltp vivian7755 guichuanghua trendingtechnology zhangzhenhua freewind2016 mervinkid ouyangchucai ling-cv shengzhang90 dreamerzp prpankajsingh qf8505 zhouyonglong allensmile jingmouren zxlzr tchen7 hswuhao jangocheng kinkir tchigher hannson gtrevg fayeeshine lujiawena gitter-badger hpsdy 0xqq qbzenker lindafeiiiiii zakra xealml beesitech f0ffff paladinzh songlei150 alirezabayatmk jiaqiang awesome-archive boluoyu raceli sunyaofei d1jk wanfangdata liuwqiang lynnapan lucifer1978 guoanwu hu-abc zhyon404 bruinxiong xiaming9880 zhxmzm xeron56 fengbeihong aelous whatisnull xusk suresv xqtbox jdkbean svanschalkwyk cocodee fpzh2011 supuchun wallaceliu jokechild 292388900 lxx1220 chengduozh zej0709 ethan199111 13meimei nickwwww cydxx lishitong1702 layningning puper kevintony001 regalius qianlinjun jupanlee feifeibear jianbotang blake86 kioco codingwangfeng zhanghaohit ericdoug-qi paul1989889

vearch's Issues

dynamic rebalancing and resharding

To implement the feature of dynamic shard splitting and migration

Benchmark conditions question

Hello!
I'm interested in your system, however I have some questions about your benchmark conditions.

Did you use 5 bare metal servers for VGG100M tests and obtain ~120QPS without filtering by integer field?
What is the "integer field filtering"? :)
Which CPU and how many RAM your servers has?
How many RAM indexes use and how large is the database?
Is it true, that your system maintains full copy of the database in each PartitionServer?

Change max_size for space after created ?

When I create space , system required max_size for space . I set max_size :1000
Example :
"engine": {"name": "gamma", "metric_type": method,"max_size":1000}

When I index item into system. Total document is limit at 1000.
How can I change limit max_size for index new item ? (Example max_size=50000)

导入CSV，detection设置为false不生效，依然会检测

curl -XPOST -H "content-type:application/json" -d '{
"method": "bulk",
"imageurl": "../images/image_retrieval/test.csv",
"detection": false
}' http://127.0.0.1:4101/test/test/_insert

Compile Question

when I "go build -o vearch"

I meets "ps/engine/gammacb/util.go:94:26: too many arguments in call to _Cfunc_MakeVectorInfo
have (*_Ctype_struct_ByteArray, uint32, _Ctype_int, *_Ctype_struct_ByteArray, *_Ctype_struct_ByteArray, *_Ctype_struct_ByteArray, _Ctype_struct_ByteArray)
want (_Ctype_struct_ByteArray, uint32, _Ctype_int, *_Ctype_struct_ByteArray, *_Ctype_struct_ByteArray, *_Ctype_struct_ByteArray)
"

Look forward to your reply.

Can I use vearch for search data based on disk ?

Dear team

I want to build system data based on disk (1T-scale datasets), I check your config vearch system , it has config when create partition:

max_size : max documents for each partition
index_size : default 100000, if index_size == 0 it will not auto indexed, if insert document num >= index_size , it will auto indexed

max_size depends on the amount of memory on the server partition for to be initialized
index_size = 0 it will not auto indexed , Does it don't use RAM server ?

How can I config for search data based on disk?

not enuogh ps, 这个要怎么处理才能跑起hello world

About to connect() to 127.0.0.1 port 8817 (#0)
Trying 127.0.0.1...
Connected to 127.0.0.1 (127.0.0.1) port 8817 (#0)
Server auth using Basic with user 'root'

PUT /space/test_vector_db/_create HTTP/1.1
Authorization: Basic cm9vdDpzZWNyZXQ=
User-Agent: curl/7.29.0
Host: 127.0.0.1:8817
Accept: /
content-type: application/json
Content-Length: 794

upload completely sent off: 794 out of 794 bytes
< HTTP/1.1 586 status code 586
< Content-Type: application/json
< Date: Fri, 29 Nov 2019 11:24:13 GMT
< Content-Length: 34
<
Connection #0 to host 127.0.0.1 left intact
{"code":586,"msg":"not enough PS"}#

1129 19:24:13.779994 8686 master_cache.go:376] [INFO] remove space cache dbID:[1] space:[7]
D1129 19:24:13.780293 8686 cluster_api.go:294] [DEBUG] create space, db:
E1129 19:24:13.780307 8686 cluster_api.go:295] [ERROR] createSpaceService err: not enough PS
[GIN] 2019/11/29 - 19:24:13 | 586 | 5.004712717s | 127.0.0.1 | PUT /space/test_vector_db/_create
D1129 19:24:18.293195 8686 startup.go:108] [DEBUG] mem.Alloc:14567416 mem.TotalAlloc:28089928 mem.HeapAlloc:14567416 mem.HeapSys:63766528 routing :106
D1129 19:24:24.281897 8686 schedule_job.go:75] [DEBUG] PartitionIds not change, do nothing!
D1129 19:24:28.293387 8686 startup.go:108] [DEBUG] mem.Alloc:11770096 mem.TotalAlloc:28271904 mem.HeapAlloc:11770096 mem.HeapSys:63766528 routing :106
D1129 19:24:30.076020 8686 schedule_job.go:136] [DEBUG] Start clean task
I1129 19:24:30.076214 8686 schedule_job.go:36] [INFO] Start Walking Partitions!
I1129 19:24:30.076226 8686 schedule_job.go:58] [INFO] Complete Walking Partitions!
I1129 19:24:30.076427 8686 schedule_job.go:63] [INFO] Start Walking Spaces!
I1129 19:24:30.076440 8686 schedule_job.go:76] [INFO] Complete Walking Spaces!
I1129 19:24:30.076638 8686 schedule_job.go:86] [INFO] Start Walking Servers!
I1129 19:24:30.076648 8686 schedule_job.go:100] [INFO] Complete Walking Servers!
D1129 19:24:38.293590 8686 startup.go:108] [DEBUG] mem.Alloc:12016000 mem.TotalAlloc:28517808 mem.HeapAlloc:12016000 mem.HeapSys:63766528 routing :106
D1129 19:24:48.293766 8686 startup.go:108] [DEBUG] mem.Alloc:12213984 mem.TotalAlloc:28715792 mem.HeapAlloc:12213984 mem.HeapSys:63766528 routing :106
I1129 19:24:48.355768 8686 schedule_job.go:104] [INFO] Receive keepalive, leaseId: 6509230568158846988, ttl:188
D1129 19:24:58.293949 8686 startup.go:108] [DEBUG] mem.Alloc:12379072 mem.TotalAlloc:28880880 mem.HeapAlloc:12379072 mem.HeapSys:63766528 routing :106
D1129 19:25:08.294122 8686 startup.go:108] [DEBUG] mem.Alloc:12545384 mem.TotalAlloc:29047192 mem.HeapAlloc:12545384 mem.HeapSys:63766528 routing :106
D1129 19:25:18.294299 8686 startup.go:108] [DEBUG] mem.Alloc:12740464 mem.TotalAlloc:29242272 mem.HeapAlloc:12740464 mem.HeapSys:63766528 routing :106
D1129 19:25:24.282022 8686 schedule_job.go:75] [DEBUG] PartitionIds not change, do nothing!

Add MAINTAINERS.md

wonderful，我想弱弱的问下怎么在c++工程中使用vearch呢？

go compiler 的时候发现invalid operation错误

hi, 我在尝试编译的时候发现如下错误，请问如何fix哈？

[root@localhost vearch]# go build -o vearch
github.com/vearch/vearch/master/store
/root/go/pkg/mod/github.com/vearch/[email protected]/master/store/distlock.go:164:15: invalid operation: ev.Type == "go.etcd.io/etcd/mvcc/mvccpb".DELETE (mismatched types "github.com/coreos/etcd/mvcc/mvccpb".Event_EventType and "go.etcd.io/etcd/mvcc/mvccpb".Event_EventType)
/root/go/pkg/mod/github.com/vearch/[email protected]/master/store/etcdstore.go:201:39: cannot use store.cli (type *"go.etcd.io/etcd/clientv3".Client) as type *"github.com/coreos/etcd/clientv3".Client in argument to concurrency.NewSTM

github.com/vearch/vearch/config
/root/go/pkg/mod/github.com/vearch/[email protected]/config/config.go:222:5: cfg.PreVote undefined (type *embed.Config has no field or method PreVote)

数据持久化

请问在vearch中建表，插入数据之后，如果机器重启，数据还在吗？

Add version/release strategy

结合过滤的查询疑问

  if (condition->range_query_result &&
      condition->range_query_result->GetAllResult().size() == 1 &&
      condition->range_query_result->GetAllResult()[0].Size() < 50000) {
    const std::vector<int> docid_list = condition->range_query_result->ToDocs();

gamma_index_ivfpq.cc L429
范围查询1个结果集，并且结果集小于50000才过滤？大于直接穿透了？

    if (range_index_ptr_ != nullptr) {
      is_filterable = [this](int doc_id) -> bool {
        return (bitmap::test(docids_bitmap_, doc_id) ||
                (not range_index_ptr_->Has(doc_id)));
      };
    } else {
      is_filterable = [this](int doc_id) -> bool {
        return (bitmap::test(docids_bitmap_, doc_id));
      };
    }

gamma_index_ivfpq.h L541

is_filterable设置了，没有任何调用的地方，符合预期么？

Visual Search API : insert data into space

I am getting below error in VisionSearch insert API

INFO:main:request -- /search/search_space/_insert -- { "imageurl": "COCO_val2014_000000000042.jpg"}
DEBUG:main: File "main.py", line 323, in pkg_result assert isinstance(response_body_list, list), response_body_list AssertionError: {"status": 550, "error": {"index": "", "index_uuid": "", "shard": "0", "type": "", "reason": "descriptor 'init' requires a 'super' object but received a 'str'"}}
DEBUG:main:response -- /search/search_space/_insert -- {"status": 550, "error": {"index": "", "index_uuid": "", "shard": "0", "type": "", "reason": "descriptor 'init' requires a 'super' object but received a 'str'"}}

got below error when i run bash ./bin/run.sh video

FileNotFoundError: [Errno 2] No such file or directory: '/root/go/src/github.com/vearch/vearch/plugin/model/20180402-114759'
INFO:main:request -- /video/video/_create -- {"db": true, "method": 1, "columns": {"imageurl": {"type": "keyword"}, "boundingbox": {"type": "keyword"}, "label": {"type": "keyword"}}, "feature": {"type": "vector", "filed": "imageurl", "model_id": "vgg16", "dimension": 512}}
DEBUG:main:response -- /video/video/_create -- {'code': 200, 'db_msg': 'success', 'space_msg': 'success'}
{'code': 200, 'db_msg': 'success', 'space_msg': 'success'}
ffmpeg version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)
configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
libavutil 55. 78.100 / 55. 78.100
libavcodec 57.107.100 / 57.107.100
libavformat 57. 83.100 / 57. 83.100
libavdevice 57. 10.100 / 57. 10.100
libavfilter 6.107.100 / 6.107.100
libavresample 3. 7. 0 / 3. 7. 0
libswscale 4. 8.100 / 4. 8.100
libswresample 2. 9.100 / 2. 9.100
libpostproc 54. 7.100 / 54. 7.100
pycache: Is a directory
ffmpeg process is killed!

Please help me out

python plugin visual search API 出现“Expecting value: line 1 column 1 (char 0)"错误

hi,
在用 bash ./bin/run.sh image deploy plugin后，我尝试用例子中的request来发create DB 请求：

curl -XPOST -H "content-type:application/json" -d '{
    "db": true,
    "method": 0,
    "columns": {
        "imageurl": {
            "type": "keyword"
        },
        "boundingbox": {
            "type": "keyword"
        },
        "label": {
            "type": "keyword"
        }
    },
    "feature": {
        "type": "vector",
        "filed": "imageurl",
        "model_id": "vgg16",
        "dimension": 512
    }
}' http://127.0.0.1:4101/test/test/_create

但返回如下错误：

INFO:main:request -*- /test/test/_create -*- {
    "db": true,
    "method": 0,
    "columns": {
        "imageurl": {
            "type": "keyword"
        },
        "boundingbox": {
            "type": "keyword"
        },
        "label": {
            "type": "keyword"
        }
    },
    "feature": {
        "type": "vector",
        "filed": "imageurl",
        "model_id": "vgg16",
        "dimension": 512
    }
}
DEBUG:main:  File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode     raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
DEBUG:main:response -*- /test/test/_create -*- Expecting value: line 1 column 1 (char 0)

我尝试把json放到文件中去curl也是一样的错误。
请问是否是python版本的问题？有版本要求吗？
thanks！

unexpected end of JSON input

create space error.
and the Json format is right.

create schema problem

Hi,
I try to create schema ,and I follow the example code 'APILowLevel.md'.
create schema
curl -v --user "root:secret" -H "content-type: application/json" -XPUT -d'
{
"name": "tpy",
"dynamic_schema": "strict",
"partition_num": 2,
"replica_num": 1,
"engine": {"name":"gamma", "max_size":1000000,"nprobe":10,"metric_type":-1,"ncentroids":-1,".....

fails with the message:
E1103 13:16:54.846912 21292 cluster_api.go:340] [ERROR] createSpaceService err: revoke lease 6509229987518300959 error :context deadline exceeded

Does anyone help me,how to fix this error?

Thanks,
celine

two types of tables: indexed and stored

we need to provide two classes of tables: indexed and stored. The indexed tables have automatic vector indexing, while the stored tables have no vector indexing but store key-vector pairs for quick key lookup.

Feature request: _delete_by_query api

Seriously awesome project! Thank you for building this and putting this out.

Was wondering if you're planning to implement a "delete_by_query" api, akin to Elasticsearch's? I am hoping to delete documents by an indexed keyword field.

Thanks!

Add Travis CI to the repo

编译和安装问题

编译的时候报错
在vearch/engine/gamma/index/gamma_index_ivfpq.h 文件中#include "faiss/AuxIndexStructures.h"，但实际上这个头文件是faiss/impl/AuxIndexStructures.h，是faiss版本的问题还是我哪里步骤错了？

查询返回的score异常

我创建的space使用的metric_type为0，我理解向量搜索返回的score值应该是个小于1的数，但实际返回了类似19068.90234375这样的数值，请教是我理解的不对，还是操作有问题？

创建space的命令:

curl -XPUT -H "content-type: application/json" -d' { "name": "vgg_space", "partition_num": 1, "replica_num": 1, "engine": { "name": "gamma", "index_size": 70000, "max_size": 2000000, "nprobe": 30, "metric_type": 0, "ncentroids": 256, "nsubvector": 64 }, "properties": { "imgpath": { "type": "keyword", "index": "true" }, "feature": { "type": "vector", "dimension": 128, "store_type": "RocksDB", "store_param": { "cache_size": 2000 } } } } ' http://127.0.0.1:8817/space/afdb/_create

向量搜索返回结果:
{ "took": 1, "timed_out": false, "_shards": { "total": 1, "failed": 0, "successful": 1 }, "hits": { "total": 3, "max_score": 19068.90234375, "hits": [ { "_index": "afdb", "_type": "vgg_space", "_id": "AW7FmgiAoEgcpgFzKImk", "_score": 19068.90234375, "_extra": { "vector_result": [ { "field": "feature", "source": "", "score": 19068.90234375 } ] }, "_version": 1, "_source": { "imgpath": "F:/dataset/antifraud/segmented/5FJ-2-3.jpg" } },

Docker example issu

Hi,

After starting the docker example with

./run_docker.sh

I get the following output:

... Successfully tagged ansj/vearch:0.2 Start service by all in one model good luck service is ready you can visit http://127.0.0.1:9001 to use it

The following cURL

curl -XPOST -H "content-type:application/json" -d '{ "db": true, "method": 0, "columns": { "imageurl": { "type": "keyword" }, "boundingbox": { "type": "keyword" }, "label": { "type": "keyword" } }, "feature": { "type": "vector", "filed": "imageurl", "model_id": "vgg16", "dimension": 512 } }' http://127.0.0.1:9001/test/test/_create

fails with the message

{"_index":"test","_type":"test","_id":"_create","status":550,"error":{"index":"","index_uuid":"","shard":"0","type":"","reason":"db:[test] space:[test] err:[space not exists]"}}

Please note that the example cURL on https://github.com/vearch/vearch/blob/master/docs/Quickstart.md mentions port 4101, is that correct?

Any idea how I can debug this?

Thanks,
Bart

编译问题

cannot find package "github.com/tiglabs/log"

在github已经找不到这个包了

None of the masters has the same ip address as current local master server's ip

When I use vearch on two servers,I get the following error message：

None of the masters has the same ip address as current local master server's ip

please help me@！@

Question about partition server ?

I have 2 question when deploy system . I deploy system vearch into 2 partition server with 30 million items and I have many issues:

On 2 partition server ,server A appear file dump.done , another server B appear file dumping.done . So when I reload partition server B , I received error:
ERROR 2019-11-01 15:16:56,806 gamma_engine.cc:736 dump.done cannot be found in [GO_PATH/data_vearch/data/data/29/2019-10-27-23:04:35]
When I excute search command through the system , It took about ~1000ms for per request
with config default : ivfpq parameters: metric_type=1, nprobe=10, ncentroids=256, nsubvector=32, nbits_per_idx=8 . and my vector dimension = 512 . I think it slowly because I chop up the vectors 512 into 32 sub-vectors. Can you recommend a suitable configuration for this amount of data ?
Log query from router : [ps_space_service.go:410] [DEBUG] Max search partitionID:[30] use time:[1118]

Router adds gRPC interface

Simplify the docker deployment

We may consider:

add docker compose
simplify the entire deployment process by just running a single script like docker/run_docker.sh

If possble, write some case studies and post here

go编译问题

最后一步在编译go的时候出错了

APIVisualSearch.md： Search similar result from space问题

request

curl -XPOST -H "content-type:application/json" -d '{
"imageurl": "../images/image_retrieval/test/COCO_val2014_000000123599.jpg",
"detection": true,
"score": 0.5,
"filter": [
{
"term": {
"label": {
"value": "zebra"
}
}
}
],
"size": 5
}' http://127.0.0.1:4101/test/test/_search

以上方式搜索不到内容，修改为下面的方式则可以：
curl -XPOST -H "content-type:application/json" -d '{
"imageurl": "../images/image_retrieval/test/COCO_val2014_000000123599.jpg",
"detection": true,
"score": 0.5,
"filter": [
{
"term": {
"label": "zebra"
}
}
],
"size": 5
}' http://127.0.0.1:4101/test/test/_search

另外label必须创建索引，不然会报错："errors": {"internal error": {}}}

Hello, how is violent search realized?

Provide link to vearch's website

Add ROADMAP.md

Add Initial committers

name, email, organization, and how long have they been working on project

Provide mailing list

Add CONTRIBUTING.md

Explain how others can contribute to the project

you can use hnsw as ivfpq coarse_quantizer

from faiss
HNSW obtains much better speed / precision operating points than IVFFlat (eg. 0.020 ms vs. 0.140 ms to get > 0.9 recall at 1), at a higher memory cost

Help, I cannot build successfully because of undefined reference.

host@host:~/Practice/go/src/github.com/vearch/vearch$ go1.11.2 build -a --tags=vector -o vearch
# github.com/vearch/vearch/ps/engine/gammacb
/home/host/workplace/Practice/go/src/github.com/vearch/vearch/ps/engine/gammacb/lib/lib/libgamma.so: undefined reference to `faiss::IndexIVFPQ::encode_vectors(long, float const*, long const*, unsigned char*) const'
collect2: error: ld returned 1 exit status

OS: ubuntu 18.04
GCC: 7.4.0 (os)
Golang: go1.11.2 (go1.10 get)
Openblas: latest on github (compile && installation)
Faiss: v1.5.3 release (compile && installation, --without-cuda due to no GPU)
vearch: latest on github

gamma.so is built sucessfully and certain at the target position.
Could anyone give some advices? Thanks very much!

create partition err: create gamma table has err:[4294967294]

I creat table when I use
"curl -XPOST -H "content-type:application/json" -d '{
"db": true,
"method": 0,
"columns": {
"imageurl": {
"type": "keyword"
},
"boundingbox": {
"type": "keyword"
},
"label": {
"type": "keyword"
}
},
"feature": {
"type": "vector",
"filed": "imageurl",
"model_id": "resnet",
"dimension": 2048
}
}' http://127.0.0.1:4101/test/test/_create "

It got "{"code": 550, "msg": "create partition err: create gamma table has err:[4294967294] "}"

I find when I change the dimension ,it meets the problem . If I use 512 as the dimension,it successed.

why?
Look forward to your reply.
Thank you ~

Add CODE_OF_CONDUCT.md

"filter" query potential regression

Hi,

I noticed what appears to be a regression in the "filter" query between version 0.2 and 0.2.1. For example, when I issue the query:

query = {
    "query": {
        "filter":[{
            "term":{
                    "requirement_id":"abc_123"

            }
        }],
        "sum": [{
            "field": "txt_embedding",
            "feature": [0,0,0,0,0],
        }]
    }
}
response = requests.get(f'{ip_data}/{db_name}/{space_name}/_search?size=100')

My response length is 0 in version 0.2.1 but non-zero, as expected, in version 0.2. I am using the docker hub images.

Thanks in advance!

gamma编译不上,

输入cmake命令后的输出是这样的,直接make报错了

(base) is@is-B365M-D2V:~/Documents/GO/git/src/github.com/vearch/vearch/engine/gamma/build$ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$vearch/ps/engine/gammacb/lib ..
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Faiss libraries: /usr/local/lib/libfaiss.so
-- Found Faiss include: /usr/local/include
-- RocksDB home isn't set, so store_type=RocksDB is not supported!
-- Release Mode
-- Flags: -std=c++11 -fPIC -m64 -Wall -O3 -mavx2 -msse4 -mpopcnt -fopenmp -D_FILE_OFFSET_BITS=64 -D_LARGE_FILE -Werror=narrowing -Wno-deprecated
-- Configuring done
-- Generating done
-- Build files have been written to: /home/is/Documents/GO/git/src/github.com/vearch/vearch/engine/gamma/build

make报错信息

CMakeFiles/gamma.dir/build.make:254: recipe for target 'CMakeFiles/gamma.dir/index/gamma_index_ivfpq.cc.o' failed
make[2]: *** [CMakeFiles/gamma.dir/index/gamma_index_ivfpq.cc.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/gamma.dir/all' failed
make[1]: *** [CMakeFiles/gamma.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

编译go项目的时候提示少了一个API的头文件

ps/engine/gammacb/gamma.go:20:23: fatal error: gamma_api.h: No such file or directory

[space not exists] warning~~~

Using docker build a vearch instance.
However, can never ever create a database or a space.

unexpected signal during runtime execution

after running this command: ./vearch -conf conf.toml getting below error.

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7fe08dc49ae0]

runtime stack:
runtime.throw(0x1574410, 0x2a)
/root/.go/src/runtime/panic.go:774 +0x72
runtime.sigpanic()
/root/.go/src/runtime/signal_unix.go:378 +0x47c

goroutine 254 [syscall]:
runtime.cgocall(0x11e53c0, 0xc000febb98, 0x0)
/root/.go/src/runtime/cgocall.go:128 +0x5b fp=0xc000febb68 sp=0xc000febb30 pc=0x4067fb
github.com/vearch/vearch/ps/engine/gammacb._Cfunc_Load(0x7fe064004b10, 0x7fe000000000)
_cgo_gotypes.go:528 +0x49 fp=0xc000febb98 sp=0xc000febb68 pc=0x11b11d9
github.com/vearch/vearch/ps/engine/gammacb.New.func8(0xc0001bd3b0, 0xc)
/root/go/src/github.com/vearch/vearch/ps/engine/gammacb/gamma.go:102 +0x5f fp=0xc000febbd8 sp=0xc000febb98 pc=0x11be0cf
github.com/vearch/vearch/ps/engine/gammacb.New(0xc000613774, 0xc, 0x0, 0xc000ab4f30, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0)
/root/go/src/github.com/vearch/vearch/ps/engine/gammacb/gamma.go:102 +0x61a fp=0xc000febd38 sp=0xc000febbd8 pc=0x11b2cfa

I have installed go (go version go1.13.4 linux/amd64) gcc version 7.4.0 , faiss : 1.5.3

getting below error also often

./vearch -conf conf.toml
./vearch: error while loading shared libraries: libgamma.so.0.1: cannot open shared object file: No such file or directory
but i have such file inside gamma folder.
[100%] Linking CXX shared library libgamma.so
[100%] Built target gamma
[100%] Built target gamma
Install the project...
-- Install configuration: "Release"
-- Installing: /root/go/src/github.com/vearch/vearch/ps/engine/gammacb/lib/lib/libgamma.so.0.1
-- Installing: /root/go/src/github.com/vearch/vearch/ps/engine/gammacb/lib/lib/libgamma.so
-- Set runtime path of "/root/go/src/github.com/vearch/vearch/ps/engine/gammacb/lib/lib/libgamma.so.0.1" to ""
-- Installing: /root/go/src/github.com/vearch/vearch/ps/engine/gammacb/lib/include/gamma_api.h

please any one help me to fix this issue

单机模式，每次停止后就无法再正常启动

本地单机模式部署，做一些数据操作后Ctrl+C停止服务，再执行命令./vearch -conf conf.toml 就无法正常启动，需要把本地生成的文件全部删除会，可以启动。
看了下启动日志，这两段应该有关系，不确定为什么会出错。

I1202 12:09:05.507985 4440 server.go:190] [INFO] to register master, nodeId:[1], times : 1 D1202 12:09:05.508045 4440 master_api.go:242] [DEBUG] master api Register url: http://127.0.0.1:8817/register?clusterName=antifraud&nodeId=1 D1202 12:09:05.508431 4440 master_api.go:244] [DEBUG] master api Register response: D1202 12:09:05.508456 4440 master_api.go:248] [DEBUG] master api Register err: Post http://127.0.0.1:8817/register?clusterName=antifraud&nodeId=1: dial tcp 127.0.0.1 :8817: connect: connection refused E1202 12:09:05.508479 4440 server.go:193] [ERROR] to register master error, nodeId:[1], err : master server all down I1202 12:09:07.508639 4440 server.go:190] [INFO] to register master, nodeId:[1], times : 2 D1202 12:09:07.508692 4440 master_api.go:242] [DEBUG] master api Register url: http://127.0.0.1:8817/register?clusterName=antifraud&nodeId=1 D1202 12:09:07.508951 4440 master_api.go:244] [DEBUG] master api Register response: D1202 12:09:07.508970 4440 master_api.go:248] [DEBUG] master api Register err: Post http://127.0.0.1:8817/register?clusterName=antifraud&nodeId=1: dial tcp 127.0.0.1 :8817: connect: connection refused E1202 12:09:07.508993 4440 server.go:193] [ERROR] to register master error, nodeId:[1], err : master server all down {"level":"info","ts":1575259748.0113544,"caller":"raft/raft.go:922","msg":"29acf840a60482e0 is starting a new election at term 8"} {"level":"info","ts":1575259748.0113983,"caller":"raft/raft.go:741","msg":"29acf840a60482e0 became pre-candidate at term 8"} {"level":"info","ts":1575259748.011429,"caller":"raft/raft.go:820","msg":"29acf840a60482e0 received MsgPreVoteResp from 29acf840a60482e0 at term 8"} {"level":"info","ts":1575259748.0114489,"caller":"raft/raft.go:725","msg":"29acf840a60482e0 became candidate at term 9"} {"level":"info","ts":1575259748.0114625,"caller":"raft/raft.go:820","msg":"29acf840a60482e0 received MsgVoteResp from 29acf840a60482e0 at term 9"} {"level":"info","ts":1575259748.011485,"caller":"raft/raft.go:777","msg":"29acf840a60482e0 became leader at term 9"} {"level":"info","ts":1575259748.0115013,"caller":"raft/node.go:330","msg":"raft.node: 29acf840a60482e0 elected leader 29acf840a60482e0 at term 9"} 2019-12-02 12:09:08.011790 I | etcdserver: published {Name:m1 ClientURLs:[http://127.0.0.1:2370]} to cluster 5ba2b0f90a66d753 I1202 12:09:08.011828 4440 server.go:76] [INFO] Server is ready!

`I1202 12:09:09.514171 4440 util.go:118] [INFO] add field:[_slot] option:[%!s(gammacb._Ctype_int=0)]
I1202 12:09:09.514181 4440 util.go:118] [INFO] add field:[imgpath] option:[%!s(gammacb._Ctype_int=1)]
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x7f6d4373fcb8]

runtime stack:
runtime.throw(0x157fa3f, 0x2a)
/usr/local/go/src/runtime/panic.go:774 +0x72
runtime.sigpanic()
/usr/local/go/src/runtime/signal_unix.go:378 +0x47c

goroutine 290 [syscall]:
runtime.cgocall(0x11ef410, 0xc000f8db98, 0x0)`

怎么求与某个向量最相近的向量呢？

我有一些向量，怎么求与某个向量最相近的向量呢？

Result data invalid when search

After I index data sucessfully. When I search into vearch, result meta data invalid on many fields (_id , imageurl. boundingbox) in some records

"_id":"f21.photo.",
"_score":0.9498358964920044,
"_version":1,
"_source":{
"boundingbox":"dn/a029e43973939bcd",
"imageurl":"c282.jpg1561476103594084352198.0,58.0,630.0,710.0http://f",
"uid":212077
}
It looks like the data has type "keyword" got mixed up.

Space info
"columns": {
"imageurl": {
"type": "keyword"
},
"boundingbox": {
"type": "keyword"
},
"uid": {
"type": "integer"
}
}

Rename anything related to "baud" or "Baud"

Using an internal naming in an open source project could make people confused.