Giter VIP home page Giter VIP logo

echoprint-server's Introduction

echoprint-server

Build Status License Platforms supported: Linux and OS X

Note: This project is no longer actively maintained

A C library, with a Python extension module and Java bindings, for fast indexing and querying of echoprint data.

Installation

The standalone C library is built with CMake. This step is required for using the Java (but not for the Python) bindings.

To build the Python extension module , run python setup.py install.

Usage

The rest of this file documents the usage of echoprint-server via the Python extension module, through a set of convenience scripts in the bin/ directory.

For the Java bindings, please refer to the UsageExample.java file.

The echoprint code generator, used to convert audio files into echoprint strings, can be found here: echoprint-codegen.

WARNING

The library uses a custom binary format for speed. At this point, ENDIANNESS IS NOT CHECKED so moving index files between machines with different architectures might cause problems. The code has been tested on little endian machines.

The Java code for creating indices explicitly assumes a little endian architecture.

echoprint-decode

Convert a codestring as output by echoprint-codegen into the corresponding list of codes represented as comma-separated integers.

Usage:

echoprint-codegen song.ogg > codegen_output.json
cat codegen_output.json | jq -r '.[0].code' | echoprint-decode > codes.txt

codes.txt will look like:

150555,1035718,621673,794882,40662,955768,96899,166055,...

This script only outputs the echoprint codes, not the offsets. jq is a command line tool to process JSON strings, it can be found here.

echoprint-inverted-index

Takes a series of echoprint strings (one per line) and an output path. Writes a compact index to disk.

Usage:

cat ... | ./echoprint-inverted-index index.bin

index.bin format is binary, see the implementation details below.

If more than 65535 songs are indexed, the output will be split into blocks with the following naming scheme:

index.bin_0000
index.bin_0001
...

Optionally the -i switch switches the input format to a comma-separated list of integer codes (one song per line).

echoprint-inverted-query

Takes a series of echoprint strings (one per line) and a list of index blocks. For each query outputs results on stdout as json-encoded objects.

Usage:

cat ... | ./echoprint-inverted-query index-file-1 [index-file-2 ...]

where the input is an echoprint string per line;

Each output line looks like the following:

{
  "results": [
    {
      "index": 0,
      "score": 0.69340412080287933,
    },
    {
      "index": 8,
      "score": 0.56301175890117883,
    },
    {
      "index": 120,
      "score": 0.31826272477954626,
    },
    ...

The index field represents the position of the matched song in the index.

Optionally the -i switch switches the input format to a comma-separated list of integer codes (one song per line).

REST service

The echoprint-rest-service script listens for POST requests (by default on port 5678), with an echoprint string as echoprint parameter. The test-rest.sh shows how to query using curl.

The request is made to host:query/<METHOD> with <METHOD> one of

  • jaccard
  • set_int
  • set_int_norm_length_first

Usage:

echoprint-rest-service index-file-1 [index-file-2 ...]

The optional --ids-file accepts a path to a text file where each line represents an id for the correspondingly-indexed track in the index. If specified, the returned results will have an id field.

Example: querying from audio

Assuming 0005dad86d4d4c6fb592d42d767e117f.ogg is in the current directory, let's cut it from 00:30 to 4:30 and re-encode it as 128 kbps mp3 (to show that echoprint is robust to alterations in the file):

ffmpeg -i 0005dad86d4d4c6fb592d42d767e117f.ogg \
	-s 30 -t 240 \
	0005dad86d4d4c6fb592d42d767e117f_cut_lowrate.mp3

Run the echoprint codegen, extract the echoprint string:

../echoprint-codegen/echoprint-codegen
    0005dad86d4d4c6fb592d42d767e117f_cut_lowrate.mp3 \
    | jq -r '.[0].code' \
    > 0005dad86d4d4c6fb592d42d767e117f_cut_lowrate.echoprint```

Query the service:

curl -s --data \
    echoprint=`cat 0005dad86d4d4c6fb592d42d767e117f_cut_lowrate.echoprint` \
    <server-path>:5678/query

Results should be similar to

{
  "results": [
    {
      "id": "0005dad86d4d4c6fb592d42d767e117f",
      "index": 0,
      "score": 0.34932565689086914
    },
    {
      "id": "ee59c151d679413a80ac4e49ac92c662",
      "index": 698096,
      "score": 0.033668458461761475
    },
    {
      "id": "026526e6a02648668ff9f410faab15be",
      "index": 312466,
      "score": 0.015930989757180214
    },
    ...
  ]
}

Implementation details

Similarity

The similarity between two echoprints is computed on their bag-of-words representations. This means that the codes' offsets are not considered, nor are the codes' multiplicities.

Inverted index binary format

The inverted index is serialized as several blocks, each being a memory dump of the EchoprintInvertedIndexBlock struct defined in the header file.

License 📝

The project is available under the Apache 2.0 license.

Contributing 📬

Contributions are welcomed, have a look at the CONTRIBUTING.md document for more information.

echoprint-server's People

Contributors

8w9ag avatar baschdl avatar mpaolino avatar nicolamontecchio avatar perploug avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

echoprint-server's Issues

Can't compile.

I'm attempting to install this repository's Python library through:

pip install git+https://github.com/spotify/echoprint-server.git

I receive this compiler error from gcc in return:

/usr/lib/python3.7/distutils/dist.py:274: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running install
running build
running build_py
running build_ext
building 'echoprint_server_c' extension
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kyleanthonywilliams2/Active/eee/env/include -I/usr/include/python3.7m -c libechoprintserver.c -o build/temp.linux-x86_64-3.7/libechoprintserver.o
libechoprintserver.c: In function ‘_load_echoprint_inverted_index_block’:
libechoprintserver.c:210:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
   fread(&(block->n_codes), sizeof(uint32_t), 1, fp);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libechoprintserver.c:211:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
   fread(&(block->n_songs), sizeof(uint32_t), 1, fp);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libechoprintserver.c:214:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
   fread(block->codes, sizeof(uint32_t), block->n_codes, fp);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libechoprintserver.c:216:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
   fread(block->code_lengths, sizeof(uint32_t), block->n_codes, fp);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libechoprintserver.c:218:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
   fread(block->song_lengths, sizeof(uint32_t), block->n_songs, fp);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libechoprintserver.c:225:3: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
   fread(block->song_indices, sizeof(uint16_t), n_tot_song_indices, fp);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libechoprintserver.c: In function ‘echoprint_inverted_index_block_similarity’:
libechoprintserver.c:100:11: warning: ‘den’ may be used uninitialized in this function [-Wmaybe-uninitialized]
     float den;
           ^~~
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kyleanthonywilliams2/Active/eee/env/include -I/usr/include/python3.7m -c echoprint_server_python.c -o build/temp.linux-x86_64-3.7/echoprint_server_python.o
echoprint_server_python.c: In function ‘initechoprint_server_c’:
echoprint_server_python.c:63:17: warning: implicit declaration of function ‘Py_InitModule3’; did you mean ‘Py_Initialize’? [-Wimplicit-function-declaration]
   PyObject *m = Py_InitModule3(
                 ^~~~~~~~~~~~~~
                 Py_Initialize
echoprint_server_python.c:63:17: warning: initialization of ‘PyObject *’ {aka ‘struct _object *’} from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
echoprint_server_python.c:66:5: warning: ‘return’ with no value, in function returning non-void
     return;
     ^~~~~~
echoprint_server_python.c:61:16: note: declared here
 PyMODINIT_FUNC initechoprint_server_c(void)
                ^~~~~~~~~~~~~~~~~~~~~~
echoprint_server_python.c: In function ‘echoprint_py_load_inverted_index’:
echoprint_server_python.c:97:9: warning: implicit declaration of function ‘PyString_Check’; did you mean ‘PyMapping_Check’? [-Wimplicit-function-declaration]
     if(!PyString_Check(py_path))
         ^~~~~~~~~~~~~~
         PyMapping_Check
echoprint_server_python.c:102:27: warning: implicit declaration of function ‘PyString_AsString’; did you mean ‘PyBytes_AsString’? [-Wimplicit-function-declaration]
     index_file_paths[n] = PyString_AsString(
                           ^~~~~~~~~~~~~~~~~
                           PyBytes_AsString
echoprint_server_python.c:102:25: warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
     index_file_paths[n] = PyString_AsString(
                         ^
echoprint_server_python.c: In function ‘echoprint_py_inverted_index_size’:
echoprint_server_python.c:129:10: warning: implicit declaration of function ‘PyInt_FromLong’; did you mean ‘PyLong_FromLong’? [-Wimplicit-function-declaration]
   return PyInt_FromLong((long) echoprint_inverted_index_get_n_songs(index));
          ^~~~~~~~~~~~~~
          PyLong_FromLong
echoprint_server_python.c:129:10: warning: returning ‘int’ from a function with return type ‘PyObject *’ {aka ‘struct _object *’} makes pointer from integer without a cast [-Wint-conversion]
   return PyInt_FromLong((long) echoprint_inverted_index_get_n_songs(index));
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
echoprint_server_python.c: In function ‘echoprint_py_query_inverted_index’:
echoprint_server_python.c:150:13: warning: passing argument 1 of ‘strcmp’ makes pointer from integer without a cast [-Wint-conversion]
   if(strcmp(PyString_AsString(arg_sim_fun), "jaccard") == 0)
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/python3.7m/Python.h:30,
                 from echoprint_server_python.c:22:
/usr/include/string.h:136:32: note: expected ‘const char *’ but argument is of type ‘int’
 extern int strcmp (const char *__s1, const char *__s2)
                    ~~~~~~~~~~~~^~~~
echoprint_server_python.c:152:18: warning: passing argument 1 of ‘strcmp’ makes pointer from integer without a cast [-Wint-conversion]
   else if(strcmp(PyString_AsString(arg_sim_fun), "set_int") == 0)
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/python3.7m/Python.h:30,
                 from echoprint_server_python.c:22:
/usr/include/string.h:136:32: note: expected ‘const char *’ but argument is of type ‘int’
 extern int strcmp (const char *__s1, const char *__s2)
                    ~~~~~~~~~~~~^~~~
echoprint_server_python.c:154:18: warning: passing argument 1 of ‘strcmp’ makes pointer from integer without a cast [-Wint-conversion]
   else if(strcmp(PyString_AsString(arg_sim_fun),
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/python3.7m/Python.h:30,
                 from echoprint_server_python.c:22:
/usr/include/string.h:136:32: note: expected ‘const char *’ but argument is of type ‘int’
 extern int strcmp (const char *__s1, const char *__s2)
                    ~~~~~~~~~~~~^~~~
echoprint_server_python.c:177:9: warning: implicit declaration of function ‘PyInt_Check’; did you mean ‘PySet_Check’? [-Wimplicit-function-declaration]
     if(!PyInt_Check(code_obj))
         ^~~~~~~~~~~
         PySet_Check
echoprint_server_python.c:185:23: warning: implicit declaration of function ‘PyInt_AsLong’; did you mean ‘PyLong_AsLong’? [-Wimplicit-function-declaration]
     code = (uint32_t) PyInt_AsLong(code_obj);
                       ^~~~~~~~~~~~
                       PyLong_AsLong
echoprint_server_python.c:201:5: error: unknown type name ‘PyStringObject’; did you mean ‘PySliceObject’?
     PyStringObject* score_k = (PyStringObject*)PyString_FromString("score");
     ^~~~~~~~~~~~~~
     PySliceObject
echoprint_server_python.c:201:32: error: ‘PyStringObject’ undeclared (first use in this function); did you mean ‘PySliceObject’?
     PyStringObject* score_k = (PyStringObject*)PyString_FromString("score");
                                ^~~~~~~~~~~~~~
                                PySliceObject
echoprint_server_python.c:201:32: note: each undeclared identifier is reported only once for each function it appears in
echoprint_server_python.c:201:47: error: expected expression before ‘)’ token
     PyStringObject* score_k = (PyStringObject*)PyString_FromString("score");
                                               ^
echoprint_server_python.c:206:21: error: ‘index_k’ undeclared (first use in this function); did you mean ‘index’?
     PyStringObject* index_k = (PyStringObject*)PyString_FromString("index");
                     ^~~~~~~
                     index
echoprint_server_python.c:206:47: error: expected expression before ‘)’ token
     PyStringObject* index_k = (PyStringObject*)PyString_FromString("index");
                                               ^
echoprint_server_python.c:207:5: error: unknown type name ‘PyIntObject’; did you mean ‘PySetObject’?
     PyIntObject* index_v = (PyIntObject*)PyInt_FromLong((long) output_indices[n]);
     ^~~~~~~~~~~
     PySetObject
echoprint_server_python.c:207:29: error: ‘PyIntObject’ undeclared (first use in this function); did you mean ‘PySetObject’?
     PyIntObject* index_v = (PyIntObject*)PyInt_FromLong((long) output_indices[n]);
                             ^~~~~~~~~~~
                             PySetObject
echoprint_server_python.c:207:41: error: expected expression before ‘)’ token
     PyIntObject* index_v = (PyIntObject*)PyInt_FromLong((long) output_indices[n]);
                                         ^
echoprint_server_python.c: In function ‘echoprint_py_inverted_index_create_block’:
echoprint_server_python.c:245:12: warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
   path_out = PyString_AsString(arg_output_path);
            ^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

I'm on Python 3.7.3.

how product index.bin

hi, I don't know how to product index.bin. Is there anyone knows? Thanks very much.

echoprint-inverted-index gives error with piping large dump file to it

actually i wanted to test echoprint by its database : http://echoprint-data.s3.amazonaws.com/echoprint-dump-1.json
and i try to do this :
cat echoprint-dump-1.json|jq -r '.[].code' | echoprint-inverted-index index.bin
and it gives this error :
Traceback (most recent call last): File "/usr/local/bin/echoprint-inverted-index", line 19, in <module> create_inverted_index(streamer(sys.stdin), args.indexfile) File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 57, in create_inverted_index for batch_index, batch in enumerate(split_seq(songs, 65535)): File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 30, in split_seq item = list(itertools.islice(it, size)) File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 78, in parsing_code_streamer yield decode_echoprint(line.strip())[1] File "/usr/local/lib/python2.7/dist-packages/echoprint_server/lib.py", line 42, in decode_echoprint unzipped = zlib.decompress(zipped) zlib.error: Error -5 while decompressing data: incomplete or truncated stream

i think it happens just when file is being larger , i tested it with small json files and it works.
any one encounter with this error ?
is this a bug or the problem is just mine ?

Installation issue

Hi! Just tried to install on digitalocean ubuntu and still getting and error:
UserWarning: Unknown distribution option: 'install_requires'

Python --version:2.7.6

Can you give some instruction how to install this stuff?

Error running setup.py

when running setup.py as instructed in the readme I am getting this:

/usr/lib/python3.8/distutils/dist.py:274: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/echoprint_server
copying echoprint_server/lib.py -> build/lib.linux-x86_64-3.8/echoprint_server
copying echoprint_server/__init__.py -> build/lib.linux-x86_64-3.8/echoprint_server
running build_ext
building 'echoprint_server_c' extension
creating build/temp.linux-x86_64-3.8
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c libechoprintserver.c -o build/temp.linux-x86_64-3.8/libechoprintserver.o
libechoprintserver.c: In function ‘echoprint_inverted_index_block_similarity’:
libechoprintserver.c:64:14: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
   64 |   for(n=0; n < index_block->n_songs; n++)
      |              ^
libechoprintserver.c:70:11: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
   70 |   while(j < query_length && i < index_block->n_codes)
      |           ^
libechoprintserver.c:70:31: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
   70 |   while(j < query_length && i < index_block->n_codes)
      |                               ^
libechoprintserver.c:77:20: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
   77 |       for(n = 0; n < codeblock_length; n++)
      |                    ^
libechoprintserver.c:98:14: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
   98 |   for(n=0; n < index_block->n_songs; n++)
      |              ^
libechoprintserver.c: In function ‘echoprint_inverted_index_query’:
libechoprintserver.c:159:16: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  159 |   for(b = 0; b < index->n_blocks; b++)
      |                ^
libechoprintserver.c:160:43: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  160 |     max_block_n_songs = max_block_n_songs > index->blocks[b].n_songs ?
      |                                           ^
libechoprintserver.c:161:7: warning: operand of ?: changes signedness from ‘int’ to ‘uint32_t’ {aka ‘unsigned int’} due to unsignedness of other operand [-Wsign-compare]
  161 |       max_block_n_songs : index->blocks[b].n_songs;
      |       ^~~~~~~~~~~~~~~~~
libechoprintserver.c:165:16: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  165 |   for(n = 0; n < n_results; n++)
      |                ^
libechoprintserver.c:174:16: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  174 |   for(b = 0; b < index->n_blocks; b++)
      |                ^
libechoprintserver.c:178:18: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  178 |     for(i = 0; i < index->blocks[b].n_songs; i++)
      |                  ^
libechoprintserver.c:183:18: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  183 |       if(ith_pos < n_results)
      |                  ^
libechoprintserver.c:195:16: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  195 |   for(n = 0; n < n_results; n++)
      |                ^
libechoprintserver.c: In function ‘_load_echoprint_inverted_index_block’:
libechoprintserver.c:221:16: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  221 |   for(n = 0; n < block->n_codes; n++)
      |                ^
libechoprintserver.c: In function ‘echoprint_inverted_index_free’:
libechoprintserver.c:254:16: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  254 |   for(n = 0; n < index->n_blocks; n++)
      |                ^
libechoprintserver.c: In function ‘echoprint_inverted_index_block_serialize’:
libechoprintserver.c:289:16: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  289 |   for(n = 0; n < block->n_codes; n++)
      |                ^
libechoprintserver.c: In function ‘echoprint_inverted_index_block_similarity’:
libechoprintserver.c:100:11: warning: ‘den’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  100 |     float den;
      |           ^~~
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c echoprint_server_python.c -o build/temp.linux-x86_64-3.8/echoprint_server_python.o
echoprint_server_python.c: In function ‘initechoprint_server_c’:
echoprint_server_python.c:63:17: warning: implicit declaration of function ‘Py_InitModule3’ [-Wimplicit-function-declaration]
   63 |   PyObject *m = Py_InitModule3(
      |                 ^~~~~~~~~~~~~~
echoprint_server_python.c:63:17: warning: initialization of ‘PyObject *’ {aka ‘struct _object *’} from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
echoprint_server_python.c:66:5: warning: ‘return’ with no value, in function returning non-void [-Wreturn-type]
   66 |     return;
      |     ^~~~~~
echoprint_server_python.c:61:16: note: declared here
   61 | PyMODINIT_FUNC initechoprint_server_c(void)
      |                ^~~~~~~~~~~~~~~~~~~~~~
echoprint_server_python.c: In function ‘echoprint_py_load_inverted_index’:
echoprint_server_python.c:97:9: warning: implicit declaration of function ‘PyString_Check’; did you mean ‘PyMapping_Check’? [-Wimplicit-function-declaration]
   97 |     if(!PyString_Check(py_path))
      |         ^~~~~~~~~~~~~~
      |         PyMapping_Check
echoprint_server_python.c:102:27: warning: implicit declaration of function ‘PyString_AsString’ [-Wimplicit-function-declaration]
  102 |     index_file_paths[n] = PyString_AsString(
      |                           ^~~~~~~~~~~~~~~~~
echoprint_server_python.c:102:25: warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
  102 |     index_file_paths[n] = PyString_AsString(
      |                         ^
echoprint_server_python.c: In function ‘echoprint_py_inverted_index_size’:
echoprint_server_python.c:129:10: warning: implicit declaration of function ‘PyInt_FromLong’; did you mean ‘PyLong_FromLong’? [-Wimplicit-function-declaration]
  129 |   return PyInt_FromLong((long) echoprint_inverted_index_get_n_songs(index));
      |          ^~~~~~~~~~~~~~
      |          PyLong_FromLong
echoprint_server_python.c:129:10: warning: returning ‘int’ from a function with return type ‘PyObject *’ {aka ‘struct _object *’} makes pointer from integer without a cast [-Wint-conversion]
  129 |   return PyInt_FromLong((long) echoprint_inverted_index_get_n_songs(index));
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
echoprint_server_python.c: In function ‘echoprint_py_query_inverted_index’:
echoprint_server_python.c:150:13: warning: passing argument 1 of ‘strcmp’ makes pointer from integer without a cast [-Wint-conversion]
  150 |   if(strcmp(PyString_AsString(arg_sim_fun), "jaccard") == 0)
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |             |
      |             int
In file included from /usr/include/python3.8/Python.h:30,
                 from echoprint_server_python.c:22:
/usr/include/string.h:137:32: note: expected ‘const char *’ but argument is of type ‘int’
  137 | extern int strcmp (const char *__s1, const char *__s2)
      |                    ~~~~~~~~~~~~^~~~
echoprint_server_python.c:152:18: warning: passing argument 1 of ‘strcmp’ makes pointer from integer without a cast [-Wint-conversion]
  152 |   else if(strcmp(PyString_AsString(arg_sim_fun), "set_int") == 0)
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                  |
      |                  int
In file included from /usr/include/python3.8/Python.h:30,
                 from echoprint_server_python.c:22:
/usr/include/string.h:137:32: note: expected ‘const char *’ but argument is of type ‘int’
  137 | extern int strcmp (const char *__s1, const char *__s2)
      |                    ~~~~~~~~~~~~^~~~
echoprint_server_python.c:154:18: warning: passing argument 1 of ‘strcmp’ makes pointer from integer without a cast [-Wint-conversion]
  154 |   else if(strcmp(PyString_AsString(arg_sim_fun),
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                  |
      |                  int
In file included from /usr/include/python3.8/Python.h:30,
                 from echoprint_server_python.c:22:
/usr/include/string.h:137:32: note: expected ‘const char *’ but argument is of type ‘int’
  137 | extern int strcmp (const char *__s1, const char *__s2)
      |                    ~~~~~~~~~~~~^~~~
echoprint_server_python.c:177:9: warning: implicit declaration of function ‘PyInt_Check’; did you mean ‘PySet_Check’? [-Wimplicit-function-declaration]
  177 |     if(!PyInt_Check(code_obj))
      |         ^~~~~~~~~~~
      |         PySet_Check
echoprint_server_python.c:185:23: warning: implicit declaration of function ‘PyInt_AsLong’; did you mean ‘PyLong_AsLong’? [-Wimplicit-function-declaration]
  185 |     code = (uint32_t) PyInt_AsLong(code_obj);
      |                       ^~~~~~~~~~~~
      |                       PyLong_AsLong
echoprint_server_python.c:201:5: error: unknown type name ‘PyStringObject’; did you mean ‘PySliceObject’?
  201 |     PyStringObject* score_k = (PyStringObject*)PyString_FromString("score");
      |     ^~~~~~~~~~~~~~
      |     PySliceObject
echoprint_server_python.c:201:32: error: ‘PyStringObject’ undeclared (first use in this function); did you mean ‘PySliceObject’?
  201 |     PyStringObject* score_k = (PyStringObject*)PyString_FromString("score");
      |                                ^~~~~~~~~~~~~~
      |                                PySliceObject
echoprint_server_python.c:201:32: note: each undeclared identifier is reported only once for each function it appears in
echoprint_server_python.c:201:47: error: expected expression before ‘)’ token
  201 |     PyStringObject* score_k = (PyStringObject*)PyString_FromString("score");
      |                                               ^
echoprint_server_python.c:206:21: error: ‘index_k’ undeclared (first use in this function); did you mean ‘index’?
  206 |     PyStringObject* index_k = (PyStringObject*)PyString_FromString("index");
      |                     ^~~~~~~
      |                     index
echoprint_server_python.c:206:47: error: expected expression before ‘)’ token
  206 |     PyStringObject* index_k = (PyStringObject*)PyString_FromString("index");
      |                                               ^
echoprint_server_python.c:207:5: error: unknown type name ‘PyIntObject’; did you mean ‘PySetObject’?
  207 |     PyIntObject* index_v = (PyIntObject*)PyInt_FromLong((long) output_indices[n]);
      |     ^~~~~~~~~~~
      |     PySetObject
echoprint_server_python.c:207:29: error: ‘PyIntObject’ undeclared (first use in this function); did you mean ‘PySetObject’?
  207 |     PyIntObject* index_v = (PyIntObject*)PyInt_FromLong((long) output_indices[n]);
      |                             ^~~~~~~~~~~
      |                             PySetObject
echoprint_server_python.c:207:41: error: expected expression before ‘)’ token
  207 |     PyIntObject* index_v = (PyIntObject*)PyInt_FromLong((long) output_indices[n]);
      |                                         ^
echoprint_server_python.c: In function ‘echoprint_py_inverted_index_create_block’:
echoprint_server_python.c:245:12: warning: assignment to ‘char *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
  245 |   path_out = PyString_AsString(arg_output_path);
      |            ^
echoprint_server_python.c:263:20: warning: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Wsign-compare]
  263 |       for(m = 0; m < song_length; m++)
      |                    ^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

happens on ubuntu 20.04.3 - any Idea what could be wrong here?

there is a question, what can I do?

There are many of json file that use echoprint-decode to transfor in the folder ,
when I want to crate the index.bin, but not as you say index_001.bin,index002.bin .
-bash: /bin/cat: the parameter list is too long....

it means that the cat command receives parameter too much.and how you resolve?

Python scripts not able to find C library

I installed the python extension module with

sudo python setup.py install

and echoprint-decode is working correct. But echoprint-inverted-index does not work, because it cannot find the C library. What am I doing wrong?

$ cat codes.txt | echoprint-inverted-index index.bin

Traceback (most recent call last):
  File "/usr/local/bin/echoprint-inverted-index", line 19, in <module>
    create_inverted_index(streamer(sys.stdin), args.indexfile)
  File "/home/sam/.local/lib/python2.7/site-packages/echoprint_server/lib.py", line 55, in create_inverted_index
    _create_index_block(list(batch), batch_output_path)
NameError: global name '_create_index_block' is not defined

QUESTION: How to compare Spotify json dump vs Echoprint MP3 dump

Hey,

I don't know if I'm correct here for asking, but I am struggling to figure this out. I have two dumps. Both are of the song "Blinding Lights" by "The Weeknd" for no particular reason other then testing.
One is of the MP3 I have on my computer, the other is from Spotify when I request the track echoprint data.
mp3.txt
spotify.txt

My question is, how can I use the dump from spotify to correctly identify the corresponding MP3 in my collection? I hope someone can push me in the right direction.

Thx in advance

memory leaks

Hi, i found some possible memory leak.

src: echoprint-server/libechoprintserver.c

Line 318:  out = (uint32_t *) malloc(sizeof(uint32_t) * out_len);
  i = 0;
  for(n = 0; n < n_sequences; n++)
  {
    uint32_t len_n = sequence_lengths[n];
    memcpy(out + i, sequences[n], sizeof(uint32_t) * len_n);
    i += len_n;
  }
  _sequence_to_set_inplace(out, &out_len);
  *output_length = out_len;
  *output = out;
}

==============================

**Line 345** **code_lengths** = (uint32_t *) malloc(sizeof(uint32_t) * n_codes);
  for(i = 0; i < n_codes; i++)
    code_lengths[i] = 0;
  for(i = 0; i < n_songs; i++)
  {
    int offset = 0;
    for(c = 0; c < song_lengths[i]; c++)
    {
      while(codes[offset] != songs_codes[i][c])
        offset++;
      code_lengths[offset]++;
    }
  }

  code_lengths_sum = 0;
  for(c = 0; c < n_codes; c++)
    code_lengths_sum += code_lengths[c];
  song_indices = (uint16_t *) malloc(
    sizeof(uint16_t) * code_lengths_sum);

  code_offsets = (uint32_t *) malloc(sizeof(uint32_t) * n_codes);
  code_offsets[0] = 0;
  for(c = 1; c < n_codes; c++)
    code_offsets[c] = code_offsets[c-1] + code_lengths[c-1];
  for(i = 0; i < n_songs; i++)
  {
    int offset = 0;
    for(c = 0; c < song_lengths[i]; c++)
    {
      uint32_t code = songs_codes[i][c];
      while(codes[offset] != code)
        offset++;
      song_indices[code_offsets[offset]] = i;
      code_offsets[offset]++;
    }
  }
  free(code_offsets);

Installation issue

Hi, I'm .net developer and I don't know much about python and java (just a week ability to read the code), I want to port this server side logic into C# and WCF, Can you explain a bit more about how to build the project? I have tried CMake to build C++ code but it gives error every time, In building the Python extension : error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\link.exe' failed with exit status 1120 and ....
I want to ask you if you could explain a bit more about how to build the project step by step, Something like Codegen windows compilation instructions :
https://github.com/spotify/echoprint-codegen/tree/master/windows

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.