Text to Visual Search Tool (AI4EU Component)

This repository collects the code for building the Text to Visual search component for use in Acumos. It is compliant with the AI4EU specifications. This code enables to do the following things:

Extract visual features from a given image folder
Construct an image index using FAISS
Building the Acumos-ready docker images, with gRPC interfaces.

This code is based on the TERN and TERAN cross-modal retrieval frameworks, published respectively at ICPR 2020 and ACM Transaction on Multimedia (TOMM) 2021.

Installation

You need Conda and Docker (with Nvidia Container Toolkit).

Then, download this repo and move into it:

git clone https://github.com/mesnico/ai4eu-text-to-visual-component
cd ai4eu-text-to-visual-component

Install the python dependencies in the conda environment and activate it:

conda env create --file environment.yml
conda activate ai4eu
export PYTHONPATH=.

Image Features Extraction

You can pull the following docker image for extracting bottom up features from the images.

docker pull mesnico/bottom-up-extractor

At this point, you can follow the instructions reported here to run the docker. In the end of this process, you will find the extracted features in OUT_PATH/bu_features.tsv.0

You then need to run the following code to convert the bottom up features to a suitable format:

python bu_features_convert.py --in_file OUT_PATH/bu_features.tsv.0 --output_dir OUT_PATH/bu_features

Image Indexing

The next step is to extract and index the TERN cross-modal features. First, you must download the tar containing the TERN pre-trained models (download link) into the project root. Then, untar it:

tar -xvf ai4eu_tern_data.tar

At this point, you can run the extraction and indexing code (OUT_PATH is the directory where the folder bu_features resides)

python index_bu_features.py --features_dir OUT_PATH/

In the end of this process, you should find a folder faiss_index in the project root, with the necessary index FAISS files inside.

Customize the URLs in the index

Actually, the default image URLs used to build the index are merely the original image names (without extension). You could customize the generation of the URL given the image name by modifying the ids_to_urls(ids) function implementation in index_bu_features.py. This is useful to retrieve the found images from a remote web server running in the Internet.

Test the gRPC interface

If you want to test the retrieval system, comprised of the gRPC interfaces, you can start the gRPC server:

python app.py

Then, you can perform a toy call to this server using the following script:

python client_test.py --query "A tennis player serving the ball on the court"

If all goes well, this script should print the URLs of the images most similar to the given query text.

Packing the Acumos-ready Docker

You can pack the component with the built image index into an Acumos-ready docker image, which exposes the gRPC interfaces.

docker build -t text-to-visual-search-component .

The built image can be uploaded to DockerHub and then onboarded on AI4EU Acumos.

mesnico / ai4eu-text-to-visual-search Goto Github PK

ai4eu-text-to-visual-search's Introduction

Text to Visual Search Tool (AI4EU Component)

Installation

Image Features Extraction

Image Indexing

Customize the URLs in the index

Test the gRPC interface

Packing the Acumos-ready Docker

ai4eu-text-to-visual-search's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent