Hiya! Thanks for making this library out in the open! I've been tryi

pulling in <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

anyway here is the official recommendation <a href="https://github.com/facebookresearc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Error Reconstructing FAISS Index about retro-pytorch HOT 18 CLOSED

lucidrains commented on June 6, 2024 1

Error Reconstructing FAISS Index

from retro-pytorch.

Comments (18)

lucidrains commented on June 6, 2024 1

pulling in @rom1504 because he might know 😂

from retro-pytorch.

ncoop57 commented on June 6, 2024 1

@lucidrains fixed the issue with this PR using the guidance from @rom1504: #16 As he alluded to, we lose the memmaping ability :(. I did find a few more errors after that in the generator function, but I'll open a separate issue to discuss it and fixes.

from retro-pytorch.

rom1504 commented on June 6, 2024 1

I recommend never using faiss gpu

from retro-pytorch.

rom1504 commented on June 6, 2024 1

anyway here is the official recommendation https://github.com/facebookresearch/faiss/wiki/Comparing-GPU-vs-CPU

but in practice I've never seen a case where it's beneficial to use GPU for faiss

faiss cpu is very fast

from retro-pytorch.

lucidrains commented on June 6, 2024

@ncoop57 ohh interesting, i wonder if this is an issue with autofaiss 🤔

from retro-pytorch.

ncoop57 commented on June 6, 2024

@ncoop57 ohh interesting, i wonder if this is an issue with autofaiss 🤔

@lucidrains looked into issues in autofaisa and no one's brought it up. Might be due to faiss itself as I found this issue: facebookresearch/faiss#709

Is there a specific environment that you've done to get it working?

from retro-pytorch.

rom1504 commented on June 6, 2024

Hey,
The answer is exactly the error.
Some index do not implement reconstructing embeddings, that's not a bug.

Why do you need to reconstruct?

Anyway you have these choices:

Don't reconstruct
Use reconstruct_n method
Use index.make_direct_map() then you can use reconstruct. If you're using autofaiss there is an option of the same name

Now for the technical explanation: reconstructing an embedding using an index and an id requires having a way to go from id to embedding location in the index.
If the index is a flat index IE a simple embedding matrix, the solution is obvious: simply slice the matrix with the id.
However for example if your index is IVF which organizes the embeddings as N clusters then there is no obvious way to do it.
If you call reconstruct_n without the make_direct_map with an IVF index, faiss will scan the whole index looking for the index you want.
If you do call make_direct_map beforehand, faiss will prebuild a dict between embedding id to position in the index (ie cluster id + position in the cluster)

Hope that's clears things up!

from retro-pytorch.

rom1504 commented on June 6, 2024

Note a small drawback of index.make_direct_map() : it does not currently support memory mapping so the mapping will go to ram. That's about 12 * number of embeddings in ram. Eg 1.2GB for 100M embeddings
Could clearly be memory mapped as well but faiss didn't implement it
If you find that necessary, I'd gladly guide anyone wanting to implement that either outside of fais in python or inside of faiss in c++

from retro-pytorch.

rom1504 commented on June 6, 2024

Ah yeah i forgot the last and best option: if your use case is to reconstruct embeddings that are coming out of a search, then don't use reconstruct at all!
Instead use search_and_reconstruct which give you embeddings directly. No need for the direct map in that case

That's what I use eg in clip retrieval

from retro-pytorch.

lucidrains commented on June 6, 2024

oh I see, thanks for the lengthy explanation

do you know which setting for the build index function would disable the reconstructing? https://github.com/lucidrains/RETRO-pytorch/blob/main/retro_pytorch/retrieval.py#L288 I don't believe I'm using it anywhere else in the app and the error is being emitted at this step

from retro-pytorch.

rom1504 commented on June 6, 2024

do you know which setting for the build index function would disable the reconstructing? https://github.com/lucidrains/RETRO-pytorch/blob/main/retro_pytorch/retrieval.py#L288 I don't believe I'm using it anywhere else in the app and the error is being emitted at this step

that doesn't make sense no.
This error would be emitted only if index.reconstruct is called

from retro-pytorch.

ncoop57 commented on June 6, 2024

do you know which setting for the build index function would disable the reconstructing? https://github.com/lucidrains/RETRO-pytorch/blob/main/retro_pytorch/retrieval.py#L288 I don't believe I'm using it anywhere else in the app and the error is being emitted at this step

that doesn't make sense no.
This error would be emitted only if index.reconstruct is called

I can confirm that this error arising with the build_index function. Specifically when should_be_mem_mappable flag is True @rom1504

from retro-pytorch.

rom1504 commented on June 6, 2024

Ok that is wrong, can you share the whole stack trace ?

from retro-pytorch.

rom1504 commented on June 6, 2024

So the bug is only that faiss-gpu is installed instead of faiss-cpu.
Just change that line in the colab and things work.

from retro-pytorch.

ncoop57 commented on June 6, 2024

So the bug is only that faiss-gpu is installed instead of faiss-cpu.
Just change that line in the colab and things work.

Oh is there any speed difference in creating the index if we used faiss-cpu vs. GPU? My assumption was that it would be fast if we accelerated using the GPU. And I think I was getting a separate error when I didn't install the faiss-gpu something like "faiss has no 'StandardGPU...' attribute".

from retro-pytorch.

ncoop57 commented on June 6, 2024

@rom1504 in the build_index, the flag use_gpu = torch.cuda.is_available(), is enabled. Do you recommend turning that off when building the index and wanting to have the mem mappable?

from retro-pytorch.

rom1504 commented on June 6, 2024

Oh is there any speed difference in creating the index if we used faiss-cpu vs. GPU?

yes it's usually slower on gpu

from retro-pytorch.

ncoop57 commented on June 6, 2024

@lucidrains I can make an additional PR reverting the changes and removing the GPU flag to only use CPU if you'd like

from retro-pytorch.

Error Reconstructing FAISS Index about retro-pytorch HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent