Giter VIP home page Giter VIP logo

Comments (9)

yshui avatar yshui commented on June 3, 2024

Same problem with fetchTree. And documentation for tarball-ttl doesn't mention fetchTree. I wonder if fetchTarball is broken because it uses the same underlying implementation as fetchTree.

In that sense maybe fetchTree should respect tarball-ttl as well.

Looks like fetchurl is using the same code as well 🤔

from nix.

yshui avatar yshui commented on June 3, 2024

OK, this has nothing to do with nix-daemon. The problem is in fetcher-cache.sqlite, this particular url has immutable (or locked as it is referred to in code) set to 1. (I was using root to make nix not use nix-daemon, which coincidentally uses a different fetcher-cache).

What does immutable mean? And what will cause it to be set to 1?

from nix.

yshui avatar yshui commented on June 3, 2024

Ah I see, if I fetch in pure mode with a sha256, it will cause immutable to be set to 1. Later nix will never fetch the url again even if I try to fetch with a different sha256.

This would break the workflow of: changing sha256 to something invalid -> run nix -> copy the new hash in error message to sha256, which is used to update fixed-output derivations. I think this is undesirable.

from nix.

tomberek avatar tomberek commented on June 3, 2024

Seems like there are two behaviors and use-cases:

  1. This is expected to be a stable URL and thus the cached hash and content are correct, thus there is no need to re-fetch, you already have the desired content.
  2. This is known to be unstable, and thus the subsequent "--impure" request should be re-checked.

from nix.

Ericson2314 avatar Ericson2314 commented on June 3, 2024

It seems we need conceptually two tables:

  • For URLs without a specified hash, the hash is not part of the primary key.

    • We look up just the URL
    • We might find a previous fetch and hash
    • (If we think the URL should not change, we can keep that entry forever, otherwise we should expire it after some amount of time. (I think we do such expiring already.))
  • Conversely for URLs with a specified hash, the has is part of the primary key

    • We look up the URL and the hash
    • If the specified hash changes, we can't just silently succeed with the wrong hash. We either need to try fetching again, or complain that we think this will fail because of the other entry (in either table) with a different hash.

The bug is that pure fetching (the second table) is interfering with impure fetching cache lookups (the first table).

Conversely, it probably is fine if the first table entry (impure with TTL) affects the second one.

We can think of the two tables as asking these two questions:

  1. "What is the current contents at this URL?" (impure, with TTL)

    • "URL -> Hash" shape
  2. "Did this URL ever contain the specified hash" (pure, no TTL)

    • "(URL, Hash) -> bool" shape

Then it makes when each can affect queries of the other:

  • 1 implies 2: what the URL currently points to is definitely something that the URL once points to. 2 includes 1, 1 can be used to answer 2 queries, etc.

  • 2 does not imply 1: passed values are not necessarily current values.

(The implementation could have 2 be just an overlay over 1, e.g. using SQL unions, or the implementation could simply copy 1 things into 2.)

from nix.

yshui avatar yshui commented on June 3, 2024

It seems we need conceptually two tables:

This makes sense to me.

But practically we can probably just store table 2. --impure queries can have a "sentinel" hash that always triggers a fetch.

from nix.

Ericson2314 avatar Ericson2314 commented on June 3, 2024

But practically we can probably just store table 2. --impure queries can have a "sentinel" hash that always triggers a fetch.

How would that work with the TTL? If you claim a different hash than the most recent entry (which, I agree, can perhaps just be an index on table 2) Then you have to redownload. On the other hand, if you have no hash (or the sentinal hash), then you will succeed within the TTL with the current entry's hash.

So I think I almost agree but it's the other way around --- it's the non-sentinal hash that always triggers a fetch (if that entry isn't cached).

from nix.

yshui avatar yshui commented on June 3, 2024

Actually, I found another problem that can be solved by this proposal as well.

Right now cache lookup doesn't try to match the hash. So it's possible to have this scenario:

  1. You added a fixed url source to a flake.
  2. You then generated a flake.lock. This caused the url to be download into the store at storePathA, let's say its hash is hashA. url -> hashA + storePathA was stored into the cache. url -> hashA is locked in flake.lock
  3. Later, maybe unrelated, you fetched the same url again, and its content changed. So the cache entry was updated to url -> hashB + storePathB.
  4. You now try to evaluate the flake again. nix will try to fetch the fixed url, and get storePathB from the cache. so it will complain that hashB doesn't match hashA. While the desired the result is nix using storePathA, because that's what's locked in flake.lock

from nix.

nixos-discourse avatar nixos-discourse commented on June 3, 2024

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-01-22-nix-team-meeting-minutes-117/38838/1

from nix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.