Giter VIP home page Giter VIP logo

Comments (5)

scrabsha avatar scrabsha commented on May 18, 2024 1

TLDR: my previous comment was very wrong.

I failed to understand what you meant by implementing equality thanks to hash. Your idea is valid.

You're also right about trait object safety. Trait objects are things I try to avoid in my own code. As such, their rules are not always clear to me.

It may also be useful to quote my apologies from the Discord server:

I see that most of my comments where irrelevant and/or impossible to do.
Additionally, the tone I used to explain my point of view was not good. I apologise for that
Thank you for staying constructive despite my not very useful feedback.

I am quite sure that the Hashing algorithm used in the Rust standard library is not designed to be fast, but rather to generate at least hash collisions as possible. So this may decrease performances, which is fine since we are in early phases of development.

I think that it's the only thing I can add to the conversation right now. I definitely lack some Clojure knowledge.

from clojurers.

scrabsha avatar scrabsha commented on May 18, 2024

Some rectifications here: PartialEq and Hash are completely independent. However, Eq, which is required to implement Hash, is directly derived from PartialEq. The difference between Eq and PartialEq is that a type which implements PartialEq but no Eq represents a partial equivalence relation (source), and having Eq implemented tells the compiler that this partial equivalence relation defined with the PartialEq implementation is in fact an equivalence relation (source).

In other words, if T implements PartialEq but not Eq, then there could be a of type T for which a != a. This is for example the case with floating-point values, in which NaN != NaN. On the other hand, having both PartialEq and Eq implemented for T means that for every a of T, we have a == a. This is the case for integers, for example.

Some kind of solution:

I think the best way to implement this is to slightly modify how Ifns are constructed. Currently, the IFn trait is implemented by some structures, such as Fn, AddFn, ConcatFn, DoFn, and so on. A great way to allow hashing would be to divide the problem as much as possible.

As far as I know, special forms are empty structures, so we would just have to add #[derive(Hash)] before declaration.

Regular forms (represented by Fn) are more tricky. A way to hash them would be to assign to each form an unique ID (perhaps wrapped in an usize), and when asked to hash the form, return the hash of the unique ID instead. This comes with a downside: we are limited in the number of Ifn we declare. To be exact, we would be limited to "only" usize::max Fns for each program execution.

Now we need to tell the compiler that every kind of structure which implements Ifn also implements Hash and Eq. We can modify the definition of the IFn trait, so that it is only implementable on types which also implement them. The following definition of Ifn could work:

pub trait IFn: Debug + DynClone + Hash + Eq {
    // ...
}

Note:

Please do not automatically derive Eq and Hash for Fn. This data structure contains a lot of data in it. It would slow dramatically the program to have to check for each field for equality, or to hash them.

Note (2):

It may be noted that I am a Rustacean who like LISPs, but never used them. As such, the implementation described in the paragraph before may be entirely wrong. I am making the assumption that two functions declared at two different places in the code, even if they arguments and bodies are the same, won't result in the same hash. Please correct me if I'm wrong.

Edit: fix typo.

from clojurers.

Tko1 avatar Tko1 commented on May 18, 2024

I appreciate this, but you don't have to explain what a partial equivalence is, or the relationship between Eq and Hash; I know these things and nothing about my message was meaning to imply otherwise. Me mentioning PartialEq and Hash together is not implying that PartialEq is the trait bound on Hash, I realize that's Eq and that PartialEq is only a transitive dependence.
However, there is reason for me mentioning defining PartialEq in terms of Hash; as mentioned, I cannot derive Hash and PartialEq and Eq

[derive(...PartialEq,Eq,Hash)

, so I need to implement them manually as

impl PartialEq for Value { 
   ....
} 
impl Eq for Value {} 
impl Hash for Value {
   ...
}

The thing is, as we know, if a == b, then hash(a) == hash(b). So what I was wondering is whether or not I should be really just use the hashes of Values to determine equality , thereby really only defining equality once (ie,

impl Hash for Value { 
    ... Still defining this
}
// I'm just pulling this example function straight from the std::Hash docs 
fn calculate_hash<T: Hash>(t: &T) -> u64 {
    let mut s = DefaultHasher::new();
    t.hash(&mut s);
    s.finish()
impl PartialEq for Value { 
        fn eq(&self, other: &Value) -> bool {
             // Here we just reuse our hashes
             calculate_hash(self) == calculate_hash(other) 
        }
}
impl Eq for Value {} 

Another reason I was specifically thinking this is I believe Clojure proper works this way as well, although I have to check again; that it uses an efficient hashing algorithm on objects, and uses that directly for checking equality.

And indeed, two functions with the same definition are not equal / do not hash the same.
Finally,

Now we need to tell the compiler that every kind of structure which implements Ifn also implements Hash and Eq. We can modify the definition of the IFn trait, so that it is only implementable on types which also implement them. The following definition of Ifn could work:

pub trait IFn: Debug + DynClone + Hash + Eq {
    // ...
}

Again, unless I'm missing something, give me more credit ahaha (EDIT: not literally, I'm glad to hear suggestions. But I likewise wanted to do this rather than resorting to what's there now). You can't do this outright because IFn is used as a trait object, and both Eq and Hash will break object safety. Hash has a generic type parameter, and Eq uses Self as a type parameter.

from clojurers.

arbroween avatar arbroween commented on May 18, 2024

First of all I just want to say that maybe I'm missing something as I have never written Hash and PartialEq manually.

Nevertheless if I have understood correctly, could implementing PartialEq in term of Hash not introduce problems in case of hash collision ? In this case it would be true that hash(a) == hash(b) but you would not want that a == b, or would you ?

Perhaps Hash could be used in PartialEq to speed up the cases where hash(a) != hash(b) but if hash(a) == hash(b) then it would have to check whether a and b are really equals. I cannot tell if it would be worth it or not.

from clojurers.

Tko1 avatar Tko1 commented on May 18, 2024

could implementing PartialEq in term of Hash not introduce problems in case of hash collision

I've thought about that but I don't know yet. I suspect so, and its a case where I wanted to also look at the clojure base and see if that sort of thing's a problem, and how its handled, or see if I was mistaken about just how hashes are used there. I did forget about this when I wrote this issue, though, as I forgot I was considering dropping this idea altogether because of this. Regardless, I want to look into it

from clojurers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.