Giter VIP home page Giter VIP logo

same-file's Introduction

same-file

A safe and cross platform crate to determine whether two files or directories are the same.

Build status

Dual-licensed under MIT or the UNLICENSE.

Documentation

https://docs.rs/same-file

Usage

Add this to your Cargo.toml:

[dependencies]
same-file = "1"

Example

The simplest use of this crate is to use the is_same_file function, which takes two file paths and returns true if and only if they refer to the same file:

use same_file::is_same_file;

fn main() {
    assert!(is_same_file("/bin/sh", "/usr/bin/sh").unwrap());
}

Minimum Rust version policy

This crate's minimum supported rustc version is 1.60.0.

The current policy is that the minimum Rust version required to use this crate can be increased in minor version updates. For example, if crate 1.0 requires Rust 1.20.0, then crate 1.0.z for all values of z will also require Rust 1.20.0 or newer. However, crate 1.y for y > 0 may require a newer minimum version of Rust.

In general, this crate will be conservative with respect to the minimum supported version of Rust.

same-file's People

Contributors

alexcrichton avatar andygauge avatar burntsushi avatar cjubb39 avatar freemasen avatar guillaumegomez avatar gurgalex avatar ignatenkobrain avatar jackpot51 avatar kodraus avatar levicole avatar yandexx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

same-file's Issues

Unknown Targets

I am curious if you would be willing to add some default implementations to this crate to allow for compilation on unknown targets?

These implementations could panic on use but would allow this crate to compile.

The reason I bring this up is that there seem to be a few libraries that are exposing this crate even though they aren't using the functionality. For example mdbook depends on handlebars (with default features) which depends on walkdir which depends on this crate. That means if I wanted to use the structs exported by the mdbook library in a project that is compiling to say wasm32-unknown-unknown the compiler stops me. This specific example could be solved by getting mdbook's handlebars dependency updated to use the no_dir_source feature but since this isn't the first time I've run into this problem I thought it might be worth exploring solving it at this level.

In most situations there would still be an issue at a higher level but it would at least alleviate a small amount of pain.

Note why `Handle` methods in `win` won't panic

Relevant API Guidelines:

The win implementation of Handle calls unwrap in the following methods:

  • Drop::drop
  • AsRawFd::as_raw_fd
  • IntoRawFd::into_raw_fd
  • as_file
  • as_file_mut

These calls should either have a comment inline documenting why they can't fail, or a Panics section should be added to the docs documenting when they will fail.

Release 1.0 on crates.io

The libz blitz evaluation is finished 🎉 (a big thanks to everyone who contributed!). I guess the next steps are:

  • Final pass over documentation
  • Changelog (#13)
  • Release 1.0

Does that sound right to you @BurntSushi?

Include license files

The MIT license at least requires publishing with the source code to really be in effect. Although I believe you don't need to include a copy of the Unlicense, it would be nice to include it as well.

Also, there's a COPYING file that contains the BSD-3-Clause, is that intended to be part of this repository, or just leftover from the initial setup (I notice that it was introduced in the first commit and the readme included it, but the first commit also had Unlicense/MIT in the Cargo.toml and the readme was changed very soon after).

`is_same_file` and `Handle::from_path` hang for named pipes on linux

How to reproduce:

  1. Create named pipe: mkfifo my_pipe

  2. Execute following program:

    use same_file::is_same_file;
    
    fn main() {
        let r = is_same_file("/path/to/my_pipe", "/path/to/my_pipe");
        println!("{r:?}");
    }

The program prints nothing and does not exit.

same-file: 1.0.6
rustc 1.58.1 (db9d1b20b 2022-01-20)
Linux 5.16.5-arch1-1 x86_64 GNU/Linux (Arch Linux)

Do a pass over documentation

Before the stable release, we should do a full pass over the crate documentation and make sure:

  • There aren't any grammatical errors in the prose
  • It all flows nicely
  • The examples are easy to follow and all make sense
  • The links to other items all work

return errors on unsupported targets (e.g., wasm)

error[E0433]: failed to resolve. Use of undeclared type or module `imp`
   --> C:\Users\harmony\.cargo\registry\src\github.com-1ecc6299db9ec823\same-file-1.0.3\src\lib.rs:105:19
    |
105 | pub struct Handle(imp::Handle);
    |                   ^^^ Use of undeclared type or module `imp`

version

nightly-x86_64-pc-windows-msvc (default)
rustc 1.30.0-nightly (39e6ba821 2018-08-25)

Note why `Handle` methods in `unix` won't panic

Relevant API Guidelines:

The unix implementation of Handle calls unwrap in the following methods:

  • Drop::drop
  • AsRawFd::as_raw_fd
  • IntoRawFd::into_raw_fd
  • as_file
  • as_file_mut

These calls should either have a comment inline documenting why they can't fail, or a Panics section should be added to the docs documenting when they will fail.

Interest check: adding a serialize, deserialize impl for `Handle`

I need to compare Handle values between program runs.

Constraints

  • I can’t store file paths but do store metadata as contained in a Handle value (perhaps except the file field when it is Some).
  • The program is cross-platform. Handle values need not be comparable across platforms. Handle values do need to be constructed using deserialization, across platforms.
  • The theoretical issue that Windows file handles are only guaranteed unique as long as both are open is not prohibitive for me, assuming they do ‘tend’ to be unique and same-file has coded around this already using the file size metadata.

Proposal

Add a serde feature to enable this functionality.

Plan

After finishing discussion and getting a go-ahead, I’ll file a PR.

Issues with this crate over samba.

I have a samba share mounted with cifs-utils. I'm doing some file operations in the share and it is not case insensitive, which is fine. However, it seems like it uses virtual inode numbers because the Handles have different numbers. Here's a sample of the issue:

Handle { file: Some(File { fd: 3, path: "GoodFellas (1990)", read: true, write: false }), is_std: false, dev: 46, ino: 10248 }
Handle { file: Some(File { fd: 3, path: "Goodfellas (1990)", read: true, write: false }), is_std: false, dev: 46, ino: 11984 }

Only the "GoodFellas (1990)" directory exists, but both uppercase and lowercase paths work to access it. But they report different inode numbers, which makes this library report a false negative.

I'm not really sure if anything can be done about this...

Note that `dev` and `ino` methods are unix only

The dev and ino methods on Handle are unix-only. This should be called out in the method docs.

Currently, Handle wraps a platform-specific implementation so I don't think we need an extension trait. Just mentioning that they're unix-only should be enough. But @BurntSushi might have other ideas.

Support stdin, stdout, and stderr in Windows Powershell

Testing handle equality for file paths to streams works on Windows under cmd, however not from Powershell (tested with 5 and 6)

PS G:\dev\lib\same-file> cargo run --example is_stderr 2> examples/stderr
"examples/is_same_file.rs" is NOT stderr!
"examples/is_stderr.rs" is NOT stderr!
"examples/stderr" is NOT stderr!

Tracking issue for libz blitz evaluation of same-file

File size is not actually checked on Windows

I was checking out the source of this library purely out of interest, and while the comments and docs say at a few places that file size is compared on windows to avoid some false positives, this does not actually seem to be the case.

The key used for comparison on Windows is

struct Key {
    volume: u64,
    index: u64,
}

which does not include the file size, nor could I find any other place where file size is taken into account.

Erroneous `impl Eq` on Windows Handle

Hi! It seems that the Eq trait is erroneously defined on win::Handle (and thus on the top-level wrapper struct).

The comments mention the following:

        match file_info(&file) {
            Ok(info) => Ok(Handle::from_file_info(file, true, info)),
            // In a Windows console, if there is no pipe attached to a STD
            // handle, then GetFileInformationByHandle will return an error.
            // We don't really care. The only thing we care about is that
            // this handle is never equivalent to any other handle, which is
            // accomplished by setting key to None.
            Err(_) => Ok(Handle { file: Some(file), is_std: true, key: None }),
        }

If the Err branch is hit, we'll produce a struct that doesn't satisfy the Eq requirement that a == a.

I'd suggest either:

  1. Removing the impl. This seems less desirable as it would be an externally breaking change.
  2. Altering the PartialEq::eq implementation to look at the file and is_std fields iff the key is None for both self and other.

I'm happy to write the pull request but I'm unsure which direction you'd prefer.

Question about methodology

Sorry to be opening an issue for a question. I was curious why this library exists compared to std::fs::canonicalize, which has the guarantee of Rust to return the real path. Is it a race condition commentary where

  • open file
  • move file
  • move other file to that location

would cause distinct files to return as matches due to having the same path across time, an edge case this lib handles? What stronger guarantees exactly does this library aim to offer?

This may benefit from a follow-up to the README clarifying its utility.

Handle’s std::os::unix::io::IntoRawFd implementation panics on stdin/stdout/stderr

This panics:

std::os::unix::io::IntoRawFd::into_raw_fd(same_file::Handle::stdin().unwrap())

The Option<File> field is treated carelessly, and so the reassuring “unwrapping won’t panic” comment in the Drop implementation is wrong.

Really, Option is the wrong tool to use here; it should be ManuallyDrop these days (✓ available within the current MSRV).

Support `ino` and `dev` on windows

Both the inode and device numbers are u64 on unix like systems.

On windows, you can get the BY_HANDLE_FILE_INFORMATION structure which returns:

  • dwVolumeSerialNumber: The dword of the volume (device)
  • nFileIndexHigh: The high-order part of a unique identifier that is associated with a file.
  • nFileIndexLow: The low-order part of a unique identifier that is associated with a file.

It would be nice, if possible, if same-file could map:

  • dwVolumeSerialNumber todev()
  • nFileIndexHigh << 32 | nFileIndexLow to ino()

This would mean that you could have an analogue of inodes on windows with this library. Alternatively, maybe exposing these as their own functions to ensure they are explicitly windows only values.

Clarify Approach to comparing files in docs

Clarify the statement about false positives a little more in the crate docs for the same_file method. This should be enough to encourage users to check the implementation if they have specific requirements, but not complicate things for users who aren't concerned.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.