evrignaud / fim Goto Github PK
View Code? Open in Web Editor NEWFile Integrity Manager -
Home Page: https://evrignaud.github.io/fim
License: GNU General Public License v3.0
File Integrity Manager -
Home Page: https://evrignaud.github.io/fim
License: GNU General Public License v3.0
You need to compare first file size in order to decrease hash collision.
When using the 1.2.3-distribution
,
$ /usr/local/bin/fim/fim fdup
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.rits.cloning.Cloner (file:/usr/local/bin/fim/bin/fim-1.2.3.jar) to field java.util.TreeSet.m
WARNING: Please consider reporting this to the maintainers of com.rits.cloning.Cloner
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Exception in thread "main" java.lang.UnsupportedOperationException: Path not associated with default file system.
at java.base/java.nio.file.Path.toFile(Path.java:771)
at org.fim.internal.SettingsManager.load(SettingsManager.java:62)
at org.fim.internal.SettingsManager.<init>(SettingsManager.java:49)
at org.fim.command.AbstractCommand.checkHashMode(AbstractCommand.java:51)
at org.fim.command.FindDuplicatesCommand.execute(FindDuplicatesCommand.java:48)
at org.fim.Fim.run(Fim.java:258)
at org.fim.Fim.main(Fim.java:75)
When trying to install from sources:
[INFO] Copying file images/hands-on-little.png
Jul 09, 2020 9:15:02 PM org.asciidoctor.internal.JRubyAsciidoctor renderFile
SEVERE: (RuntimeError) asciidoctor: FAILED: /home/lupe/recolhidos/fim-1.2.3/src/main/asciidoc/slides/fr.adoc: Failed to load AsciiDoc document - Could not find a converter to handle transform: inline_quoted
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.690 s
[INFO] Finished at: 2020-07-09T21:15:02+01:00
[INFO] Final Memory: 49M/118M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.asciidoctor:asciidoctor-maven-plugin:1.5.5:process-asciidoc (generate-slides) on project fim: Execution generate-slides of goal org.asciidoctor:asciidoctor-maven-plugin:1.5.5:process-asciidoc failed: org.jruby.exceptions.RaiseException: (RuntimeError) asciidoctor: FAILED: /home/lupe/recolhidos/fim-1.2.3/src/main/asciidoc/slides/fr.adoc: Failed to load AsciiDoc document - Could not find a converter to handle transform: inline_quoted -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException
As I don't know Java or maven, I'd appreciate some step-by-step help on how to solve this issue.
I'm using openjdk11 on Linux 64 bits.
Hello,
I wanted to build Fim myself, not than I can maintain a fork but in order to use a build which resolves Issue #9.
Anyway, it doesn't work and I don't know what can be wrong.
Thank you for your help!
Here is the end of the output but I can send more informations.
- State #1: 2017/06/03 21:57:28 (10 files - 121,4 MB - generated using hash mode hashAll)
Comment: Using hash mode hashAll
Added: file01
Added: file02
Added: file03
Added: file04
Added: file05
Added: file06
Added: file07
Added: file08
Added: file09
Added: file10
10 added
- State #2: 2017/06/03 21:57:50 (20 files - 230,3 MB - generated using hash mode hashSmallBlock)
Comment: Using hash mode hashSmallBlock
Added: .fimignore
Added: dir01/.fimignore
Added: dir01/ignored_type1
Added: file12
Added: ignored_type2
Added: media.mp4
Copied: file11 (was file05)
Duplicated: file03.dup1 = file03
Duplicated: file03.dup2 = file03
Duplicated: file07.dup1 = file07
Date modified: file02 last modified: 2017/06/03 21:57:27 -> 2017/06/03 21:57:30
Content modified: file04 last modified: 2017/06/03 21:57:27 -> 2017/06/03 21:57:29
Content modified: file05 last modified: 2017/06/03 21:57:28 -> 2017/06/03 21:57:30
Renamed: file01 -> dir01/file01
Deleted: file06
6 added, 1 copied, 3 duplicated, 1 date modified, 2 content modified, 1 renamed, 1 deleted
- State #3: 2017/06/03 21:57:53 (20 files - 234,3 MB - generated from dir01 using hash mode hashSmallBlock)
Comment: Using hash mode hashSmallBlock
Added: dir01/file15
1 added
- State #4: 2017/06/03 21:57:54 (22 files - 267,3 MB - generated using hash mode hashSmallBlock)
Comment: From from sub directory dir01
Added: file13
Added: file14
2 added
2017/06/03 21:57:57 - Info - Searching for duplicate files
2017/06/03 21:57:57 - Info - Scanning recursively local files, using 'full' mode and automatic scaling
(Hash progress legend for files grouped 10 by 10: # > 1 GB, @ > 200 MB, O > 100 MB, 8 > 50 MB, o > 20 MB, . otherwise)
OO
2017/06/03 21:57:59 - Info - Scanned 22 files (267,3 MB), using 2 threads, hashed 267,3 MB (avg 267,3 MB/s), during 00:00:01
- Duplicate set #1: duplicated 2 times, 11,4 MB each, 22,8 MB of wasted space
file03
[-] file03.dup1
[-] file03.dup2
- Duplicate set #2: duplicated 1 time, 12,6 MB each, 12,6 MB of wasted space
file07
[-] file07.dup1
Removed 3 files and freed 35,4 MB
No duplicate file remains
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.791 sec
Results :
Tests in error:
fimDoesNotExist(org.fim.FimTest): Unexpected exception, expected<org.fim.command.exception.BadFimUsageException> but was<org.fim.command.exception.RepositoryException>
Tests run: 249, Failures: 0, Errors: 1, Skipped: 6
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:50 min
[INFO] Finished at: 2017-06-03T21:57:59+02:00
[INFO] Final Memory: 69M/580M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.9:test (default-test) on project fim: There are test failures.
[ERROR]
[...]
As "readlink" command is not GNU for this platforme, we have to replace it with "greadlink".
The content of the script should be:
currentDir=`pwd`
baseDir=`dirname "$(greadlink -f "$0")"`
JAVA_OPTIONS="-Xmx4g -XX:MaxMetaspaceSize=4g"
JAR_FILE=`ls -1 "${baseDir}/target"/fim-*.jar | grep -v sources`
java ${JAVA_OPTIONS} -jar "${JAR_FILE}" "${1}" "${2}" "${3}" "${4}" "${5}" "${6}"
I use to manage my image collection with Digikam. And I often modify image metadatas:
Would it be possible to let fim hashes the part of the image file which is the image and the not part representing the metadata? This request is relevant for many file formats such as jpeg, tif, mp3, etc.
I've got no idea if it's possible or not but I think some file format probably have a part reserved for content and another one for metadata.
Thanks for the good job.
Hello,
Currently, the text output of the fdup
sub-command is not easily usable by other tools. It would be useful to provide other formats using a dedicated switch (eg --output-type csv
). CSV is a good candidate. Columns could be:
Thank you.
I tried under linux and windows and the remove duplicates command "fim rdup -y" ask me which file I want to remove despite the --always-yes option. Is it normal ? Did I misunderstood the documentation ?
To ensure that the headers don't increase the collision probability when doing a rapid check, it would be more efficient to check for the second 1 MB / 4kB block in the file.
Hi,
I remove several files. But the use space is not update in the commit.
But, I remove files in State 2, the use space should be updated in state 2 not in state 3.
λ fim log -q
- State #1: 2018/05/03 22:34:38 (1801 files - 250.3 MB - generated using hash mode hashMediumBlock)
1801 added
- State #2: 2018/05/03 23:20:01 (1950 files - 252.6 MB - generated using hash mode hashAll)
149 added, 1768 content modified, 33 deleted
- State #3: 2018/05/05 21:47:20 (1935 files - 163.5 MB - generated using hash mode hashMediumBlock)
18 added, 2 content modified, 2 deleted
Note: the two deleted files in state 3 are small.
hello
maybe a little bug?
with command: fim fdup -l .fimignore isn't take in account
with command: fim fdup .fimignore is used...
and a little difficulty with windows:
To create .fimignore , I have to enter ".fimgnore.{espace}"
It take me some times to find that..
is it possible to access directly the json database??
thank you , very very useful program..
I like your idea and it's a very well cared repo with a good Readme section - thank you evrignaud for your work.
I would like to see a git push/pull kind of interface inside fim, in order to synchronise between repos.
I'm facing this problem:
It would be cool to have a git push/pull kind of interface.
Maybe, a UUID or network path could be used as an identifier:
On backup:
fim remote add workspace uuid://13152fae-d25a-4d78-b318-74397eb08184
On the workstation:
fim remote add backup smb://home-nas/backup/A
And then run:
fim push backup
fim pull origin
How this could work?
fim remote COMMAND [ALIAS] [LOCATION]
COMMAND
add [ALIAS] [LOCATION]
It will check location for a .fim folder
If so use it, otherwise initialise a new work space
delete [ALIAS]
Delete saved locations
set [ALIAS] [LOCATION]
Change location of ALIAS
ALIAS
An alias name for a repo
LOCATION
ftp://, smb://, etc for network location
uuid:// for a disk location (generally, uuid for linux fs, sn for ntfs)
why use uuid? because drives can be mounted in different locations
using the mount point will create the need for the drive to be in the correct position each time
fim remote add backup smb://home-nas/backup/A
fim push/pull [SOURCE][/SUBDIR] [ORIGIN][/SUBDIR]
push/pull will:
push and pull are are the same command, although they work in different directions.
push from SOURCE to ORIGIN
pull from ORIGIN to SOURCE
SOURCE and ORIGIN are ALIAS
If SOURCE is omitted, it will default to the current work space
ORIGIN cannot be omitted <-- different behaviour than Git, but because there is no rollback possibilities, ensure user actually wants to make that operation
SUBDIR
It can be used to filter the operations on a given subposition in the repo
fim pull server/subfolder1
If omitted will mean current position in respect of repo root
Other use case:
A media production factory as server on smb://server/media-root
A new employee, Mr. Smith, start working with the company.
He want to get a subfolder of media-root into his local folder such as %USERPROFILE%\My Documents\WorkingDir or ~/Documents/WorkingDir:
Smith ~/Documents/WorkingDir $ fim init
Smith ~/Documents/WorkingDir $ fim remote add server smb://server/media-root
Smith ~/Documents/WorkingDir $ fim pull server/subfolder1
Mr Smith goes to his assignment
Smith ~/Documents/WorkingDir $ cd subfolder1
When finished he will commit his work:
Smith ~/Documents/WorkingDir/subfolder1 $ fim ci -m "My job done"
Check his work will not conflict with others
Smith ~/Documents/WorkingDir/subfolder1 $ fim pull server
Because of the omitting system, these actually translates to
fim pull localrepo/subfolder1 server/subfolder1
Push the data to the server
Smith ~/Documents/WorkingDir/subfolder1 $ fim push server
And it is really cool now because fim has taken care that everything is in place and will not conflict with Mr Smiths colleagues work!
At the same Mr Smith office will know what files Mr Smith has used and why!
Hi,
I don't know the java ecosystem. Did a git clone
for this repository. When tried to run fim
:
$ ./fim
ls: cannot access '/wd0/home/lpvm/recol/fim/target/fim-*.jar': No such file or directory
Error: Unable to access jarfile
$ which java
/sbin/java
$ java -version
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (build 1.8.0_121-b13)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)
What should I do?
Hello,
it would be great if you could be able to ignore a sub-folder directly in the .fimignore at the root of the repository/directory you are monitoring.
Currently, to ignore, let's say foo/sub/
you must create a .fimignore file in foo/
and write sub
inside. Don't know if it's possible to add the possibility to add foo/sub/
in the "root" .fimignore (and with that you can keep only one ignore file by repository) ?
Thanks for reading,
N4th4
Hello,
I run Fim version 1.2.2 on Ubuntu 16.04.2 LTS with java -version
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (build 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)
I had this error:
fim st -n
2017/03/26 17:39:18 - Info - Scanning recursively local files, using 'do not hash' mode and automatic scaling
2017/03/26 17:39:40 - Info - Scanned 274436 files (1 TB), during 00:00:21
Exception in thread "main" java.lang.IllegalStateException: Duplicated entries: Size=273632, MapSize=273585
at org.fim.util.FileStateUtil.buildHashCodeMap(FileStateUtil.java:52)
at org.fim.internal.StateComparator.searchForAddedOrModified(StateComparator.java:191)
at org.fim.internal.StateComparator.compare(StateComparator.java:162)
at org.fim.command.StatusCommand.execute(StatusCommand.java:56)
at org.fim.Fim.run(Fim.java:258)
at org.fim.Fim.main(Fim.java:75)
One strange thing is also that repertory has a size of 1,7T, so bigger than the 1 TB displayed.
Hello,
Currently duplicate results are
sort
command.Is it possible to add a --sort
option to the fdup
sub-command with a parameter that will sort results by
Thank you
fim is good at finding which file changed in a normal way.
It would be nice to have a way to find changes most likely caused by a hardware corruption or a filesystem bug: change in content, but not in any of the timestamps (ctime, mtime). This could be done by filtering fim diff
to only show changes to files having unmodified timestamp since last commit.
Hello,
For a lot of reasons, I have some legitimate duplicates inside my fimed repository. So running rdup
is not an option. I would like to do it against two specific dirs. Furthermore rdup
keeps the first entry, which is sometimes not the good one to keep.
I would to be be able to use the -M
option from the rdup
sub-command inside a sub(-sub-sub) directory of a fim repository.
e.g. based on your documentation terminology (10.2 section), I would to do the following :
$ pwd
source #contains the .fim directory
$ cd some/sub/dir
$ pwd
source/some/sub/dir
$ fim rdup -M ../../other/sub/dir
[...]
Could be very useful
Thank you.
I initialized a fim repository and added everything. I'm trying to specifically ignore the file keys/pwdb.kdbx
and the directory notebook/todo
, and even though I have an entry for them in my .fimignore
, they still show up in the status.
With git, it'd be git rm --cached
. Is there an equivalent for fim, or do I need to reinitialize the repository and start over?
/data
├── archives
...
├── keys
│ └── pwdb.kdbx
...
├── notebook
│ ├── misc
│ └── todo
...
# Wildcard directories
**/.stversions
**/.tmsu
# Wildcard files
**/*.kdbx.lock
# Specific files and directories
# (also tried /data/...)
keys/pwdb.kdbx
notebook/todo
2017/08/11 22:31:16 - Info - Scanning recursively local files, using 'do not hash' mode and automatic scaling
2017/08/11 22:31:19 - Info - Scanned 37277 files (69 GB), using 1 thread, during 00:00:02
Comparing with the last committed state from 2017/08/10 17:07:55
Comment: Fix up permissions and set up syncthing
Added: notebook/misc/howto/make-goo-gone.txt
Added: tmp/.stfolder
Content modified: .fimignore last modified: 2017/08/09 21:18:50 -> 2017/08/11 22:31:12
Content modified: keys/pwdb.kdbx PosixFilePermissions: rw-rw-r-- -> rw-r--r--
last modified: 2017/08/08 00:13:38 -> 2017/08/11 08:33:48
Content modified: notebook/todo/todo.txt last modified: 2017/08/10 14:47:41 -> 2017/08/11 11:33:50
2 added, 3 content modified
Once the fim dcor
command have detected hardware corruption.
It could be fine to repair them using par2 files.
More details here in french: http://linuxfr.org/news/parchive-les-pr%C3%A9mices-dune-norme
Hello, thanks for this good software.
Maybe I use it poorly, but I want to save the report generated during the commit.
For example
Added: test_1.txt
Added: test_2.txt
Content modified: test_3.txt
It seems that, the log
command doesn't return it. Only this message appears:
2 added, 1 content modified
My actual poor wordaround: Run the commit command one time, get the report, cancel the commit, run the commit again with the comment filled with the report.
Hello!
My question is related to fim init
taking nearly 3 hours to complete as seen in the documentation.
Scenario
fim init
is taking up alot of CPU power and slowing down the users computer. I want to 'pause' the scanning and resume laterMy Testing
fim init
, I am not able to 'restart' the init without removing the .fim
directory.Question
1.) Is it possible to 'pause' the initial fim init
indexing and resume at a later time?
Thanks
When attempting to create a fim repository, I get the error message "Not allowed to create the 'X:.fim' directory that holds the Fim repository"
This is appening when accessing a CIFS file located on a synology NAS with admin read/write account
fim
finds duplicates, but cannot remove them, probably because of 'strange' (accented - bad UTF) characters in file or directory names.
I include one of those files for your tests.
Hello,
Using the fdup
sub-command, it would be nice to have an option to filter results to exclude some directories/filetype inside the repository. The --exclude
/ --include
options could use ant-like patterns to define paths, separated by :
for example.
Thank you
Issue created from #20
SELinux is used to label files which allows them to be accessible from certain processes and not others.
As files can contain sensible information, are checked if they are not tampered with (integrity), metadata such as SELinux is also worth to keep a look on.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.