Comments (18)
Wow, it's so pretty! This is some really nice work Jonas. I don't know what your preference is here, whether you'd like paperless-ng to supplant this project (take over the name, merge into this repo, etc) or if you're just promoting the project as a literal next-generation, but I just wanted to congratulate you on a nice job.
I haven't had time for a technical assessment (assuming you wanted one?) as I've got my hands full with presentations, another side project, and a 2 year old, but as far as I'm concerned this is a community project now. If there's strong support for full adoption of paperless-ng over the current core for v3.0, I'm cool with it. The one thing I'd mention though is that one of the strengths of the current system is that it runs well on low-powered (read Raspberry Pi) systems. If -ng requires more than that and can't be stripped down for such cases, that'd be a good argument for keeping yours as a separate fork.
from paperless.
Lets not derail the conversation too much. The discussion of "proper" encryption is a big (separate) one but I think anyone who looks at this closely would agree the encryption as it stands in paperless is in fact a false sense of security, which is why @jonaswinkler chose to remove it (a decision I agree with). The point is IMHO that -ng having removed encryption should not be a barrier to using -ng as the continuation of the project, its not a feature removal if the feature wasnt truly implemented in the first place.
As for the other apparent issue, does someone who uses a RPi as their primary host want to try it out?? Seems like we're so worried about low-resourced systems but most of the people commenting here aren't actually using one π. If its a major part of the user base then we should be able to find some folks and find out?!
from paperless.
All NG is missing is the userbase of paperless. I kind of feel sorry for all the users who find paperless today and start with it, not knowing there's NG.
Well, I am doing that right now... :-) I am fully aware of the existence of NG, however.
As for the other apparent issue, does someone who uses a RPi as their primary host want to try it out?? Seems like we're so worried about low-resourced systems but most of the people commenting here aren't actually using one. If its a major part of the user base then we should be able to find some folks and find out?!
Or would there any reason for anybody to prefer paperless over paperless-ng?
Just after learning about paperless, I found this issue and decided to try out NG directly on my RPi4. I didn't manage to set it up, however. Tried it with and without a virtual environment. Version 0.9.11 would not work at all, some python dependency hell, apparently. The dependency problems disappeared in versions 0.9.12 and later, but it throws a missing module in PIL when importing documents. After battling with it for some time, I gave up and installed paperless instead. I was able to get it up and running perfectly within 10 minutes. Now I should say that this RPi4 is running Debian sid and python 3.9. So this might be the source of the problems with NG. Paperless works perfectly, however. So, I am sticking with it for the time being. I am not well versed in python programming, but It seems strict PR review does have its benefits after all.
from paperless.
See: jonaswinkler/paperless-ng#456 (comment)
from paperless.
Just wanted to add I love this :) Big thanks for sharing @jonaswinkler π
I've been using Paperless OG for quite a while but have just switched. Running on a RPi4 via the latest multi-arch image through K8S and working perfectly.
from paperless.
Up to you. This fork will see some active development in the foreseeable future and I'm pushing for a first stable release. The last thing I want to get into there before that is the ability to add selectable text to scanned documents, both for new documents as well as documents that are already in the system.
from paperless.
So all the things said here make me support the idea of paperless-ng replacing this project, which of course would mean to make Jonas owner.
Paperless really was the basis of my motivation to get rid of all the papers, but paperless-ng was the thing still missing seeing paperless only approving PRs slowly and not really having changes frequently.
All NG is missing is the userbase of paperless. I kind of feel sorry for all the users who find paperless today and start with it, not knowing there's NG.
Or would there any reason for anybody to prefer paperless over paperless-ng?
from paperless.
Hey !
I am personally very excited by paperless-ng. I was wondering several weeks ago if I would migrate from paperless to papermerge (https://github.com/ciur/papermerge), but your project makes seems to be a good competitor (and will avoid me to write a papermerge/paperless mapping) !
Thanks for your amazing work !
from paperless.
My opinion is just as an end user and not a dev (Edit: am now contributing, still feel strongly should some day become the next version of paperless) on this project but I have to say Jonasβ work and enthusiasm suggest to me paperless-ng should be merged into the core. Thereβs a lot of work on that fork under the hood that I think is important to the longevity of the project too.
Very valid concern regarding low powered devices but just my +1 for adopting paperless-ng for v3.0. Bravo Jonas ππΌ
from paperless.
Thank you :)
The entire process of making this pretty has been incredibly fun. Also learned a couple things. I've never done any kind of UX work or front end design, I just took a couple libraries, mixed them together and tried to make it work. This bootstrap css framework has some pretty nifty stuff.
Oh, I certainly did not expect a technical assessment, that would be quite a task. I should have made that clear.
I'd rather want to get a feel for what the community feels is best for the future of the project and respect that. I'm fine either way!
Edit for the statement above: This is especially true since the new project does a couple things quite differently and I've chopped off a few things, such as encryption.
Regarding low-powered devices. I've got some good and some not-so-good news. The good news is that the new front end runs entirely in the browser and just uses the API to fetch data. Therefore, the server has to do much less work when serving the pages. The not-so-good news is that one of the new features does occasionally require a little bit more computing power, but that could be scheduled to run during the night. I've made this with the RPi in mind, but haven't extensively tested it on that platform.
Someone got it running on an RPi 4, but I haven't heard anything about performance yet.
from paperless.
Hi @jonaswinkler
Thank you so much for your work and effort.
I will put papereless-ng to the test and report to your repo.
from paperless.
Thanks. I really need some more feedback on what's workable and what need improvement. We're currently working on making the central filtering tools nice, the present implementation is rather bulky.
from paperless.
but as far as I'm concerned this is a community project now. ... one of the strengths of the current system is that it runs well on low-powered (read Raspberry Pi) systems.
This is a little bit cheese, isn't it?
@danielquinn , you set the rule that two (2) people have to approve a pull-request. How many people in your 'community' project have the permission to approve? You included three (3) but two of you never approve.
Strength (RPi), the one and only IF the software runs - because of lack of approving of fixing PR's.
Calling it community, doesn't make it so. I think this is unfair towards people who spent time writing PRs.
from paperless.
In total 8 people can approve, as I see it. But I've got the same feeling. I'd like to write a PR at times, but since I feel like we can't make it over the limit of 2 people if one of 2 (sometimes) active reviewers writes the PR, I refrain from doing so. So yeah, it's not so much fun, if you can't fix anything yourself and are limited to only looking at other people's code all the time.
from paperless.
Or would there any reason for anybody to prefer paperless over paperless-ng?
Maybe only the better (?) support of low-powered devices and the use of encryption via GPG?
from paperless.
Yes, we should figure out if it's really better in every meaning:
- low-powered device
@jonaswinkler, what is the referred function? Some AI bit? Would it be possible to deactivate that in case anybody doesn't it to block the Pi at night? - encryption
That's what I thought, too, when I read that Jonas removed it. But then I looked at the reason and understood that the solution currently implemented by paperless is not really a secure thing, rather a bit pseudo-secure (key under doormat). And from what I understood, too, Jonas would be willing to bring in encryption again once there is a working idea on how to do it
I hope, Daniel, you don't get me wrong when I say that NG might be better in every meaning! I absolutely adore what you have created, but I am super happy that Jonas continued your work instead of starting from scratch like many others. I am sure that is why this is the best solution from my point of view.
from paperless.
- low-powered device
@jonaswinkler, what is the referred function? Some AI bit? Would it be possible to deactivate that in case anybody doesn't it to block the Pi at night?
If you don't use "Auto" matching, the logic in question won't be invoked at all. I don't run this on a Pi, so I have no idea about performance. My gut feeling is that the web UI should be much more responsive.
- encryption
That's what I thought, too, when I read that Jonas removed it. But then I looked at the reason and understood that the solution currently implemented by paperless is not really a secure thing, rather a bit pseudo-secure (key under doormat). And from what I understood, too, Jonas would be willing to bring in encryption again once there is a working idea on how to do it
Apart from that, the database stores unencrypted content for searching, even if encryption was enabled. That contains all your personal information from your documents, credit card numbers, addresses, maybe even passwords if sent via postal mail, all the things you purchased, your bank account history, etc.
The way you'd implement security in a system like this would be as follows
- Encrypt all information with a public/private key system, where documents are encrypted with a public key,
and the private key is only ever temporarily provided by the user when doing requestsand the private key is never sent to the server. All decryption is done in the browser on the client. This is how lastpass works, for example. - However, this would mean that even the server itself does not have access to clear text information. This in turn means that
- No auto matching, since the server cannot access clear text content to update the algorithm.
- No full text search index, searching will be slow (always decrypt all content on every request and search within there)
A system like that has to be designed with this concept in mind from the very beginning. It's very unlikely I'll add something like that to paperless. For example, we can't just encrypt all the database fields as well, since
- This still allows someone to figure out how many documents there are, how many documents from one particular (yet unknown) correspondent. It's possible to derive information even from encrypted data. This is similar to how its possible to derive information from improperly encrypted file systems by examining unused areas.
- How do we handle file names? These need to be encrypted as well.
There's lots of things involved in doing this properly.
from paperless.
I have only started reading up on paerless and intend to start using it, but I'd like to comment on the encryption topic.
There are multiple attack vectors; here are four from the top of my head:
- someone getting access to the hardware (e.g. computer stolen)
- someone getting access to the file system (by attaching a keyboard to your RasPi, through a remote shell, ...)
- someone getting access to the database (locally or remotely)
- someone getting access to your documents by privilege escalation (i.e. a bug in paperless)
There's also the posibility of transport-level attacks (e.g. MITM) or malicious admins, but these are separate topics.
To protect against 1), you could use an encrypted filesystem so that someone stealing your computer could not mount it to read the contents. This can be done by everyone already without needing any change in paperless.
For 2) however, an encrypted filesystem does not help, because when the filesystem is mounted, the contents is nicely decrypted. To protect against this, you would need to encrypt the files themselves separately (also the database storage). You would need to decrypt them in-memory only and you would need to make sure that the encryption key is not available to the attacker, e.g. by keeping the key only in memory (if at all). It might still be possible to read the key from memory, but that's a different topic. You would need to ask for the key on every start of paperless, of course.
To protect against 3), you could encrypt the database, so that the contents are unreadable without access to the key. This also covers the database part of 2). See e.g. https://stackoverflow.com/a/5877130 for sqlite encryption.
Protection against 4) on encryption-level is hard. You would need to use separate keys per user, essentially making it impossible for paperless itself to access the data (as you mentioned yourself).
IMHO, an encrypted filesystem (e.g. https://en.wikipedia.org/wiki/EncFS) for the documents and an encrypted database would be sensible options with a "master key" to be provided on startup. If you don't want to protect against 2), you could even store the password for the database encryption inside the encrypted filesystem. That way the user would not need to provide the password for starting paperless (only when mounting the encrypted filesystem). encFS also encrypts filenames, btw.
Good encryption also comes with the price of making sure to never lose the master key, of course.
from paperless.
Related Issues (20)
- Uploade with Webgui or App HOT 3
- docker-compose fails to build with the last version of Pipenv HOT 3
- Correnspondent picked from filename
- Docker Container Unhealthy
- Problem pulling static content with reverse proxy
- ImportError: cannot import name 'FieldDoesNotExist' from 'django.db.models' in __init__.py
- Not detecting new files via ftp only via smb
- Consumer uses 100% CPU when idle HOT 3
- [Feature] - Templates for OCR (Zonal OCR) using KULL
- Problem using docker-compose HOT 2
- gunicorn cannot read files? wrong permissions? HOT 2
- consumer not running in Synology Docker HOT 9
- Provide as Yunohost App
- Dockerfile: Unable to open /etc/sudoers: Permission denied HOT 1
- Disabling encryption failing after one file HOT 10
- Docker Install : No such file or directory 'python3 HOT 2
- Docker install: ERROR: for consumer Container "a713bc3650c5" is unhealthy.
- ERROR Error while consuming document img_20180606_204601.893.jpg: Invalid rotation (0) HOT 1
- Paperless in Kubernetes with NFS Backing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paperless.