shreevatsa / ambuda Goto Github PK
View Code? Open in Web Editor NEWThis project forked from ambuda-org/ambuda
Fork of https://github.com/sanskrit/ambuda: A Sanskrit reader
Home Page: https://ambuda.org
License: MIT License
This project forked from ambuda-org/ambuda
Fork of https://github.com/sanskrit/ambuda: A Sanskrit reader
Home Page: https://ambuda.org
License: MIT License
Right now, the line-by-line view works immediately after "Run OCR", but after saving the page and returning after navigating away, it's all gone. Make it work, i.e. save the info to the backend too.
After reading/watching “Coping strategies for the serial project hoarder” by simonw (and also skimming of his earlier stuff like this/this/this), I'm inspired to try out the approach here. Namely:
As of 8f04ed9, works only intermittently.
Right now, the textarea editor already implements quite a few formatting/markup features:
<error>
and <fix>
)<flag>
)Need to add these to the PM schema, so that it's strictly an improvement over the textarea editor and not a regression.
(I plan to get to this eventually myself, based on the needs of the books I encounter, but listing this anyway…)
Right now in ambuda-org#132, the ProseMirror editor simply replaces the textarea editor. We instead need to modify the PR to have the ProseMirror editor apply only optionally. I think it should be an option at (proofreading) project creation time.
Originally posted by @shreevatsa in #49 (comment)
Right now I'm just saving the ProseMirror doc in JSON to the backend:
ambuda/ambuda/static/js/pm-editor/pm-editor.ts
Lines 298 to 300 in 9469f16
and
ambuda/ambuda/static/js/proofer.js
Lines 287 to 294 in 9469f16
Doing this—saving the JSON as the page's contents in the db—creates a strong coupling between the ProseMirror editor's schema and the backend. I'm strongly convinced from experience that this is actually the right thing to do, at least until we're sure the PM editor does not need further changes to the schema (that's a long way away), but we either
Right now, cannot select across a line boundary; need to figure out why.
Somewhere outside this project, should be able to use the page images and get something useful.
python3 get_proof_lg_regions.py
50 [{'page_id': 128, 'xmin': 576, 'xmax': 2479, 'ymin': 440, 'ymax': 769}]
51 [{'page_id': 128, 'xmin': 573, 'xmax': 2615, 'ymin': 769, 'ymax': 1063.5}]
52 [{'page_id': 128, 'xmin': 568, 'xmax': 2619, 'ymin': 1063.5, 'ymax': 1395.5}]
53 [{'page_id': 128, 'xmin': 567, 'xmax': 2275, 'ymin': 1395.5, 'ymax': 1723.5}, {'page_id': 129, 'xmin': 581, 'xmax': 2611, 'ymin': 427, 'ymax': 786.5}]
50f [{'page_id': 128, 'xmin': 282, 'xmax': 2619, 'ymin': 1723.5, 'ymax': 2667}]
51f [{'page_id': 128, 'xmin': 282, 'xmax': 2622, 'ymin': 2667, 'ymax': 3500.5}]
52f [{'page_id': 128, 'xmin': 278, 'xmax': 2609, 'ymin': 3500.5, 'ymax': 4188.5}]
53f [{'page_id': 128, 'xmin': 276, 'xmax': 2607, 'ymin': 4188.5, 'ymax': 4678}, {'page_id': 129, 'xmin': 282, 'xmax': 2619, 'ymin': 1815, 'ymax': 2623.5}]
54 [{'page_id': 129, 'xmin': 564, 'xmax': 2615, 'ymin': 786.5, 'ymax': 1113.5}]
55 [{'page_id': 129, 'xmin': 704, 'xmax': 2434, 'ymin': 1113.5, 'ymax': 1449}]
56 [{'page_id': 129, 'xmin': 562, 'xmax': 2613, 'ymin': 1449, 'ymax': 1815}]
54f [{'page_id': 129, 'xmin': 276, 'xmax': 2620, 'ymin': 2623.5, 'ymax': 3565.5}]
55f [{'page_id': 129, 'xmin': 272, 'xmax': 2625, 'ymin': 3565.5, 'ymax': 4678}]
53
it will show the two regions from the corresponding pages. This is (close to) the ultimate goal of the #32 project.Before merging (#49), the PM editor needs to be extensively tested, make sure all flows work etc. Either adding unit tests / integration tests, or even simply manual testing may be enough.
I already know that e.g. the edit conflict UI becomes unusable (#44) but that may be ok, as edit conflicts seem to be rare currently.
Correctly split a page image into individual lines.
See #32 (comment)
If a page hasn't had OCR run on it, then just do it, instead of asking the user to run Tools -> Run OCR.
Decide.
Some background on my goals here.
The obvious cost of using ProseMirror (to me) is that it comes with some conceptual cost—one has to spend an hour or two reading the docs https://prosemirror.net/docs/ (I don't even remember what was useful for me initially, but I think the blog post, looking at examples, and definitely reading the guide). This can be a barrier for casual contributors / keep the bus factor low. Some things I'm trying to do about this:
Wrote a quick-start guide: https://github.com/shreevatsa/ambuda/blob/9469f16995035889823a429b2bc9908141cc98d6/ambuda/static/js/pm-editor/README.md
Am trying to leave it to others to understand and merge the code, while I just work on it independently and post screenshots. This should give an idea of how much work is needed before others can easily make changes, e.g. maybe how much the quick-start guide needs to be expanded by or the code refactored.
(Note to self: I have other reasons for working on my own: (1) my previous approach of trying to sell the vision and get agreement on the value first before further work, e.g. here/here, wasn't as successful as actually producing (screenshots of) even barely working initial code, which is fair enough, and (2) it avoids having to deal with consensus issues like #53, and (3) I actually want to start using this for books of interest to me. But the "increase bus factor" reason is the most virtuous one :P)
Anyway, writing down this cost as a barrier for adoption / reason not to use ProseMirror (at least yet).
Need some changes to schema, so that for a page like the below, we can select individual lines and:
This will also be a good start for other schema changes we need to make (for commentaries etc).
Right now, if the label given to a group (verse/footnote) is incorrect, can only fix it (if it's not a local undo away) by redoing the entire page with Tools->Run OCR (and losing all other proofreading one may have done). Should be easier.
Maybe hitting Ctrl-V again on a group will ungroup it? See toggleLink
at https://prosemirror.net/examples/schema/ — may be helpful.
After the basic breaking into lines and verifying that the lines make sense, we don't really need the individual line images until we get to the actual proofreading step. So it would be nice to turn it off, and turn it back on when actually proofreading.
Right now there's a fair bit of Ambuda documentation on how to use the textarea editor:
All of this needs to be updated to apply to the PM editor and its features. (May be hard to do while the editor is still in flux… also I think recording a video may be good too.)
Whether to debug/highlight how the line-recognition is going, or to manually fix it, may be good to allow the user to draw/move lines.
Originally posted by @shreevatsa in #33 (comment)
We'd like words (or lines, at least) to correspond to regions of the image.
May help
Now that marking into groups (#43) is working, I think I already have most of what's needed for a (bare minimum) "Bookchop" (#32) MVP for personal use. The only thing to add is some way to use the manually marked groups outside of the proofreading UI. This is probably simple—I'm already saving the PM doc as JSON to the db, so probably just need to get out the doc and parse it—but creating this tracking issue as there may be things to think about.
This can be a visual way of debugging them.
What we need first is a UI to visualize the response from Google OCR (the four different ways to get the text out).
Make a small tweak to the way lines are detected currently, for #33
I think now that we have something basic working (see #59 (comment)) the next step is "process": complete "chopping" one book beginning to end; that should throw up lots of issues.
Originally posted by @shreevatsa in #32 (comment)
No time to leave details here, but see https://prosemirror.net/docs/guide/#view.node_views
Was looking at the DOMOutputSpec and it shouldn't be too hard.
Remove hard-coding of page here:
It's very annoying.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.