Giter VIP home page Giter VIP logo

ambuda's People

Contributors

akprasad avatar akshaypall avatar dependabot[bot] avatar epicfaace avatar kiranlakkur avatar kvchitrapu avatar pmarathe25 avatar shreevatsa avatar thefestest avatar vvasuki avatar

ambuda's Issues

Retain OCR-structural info after page save

Right now, the line-by-line view works immediately after "Run OCR", but after saving the page and returning after navigating away, it's all gone. Make it work, i.e. save the info to the backend too.

Project tracking

After reading/watching “Coping strategies for the serial project hoarder” by simonw (and also skimming of his earlier stuff like this/this/this), I'm inspired to try out the approach here. Namely:

  • “issue driven development”, where everything is done via GitHub issues with comments of you talking to yourself, and
  • leave everything documented, in a state where you can walk away at any point, and thus avoid guilt which is what kills personal projects.

Implement all the current markup features

Right now, the textarea editor already implements quite a few formatting/markup features:

  • Errors in the printed text (<error> and <fix>)
  • Things unclear in the printed text (<flag>)
  • Footnotes ^
  • Bold, italics, ? I don't know what else is supported.

Need to add these to the PM schema, so that it's strictly an improvement over the textarea editor and not a regression.

(I plan to get to this eventually myself, based on the needs of the books I encounter, but listing this anyway…)

Give names to the blocks

We now have grouping into lines (#43), but it would be nice to be able to give names to the groups (e.g. maybe "48", "49", "48f" and "49f" for the four sections in the screenshots in #43). This is faster to do when actually proofreading the page, instead of doing it manually later.

What gets saved to the DB?

Right now I'm just saving the ProseMirror doc in JSON to the backend:

export function toText(view: EditorView): string {
const doc = view.state.doc.toJSON();
return JSON.stringify(doc);

and

// Before the form is submitted, copy contents of the ProseMirror editor back to the textarea.
syncPMToTextarea() {
document.querySelector('textarea').value = this.textValue();
},
textValue() {
return toText(this.editorView());
},

Doing this—saving the JSON as the page's contents in the db—creates a strong coupling between the ProseMirror editor's schema and the backend. I'm strongly convinced from experience that this is actually the right thing to do, at least until we're sure the PM editor does not need further changes to the schema (that's a long way away), but we either

  • need consensus from the rest of the team that this is ok (Previous instance of non-consensus about this: https://groups.google.com/g/ambuda-discuss/c/ZnbfIML10-Y), or
  • build functions to save the PM structured doc to some appropriate XML on the backend and translate back from it so that we can recover the same doc on the frontend. IMO this may be conceptually clean but is just the same coupling with more steps.

Turn page regions data into something useful (images of the corresponding page regions)

Somewhere outside this project, should be able to use the page images and get something useful.

  • Recall the script's output from #46
python3 get_proof_lg_regions.py
50 [{'page_id': 128, 'xmin': 576, 'xmax': 2479, 'ymin': 440, 'ymax': 769}]
51 [{'page_id': 128, 'xmin': 573, 'xmax': 2615, 'ymin': 769, 'ymax': 1063.5}]
52 [{'page_id': 128, 'xmin': 568, 'xmax': 2619, 'ymin': 1063.5, 'ymax': 1395.5}]
53 [{'page_id': 128, 'xmin': 567, 'xmax': 2275, 'ymin': 1395.5, 'ymax': 1723.5}, {'page_id': 129, 'xmin': 581, 'xmax': 2611, 'ymin': 427, 'ymax': 786.5}]
50f [{'page_id': 128, 'xmin': 282, 'xmax': 2619, 'ymin': 1723.5, 'ymax': 2667}]
51f [{'page_id': 128, 'xmin': 282, 'xmax': 2622, 'ymin': 2667, 'ymax': 3500.5}]
52f [{'page_id': 128, 'xmin': 278, 'xmax': 2609, 'ymin': 3500.5, 'ymax': 4188.5}]
53f [{'page_id': 128, 'xmin': 276, 'xmax': 2607, 'ymin': 4188.5, 'ymax': 4678}, {'page_id': 129, 'xmin': 282, 'xmax': 2619, 'ymin': 1815, 'ymax': 2623.5}]
54 [{'page_id': 129, 'xmin': 564, 'xmax': 2615, 'ymin': 786.5, 'ymax': 1113.5}]
55 [{'page_id': 129, 'xmin': 704, 'xmax': 2434, 'ymin': 1113.5, 'ymax': 1449}]
56 [{'page_id': 129, 'xmin': 562, 'xmax': 2613, 'ymin': 1449, 'ymax': 1815}]
54f [{'page_id': 129, 'xmin': 276, 'xmax': 2620, 'ymin': 2623.5, 'ymax': 3565.5}]
55f [{'page_id': 129, 'xmin': 272, 'xmax': 2625, 'ymin': 3565.5, 'ymax': 4678}]
  • This means we should be able to have a webpage that lists all the regions, e.g. for 53 it will show the two regions from the corresponding pages. This is (close to) the ultimate goal of the #32 project.

Testing

Before merging (#49), the PM editor needs to be extensively tested, make sure all flows work etc. Either adding unit tests / integration tests, or even simply manual testing may be enough.

I already know that e.g. the edit conflict UI becomes unusable (#44) but that may be ok, as edit conflicts seem to be rare currently.

Bookchop

Some background on my goals here.

Cost of learning a new framework

The obvious cost of using ProseMirror (to me) is that it comes with some conceptual cost—one has to spend an hour or two reading the docs https://prosemirror.net/docs/ (I don't even remember what was useful for me initially, but I think the blog post, looking at examples, and definitely reading the guide). This can be a barrier for casual contributors / keep the bus factor low. Some things I'm trying to do about this:

  • Wrote a quick-start guide: https://github.com/shreevatsa/ambuda/blob/9469f16995035889823a429b2bc9908141cc98d6/ambuda/static/js/pm-editor/README.md

  • Am trying to leave it to others to understand and merge the code, while I just work on it independently and post screenshots. This should give an idea of how much work is needed before others can easily make changes, e.g. maybe how much the quick-start guide needs to be expanded by or the code refactored.

    (Note to self: I have other reasons for working on my own: (1) my previous approach of trying to sell the vision and get agreement on the value first before further work, e.g. here/here, wasn't as successful as actually producing (screenshots of) even barely working initial code, which is fair enough, and (2) it avoids having to deal with consensus issues like #53, and (3) I actually want to start using this for books of interest to me. But the "increase bus factor" reason is the most virtuous one :P)

Anyway, writing down this cost as a barrier for adoption / reason not to use ProseMirror (at least yet).

Mark lines as headers, groups, footnote

Need some changes to schema, so that for a page like the below, we can select individual lines and:

  • Mark a single line as a header
  • Mark a bunch of lines as a group (verse)
  • Mark a bunch of lines as a footnote

This will also be a good start for other schema changes we need to make (for commentaries etc).

image

Allow ungrouping

Right now, if the label given to a group (verse/footnote) is incorrect, can only fix it (if it's not a local undo away) by redoing the entire page with Tools->Run OCR (and losing all other proofreading one may have done). Should be easier.

Maybe hitting Ctrl-V again on a group will ungroup it? See toggleLink at https://prosemirror.net/examples/schema/ — may be helpful.

Button to turn off line images

After the basic breaking into lines and verifying that the lines make sense, we don't really need the individual line images until we get to the actual proofreading step. So it would be nice to turn it off, and turn it back on when actually proofreading.

Don't allocate empty space to first and last lines

Should be a quick fix but writing down instead of pursuing it: right now I allocate the empty space at the top of the page to the first line, and the empty space at bottom of page to the last line. They should be split, just be new separate lines.

First line:

image

Last line:

image

Highlighting region of image

We'd like words (or lines, at least) to correspond to regions of the image.

  • For now, this will help with debugging OCR response: #35
  • Even eventually, i.e. even after a page is proofread, it would be nice to retain this association so that anyone reading the text can check suspected typos by seeing the word in its context on the page.

Export doc (at least line groups) in usable format

Now that marking into groups (#43) is working, I think I already have most of what's needed for a (bare minimum) "Bookchop" (#32) MVP for personal use. The only thing to add is some way to use the manually marked groups outside of the proofreading UI. This is probably simple—I'm already saving the PM doc as JSON to the db, so probably just need to get out the doc and parse it—but creating this tracking issue as there may be things to think about.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.