Comments (3)
we do not differentiate between an
OcrdFile
's.local_filename
(which may be empty) and its.url
. The latter could still be downloaded into thedocument.directory
under some name and returned here.Or perhaps one could somehow make this downloading a lazy operation only to be triggered when actually needed.
BTW, that's also how most OCR-D processors handle this. They rely on Workspace.download_file
, which for non-local files will automatically download from the URL and store in the workspace (without actually changing the METS but with a reproducible local path, so subsequent attempts will use the local copy).
from browse-ocrd.
See support_remote_images branch for progress
from browse-ocrd.
One additional feature wish: A graceful way to handle failing downloads, e.g. showing just a placeholder image instead of crashing outright. This does happen in our collection for files in the PRESENTATION
fileGrp which references files by file://
URL that are not actually usable outside the network:
<mets:fileGrp USE="PRESENTATION">
<mets:file ID="FILE_0001_PRESENTATION" MIMETYPE="image/tiff">
<mets:FLocat xmlns:xlink="http://www.w3.org/1999/xlink" LOCTYPE="URL" xlink:href="file:///goobi/tiff001/sbb/PPN680203753/00000001.tif"/>
</mets:file>
I know that we should fix that on our side but that is not trivial to do and we're probably not the only ones (mis)using mets:FLocat
like this.
from browse-ocrd.
Related Issues (20)
- chdir to workspace HOT 1
- make TextView and PageView searchable
- cannot run in Python pre 3.7.2 anymore HOT 2
- Feature Request: Scroll lock panels
- Integrate page-xml-draw HOT 4
- PageView: PNG-Export
- PageView: Additional information in tooltip HOT 1
- page view: add baselines if available HOT 1
- PageView: component menu not editable HOT 2
- Display / edit of GT labelling metadata HOT 1
- ViewPage: ignore AlternativeImage if not retrievable HOT 6
- support path names with spaces
- ViewPage: ReadingOrder display ignores tables
- AlternativeImage selection glitch
- False warning about number of images per grp/page
- AttributeError: 'EntryPoints' object has no attribute 'get' HOT 2
- use last fileGrp as default HOT 3
- add CLI options for better Dockerization HOT 2
- Application crashes on launch HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from browse-ocrd.