Giter VIP home page Giter VIP logo

Comments (7)

itkach avatar itkach commented on August 14, 2024

I propose to add support for pronunciations into Aard to make it easy to learn new words by hearing them.

I see 4 choices how to achieve this:

Include audio into the slob format.
Implementation: format needs to be changed.

It doesn't. Slob can contain data of any type.

Audio can be quite heavy.
Pronunciations are not really connected to dicts themselves. Non-orthogonal solution.

It kind of is. Context of where audio appears is important, and actually correlating some abstract set of audio data for pronunciations back to "words" is rather non-trivial. Which pronunciation variant is it? Which language, which meaning, are there spelling variants?

Use separate audio-dictionary.
Implementation: archive of audio files, multi-track audio format (like mogg), brand new format?

It's possible to package media (or any content type) to package into a separate .slob and treat it as if it were part of some other dictionary (well, almost) - by specifying uri tag in .slob that matches some other dictionary's uri.

Audio can be quite heavy.
Use online services.
Implementation: use API or crawl sites.

"Crawl sites" means download audio resources and include into dictionary. All "API" that's needed though is an http link to audio file.

Use an external program.
Implementation: easy.
UX: depends on the program, additional click.

Which program for example?

Audio is really not much different from images, just another resource that is identified by a URL, dictionary-local or otherwise. Perhaps a good start would be to compile a Wiktionary without filtering out audio-related markup and see if we can get audio to play using Wiktionary's online hosted audio-files.

from aard2-android.

rominf avatar rominf commented on August 14, 2024

It doesn't. Slob can contain data of any type.

Cool! Didn't know that. I had a wrong assumptions what Slob format is. Now, after reading https://github.com/itkach/slob I got better understanding.

Which program for example?

https://play.google.com/store/apps/details?id=ru.o2genum.howtosay

It's possible to package media (or any content type) to package into a separate .slob and treat it as if it were part of some other dictionary (well, almost) - by specifying uri tag in .slob that matches some other dictionary's uri.

OK, I like the idea of using separate Slob with audio for dictionaries. Didn't know that .slob files can be connected via uris.

Perhaps a good start would be to compile a Wiktionary without filtering out audio-related markup and see if we can get audio to play using Wiktionary's online hosted audio-files.

That would be cool. Another thing that could be done with Wiktionary is to crawl it like this (pseudocode):

audio_dicts = {}
for lang in wiktionary.langs:
    audio_dicts[lang] = Slob()
    for word in wiktionary[lang].articles:
      audio_dicts[lang][str(word)] = generate_article(extract_audio_files(word))
for lang, audio_dict in audio_dicts.items():
    audio_dict.save('wiktionary_{}.slob'.format(lang))

The biggest problem I see is an inconsistency of paragraphs of pronunciations between different languages (see https://en.wiktionary.org/wiki/test and https://ru.wiktionary.org/wiki/%D0%BF%D1%80%D0%BE%D0%B2%D0%B5%D1%80%D0%BA%D0%B0 for example). I think that probably the easiest way to get around this is to wait until Wikimedia pushes Wiktionary's words into Wikidata.

from aard2-android.

qnga avatar qnga commented on August 14, 2024

Is it possible to include audio into slob files making them accessible over HTTP but not by any key? I can see in xdxf2slob that CSS and JS resources are actually included with a key starting with "~/". Is this the only way? How does this work? I have difficulties in understanding this from the code bases.

from aard2-android.

itkach avatar itkach commented on August 14, 2024

Is it possible to include audio into slob files making them accessible over HTTP but not by any key?

not sure what you're asking

slob is a simple key-value store, text for keys and arbitrary bytes for value along with content-type specifying how to interpret the bytes. In aard2 (android and web), content is served by a customer embedded web server that interprets request urls and translates them into keys to look up. "~/" is just a convention, there's nothing special about it. You can type ~/css in lookup and see list of all resources starting with that key across all the dictionary.

from aard2-android.

qnga avatar qnga commented on August 14, 2024

Sorry, I was too elliptical.
I would like to use locally-stored HTML resources in a dictionary, i.e. images and audio files that are not meant to be accessed as separate articles. The most natural way I can think of is adding them in the slob without any key and linking to them by blob id rather than by key. Is this allowed by both lib and Aaard's web server? I don't think so. Otherwise, I can use keys prefixed by "~/" and rely on them for linking. Is this the preferred way?

from aard2-android.

itkach avatar itkach commented on August 14, 2024

without any key and linking to them by blob id rather than by key. Is this allowed by both lib and Aaard's web server? I don't think so.

Right, there has to be a key.

keys prefixed by "~/" and rely on them for linking. Is this the preferred way?

Yes.

It's quite similar to regular web sites - images, css and javascript all have urls and are downloaded in the same way as the main html content, it's just that users typically don't type those urls in browser address bar or look directly at those resources. But they can.

from aard2-android.

qnga avatar qnga commented on August 14, 2024

Okay, I understood keys more as entry points rather than URLs. However, a list of all contents is actually never showed in Aaard, and an user is unlikely to search words beginning with ~/.

Thank you for this clarification.

from aard2-android.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.