Giter VIP home page Giter VIP logo

dict-fr-abu's Introduction

Installation

pip install dict-fr-ABU

French dictionaries from Association des Bibliophiles Universels (ABU)

DESCRIPTION

This package contains several dictionaries processed from those made available by the Association des Bibliophiles Universels (ABU) organization before 2003.

FILES

All files are installed in Python's /usr/local equivalent, under share/dict.

Original files

Filename Description
dict-fr-ABU-cites 39.076 French cities list (accented, with compound words), along with postal zip code
dict-fr-ABU-Header-cites French cities list (mandatory header)
dict-fr-ABU-dicorth 1.500 French orthographical difficulties by decreasing frequency (with compound words)
dict-fr-ABU-Header-dicorth French orthographical difficulties (mandatory header)
dict-fr-ABU-mots_communs 255.282 French common words (including female and plural forms, as well as conjugated verbs), along with singular / unconjugated form, and type
dict-fr-ABU-pays 170 countries and regions (with compound words)
dict-fr-ABU-Header-pays Countries and regions (mandatory header)
dict-fr-ABU-prenoms 12.437 firstnames (unaccented)
dict-fr-ABU-Header-prenoms Firstnames (mandatory header)
dict-fr-ABU-License ABU 1.1 License

Generated files

Filename Description
dict-fr-ABU-cites.ascii French cities list (unaccented)
dict-fr-ABU-cites.unicode French cities list (accented)
dict-fr-ABU-cites.combined French cities list (with both accented and unaccented words)
dict-fr-ABU-mots_communs.ascii French common words (unaccented)
dict-fr-ABU-mots_communs.combined French common words (accented)
dict-fr-ABU-mots_communs.unicode French common words (with both accented and unaccented words)
dict-fr-ABU-pays.ascii Countries and regions (unaccented)
dict-fr-ABU-pays.combined Countries and regions (accented)
dict-fr-ABU-pays.unicode Countries and regions (with both accented and unaccented words)
dict-fr-ABU-prenoms.ascii Firstnames (unaccented)

These generated files went through the following transformations:

  • extraction of the headers into the dict-fr-header-* files above
  • conversion from ISO-Latin-1 to UTF-8
  • sort
  • removal of duplicates
  • removal of lemma and grammatical info from dict-fr-ABU-mots_communs
  • removal of the zip codes from dict-fr-ABU-cites
  • lossless conversion of accents for the *-ascii versions
  • combination of the *-ascii and *-unicode versions into the *-combined ones (without duplicates)

SEE ALSO

spell(1) like tools, anagram(6)

HISTORY

These data files were originally intended to be used with the PNU project's anagram command, as well as many other text processing tools.

I wrote an history of Unix & French dictionaries (in French only), which covers this dictionary and many others.

LICENSE

The original contents, as well as this package, are licensed under the ABU 1.1 license.

Some source files had mandatory headers that were kept under data/dict-fr-ABU-Header-* rather than in the files themselves, in order to ease direct processing with other tools.

AUTHORS

Association des Bibliophiles Universels (ABU) for the original contents.

Hubert Tournier for the package.

dict-fr-abu's People

Contributors

hubtou avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.