Giter VIP home page Giter VIP logo

Comments (3)

hkage avatar hkage commented on June 14, 2024

Hi,

with the current version of pganonymize the Faker library will always be initialized with the default locale en_US.

To be able to use localized providers the locale should be added as an optional argument within the YAML schema definition or as an additional property for the FakeProvider. This is currently not supported but it is a great idea to get access to the localized providers as we would also possibly use localized data like VAT-IDs or states.

I will take a look into that, thank you for reporting / requesting this feature.

Regards,
Henning

from pganonymize.

hkage avatar hkage commented on June 14, 2024

I suppose the main difficulty for the implementation is a performance issue: if we pass the locale on a table's field level within the YAML schema and instantiate the Faker class for each table record (instead of module wide), this would result in a poor execution time, e.g.:

import timeit

>>> timeit.timeit('faker.first_name()', setup="import faker; faker = faker.Faker()", number=1000)
<<< 0.3215181827545166

>>> timeit.timeit('faker.Faker().first_name()', setup="import faker", number=1000)
<<< 14.740003108978271

So I guess the only way to prevent the initialization on record level is to provide something like global provider options within the YAML schema that will be passed to a single and reusable Faker instance, that will be used for all records, like this:

tables:
 - address:
    fields:
     - first_name:
        provider:
          name: fake.first_name
     - last_name:
         provider:
           name: fake.last_name
     - vat_id:
         provider:
           name: fake.ssn

options:
  faker:
    locales:
      - de_DE
      - fr_FR

Faker's multi localization mode could be also used to provide more than one locale, but this would also mean that common generator methods like first_name or last_name will result in random names (according to the locale order).

from pganonymize.

hkage avatar hkage commented on June 14, 2024

The localization feature will be part of the upcoming release 0.10.0 - thanks to @BuddhaOhneHals for the contribution.

from pganonymize.

Related Issues (18)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.