Giter VIP home page Giter VIP logo

datafaker's Introduction

Data Faker

Maven Status License codecov

This library is a modern fork of java-faker with up to date libraries and several newly added Fake Generators.

Datafaker 2.x has Java 17 as the minimum requirement.

If Java 17 is not an option for you, you can choose to use Datafaker 1.x. Datafaker 1.x is built on Java 8, but this version is no longer maintained. We recommend all users to upgrade to Datafaker 2.x.

This library generates fake data, similar to other fake data generators, such as:

It's useful when you're developing a new project and need some pretty data for showcase.

Usage

In the pom.xml, add the following fragment to the dependencies section:

<dependency>
    <groupId>net.datafaker</groupId>
    <artifactId>datafaker</artifactId>
    <version>2.1.0</version>
</dependency>

For Gradle users, add the following to your build.gradle file.

dependencies {
    implementation 'net.datafaker:datafaker:2.1.0'
}

You can also use the snapshot version (2.1.1-SNAPSHOT), which automatically gets published after every push to the main branch of this repository. Binary repository URL for snapshots download is https://s01.oss.sonatype.org/content/repositories/snapshots/.

Get started

In your Java code:

Faker faker = new Faker();

String name = faker.name().fullName(); // Miss Samanta Schmidt
String firstName = faker.name().firstName(); // Emory
String lastName = faker.name().lastName(); // Barton

String streetAddress = faker.address().streetAddress(); // 60018 Sawayn Brooks Suite 449

Or in your Kotlin code:

val faker = Faker()

val name = faker.name().fullName() // Miss Samanta Schmidt
val firstName = faker.name().firstName() // Emory
val lastName = faker.name().lastName() // Barton

val streetAddress = faker.address().streetAddress() // 60018 Sawayn Brooks Suite 449

JShell

# from project root folder
jshell --class-path $(ls -d target/*.jar | tr '\n' ':')
|  Welcome to JShell -- Version 17.0.4
|  For an introduction type: /help intro

jshell> import net.datafaker.Faker;

jshell> var faker = new Faker();
faker ==> net.datafaker.Faker@c4437c4

jshell> faker.address().city();
$3 ==> "Brittneymouth"

jshell> faker.name().fullName();
$5 ==> "Vernie Schmidt"

Expressions

Faker faker = new Faker();
faker.expression("#{letterify 'test????test'}"); // testqwastest
faker.expression("#{numerify '#test#'}"); // 3test5
faker.expression("#{templatify 'test','t','q','@'}"); // @esq
faker.expression("#{examplify 'test'}"); // ghjk
faker.expression("#{regexify '[a-z]{4,10}'}"); // wbevoa
faker.expression("#{options.option '23','2','5','$','%','*'}"); // *
faker.expression("#{date.birthday 'yy DDD hh:mm:ss'}"); // 61 327 08:11:45
faker.expression("#{csv '1','name_column','#{Name.first_name}','last_name_column','#{Name.last_name}'}");
// "name_column","last_name_column"
// "Sabrina","Kihn"
faker.expression("#{json 'person','#{json ''first_name'',''#{Name.first_name}'',''last_name'',''#{Name.last_name}''}','address','#{json ''country'',''#{Address.country}'',''city'',''#{Address.city}''}'}");
// {"person": {"first_name": "Barbie", "last_name": "Durgan"}, "address": {"country": "Albania", "city": "East Catarinahaven"}}

also more examples at https://www.datafaker.net/documentation/expressions/

Collections

Faker faker = new Faker();
List<String> names = faker.collection(
                              () -> faker.name().firstName(),
                              () -> faker.name().lastName())
                         .len(3, 5)
                         .generate();
System.out.println(names);
// [Skiles, O'Connell, Lorenzo, West]

more examples about that at https://www.datafaker.net/documentation/sequences/

Streams

Faker faker = new Faker();
// generate an infinite stream
Stream<String> names = faker.stream(
                              () -> faker.name().firstName(),
                              () -> faker.name().lastName())
                         .generate();

Formats

Schema

There are 2 ways of data generation in specific formats

  1. Generate it from scratch
  2. There is already a sequence of objects and we could extract from them some values and return it in specific format

For both cases we need a Schema which could describe fields and a way of data generation. In case of generation from scratch Suppliers are enough, in case of transformation Functions are required

CSV

// transformer could be the same for both
CsvTransformer<Name> transformer =
        CsvTransformer.<Name>builder().header(true).separator(",").build();
// Schema for from scratch
Schema<Name, String> fromScratch =
    Schema.of(field("firstName", () -> faker.name().firstName()),
        field("lastname", () -> faker.name().lastName()));
System.out.println(transformer.generate(fromScratch, 2));
// POSSIBLE OUTPUT
// "first_name" ; "last_name"
// "Kimberely" ; "Considine"
// "Mariela" ; "Krajcik"
// ----------------------
// Schema for transformations
Schema<Name, String> schemaForTransformations =
    Schema.of(field("firstName", Name::firstName),
        field("lastname", Name::lastName));
// Here we pass a collection of Name objects and extract first and lastnames from each element
System.out.println(
    transformer.generate(
        faker.collection(faker::name).maxLen(2).generate(), schemaForTransformations));
// POSSIBLE OUTPUT
// "first_name" ; "last_name"
// "Kimberely" ; "Considine"
// "Mariela" ; "Krajcik"

JShell

# from project root folder
jshell --class-path $(ls -d target/*.jar | tr '\n' ':')
|  Welcome to JShell -- Version 17.0.4
|  For an introduction type: /help intro

jshell> import net.datafaker.Faker;

jshell> import net.datafaker.providers.base.Name;

jshell> import net.datafaker.transformations.Schema;

jshell> import net.datafaker.transformations.CsvTransformer;

jshell> import static net.datafaker.transformations.Field.field;

jshell> var faker = new Faker();
faker ==> net.datafaker.Faker@c4437c4

jshell> Schema fromScratch =
   ...>     Schema.of(field("firstName", () -> faker.name().firstName()),
   ...>         field("lastname", () -> faker.name().lastName()));
fromScratch ==> net.datafaker.transformations.Schema@306a30c7

jshell> CsvTransformer<Name> transformer =
   ...>     CsvTransformer.<Name>builder().header(false).separator(",").build();
transformer ==> net.datafaker.transformations.CsvTransformer@506c589e

jshell> System.out.println(transformer.generate(fromScratch, 2));
"firstName","lastname"
"Darcel","Schuppe"
"Noelle","Smitham"

JSON

Schema<Object, ?> schema = Schema.of(
    field("firstName", () -> faker.name().firstName()),
    field("lastName", () -> faker.name().lastName())
    );

JsonTransformer<Object> transformer = JsonTransformer.builder().build();
String json = transformer.generate(schema, 2);
// [{"firstName": "Oleta", "lastName": "Toy"},
// {"firstName": "Gerard", "lastName": "Windler"}]

More complex examples and other formats like YAML, XML could be found at https://www.datafaker.net/documentation/formats/

Unique Values

Faker faker = new Faker();

// The values returned in the following lines will never be the same.
String firstUniqueInstrument = faker.unique().fetchFromYaml("music.instruments"); // "Flute"
String secondUniqueInstrument = faker.unique().fetchFromYaml("music.instruments"); // "Clarinet"

More examples can be found in https://www.datafaker.net/documentation/unique-values

Custom provider

Add your own custom provider in your app following steps from https://www.datafaker.net/documentation/custom-providers/

Documentation

Getting started.

Contributions

See CONTRIBUTING.md

If this is your first time contributing then you may find it helpful to read FIRST_TIME_CONTRIBUTOR.md

Providers

The list below is not complete and shows only a part of available providers. To view the full list of providers, please follow the link: Full list of providers.

  • Address
  • Ancient
  • Animal
  • App
  • Appliance
  • Aqua Teen Hunger Force
  • Artist
  • Australia
  • Avatar
  • Aviation
  • AWS
  • Azure
  • Babylon 5
  • Back To The Future
  • Barcode
  • Baseball
  • Basketball
  • Battlefield 1
  • Beer
  • Big Bang Theory
  • Blood Type
  • Bojack Horseman
  • Book
  • Bool
  • Bossa Nova
  • Brand
  • Breaking Bad
  • Brooklyn Nine-Nine
  • Buffy
  • Business
  • CNPJ (Brazilian National Registry of Legal Entities)
  • CPF (Brazilian individual taxpayer registry identification)
  • Camera
  • Cat
  • Chuck Norris
  • Clash of Clans
  • Code
  • Coin
  • Color
  • Commerce
  • Community
  • Company
  • Compass
  • Computer
  • Control
  • Country
  • Credit Card Type
  • Cricket
  • Crypto
  • Currency
  • Date and Time
  • DC Comics
  • Demographic
  • Departed
  • Dessert
  • Device
  • Disease
  • Doctor Who
  • Dog
  • Domain
  • Doraemon
  • Dragon Ball
  • Driving License
  • Dumb and Dumber
  • Dune
  • Durations
  • Educator
  • Elden Ring
  • Elder Scrolls
  • Electrical Components
  • Emoji
  • England Football
  • Esports
  • Fallout
  • Family Guy
  • Famous Last Words
  • File
  • Final Space
  • Finance
  • Food
  • Formula 1 (:racing_car:)
  • Friends
  • Fullmetal Alchemist: Brotherhood
  • Funny Name
  • Futurama
  • Game Of Thrones
  • Garment Size
  • Gender
  • Ghostbusters
  • Grateful Dead
  • Greek Philosopher
  • Hacker
  • Harry Potter
  • Hashing
  • Hearthstone
  • Heroes of the Storm
  • Hey Arnold
  • Hipster
  • Hitchhiker's Guide To The Galaxy
  • Hobbit
  • Hobby
  • Horse
  • House
  • How I Met Your Mother
  • IdNumber
  • Industry Segments
  • Internet
  • Job
  • Joke
  • K-pop (Korean popular music)
  • Kaamelott
  • Language Code
  • League Of Legends
  • Lebowski
  • Locality
  • Lord Of The Rings
  • Lorem
  • Marketing
  • Marvel Snap
  • Mass Effect
  • Matz
  • MBTI
  • Measurement
  • Medical
  • Military
  • Minecraft
  • Money
  • Money Heist
  • Mood
  • Mountaineering
  • Mountains
  • Movie
  • Music
  • Name
  • Naruto
  • Nation
  • Nato Phonetic Alphabet
  • Nigeria
  • Number
  • One Piece
  • Options
  • Oscar Movie
  • Overwatch
  • Passport
  • Password
  • Phone Number
  • Photography
  • Planet
  • Pokemon
  • Princess Bride
  • Programming Language
  • Red Dead Redemption 2
  • Relationship Terms
  • Resident Evil
  • Restaurant
  • Rick and Morty
  • Robin
  • Rock Band
  • RuPaul's Drag Race
  • Science
  • Seinfeld
  • Shakespeare
  • Silicon Valley
  • Simpsons
  • Sip
  • Size
  • Slack Emoji
  • Soul Knight
  • Space
  • StarCraft
  • StarTrek
  • Stock
  • Studio Ghibli
  • Subscription
  • Super Mario
  • Superhero
  • Tea
  • Team
  • The IT Crowd
  • Time
  • Touhou
  • Tron
  • Twin Peaks
  • Twitter
  • University
  • Vehicle
  • Verb
  • Volleyball
  • Weather
  • Witcher
  • Yoda
  • Zelda

Usage with Locales

Faker faker = new Faker(new Locale("YOUR_LOCALE"));

For example:

new Faker(new Locale("en", "US")).address().zipCodeByState("CA"));

Supported Locales

  • ar
  • bg
  • ca
  • ca-CAT
  • cs
  • da-DK
  • de
  • de-AT
  • de-CH
  • el-GR
  • en
  • en-AU
  • en-au-ocker
  • en-BORK
  • en-CA
  • en-GB
  • en-IND
  • en-MS
  • en-NEP
  • en-NG
  • en-NZ
  • en-PAK
  • en-SG
  • en-UG
  • en-US
  • en-ZA
  • en-PH
  • es
  • es-MX
  • fa
  • fi-FI
  • fr
  • he
  • hu
  • in-ID
  • it
  • ja
  • ka
  • ko
  • nb-NO
  • nl
  • pl
  • pt
  • pt-BR
  • ru
  • sk
  • sv
  • sv-SE
  • tr
  • uk
  • vi
  • zh-CN
  • zh-TW

LICENSE

Copyright (c) 2024 Datafaker.net See the LICENSE file for license rights and limitations.

datafaker's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datafaker's Issues

Generate password with both lower and upper case guaranteed

Is your feature request related to a problem? Please describe.
The password generator doesn't allow generating passwords with both lower and upper case in a consistent way.
Right now, the decision to have an upper or lower case character is defined by this.faker.bool().bool()
This will regularly generate an incorrect password, certainly if it's short.

Describe the solution you'd like
I want to be able to consistently generate a password with at least:

  • a given minimum & maximum length
  • 1 lower case
  • 1 upper case
  • 1 special
  • 1 number

Locale "ar" - Generating Male FullName with Female Title and Vice Versa.

Describe the bug
Using "Locale("ar") to generate a fake full name, the generated male full name is associated with the female's title and vice versa.

To Reproduce

Faker fake = new Faker(new Locale("ar")); 
String fullName = fake.name().fullName(); 

Expected behavior

Female Example 1:
Generated By FakeData : السيّد الخنساء زايد
With Correct Title: السيّدة الخنساء زايد

Female Example 2:
Generated By FakeData: الدكتور فاطمة العديني
With Correct Title: الدكتورة فاطمة العديني

Male Example:
Generated By FakeData : السيّدة عبدالعزيز بامحفوظ
With Correct Title: السيّد عبدالعزيز بامحفوظ

Versions:

  • OS: Windows 10
  • JDK: Java 17
  • Faker Version 1.1.0

Provide a (better) changelog

First off, thanks a lot for this well maintained fork of javafaker. Very much appreciated!

IMO, this project is lacking a "proper" changelog. There is https://www.datafaker.net/releases/1.4.0/, but it's not linked from the Github releases page, so it's not that easily found when one has a very Github-centric workflow.
In fact, there are just tags, not "proper" releases here on Github.

I suggest create a release entry per actual release and place a link to the respective website page.
And to have a detailed list of changes I suggest to let GH auto-generate it (there is a button on the release creation page).

Also, being the maintainer of a GH project myself (https://github.com/gitflow-incremental-builder/gitflow-incremental-builder), I found Github milestones to be very handy.

Normalized phone_number

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
At my job I often use phone_number feature, but I work with phone numbers without any signs (only country code and digits). I have to normalize phone numbers by myself.

Describe the solution you'd like
A clear and concise description of what you want to happen.
It would be perfect to add new method normalizedPhoneNumber() to PhoneNumber class which returns normalized phone.

Of cource I can implement it by myself

weapons feature request

I read the issues of javafaker and kown that this is the fork repo which mostly update, so i create a same issue, hoping someone fix it

Is your feature request related to a problem? Please describe.

There are many weapons in modern society.Please consider adding its feature.

Describe the solution you'd like

Weapon type, The bullet diameter, manufacture etc

Apply spotless autoformating with google-java-format

Is your feature request related to a problem? Please describe.
This is a follow-up task for #329
The issue is that if we blindlessly apply spotless google-java-format then formatting required by spotless will differ from formatting done by IDE.

Describe the solution you'd like
The idea is to setup google-java-format plugin for IntellijIdea, Eclipse as mentioned at https://github.com/google/google-java-format
and make it used by default for the project.
Then we could enable autoformatting for spotless since both spotless and IDE will produce same formatting

WDYT?

// cc @bodiam @yuokada

Add uuidv3

Is your feature request related to a problem? Please describe.
There is currently no way of generating repeatable/reproducible UUIDs. They are always random, even if a constant random seed is provided.

Describe the solution you'd like
Add internet().uuidv3() which is based on a parameter we can control. We could simply generate random bytes using the random service (which in turn uses the constant random seed). The output would be of the same structure as the existing uuid() method. It would merely be uuid v3 instead of v4.

This has been discussed in #278. Created a new issue to give this issue/solution a bit more "exposure/visibility".

Unclear on how to use Format.toCsv(FakeCollection<T> collection)

This is neither a bug nor a feature request, that's why I'm writing in free form, I hope that's okay.

I'm trying to do the following:

  1. Generate valid CSV data (using datafaker)
  2. Give CSV string to a parser (the one under test)
  3. Compare parsed entities with the generated CSV data

How could one go about doing this? I saw that there is a Format.toCSV() overload that accepts a collection but it's unclear to me how you'd use that method.

How to generate JSON with different data providers

[{
    "firstName": "Tesha", 
    "lastName": "Wyman", 
    "address": "Algeria"
}]
[{
    "firstName": "Brenton", 
    "lastName": "Schumm", 
    "address": ["Saint Pierre and Miquelon"]
}]

Currently, I found only way to generate JSON with different providers via approach like this .set("phones", name -> faker.collection(() -> faker.phoneNumber().phoneNumber()).len(2).build().get()), but this way outputs as a list (code snippet 2.). Is it possible to generate JSONs as in code snippet 1.?

"address": "Algeria" and "address": ["Saint Pierre and Miquelon"], I need to achieve the first result.

Data generation repeatability and Faker attribute access

Is your feature request related to a problem? Please describe.
In order to support data generation repeatability, it is important to be able to easily set the Random seed. Currently, the Faker constructor signatures make this a difficult task when a Faker uses another Faker internally.

Faker A should be able to easily pass along its Random when constructing Faker B. While you can get access to the RandomService, you cannot easily obtain the Random object within it. Using the RandomService to construct Faker B then requires other parameters to do so.

A similar problem exists with Locales last I recall, where it is difficult to access the current configuration of a given Faker.

Describe the solution you'd like
Attributes like Random and Locale should be accessible after a given Fake is constructed. This would improve the ability to utilize Faker's internally within other fakers, and help to improve logging capabilities where it might be important to trace some of these elements for test repeatability.

A bug in the job method

Describe the bug
The bug about Chinese still exists, just like the https://github.com/DiUS/java-faker/ library
To Reproduce

import net.datafaker.Faker;
import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;

import java.util.Locale;

@SpringBootTest
class SpringBootStudyApplicationTests {

    @Test
    void testFaker() {
        Faker faker = new Faker(Locale.CHINA);
        System.out.println(faker.name().name());
        System.out.println(faker.job().field());
        System.out.println(faker.job().position());
        System.out.println(faker.job().title());
    }
}

result
image

Versions:

  • OS: windows
  • JDK 8
  • Faker 1.6.0

Code Style issue

Issue:

Most of the source files are using '4 indentation code style'.
But at the same time some of the files are written with '2 indentation code style'.
It's quite inconvenient to make changes in 2 different files, with 2 different code styles.

Describe the solution you'd like

Define the rules of code style and probably add some static code analyzer, which could check a number of rules, including indentations...

Describe alternatives you've considered

I don't see any alternatives, as many people use different IDEs with different code style configurations.

Example

image

image

Architectural changes creating either modules or abstractions for the generators

Context

This is an excellent fork from JavaFaker that adds everything that the project commuters aren't approving. Most of the requests are either improving the library or adding new generators.

Problem state

One of the problems from the fork is that we got a lot of "not so focused and useful" generators, as a lot of movies something (in JavaFaker almost of half generators are movie-related).

Proposition

Either create modules for those kinds of generators (movies, for example) or the creation of one more abstraction layer to group different generators.

New Module

It would be a new library that can be added on the top of the main one. For example:

Main library with "most important generators"

<dependency>
    <groupId>net.datafaker</groupId>
    <artifactId>datafaker</artifactId>
    <version>1.3.0</version>
</dependency>

Library extension

<dependency>
    <groupId>net.datafaker</groupId>
    <artifactId>datafaker-movies-module</artifactId>
    <version>1.0.0</version>
</dependency>

Pros

  • See only generators that make sense in some context
  • Isolation during development, fixes, and releases
  • Centralized contributions

Cons

  • Hard to define the "most important generators"
  • Possible changes in the current architecture
  • Different places to maintain, as most of the generators would be modules
  • Breaking change added for this adoption
  • Time-consuming to implement

New abstraction

It consists of the addition of one more object during the Fluent Interface usage, like:

faker.medical().bloodType()...
faker.business().creditCardType()...
faker.movies().avatar()...
faker.personal().address()...
faker.sports().formula1()...

Pros

  • Beter code usage from the consumer side, there it can focus on finding and using the generators
  • More consistent architecture
  • Code isolation

Cons

  • Breaking change added for this adoption

Benefits

Also, in a corporate world, it's sad trying to use a generator and see a lot of those generators.

Final comments

I believe that the adoption of one of the approaches will bring benefits to this project.
And I am open to helping in the implementation.

The edit link in public docs pages get 404, going to master not main

Describe the bug
If you visit a page like https://www.datafaker.net/documentation/usage/, and try to use the edit button (such as to suggest a typo), that edit button's link goes to:
https://github.com/datafaker-net/datafaker/edit/master/docs/documentation/usage.md

which gets a 404. That should instead be main, not master, so:
https://github.com/datafaker-net/datafaker/blob/main/docs/documentation/usage.md

I tried to find any reference for that in the code base here (to suggest a pr) but could not. Perhaps some process outside this repo is what controls creating the publication of the docs on the datafaker.net site.

I do understand why and how this problem came to be, and how this could easily slip through the cracks. I just share it to help you help others... to help you. :-)

I can tell you that on a mobile, it wasn't so easy to connect the dots of what was happening. Other casual readers who, on mobile, might have tripped on this may not have had the time/gumption to dig in, let alone report the issue.

Hope it's a really simple fix for you to do. Thanks for the efforts on the project. I just learned of it today.

Make locale use more consistent with the Java Locale API

The use of locales as suggested by README.md seems to be inconsistent with the Java Locale API. Namely, it gives as an example

new Faker(new Locale("en-us")).address().zipCodeByState("CA"));

Here Locale("en-us") is an incorrect locale with language en-us. This is not the same as Locale.US, which is Locale("en", "US"), i.e. language en and country US.

It seems like Faker normalizes the incorrect locales:

var locale = new Locale("en-us");  // Incorrect locale
var faker = new Faker(locale);

faker.getLocale();  // The correct Locale("en", "US")

This behavior is not reflected in the javadoc. Probably the documentation in README.md could give as an example one of these two:

new Faker(Locale.US).address().zipCodeByState("CA");
// OR
new Faker(new Locale("en", "US")).address().zipCodeByState("CA");

and the list of supported locales could follow the same format as used by Locale::toString, with underscores _ instead of hyphens -:

- ar
- bg
- ca
- ca_CAT
- cs
- da_DK
. . . . .

Another inconsistency with the Java Locale API that I noticed is that java.util.Locale uses ISO 3166 alpha-2 for country codes, and datafaker doesn't. Not everything supported by datafaker can be mapped to ISO 3166 alpha-2, but probably datafaker could recognize ISO 3166 alpha-2 codes whenever possible:

var fakerNp = new Faker(new Locale("en", "NP"));
var fakerNep = new Faker(new Locale("en", "NEP"));

fakerNp.getLocale().getDisplayCountry();  // "Nepal"
fakerNep.getLocale().getDisplayCountry();  // "NEP"

fakerNp.name().fullName();  // "Clora Douglas"
fakerNep.name().fullName();  // "Laxmi Basynat"

As you see, the Java API is not aware of NEP while datafaker is not aware of NE. The same issue arises for IN and IND.

The same value is returned for random expressions within single run

Describe the bug
The same value is returned for random expressions within single run

To Reproduce

import net.datafaker.Faker;

public class Main
{
    public static void main(String[] args)
    {
        Faker faker = new Faker();

        System.out.println(faker.expression("#{regexify '[a-z]{5}[A-Z]{5}'}"));
        System.out.println(faker.expression("#{regexify '[a-z]{5}[A-Z]{5}'}"));
        System.out.println(faker.expression("#{Address.city}"));
        System.out.println(faker.expression("#{Address.city}"));
    }
}

Actual result:

brmbnQKCAJ
brmbnQKCAJ
Port Quinn
Port Quinn

Expected behavior
Random data should be generated for each expression invocation.

Versions:

  • Faker Version: 1.2.0

Additional context
This is a regression issue. It's not reproduced for faker 1.1.0

Add more passport number generators

Is your feature request related to a problem? Please describe.
At that moment, there are only 2 passport number generators. Am(American) and Ch(Chinese) passport number generator.

Describe the solution you'd like
Add support for more countries, at least for the EU.

Support extensible model to register custom external faker service

Support the ability to create a new faker service externally and provide support from within Faker class to use the external faker data.

The alternative is to make a PR to this code repo, however, what if the data in the faker service is specific to an industry that may not apply to everyone.

Maybe consider an interface and service loader approach in which subtypes of the interface are registered with the Faker service using the Java ServiceLoader pattern.

The generated image address cannot be accessed, and the same url is generated every time

Describe the bug
A clear and concise description of what the bug is.
faker.avatar().image()

To Reproduce
Code to reproduce

Expected behavior
A clear and concise description of what you expected to happen.
I hope that each time the generated image address is different and can be accessed normally
Versions:

  • OS: [e.g. OSX] macos
  • JDK jdk11
  • Faker Version [e.g. 22]

Additional context
Add any other context about the problem here.

Replace Lorempixel with Lorem Picsum (or other image provider)

Lorempixel no longer exists, which is used in Internet.image(), so might be good to replace it by a different service, such as Lorem Picsum.

There's a few things to take into account:

  • there's categories used in internet.yml. They might need to go?
  • there's 2 methods, one for images, and one for images with parameters, such as gray and text. Maybe this method should go/be deprecated, since I'm not sure if we should provide options for grayscale etc
  • To be very repeatable, Lorem Picsum allows sending a seed, to always have the same image. Would be nice to support that also.

Improve the quality of phone numbers generated

At the moment, the quality of some generated phone numbers is quite poor. To demonstrate this, there's a test called PhoneNumberValidityFinderTest.testAllPhoneNumbers, which generates the following output:

en_NZ=53
fr_CH=57
lv_LV=60
pt_PT=66
en_MS=69
tr_TR=84
zh_CN=86
hu_HU=93
en_PH=93
by_BY=94
ar_AR=97
no_NO=99
zh_TW=100
sv_SV=100
es_AR=100
bg_BG=100

This means that for the locales zh_TW, sv_SV, es_AR and bg_BG currently all phone numbers are generated incorrectly.

It would be great if we can improve the quality here a little bit.

Rename master to main

Any objections regarding following a kind of cutting edge of the base branch naming?

Better ways of testing

While working on some extensions for the Vehicle provider I noticed that the way many of the current tests are set up is fairly shallow and prone to flakiness.

I realized that most tests, like this example, merely test that a value is returned.

@Test
void testMake() {
    assertThat(faker.vehicle().make()).matches(INTERNATIONAL_WORDS_MATCH);
}

Now, what does this test? It tests whether the provider will return a String value that adheres to a very permissive pattern. The catch is of course that these values are (should be) randomly selected. What this does not test at all, it whether the correct values are returned.
Also it primarily tests things that are outside the bounds of the provider.

So instead I started thinking about what should be tested? In this specific example the implementation will use the FakeValueService to resolve values from key vehicle.makes. Then I think it adds more value to test whether the correct key is being resolved.

Other things are clearly outside the scop of this test. The random value selection is not part of the provider, it is the duty of the FakeValueService. The specific values are essentially data. Can one test data?

Given all that I think a more appropriate (and less flaky) test could look something like this:

@Test
void testMake() {
    // Given
    FakeValuesService fakeValuesServiceMock = mock(FakeValuesService.class);
    FakerContext fakerContextMock = mock(FakerContext.class);
    Faker faker = new Faker(fakeValuesServiceMock, fakerContextMock);

     // When
    faker.vehicle().make();

    // Then
    verify(fakeValuesServiceMock).resolve(eq("vehicle.makes"), any(Vehicle.class), any(FakerContext.class));
}

Of course, some of that can be moved to base classes or helper methods:

@Test
void testMake() {
    // Given
    Vehicle vehicle = givenAVehicleProvider();

    // When
    vehicle.make();

    // Then
    verify(fakeValuesServiceMock).resolve(eq("vehicle.makes"), any(Vehicle.class), any(FakerContext.class));
}

What do you think about such an approach?

Ability to generate unique values

Currently there isn't a way to enforce that random values from faker are different. This is mainly an issue when writing tests with ids or keys that cannot be the same. The data produced by faker is usually unique enough, but there is still a small chance that tests will randomly fail if we're not careful.

My solution is to have a unique faker that keeps a store of every value that it has generated. It has a base method that takes in a supplier and ensures that the value from the supplier has not been generated before during the unique faker's lifespan. For example:

// These two names will never be the same
faker.unique().get(() -> faker.name().firstName());
faker.unique().get(() -> faker.name().firstName());

// If the last name has the possibility of being the same as the first name, it will be 
// regenerated and guaranteed to be unique as well
faker.unique().get(() -> faker.name().lastName());

The store is kept at the unique faker level, so the uniqueness is only persisted during the lifespan of the faker object. If there are two different fakers they could potentially generate the same values.

Would there be any issues with having a faker like this? Here is the full implementation that I had in mind:

public class Unique {
    private final Faker faker;
    private final Set<Object> uniqueValueStore;

    private static final long LOOP_TIMEOUT_MILLIS = 10000;

    public Unique(Faker faker) {
        this.faker = faker;
        this.uniqueValueStore = new HashSet<>();
    }

    public <T> T get(Supplier<T> supplier) {
        T value = supplier.get();
        long millisBeforeCheck = currentTimeMillis();
        while (uniqueValueStore.contains(value)) {
            handleInfiniteLoop(millisBeforeCheck);
            value = supplier.get();
        }
        uniqueValueStore.add(value);
        return value;
    }

    public String resolve(String key) {
        return get(() -> faker.resolve(key));
    }

    public String expression(String expression) {
        return get(() -> faker.expression(expression));
    }

    public int nextInt() {
        return get(() -> faker.random().nextInt());
    }

    public int nextInt(int n) {
        return get(() -> faker.random().nextInt(n));
    }

    public int nextInt(int min, int max) {
        return get(() -> faker.random().nextInt(min, max));
    }

    public long nextLong() {
        return get(() -> faker.random().nextLong());
    }

    public long nextLong(long n) {
        return get(() -> faker.random().nextLong(n));
    }

    public long nextLong(long min, long max) {
        return get(() -> faker.random().nextLong(min, max));
    }

    private void handleInfiniteLoop(long initialMillis) {
        if (currentTimeMillis() - initialMillis > LOOP_TIMEOUT_MILLIS) {
            throw new RuntimeException("Unable to get unique value from supplier");
        }
    }
}

Test not in line with actual implementation of DateAndTime.past()

Describe the bug
An unreleated test failed for #358 :

Error: 8.168 [ERROR] Tests run: 40, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.29 s <<< FAILURE! - in net.datafaker.service.FakeValuesServiceTest
Error: 8.169 [ERROR] net.datafaker.service.FakeValuesServiceTest.pastDateExpression  Time elapsed: 0.012 s  <<< FAILURE!
java.lang.AssertionError: 

Expecting actual:
  1663204567147L
to be between:
  ]1663204568040L, 1663222568040L[

The value is out of bounds by 893 millis.

To Reproduce
Run net.datafaker.service.FakeValuesServiceTest.pastDateExpression multiple times until it produces a really low value.

Expected behavior
The test passes.

Additional context
A little investigation shows that the expression in the test resolves to the DateAndTime.past(int, TimeUnit) method. When looking at the implementation (and Javadoc) it shows that it takes a 1000 millisecond offset form now. The test does not. Hence this test will be flaky. Given the large range it will fail very sparsely.
I'm not sure what the thought is behind the 1 second offset / slack of the implementation. Should the test be fixed to match the implementation, or should the implementation be fixed to match the test?

And as a side-note: this kind of flaky behavior is obviously a risk that's in the nature of this library. Can we think of a way to more reliably test randomly generated values?

Missing data for ru and bg locales?

I upgraded from com.github.javafaker:javafaker to net.datafaker:datafaker:1.5.0, and it seems like some data is missing now.

The following worked as expected for javafaker:

new com.github.javafaker.Faker(new Locale("ru")).address().streetAddress();
// Example result: "Шевченко улица, 845"

new com.github.javafaker.Faker(new Locale("bg")).address().streetAddress();
// Example result: "бул. Лъчезар Станчев, 17"

However, this crashes with RuntimeException:

new net.datafaker.Faker(new Locale("ru")).address().streetAddress();
// RuntimeException: Unable to resolve #{Address.street_title} directive

new net.datafaker.Faker(new Locale("bg")).address().streetAddress();
// RuntimeException: Unable to resolve #{Address.street_title} directive

I went through all the locales listed in README.md, and this issue seems to arise precisely for ru and bg.

Add transformation schemas

The problem is that currently to produce JSON, CSV or any other format there should be configured rules independently for each of the formats and for each in its own way... It makes usage of that functionality more complicated.

The idea is to create a transformation schema which could define the rules of retrieval/supplying data and transformers which could define rules of transformations into specific formats.
e.g. there could be a schema

final Schema.SchemaBuilder<Name, Object> schemaBuilder =
        new Schema.SchemaBuilder<Name, Object>()
            .of(
                field("first", Name::firstName),
                compositeField(
                    "second",
                    new Field[] {field("third", Name::username)}),
                field("fifth", Name::fullName)).build();

and transformer like

public interface Transformer<IN extends AbstractProvider<?>, OUT> {
    OUT apply(IN input, Schema<IN, ? extends OUT> schema);
}

Then all the transformers should care how to transform to something based on schema without dealing about schema definition
Finally e.g. with JsonTransformer it could generate

{"first":"Dania", "second":{"third":"lindsay.wiza"}, "fourth":"Kelvin West"},
{"first":"Tad", "second":{"third":"usha.greenholt"}, "fourth":"Ronald Lakin"},
{"first":"Evelyne", "second":{"third":"regan.maggio"}, "fourth":"Claire Wehner"},
{"first":"Jefferson", "second":{"third":"pam.bayer"}, "fourth":"Willie Jacobson"}

It will generify and simplify support of different transformations

Incorrect ZhCN ID number generation

Describe the bug
Chinese (zh_CN) ID number generator does not match the correct format. 'Date of Birth' in the generated number comes first, however, according to format, rather should be 'Address Code'.

To Reproduce

Faker faker = new Faker();
faker.idNumber().validZhCNSsn();

Expected behavior
image
(Resource: https://en.wikipedia.org/wiki/Resident_Identity_Card#Identity_card_number)

According to the format, the first 6 digits are 'Address code', the next 8 are 'Date of Birth' in the form YYYY-MM-DD, the next 3 are 'Order Code' and the last symbol is 'Checksum'.

Versions:

  • OS: Linux Ubuntu
  • JDK: 11
  • Faker Version: 1.4.0

`ConcurrentModificationException` when using parallel test execution

Describe the bug
With the upgrade from version 1.4.0 to 1.5.0 one of my tests started to fail with a ConcurrentModificationException.

To Reproduce
This could also be reproduced within an empty gradle project with only this test:

import net.datafaker.Faker;
import org.junit.jupiter.api.Test;

public class FakerConcurrencyModificationExceptionTest {

    private final Faker faker = new Faker();

    @Test
    public void test1() {
        faker.random().nextLong();
    }

    @Test
    public void test2() {
        faker.random().nextLong();
    }
}

junit-platform.properties

junit.jupiter.execution.parallel.enabled=true
junit.jupiter.execution.parallel.mode.default=concurrent

Expected behavior
Tests don't fail with this exception.

Versions:

  • OS: OSX 12.3.1
  • JDK 17.0.4
  • Faker Version 1.5.0

Java 8 compatibility of 1.5.0

Describe the bug
Using datafaker 1.5.0 with java 8 throws java.lang.UnsupportedClassVersionError: net/datafaker/Faker has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0

To Reproduce
Use net.datafaker:datafaker:1.5.0 with Java 8

Expected behavior
That the library can be used with Java 8 as that is stated on the Readme: This library is a modern port of java-faker, built on Java 8, with up to date libraries and several newly added Fake Generators.

Versions:

  • OS: Linux
  • JDK 1.8
  • Faker Version: 1.5.0

Additional context
Maybe release a 1.5.1 version that is compiled with target Java 8.

Generator of unique values for file based generators

After thinking about #232 I came to the idea which potentially should work without issues mentioned in #232 however with the assumption that we consider only bounded set of values like data from files, enums

During resolution of provider in net.datafaker.service.FakeValuesService#safeFetch it retrieves a list of possible values and then picks a random.
So the idea is keeping track of retrieved values each time removing picked one. So, it means we should not retrieve it each time. Since each time we remove the one which was already generated then we guarantee that we have only those which are allowed to be generated. So there is no need to do multiple retries. Once we generate all unique values we could throw an exception

Also the downside is that it requires deep dive into core implementation and probably will consume time to implement

Error involving `Csv.Column.of()`

I trying to generate fake data in csv format. The way I do it, is by having enum that looks like this:

public enum Provider {
        FIRST_NAME(() -> "first_name", faker -> faker.name().firstName());

        private final Supplier<String> header;
        private final Function<Faker, String> provider;

        Provider(Supplier<String> header, Function<Faker, String> provider) {
            this.header = header;
            this.provider = provider;
        }
    }

The problem in this piece of code:

            result.add(Csv.Column.of(() -> provider.getHeader(), faker -> provider.getProvider().apply(this.faker)));

First argument of Csv.Column.of() returns error which says:
Required type: Column
Provided: CollectionColumn<T>
Supplier<String> is not compatible with String

Stack trace:

java: no suitable method found for of(()->provid[...]der(),(faker)->p[...]aker))
    method net.datafaker.fileformats.Csv.CollectionColumn.<T>of(java.util.function.Supplier<java.lang.String>,java.util.function.Function<T,java.lang.String>) is not applicable
      (cannot infer type-variable(s) T
        (argument mismatch; bad return type in lambda expression
          java.util.function.Supplier<java.lang.String> cannot be converted to java.lang.String))
    method net.datafaker.fileformats.Csv.Column.of(java.lang.String,java.util.function.Supplier<java.lang.String>) is not applicable
      (argument mismatch; java.lang.String is not a functional interface)
    method net.datafaker.fileformats.Csv.Column.of(java.util.function.Supplier<java.lang.String>,java.util.function.Supplier<java.lang.String>) is not applicable
      (argument mismatch; bad return type in lambda expression
          java.util.function.Supplier<java.lang.String> cannot be converted to java.lang.String)

Maybe I'm doing it wrong and there is another way to implement this.

Create positive / negative number support

For one of the projects I'm working on, I need a random positive, and a random negative number. I can use the random (min, max) option, but that's less expressive than just saying "positive()"

related fake data needed

I generate some data with this:
Address address = faker.address();
System.out.println(address.country());
System.out.println(address.city());
System.out.println(address.streetAddress());
System.out.println(address.longitude());
System.out.println(address.latitude());
The country and the city are not related. Can I get them related?
git

Dark Soul Feature Request

Is your feature request related to a problem? Please describe.
[From] Java-faker's Feature Request about Dark Soul.

Describe the solution you'd like
Add stuff in Dark Soul to datafaker.

Baseball feature request

Is your feature request related to a problem? Please describe.
We expected a provider which could be frequently used, Baseball but could not find on this repository.

Describe the solution you'd like
To solve this problem, we created Baseball Provider under Sports category which will provides team names, coaches, positions and player names.

Additional context
We hope our code being contribute to improve data-faker project. Thank you in advance, and thanks for maintaining this project.

Add FakeStream similar to FakeCollection

Feature description:

I think it's worth to add an ability of creating a Stream of finite/infinite size with fake data, similar to the FakeCollection generator.

Usage:

Faker faker = new Faker();
Stream<String> infiniteNames = faker.stream(
                              () -> faker.name().firstName(),
                              () -> faker.name().lastName())
                         .generate();

Stream<String> finiteNames = faker.stream(
                              () -> faker.name().firstName(),
                              () -> faker.name().lastName())
                         .len(10, 20)
                         .generate();

Possible solution:

Similar to FakeCollection.
FakeStream.java

public class FakeStream<T> {
    private final RandomService randomService;
    private final List<Supplier<T>> suppliers;
    private final double nullRate;
    private final int minLength;
    private final int maxLength;

    private FakeStream(List<Supplier<T>> suppliers, int minLength, int maxLength, RandomService randomService, double nullRate) {
        this.suppliers = suppliers;
        this.minLength = minLength;
        this.maxLength = maxLength;
        this.randomService = randomService;
        this.nullRate = nullRate;
    }

    public T singleton() {
        if (nullRate == 0d || randomService.nextDouble() >= nullRate) {
            return suppliers.get(randomService.nextInt(suppliers.size())).get();
        }
        return null;
    }

    public Stream<T> get() {
        if (isInfinite()) {
            return Stream.generate(this::singleton);
        }

        int size = randomService.nextInt(minLength, maxLength);
        return Stream.generate(this::singleton).limit(size);
    }

    private boolean isInfinite() {
        return maxLength < 0;
    }

    public static class Builder<T> {...}

BaseFaker.java

...
    /**
     * @return builder to build {@code FakeStream}
     */
    public <T> FakeStream.Builder<T> stream() {
        return new FakeStream.Builder<T>().faker(this);
    }

    @SafeVarargs
    public final <T> FakeStream.Builder<T> stream(Supplier<T>... suppliers) {
        return new FakeStream.Builder<>(suppliers).faker(this);
    }

    public final <T> FakeStream.Builder<T> stream(List<Supplier<T>> suppliers) {
        return new FakeStream.Builder<>(suppliers).faker(this);
    }
...

What do you think?

Cannot call the fakerValuesServices method In a New Class

Hello, this is my first time contributing. I am running on Java 17 and had to download v8 for this project.

Created a new class, and could not reference the fakerValuesServices method in my new class.
protected Nigeria( Faker faker) { this.faker = faker; } public String locations() {return faker.fakeValueServices().resolve("nigeria.locations", this, faker); }
Tried fixing this with the Java Reflection API to call a private method, but still had issues with the return input.
Please help, thank you.

Strange generation of 'PhoneNumber' faker

Describe the bug
I've been generating phone numbers in my project and noticed some strange and incorrect results.

To Reproduce

Faker faker = new Faker();
faker.phoneNumber().phoneNumber();

The result of 50 repeated generations:

755-482-2234
085-863-3865 x3172
1-065-305-4959 x8047
314.976.7963 x734
330-099-4886 x670
916.255.0603 x527
725-431-2485 x2418
212-342-1988
342-146-2804 x670
(085) 866-9161 x22174
952-205-3782 x28369
(116) 263-6437 x19870
(767) 394-5938 x16583
106-499-4984 x0279
742-522-5959 x09797
(369) 671-5077
(228) 198-2966 x883
1-308-449-2513 x34799
1-941-533-0634 x464
1-473-386-5272 x7908
(609) 208-8335 x1751
1-569-094-9768 x527
559.581.5058 x13411
372-047-9918 x8116
942.389.2238
955.419.1339 x350
(860) 978-1565 x11785
906-710-1553 x40711
1-226-999-7832 x309
510.265.6294 x3466
1-009-496-8327
1-070-132-9366 x88195
292.279.7325 x65075
1-891-998-7425
619-452-1971 x0025
(195) 188-3605
1-099-630-3627 x726
520-057-5025 x683
1-614-603-4667
356.356.3680 x787
(610) 582-2496 x3466
733.494.4426
746.041.2609 x2698
281-908-1136
(227) 044-7571 x869
1-922-095-8040 x7974
371.030.3800 x756
1-173-703-9945 x73683
1-011-256-6217 x0112
860.554.2778 x555

Versions:

  • OS: Windows 11
  • JDK: 11
  • Faker Version 1.5.0

How about we drop Java 8?

Java 8 is getting quite old these days, and I'm not sure if keeping Java 8 as a baseline is going to be helpful in the long run.

My proposal is to move to Java 11 instead for the next year(s) or so, and maybe call it Datafaker 2.0 to make the distinction. People on Java 8 can still use Datafaker 1.x, but the main version of Datafaker would have Java 11 as a minimum.

Thoughts, concerns?

European makes and models for cars

I'm currently using the Vehicle faker to generate car makes and models. What I noticed, is that currently primarily US models and makes are included in the default configuration.

Now I could go and add some more European makes and models to one or more locale files, but that does not feel right. While there are differences in markets at large (US / Europe / Asia, probably some more), this is a subject that's not strictly bound to a single locale. Also in our use case we use several locale's from around Europe.

What would be the best way forward here?

  • Use the locales as they are right now and just reproduce makes and models over many locale files?
  • Add specific methods to the Vehicle faker for European makes and models?
  • Something completely different...?

To be clear: I'd be happy to contribute and file PR, I'm just wondering what direction I should / could take with this one.

I hope to create an api that can specifically generate pictures, because some data scenarios require many different pictures

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Street name faker for 'bg' locale broken

Describe the bug
PR that describes bug: #250

To Reproduce

Faker localFaker = new Faker(new Locale("bg"));
assertThat(localFaker.address().streetName()).isNotEmpty();

Versions:

  • OS: Linux Ubuntu
  • JDK
  • Faker Version 15

Improve credit card format

Improve formats in business.yml.

At the moment, credit_card_numbers and credit_card_expiry_dates in business.yml are predefined and there are only a few of them, which is not enough for a normal result.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.