Giter VIP home page Giter VIP logo

Comments (12)

Belval avatar Belval commented on May 21, 2024

Hi!

First, you need to identify the type of transformation applied in the Oxford Synthetic Word Dataset.

From your example, I'd say it's mostly various fonts with an angle added to them. This should be easy enough to do.

My first bet would be to do: python3 run.py -w 1 -c 1000 -k 20 -rk

-w 1 means "1 word per image"
-c 1000 means "1000 images"
-k 20 means "skewed 20 degrees"
-rk means "random skew"

You can add -t X where X is the number of thread in your CPU. This is useful to speedup the process.

But then this will generate images with always the same font color, which is not what the Oxford dataset is about!

You can add -tc "#282828,#444444" which means "A font color between #282828 (gray) and #444444 (lighter gray)"

Putting everything together you get python3 run.py -w 1 -c 1000 -k 20 -rk -tc "#282828,#444444"

For more options you can simply look at python3 run.py -h for a short-yet-thoughtful description of the parameter.

I will try it on my own computer a bit latter and post examples of the result.

from textrecognitiondatagenerator.

Belval avatar Belval commented on May 21, 2024

Here's 8 samples I generated with the above command:

festal_938
habilimented_400
horsegate_591
munchausenism_168
parsonages_176
phlebotomise_767
recategorize_194
raticides_662

If you want to remove the gaussian noise behind the words, you can add -b 1 for now it is only white, with more options planned.

from textrecognitiondatagenerator.

kevgeo avatar kevgeo commented on May 21, 2024

@Belval Thanks for the swift response :). The images you have generated are fine but the problem arises when I want to include numbers and punctuations. I am trying to generate images similar to the ones below,
oxford1
oxford2

But my generated images are like below where the font of the words does not look similar to those in the dataset or to your generated images .
newsynth4
newsynth6

from textrecognitiondatagenerator.

Belval avatar Belval commented on May 21, 2024

What exact command are you using?

from textrecognitiondatagenerator.

kevgeo avatar kevgeo commented on May 21, 2024

@Belval Sorry for the late reply. I have edited the arguments in run.py . Here are the list of arguments with values for clarity.
`

def parse_arguments():
#Parse the command line arguments of the program.

parser = argparse.ArgumentParser(description='Generate synthetic text data for text recognition.')
parser.add_argument(
    "--output_dir",
    type=str,
    nargs="?",
    help="The output directory",
    default="out/",
)
parser.add_argument(
    "-i",
    "--input_file",
    type=str,
    nargs="?",
    help="When set, this argument uses a specified text file as source for the text",
    default=""
)
parser.add_argument(
    "-l",
    "--language",
    type=str,
    nargs="?",
    help="The language to use, should be fr (French), en (English), es (Spanish), de (German), or cn (Chinese).",
    default="en"
)
parser.add_argument(
    "-c",
    "--count",
    type=int,
    nargs="?",
    help="The number of images to be created.",
    default=1000
)
parser.add_argument(
    "-rs",
    "--random_sequences",
    action="store_true",
    help="Use random sequences as the source text for the generation. Set '-let','-num','-sym' to use letters/numbers/symbols. If none specified, using all three.",
    default=True
)
parser.add_argument(
    "-let",
    "--include_letters",
    action="store_true",
    help="Define if random sequences should contain letters. Only works with -rs",
    default=True
)
parser.add_argument(
    "-num",
    "--include_numbers",
    action="store_true",
    help="Define if random sequences should contain numbers. Only works with -rs",
    default=True
)
parser.add_argument(
    "-sym",
    "--include_symbols",
    action="store_true",
    help="Define if random sequences should contain symbols. Only works with -rs",
    default=True
)
parser.add_argument(
    "-w",
    "--length",
    type=int,
    nargs="?",
    help="Define how many words should be included in each generated sample. If the text source is Wikipedia, this is the MINIMUM length",
    default=1
)
parser.add_argument(
    "-r",
    "--random",
    action="store_true",
    help="Define if the produced string will have variable word count (with --length being the maximum)",
    default=False
)
parser.add_argument(
    "-f",
    "--format",
    type=int,
    nargs="?",
    help="Define the height of the produced images",
    default=32,
)
parser.add_argument(
    "-t",
    "--thread_count",
    type=int,
    nargs="?",
    help="Define the number of thread to use for image generation",
    default=3,
)
parser.add_argument(
    "-e",
    "--extension",
    type=str,
    nargs="?",
    help="Define the extension to save the image with",
    default="jpg",
)
parser.add_argument(
    "-k",
    "--skew_angle",
    type=int,
    nargs="?",
    help="Define skewing angle of the generated text. In positive degrees",
    default=10,
)
parser.add_argument(
    "-rk",
    "--random_skew",
    action="store_true",
    help="When set, the skew angle will be randomized between the value set with -k and it's opposite",
    default=True,
)
parser.add_argument(
    "-wk",
    "--use_wikipedia",
    action="store_true",
    help="Use Wikipedia as the source text for the generation, using this paremeter ignores -r, -n, -s",
    default=False,
)
parser.add_argument(
    "-bl",
    "--blur",
    type=int,
    nargs="?",
    help="Apply gaussian blur to the resulting sample. Should be an integer defining the blur radius",
    #default = 0.7,
    default=0,
)
parser.add_argument(
    "-rbl",
    "--random_blur",
    action="store_true",
    help="When set, the blur radius will be randomized between 0 and -bl.",
    default = 0.5,
    #default=False,
)
parser.add_argument(
    "-b",
    "--background",
    type=int,
    nargs="?",
    help="Define what kind of background to use. 0: Gaussian Noise, 1: Plain white, 2: Quasicrystal, 3: Pictures",
    default=0,
)
parser.add_argument(
    "-hw",
    "--handwritten",
    action="store_true",
    help="Define if the data will be \"handwritten\" by an RNN",
)
parser.add_argument(
    "-na",
    "--name_format",
    type=int,
    help="Define how the produced files will be named. 0: [TEXT]_[ID].[EXT], 1: [ID]_[TEXT].[EXT] 2: [ID].[EXT] + one file labels.txt containing id-to-label mappings",
    default=0,
)
parser.add_argument(
    "-d",
    "--distorsion",
    type=int,
    nargs="?",
    help="Define a distorsion applied to the resulting image. 0: None (Default), 1: Sine wave, 2: Cosine wave, 3: Random",
    default=0
)
parser.add_argument(
    "-do",
    "--distorsion_orientation",
    type=int,
    nargs="?",
    help="Define the distorsion's orientation. Only used if -d is specified. 0: Vertical (Up and down), 1: Horizontal (Left and Right), 2: Both",
    default=0
)
parser.add_argument(
    "-wd",
    "--width",
    type=int,
    nargs="?",
    help="Define the width of the resulting image. If not set it will be the width of the text + 10. If the width of the generated text is bigger that number will be used",
    default=-1
)
parser.add_argument(
    "-al",
    "--alignment",
    type=int,
    nargs="?",
    help="Define the alignment of the text in the image. Only used if the width parameter is set. 0: left, 1: center, 2: right",
    default=1
)
parser.add_argument(
    "-tc",
    "--text_color",
    type=str,
    nargs="?",
    help="Define the text's color, should be either a single hex color or a range in the ?,? format.",
    default = "#282828,#444444"
    #default='#282828'
)
return parser.parse_args()

from textrecognitiondatagenerator.

Belval avatar Belval commented on May 21, 2024

Multiple reasons exist to explain the differences you see between the Oxford dataset and the generated examples.

  1. There is a 5px margin on top and bottom of the generated text. I am currently looking into fixing this by adding a parameter to let people manage this themselves.
  2. Random strings means random strings, if you wish to have common words with symbols in them I would suggest generating your own dictionary and feeding it to run.py using the -i parameter.
  3. White text on dark background is not currently supported this will probably change in the future.

from textrecognitiondatagenerator.

Belval avatar Belval commented on May 21, 2024

About 1, see my reply on issue #31

from textrecognitiondatagenerator.

kevgeo avatar kevgeo commented on May 21, 2024

@Belval Hi yes I understood the main difference was because the text is black on white background and not vice-versa. If I am able to do it, I would definitely add a PR. Thanks for the help.

from textrecognitiondatagenerator.

Belval avatar Belval commented on May 21, 2024

Oh that's great news, I didn't know you would be interested in working on it.

I would suggest adding a -bc / --background_color that takes the same parameters as --text_color.

I am open to ideas!

from textrecognitiondatagenerator.

Belval avatar Belval commented on May 21, 2024

Closing as the issue has gone cold. Feel free to reopen.

from textrecognitiondatagenerator.

bharatkumarreddy avatar bharatkumarreddy commented on May 21, 2024

@kevgeo Just Curious, Did you able to generate the text like synthetic word Dataset

from textrecognitiondatagenerator.

abdksyed avatar abdksyed commented on May 21, 2024

Is the background-color supported now???

from textrecognitiondatagenerator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.