Hi, I am trying to generate images containing single words similar to that in the

Here's 8 samples I generated with the above command: <a target="_bla

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

About 1, see my reply on issue <a class="issue-link js-issue-link" data-error-text="Fa

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Generating Images similar to Oxford Synthetic Word Dataset about textrecognitiondatagenerator HOT 12 CLOSED

belval commented on May 21, 2024

Generating Images similar to Oxford Synthetic Word Dataset

from textrecognitiondatagenerator.

Comments (12)

Belval commented on May 21, 2024

Hi!

First, you need to identify the type of transformation applied in the Oxford Synthetic Word Dataset.

From your example, I'd say it's mostly various fonts with an angle added to them. This should be easy enough to do.

My first bet would be to do: python3 run.py -w 1 -c 1000 -k 20 -rk

-w 1 means "1 word per image"
-c 1000 means "1000 images"
-k 20 means "skewed 20 degrees"
-rk means "random skew"

You can add -t X where X is the number of thread in your CPU. This is useful to speedup the process.

But then this will generate images with always the same font color, which is not what the Oxford dataset is about!

You can add -tc "#282828,#444444" which means "A font color between #282828 (gray) and #444444 (lighter gray)"

Putting everything together you get python3 run.py -w 1 -c 1000 -k 20 -rk -tc "#282828,#444444"

For more options you can simply look at python3 run.py -h for a short-yet-thoughtful description of the parameter.

I will try it on my own computer a bit latter and post examples of the result.

from textrecognitiondatagenerator.

Belval commented on May 21, 2024

Here's 8 samples I generated with the above command:

If you want to remove the gaussian noise behind the words, you can add -b 1 for now it is only white, with more options planned.

from textrecognitiondatagenerator.

kevgeo commented on May 21, 2024

@Belval Thanks for the swift response :). The images you have generated are fine but the problem arises when I want to include numbers and punctuations. I am trying to generate images similar to the ones below,

But my generated images are like below where the font of the words does not look similar to those in the dataset or to your generated images .

from textrecognitiondatagenerator.

Belval commented on May 21, 2024

What exact command are you using?

from textrecognitiondatagenerator.

kevgeo commented on May 21, 2024

@Belval Sorry for the late reply. I have edited the arguments in run.py . Here are the list of arguments with values for clarity.
`

def parse_arguments():
#Parse the command line arguments of the program.

parser = argparse.ArgumentParser(description='Generate synthetic text data for text recognition.')
parser.add_argument(
    "--output_dir",
    type=str,
    nargs="?",
    help="The output directory",
    default="out/",
)
parser.add_argument(
    "-i",
    "--input_file",
    type=str,
    nargs="?",
    help="When set, this argument uses a specified text file as source for the text",
    default=""
)
parser.add_argument(
    "-l",
    "--language",
    type=str,
    nargs="?",
    help="The language to use, should be fr (French), en (English), es (Spanish), de (German), or cn (Chinese).",
    default="en"
)
parser.add_argument(
    "-c",
    "--count",
    type=int,
    nargs="?",
    help="The number of images to be created.",
    default=1000
)
parser.add_argument(
    "-rs",
    "--random_sequences",
    action="store_true",
    help="Use random sequences as the source text for the generation. Set '-let','-num','-sym' to use letters/numbers/symbols. If none specified, using all three.",
    default=True
)
parser.add_argument(
    "-let",
    "--include_letters",
    action="store_true",
    help="Define if random sequences should contain letters. Only works with -rs",
    default=True
)
parser.add_argument(
    "-num",
    "--include_numbers",
    action="store_true",
    help="Define if random sequences should contain numbers. Only works with -rs",
    default=True
)
parser.add_argument(
    "-sym",
    "--include_symbols",
    action="store_true",
    help="Define if random sequences should contain symbols. Only works with -rs",
    default=True
)
parser.add_argument(
    "-w",
    "--length",
    type=int,
    nargs="?",
    help="Define how many words should be included in each generated sample. If the text source is Wikipedia, this is the MINIMUM length",
    default=1
)
parser.add_argument(
    "-r",
    "--random",
    action="store_true",
    help="Define if the produced string will have variable word count (with --length being the maximum)",
    default=False
)
parser.add_argument(
    "-f",
    "--format",
    type=int,
    nargs="?",
    help="Define the height of the produced images",
    default=32,
)
parser.add_argument(
    "-t",
    "--thread_count",
    type=int,
    nargs="?",
    help="Define the number of thread to use for image generation",
    default=3,
)
parser.add_argument(
    "-e",
    "--extension",
    type=str,
    nargs="?",
    help="Define the extension to save the image with",
    default="jpg",
)
parser.add_argument(
    "-k",
    "--skew_angle",
    type=int,
    nargs="?",
    help="Define skewing angle of the generated text. In positive degrees",
    default=10,
)
parser.add_argument(
    "-rk",
    "--random_skew",
    action="store_true",
    help="When set, the skew angle will be randomized between the value set with -k and it's opposite",
    default=True,
)
parser.add_argument(
    "-wk",
    "--use_wikipedia",
    action="store_true",
    help="Use Wikipedia as the source text for the generation, using this paremeter ignores -r, -n, -s",
    default=False,
)
parser.add_argument(
    "-bl",
    "--blur",
    type=int,
    nargs="?",
    help="Apply gaussian blur to the resulting sample. Should be an integer defining the blur radius",
    #default = 0.7,
    default=0,
)
parser.add_argument(
    "-rbl",
    "--random_blur",
    action="store_true",
    help="When set, the blur radius will be randomized between 0 and -bl.",
    default = 0.5,
    #default=False,
)
parser.add_argument(
    "-b",
    "--background",
    type=int,
    nargs="?",
    help="Define what kind of background to use. 0: Gaussian Noise, 1: Plain white, 2: Quasicrystal, 3: Pictures",
    default=0,
)
parser.add_argument(
    "-hw",
    "--handwritten",
    action="store_true",
    help="Define if the data will be \"handwritten\" by an RNN",
)
parser.add_argument(
    "-na",
    "--name_format",
    type=int,
    help="Define how the produced files will be named. 0: [TEXT]_[ID].[EXT], 1: [ID]_[TEXT].[EXT] 2: [ID].[EXT] + one file labels.txt containing id-to-label mappings",
    default=0,
)
parser.add_argument(
    "-d",
    "--distorsion",
    type=int,
    nargs="?",
    help="Define a distorsion applied to the resulting image. 0: None (Default), 1: Sine wave, 2: Cosine wave, 3: Random",
    default=0
)
parser.add_argument(
    "-do",
    "--distorsion_orientation",
    type=int,
    nargs="?",
    help="Define the distorsion's orientation. Only used if -d is specified. 0: Vertical (Up and down), 1: Horizontal (Left and Right), 2: Both",
    default=0
)
parser.add_argument(
    "-wd",
    "--width",
    type=int,
    nargs="?",
    help="Define the width of the resulting image. If not set it will be the width of the text + 10. If the width of the generated text is bigger that number will be used",
    default=-1
)
parser.add_argument(
    "-al",
    "--alignment",
    type=int,
    nargs="?",
    help="Define the alignment of the text in the image. Only used if the width parameter is set. 0: left, 1: center, 2: right",
    default=1
)
parser.add_argument(
    "-tc",
    "--text_color",
    type=str,
    nargs="?",
    help="Define the text's color, should be either a single hex color or a range in the ?,? format.",
    default = "#282828,#444444"
    #default='#282828'
)
return parser.parse_args()

from textrecognitiondatagenerator.

Belval commented on May 21, 2024

Multiple reasons exist to explain the differences you see between the Oxford dataset and the generated examples.

There is a 5px margin on top and bottom of the generated text. I am currently looking into fixing this by adding a parameter to let people manage this themselves.
Random strings means random strings, if you wish to have common words with symbols in them I would suggest generating your own dictionary and feeding it to run.py using the -i parameter.
White text on dark background is not currently supported this will probably change in the future.

from textrecognitiondatagenerator.

Belval commented on May 21, 2024

About 1, see my reply on issue #31

from textrecognitiondatagenerator.

kevgeo commented on May 21, 2024

@Belval Hi yes I understood the main difference was because the text is black on white background and not vice-versa. If I am able to do it, I would definitely add a PR. Thanks for the help.

from textrecognitiondatagenerator.

Belval commented on May 21, 2024

Oh that's great news, I didn't know you would be interested in working on it.

I would suggest adding a -bc / --background_color that takes the same parameters as --text_color.

I am open to ideas!

from textrecognitiondatagenerator.

Belval commented on May 21, 2024

Closing as the issue has gone cold. Feel free to reopen.

from textrecognitiondatagenerator.

bharatkumarreddy commented on May 21, 2024

@kevgeo Just Curious, Did you able to generate the text like synthetic word Dataset

from textrecognitiondatagenerator.

abdksyed commented on May 21, 2024

Is the background-color supported now???

from textrecognitiondatagenerator.

Generating Images similar to Oxford Synthetic Word Dataset about textrecognitiondatagenerator HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent