Giter VIP home page Giter VIP logo

mxfont's Introduction

MX-Font (ICCV 2021)

NOTICE: We release the unified few-shot font generation repository (clovaai/fewshot-font-generation). If you are interested in using our implementation, please visit the unified repository.

Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper

Song Park1, Sanghyuk Chun2, 3, Junbum Cha3, Bado Lee3, Hyunjung Shim1
1 School of Integrated Technology, Yonsei university
2 NAVER AI Lab
3 NAVER CLOVA

A few-shot font generation (FFG) method has to satisfy two objectives: the generated images should preserve the underlying global structure of the target character and present the diverse local reference style. Existing FFG methods aim to disentangle content and style either by extracting a universal representation style or extracting multiple component-wise style representations. However, previous methods either fail to capture diverse local styles or cannot be generalized to a character with unseen components, e.g., unseen language systems. To mitigate the issues, we propose a novel FFG method, named Multiple Localized Experts Few-shot Font Generation Network (MX-Font). MX-Font extracts multiple style features not explicitly conditioned on component labels, but automatically by multiple experts to represent different local concepts, e.g., left-side sub-glyph. Owing to the multiple experts, MX-Font can capture diverse local concepts and show the generalizability to unseen languages. During training, we utilize component labels as weak supervision to guide each expert to be specialized for different local concepts. We formulate the component assign problem to each expert as the graph matching problem, and solve it by the Hungarian algorithm. We also employ the independence loss and the content-style adversarial loss to impose the content-style disentanglement. In our experiments, MX-Font outperforms previous state-of-the-art FFG methods in the Chinese generation and cross-lingual, e.g., Chinese to Korean, generation.

You can find more related projects on the few-shot font generation at the following links:


Prerequisites

conda install numpy scipy scikit-image tqdm jsonlib-python3 fonttools

Usage

Note that, we only provide the example font files; not the font files used for the training the provided weight (generator.pth). The example font files are downloaded from here.

Preparing Data

  • The examples of datasets are in (./data)

Font files (.ttf)

  • Prepare the TrueType font files(.ttf) to use for the training and the validation.
  • Put the training font files and validation font files into separate directories.

The text files containing the available characters of .ttf files (.txt)

  • If you have the available character list of a .ttf file, save its available characters list to a text file (.txt) with the same name in the same directory with the ttf file.
    • (example) TTF file: data/ttfs/train/MaShanZheng-Regular.ttf, its available characters: data/ttfs/train/MaShanZheng-Regular.txt
  • You can also generate the available characters files automatically using the get_chars_from_ttf.py
# Generating the available characters file

python get_chars_from_ttf.py --root_dir path/to/ttf/dir
  • --root_dir: The root directory to find the .ttf files. All the .ttf files under this directory and its subdirectories will be processed.

The json files with decomposition information (.json)

  • The files for the decomposition information are needed.
    • The files for the Chinese characters are provided. (data/chn_decomposition.json, data/primals.json)
    • If you want to train the model with a language other than Chinese, the files for the decomposition rule (see below) are also needed.
      • Decomposition rule
        • structure: dict (in json format)
        • format: {char: [list of components]}
        • example: {'㐬': ['亠', '厶', '川'], '㐭': ['亠', '囗', '口']}
      • Primals
        • structure: list (in json format)
        • format: [All the components in the decomposition rule file]
        • example: ['亠', '厶', '川', '囗', '口']

Training

Modify the configuration file (cfgs/train.yaml)

- use_ddp:  whether to use DataDistributedParallel, for multi-GPUs.
- port:  the port for the DataDistributedParallel training.

- work_dir:  the directory to save checkpoints, validation images, and the log.
- decomposition:  path to the "decomposition rule" file.
- primals:  path to the "primals" file.

- dset:  (leave blank)
  - train:  (leave blank)
    - data_dir : path to .ttf files for the training
  - val: (leave blank)
    - data_dir : path to .ttf files for the validation
    - source_font : path to .ttf file used as the source font during the validation

Run training

python train.py cfgs/train.yaml
  • arguments
    • path/to/config (first argument): path to configration file.
    • --resume (optional) : path to checkpoint to resume.

Test

Preparing the reference images

  • Prepare the reference images and the .ttf file to use as the source font.
  • The reference images are should be placed in this format:
    * data_dir
    |-- font1
        |-- char1.png
        |-- char2.png
        |-- char3.png
    |-- font2
        |-- char1.png
        |-- char2.png
            .
            .
            .
  • The names of the directory or the image files are not important, however, the images with the same reference style are should be grouped with the same directory.
  • If you want to generate only specific characters, prepare the file containing the list of the characters to generate.
    • The example file is provided. (data/chn_gen.json)

Modify the configuration file (cfgs/eval.yaml)

- dset:  (leave blank)
  - test:  (leave blank)
    - data_dir: path to reference images
    - source_font: path to .ttf file used as the source font during the generation
    - gen_chars_file: path to file of the characters to generate. Leave blank if you want to generate all the available characters in the source font.

Run test

python eval.py \
    cfgs/eval.yaml \
    --weight generator.pth \
    --result_dir path/to/save/images
  • arguments
    • path/to/config (first argument): path to configration file.
    • --weight : path to saved weight to test.
    • --result_dir: path to save generated images.

Code license

This project is distributed under MIT license, except modules.py which is adopted from https://github.com/NVlabs/FUNIT.

MX-Font
Copyright (c) 2021-present NAVER Corp.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

Acknowledgement

This project is based on clovaai/dmfont and clovaai/lffont.

How to cite

@inproceedings{park2021mxfont,
    title={Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts},
    author={Park, Song and Chun, Sanghyuk and Cha, Junbum and Lee, Bado and Shim, Hyunjung},
    year={2021},
    booktitle={International Conference on Computer Vision (ICCV)},
}

mxfont's People

Contributors

8uos avatar clovaaiadmin avatar sanghyukchun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mxfont's Issues

about FID

do you measure the style-aware and content-aware FID using all the generated images instead of measuring the FID of each generated style and calculating the average?

trainning stragegy

Hello, thank you for your impressive work.
I noticed that you set'max_iter' to 800,000, and the training data set contains 439 different styles (each has nearly 6000 characters).
When I set 'batch' to 8, the epoch is between 2 and 3 (which makes me feel strange).
Am I right?

What are the meanings of style_facts_s style_facts_c, char_facts_s and char_facts_c

       What is the difference between style_facts_s and style_facts_c?
       What is the difference between char_facts_s and char_facts_c?

        style_facts_s = self.gen.factorize(style_feats, 0)  # (B*n_s, n_exp, *feat_shape)
        style_facts_c = self.gen.factorize(style_feats, 1)
        char_facts_s = self.gen.factorize(char_feats, 0)
        char_facts_c = self.gen.factorize(char_feats, 1)

abort program aborting

Hello, the 313 fonts were collected and were made into dataset. During training mxfont model the program aborted.

Some losses were very abnormal. The log was recorded in the following.

image

got a blank image during inference

Thank you very much for your excellent work. I've spent some time studying and learning from it. However, I've encountered a problem that I'd like to ask about. While testing with your publicly available generator.pth model, I've found that it sometimes generates blank images. Moreover, for different style_img, it might produce blank results on different predicted fonts. Could you please explain what might be the cause of this?

Consult "B.5. Training details" in the paper

I don't understand the description of "Then, we randomly select n style glyphs with the same style as the target glyph, and n content glyphs with the same character as the target glyph for each target glyph". Does this mean that style fonts and content fonts must have the same characters? Is it OK if the number of characters in my style font is 556 and the number of characters in my content font is 6763?

When reducing the training set characters, an error is reported:

When reducing the training set characters, an error is reported: File "/home/jmt/MXfont/mxfont-main/trainer/trainer_utils.py", line 101, in expert_assign r_in, c_in = linear_sum_assignment(prob_in)
ValueError: matrix contains invalid numeric entrie.

There should be a free data matrix, but increasing the batchsize will not cause an error, but I don't want to increase the batchsize. May I ask if there is any solution available

Training Set Consultation

Hello:
The training set consists of content and style, with 6763 characters for content and 556 characters for style. After training and testing, it was found that the character style generation effect was not good except for 556. May I ask if this dataset combination is reasonable

problem about reproducing the mxfont and FUNIT

Thanks for your wonderful works. I am trying to reproduce the results of MXFont and FUNIT. According to your reply in #6, I choose the best AC_g_acc_c and AC_g_acc_s as the stop iteration. When I try to reproduce FUNIT (128*128), mode collapse happens after about 10000 iterations. Have you encountered this problem?

font_indice font_indice

Excuse me, Where are font_indice, font_indice parameter passed from discriminator.py and what does it mean

Classifier

How is the classifier specifically trained? How does the style classifier identify style features?

The Role of Factorize and Defactorize

Hello, I am very interested in your paper. When reading the code, what are the functions of factorize and deformize, and what are the functions of 'last' and 'skip'

Pre-trained Model

Hi! Thanks for your excellent work!
I am wondering if it is possible to share a pre-trained model for this project.

get_defined_chars function seems to not work for some ttf files

Hello, thanks for the this amazing work in few-shot font generation!

When I want to train the model on my own ttf files, I met a problem. When I tried to utilize the get_chars_from_ttf.py, some of my ttf files can only produces the alphabets some punctuations, and no chinese chars is produced. However, the ttfs does have included chinese chars. I am not sure if there is some limitation of fontTools when it read in the ttf files. Could you help me to explain the reasons? Thanks for your help!

8281635197302_ pic

8291635197328_ pic

dataset

Thank you for your excellent work. I would like to inquire about the preparation of your dataset. My dataset consists of 30 fonts, each with 1000 characters. Among them, 20 fonts were used for training, with 800 characters. The test set consists of two parts: one part is the remaining 200 characters from the 20 fonts used in the training set, and the other part is the remaining 200 characters from the remaining 10 fonts. The source font is a font from the training set, May I ask how to place my dataset in the folder of your project, how to adjust training parameters, and how to reflect 800 training characters for each font in the project? I have seen that the training and val folders in your project are the same. Does the txt file list all the characters contained in the ttf for each font? Why should all the characters be listed? What is the purpose of this list of available characters? A question from a beginner in an industry

OSError: symbolic link privilege not held

Thanks for your fantastic work.
I got this error when I tried to train the model with my dataset. Here is the setting that I changed:

etc

save: all-last
print_freq: 100
val_freq: 1000
save_freq: 5000
tb_freq: 100
##############################################################################################
INFO::07/20 00:14:51 | Step 5000
|D 2.925 |G 0.963 |FM 0.086 |R_font 0.710 |F_font 0.760 |R_uni 0.710 |F_uni 0.725
|AC_s 0.110 |AC_c 0.959 |cr_AC_s -3.527 |cr_AC_c -2.403 |AC_acc_s 95.8% |AC_acc_c 60.0%
|AC_g_s 0.642 |AC_g_c 1.269 |cr_AC_g_s -3.527 |cr_AC_g_c -2.403 |AC_g_acc_s 81.4% |AC_g_acc_c 48.2%
|L1 0.012 |INDP_EXP 0.0266 |INDP_FACT 0.0617
INFO::07/20 00:14:51 | Validation at Epoch = 30.303
Traceback (most recent call last):
File "train.py", line 185, in
main()
File "train.py", line 181, in main
train(args, cfg)
File "train.py", line 167, in train
trainer.train(trn_loader, st_step, cfg.max_iter)
File "D:\mxfont\trainer\fact_trainer.py", line 198, in train
self.save(loss_dic['g_total'], self.cfg.save, self.cfg.get('save_freq', self.cfg.val_freq))
File "D:\mxfont\trainer\base_trainer.py", line 192, in save
last_ckpt_path.symlink_to(step_ckpt_path)
File "C:\Users\user\anaconda3\envs\stargan-v2-torch\lib\pathlib.py", line 1352, in symlink_to
self._accessor.symlink(target, self, target_is_directory)
OSError: symbolic link privilege not held

about training

Thanks for your amazing work. I tried to train mxfont model following the default configuration, but how I get reference image and what the dataset format is.

please give the detail description.
Thanks.

single Gpu report Bug

Traceback (most recent call last):
File "train.py", line 184, in
main()
File "train.py", line 180, in main
train(args, cfg)
File "train.py", line 166, in train
trainer.train(trn_loader, st_step, cfg.max_iter)
File "/202102014/Fy/mxfont/trainer/fact_trainer.py", line 145, in train
self.add_ac_losses_and_update_stats(
File "/202102014/Fy/mxfont/trainer/fact_trainer.py", line 281, in add_ac_losses_and_update_stats
ac_loss_c, cross_ac_loss_c, acc_c = self.infer_comp_ac(char_facts, comp_ids)
File "/202102014/Fy/mxfont/trainer/fact_trainer.py", line 251, in infer_comp_ac
acc = T_probs[cids, eids].sum() / n_experts
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

Could you help with this error please?

Traceback (most recent call last):
File "C:\Users\user\Desktop\mxfont\train.py", line 185, in
main()
File "C:\Users\user\Desktop\mxfont\train.py", line 181, in main
train(args, cfg)
File "C:\Users\user\Desktop\mxfont\train.py", line 167, in train
trainer.train(trn_loader, st_step, cfg.max_iter)
File "C:\Users\user\Desktop\mxfont\trainer\fact_trainer.py", line 145, in train
self.add_ac_losses_and_update_stats(
File "C:\Users\user\Desktop\mxfont\trainer\fact_trainer.py", line 281, in add_ac_losses_and_update_stats
ac_loss_c, cross_ac_loss_c, acc_c = self.infer_comp_ac(char_facts, comp_ids)
File "C:\Users\user\Desktop\mxfont\trainer\fact_trainer.py", line 251, in infer_comp_ac
acc = T_probs[cids, eids].sum() / n_experts
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

Training iteration

HI, thanks for your impressive work.
I am trying training your code from the begining, so I was wondering how many GPUS you used and how long will it spends for training?

I noticed that you set'max_iter' to 800,000 and batch_size is 8. But in paper you said the iter is 650,000 and mini_batch size is 24.
So which is correct?

When I set 'batch size' to 24, can I reduce the iteration to 650k/3 ? can I get the same result?
I am looking forward to your response.Thanks

About inference...

I noticed that there is a source_path in inference.ipynb that needs to specify a ttf your data folder does not contain, does it use the same font as your lffont's content_font. If not, how do I get it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.