Giter VIP home page Giter VIP logo

design2code's People

Contributors

noviscl avatar stevenyzzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

design2code's Issues

better eval interface

improve API design of the eval code so users can either provide one single filename, one single directory, or multiple directories for running eval. also need to benchmark eval speed for people's reference.

error when multithread.

Fatal Python error: init_sys_streams: can't initialize sys standard streams
Python runtime state: core initialized
OSError: [Errno 9] Bad file descriptor

Better Scoring Function

Looked through some examples and it seems that the scoring can still be improved, esp. for cases where certain elements are entirely missing from the generation.

Metric Improvement Ideas

  • Include image matching (size, positions, etc.). Also consider missing / extra image files.
  • Consider visual similarity of bounding box elements: colors, fonts (for text), etc.
  • Generate a more explainable multi-dimensional report apart from the final aggregate score. E.g., text element score & image element score / content score & layout score. We can ask humans to judge along the same dimensions.

Fix evaluation score

Need to fix the overlapping and merging issue of text blocks (the right one below):

image

DATA LICENSE

Need to check what would be an appropriate license for our test data. Also, are we ok with releasing them as part of the repo directly (e.g., should we somehow avoid them from being part of the future GPT training data)?

Preprocessing details

@NoviScl
Can we remove something like

@import url("http://fonts.googleapis.com/css?family=Open+Sans");

during preprocessing? I find it can sometimes lead to render failure while taking screenshot:

image

inquiry about its usage

after taking screenshot from (python3 data_utils/screenshot.py ).
How to input that screenshot to the model and gets generated code?

Minimize code duplicates

There are some duplicates right now (e.g., screenshot code is also in the metrics modules; image rescaling code is copied over in the GPT-4V module). Would be nice to refactor the code to avoid these.

Potential Bug in Metric v3

Example 11625.png in gpt4v_visual_revision_prompting, error below:

Traceback (most recent call last):
 File “eval.py”, line 46, in <module>
  matched, final_score, multi_score = visual_eval_v3(os.path.join(predictions_dir, filename.replace(“.html”, “.png”)), os.path.join(reference_dir, filename.replace(“.html”, “.png”)))
 File “/Users/clsi/Desktop/Pix2Code/Pix2Code/metrics/visual_score.py”, line 963, in visual_eval_v3
  blocks1 = get_blocks_ocr_free(gpt_img)
 File “/Users/clsi/Desktop/Pix2Code/Pix2Code/metrics/ocr_free_utils.py”, line 229, in get_blocks_ocr_free
  different_pixels = find_different_pixels(p_png, p_png_1)
 File “/Users/clsi/Desktop/Pix2Code/Pix2Code/metrics/ocr_free_utils.py”, line 74, in find_different_pixels
  raise ValueError(“Images are not the same size”)
ValueError: Images are not the same size

@StevenyzZhang

Demo Idea

Add a demo where users give a web link, and we generate webpages with different models and methods and compare all of them side by side.

Normalize the Final Score

Right now the final_score from visual_score is not within the range [0,1], would be nice to normalize it.

Question about the metrics

For low-level metric, are blocks only contains the text elements? Do the size and position of the image elements also need to be added to the block for comparison?

TODO on visual score

  1. We need high resolutions AND flexible aspect ratio.
    1. playwright: flexible aspect ratio (full page option) but low res(?)
    2. html2image: can set to high res, but need to set aspect ratio.
  2. Default background colors can lead to unrecognizable words by OCR.
    1. --default-background-color flag in browser settings
    2. Color differences pose challenges to OCR (the dark color foot note in 1390.png)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.