Giter VIP home page Giter VIP logo

uied's Introduction

UIED - UI element detection, detecting UI elements from UI screenshots or drawnings

This project is still ongoing and this repo may be updated irregularly, I developed a web app for the UIED in http://uied.online

Related Publications:

1. UIED: a hybrid tool for GUI element detection

2. Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?

The repo has been upgraded with Google OCR for GUI text detection, to use the original version in our paper (using EAST as text detector), check the relase v2.3 and download the pre-trained model in this link.

What is it?

UI Element Detection (UIED) is an old-fashioned computer vision (CV) based element detection approach for graphic user interface.

The input of UIED could be various UI image, such as mobile app or web page screenshot, UI design drawn by Photoshop or Sketch, and even some hand-drawn UI design. Then the approach detects and classifies text and graphic UI elements, and exports the detection result as JSON file for future application.

UIED comprises two parts to detect UI text and graphic elements, such as button, image and input bar.

  • For text, it leverages Google OCR to perfrom detection.

  • For graphical elements, it uses old-fashioned CV approaches to locate the elements and a CNN classifier to achieve classification.

UIED is highly customizable, you can replace both parts by your choice (e.g. other text detection approaches). Unlike black-box end-to-end deep learning approach, you can revise the algorithms in the non-text detection and merging (partially or entirely) easily to fit your task.

UIED Approach

How to use?

Dependency

  • Python 3.5
  • Opencv 3.4.2
  • Pandas

Installation

The new version of UIED equipped with Google OCR is easy to deploy and no pre-trained model is needed. Simply donwload the repo along with the dependencies.

Please replace the Google OCR key at detect_text/ocr.py line 28 with your own (apply in Google website).

Usage

To test your own image(s):

  • To test single image, change input_path_img in run_single.py to your input image and the results will be output to output_root.
  • To test mutiple images, change input_img_root in run_batch.py to your input directory and the results will be output to output_root.
  • To adjust the parameters lively, using run_testing.py

Note: The best set of parameters vary for different types of GUI image (Mobile App, Web, PC). I highly recommend to first play with the run_testing.py to pick a good set of parameters for your data.

Folder structure

cnn/

  • Used to train classifier for graphic UI elements
  • Set path of the CNN classification model

config/

  • Set data paths
  • Set parameters for graphic elements detection

data/

  • Input UI images and output detection results

detect_compo/

  • Non-text GUI component detection

detect_text/

  • GUI text detection using Google OCR

detect_merge/

  • Merge the detection results of non-text and text GUI elements

The major detection algorithms are in detect_compo/, detect_text/ and detect_merge/

Demo

GUI element detection result for web screenshot

UI Components detection result

uied's People

Contributors

mulongxie avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.