Giter VIP home page Giter VIP logo

huozi's Introduction

Huozi

Huozi is a package of automated document manipulators, tailored for efficient production of electronic digest-type weekly magazines which requires little fancy. Huozi grew out of my experiences of editing and producing of 1510 Weekly, which is why for the moment, Huozi is customized for 1510 Weekly production, and Chinese-oriented. The package is under development and far from decent usability. In Chinese, Huozi means 'movable type', 'lively words', or 'animated words'.

Tool Simple

Tool simple is the GUI.

AEP

AEP, or Automated E-Digest Preprocessor, includes these tools:

  • Issue Class and Article Class, corresponding to an issue of a magazine and an article in it;
  • Text cleaner that removes redundant spaces, blank lines, and unify punctuation marks;
  • Html analyser that extract its main text and guess title, author, and where the sub-headlines lies; and
  • Grabber that handles .

The Bride

The Bride is the Microsoft Word document formatter. The naming has nothing to do with Kill Bill.

Requirments

  • Python 2.7 (not tested on other versions)
  • lxml, BeautifulSoup 3
  • PIL For auto-detection of charset:
  • chardet For doc export:
  • MS Windows XP or 7; Vista should probably work but not tested.
  • MS Word 2007 or 2010; Version 2013 and 2003 could work as well.
  • win32com

Software Structure

For a closer look at the structure of the package, see package structure.

Author

My name is Andy Shu. I am a Hong Kong-based journalist and volunteer with a non-profit organization Co-China forum. I picked up programming in January 2013. So please expect nothing better than chaotic design, horrible mistakes, and spaghetti code style.

huozi's People

Contributors

blaesus avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.