Giter VIP home page Giter VIP logo

htmllaundry's Introduction

Introduction

This package contains several handy python methods to cleanup HTML markup or perform other common changes. The cleanup is strict enough to be able to clean HTML pasted from MS Word or Apple Pages. This package also contains integration code for z3c.form to provide fields that automatically sanitize HTML on save.

The implementation is based on the Cleaner class from lxml.

Cleanup routines

All cleanup routines can be invoked through the single sanitize function. This functions takes an input string as input and will return a cleaned up version of that string. Here is a simple example:

>>> from htmllaundry import sanitize
>>> sanitize('Hello, <em>world</em>')
'<p>Hello, <em>world</em></p>'

The sanitize method takes an extra optional parameter with a lxml Cleaner instance, which can be used to use different filtering rules. htmllaundry includes three cleaners:

  • htmllaundry.cleaners.DocumentCleaner, which is the default cleaner. This cleaner will allow most safe tags, while stripping out inline styles and insecure markup.
  • htmllaundry.cleaners.LineCleaner is a more strict cleaner which only allows a few inline elements. This is useful in places where you only want to accept single-line input, for example in document titles.
  • htmllaundry.cleaners.CommentCleaner only allows a very limited set of HTML elements, and is designed to be useful for user provided comments. It will also force all external links to open in a new browser window.

If you want to go all the way you can also use strip_markup to strip all markup from your input:

>>> from htmllaundry import strip_markup
>>> strip_markup('Hello, <em>world</em>')
'Hello, world'

z3c.form integration

If you want to use the z3c.form integration you should use the z3cform extra for this package:

install_requires=[
     ....
     htmllaundry [z3cform]
     ...
     ],

In addition you will need to load the ZCML. In your configure.zcml add a line like this:

<include package="htmllaundry" />

You can then use the HtmlText field type in your schemas. For example:

from zope.interface import Interface
from zope import schema
from htmllaundry.z3cform import HtmlText

class IDocument(Interface):
    title = schema.TextLine(
            title = _(u"Title"),
            required = True)

    description = HtmlText(
            title = _(u"Description"),
          required = True)

Please note that using HtmlText will not automatically give you a WYSYWIG widget.

htmllaundry's People

Contributors

ale-rt avatar jcbrand avatar pysailor avatar rafaelbco avatar wichert avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.