Giter VIP home page Giter VIP logo

kses's Introduction

This is the initial checkin of code in the process of being copied from its source at sourceforge and getting ready to be updated many years later.

I was surprised to see that kses is still being used these days, and that has inspired me to pick it back up and see what might be needed to be added to it to see if it can be made to work even better.

Updates to follow, soon.


kses 0.2.2 README  [kses strips evil scripts!]
=================

* INTRODUCTION *


Welcome to kses - an HTML/XHTML filter written in PHP. It removes all unwanted
HTML elements and attributes, no matter how malformed HTML input you give it.
It also does several checks on attribute values. kses can be used to avoid
Cross-Site Scripting (XSS), Buffer Overflows and Denial of Service attacks,
among other things.

The program is released under the terms of the GNU General Public License. You
should look into what that means, before using kses in your programs. You can
find the full text of the license in the file COPYING.


* FEATURES *


Some of kses' current features are:

* It will only allow the HTML elements and attributes that it was explicitly
told to allow.

* Element and attribute names are case-insensitive (a href vs A HREF).

* It will understand and process whitespace correctly.

* Attribute values can be surrounded with quotes, apostrophes or nothing.

* It will accept valueless attributes with just names and no values (selected).

* It will accept XHTML's closing " /" marks.

* Attribute values that are surrounded with nothing will get quotes to avoid
producing non-W3C conforming HTML
(<a href=http://sourceforge.net/projects/kses> works but isn't valid HTML).

* It handles lots of types of malformed HTML, by interpreting the existing
code the best it can and then rebuilding new code from it. That's a better
approach than trying to process existing code, as you're bound to forget about
some weird special case somewhere. It handles problems like never-ending
quotes and tags gracefully.

* It will remove additional "<" and ">" characters that people may try to
sneak in somewhere.

* It supports checking attribute values for minimum/maximum length and
minimum/maximum value, to protect against Buffer Overflows and Denial of
Service attacks against WWW clients and various servers. You can stop
<iframe src= width= height=> from having too high values for width and height,
for instance.

* It has got a system for whitelisting URL protocols. You can say that
attribute values may only start with http:, https:, ftp: and gopher:, but no
other URL protocols (javascript:, java:, about:, telnet:..). The functions that
do this work handle whitespace, upper/lower case, HTML entities
("jav&#97;script:") and repeated entries ("javascript:javascript:alert(57)").
It also normalizes HTML entities as a nice side effect.

* It removes Netscape 4's JavaScript entities ("&{alert(57)};").

* It handles NULL bytes and Opera's chr(173) whitespace characters.

* There is a procedural version and two object-oriented versions (for PHP 4
  and PHP 5) of kses.


* USE IT *


It's very easy to use kses in your own PHP web application! Basic usage looks
like this:


<?php

include 'kses.php';

$allowed = array('b' => array(),
                 'i' => array(),
                 'a' => array('href' => 1, 'title' => 1),
                 'p' => array('align' => 1),
                 'br' => array());

$val = $_POST['val'];
if (get_magic_quotes_gpc())
  $val = stripslashes($val);
# You must strip slashes from magic quotes, or kses will get confused.

$val = kses($val, $allowed); # The filtering takes place here.

# Do something with $val.

?>


This definition of $allowed means that only the elements B, I, A, P and BR are
allowed (along with their closing tags /B, /I, /A, /P and /BR). B, I and BR
may not have any attributes. A may only have the attributes HREF and TITLE,
while P may only have the attribute ALIGN. You can list the elements and
attributes in the array in any mixture of upper and lower case. kses will also
recognize HTML code that uses both lower and upper case.

It's important to select the right allowed attributes, so you won't open up
an XSS hole by mistake. Some important attributes that you mustn't allow
include but are not limited to: 1) style, and 2) all intrinsic events
attributes (onMouseOver and so on, on* really). I'll write more about this in
the documentation that will be distributed with future versions of kses.

It's also important to note that kses' HTML input must be cleaned of all
slashes coming from magic quotes. If the rest of your code requires these
slashes to be present, you can always add them again after calling kses with
a simple addslashes() call.

You should take a look at the documentation in the docs/ directory and the
examples in the examples/ directory, to get more information on how to use
kses. The object-oriented versions of kses are also worth checking out, and
they're included in the oop/ directory.


* UPGRADING TO 0.2.2 *


kses 0.2.2 is backwards compatible with all previous releases, so upgrading
should just be a matter of using a new version of kses.php instead of an old
one.


* NEW VERSIONS, MAILING LISTS AND BUG REPORTS *


If you want to download new versions, subscribe to the kses-general mailing
list or even take part in the development of kses, we refer you to its
homepage at  http://sourceforge.net/projects/kses . New developers and beta
testers are more than welcome!

If you have any bug reports, suggestions for improvement or simply want to tell
us that you use kses for some project, feel free to post to the kses-general
mailing list. If you have found any security problems (particularly XSS,
naturally) in kses, please contact Ulf privately at  metaur at users dot
sourceforge dot net  so he can correct it before you or someone else tells the
public about it.

(No, it's not a security problem in kses if some program that uses it allows a
bad attribute, silly. If kses is told to accept the element body with the
attributes style and onLoad, it will accept them, even if that's a really bad
idea, securitywise.)


* OTHER HTML FILTERS *


Here are the other stand-alone, open source HTML filters that we currently know
of:

* Htmlfilter for PHP - the filter from Squirrelmail
  PHP
  Konstantin Riabitsev
  http://linux.duke.edu/projects/mini/htmlfilter/

* HTML::StripScripts and related CPAN modules
  Perl
  Nick Cleaton
  http://search.cpan.org/perldoc?HTML%3A%3AStripScripts

* SafeHtmlChecker [is this really open source?]
  PHP
  Simon Willison
  http://simon.incutio.com/archive/2003/02/23/safeHtmlChecker

There are also a lot of HTML filters that were written specifically for some
program. Some of them are better than others.

Please write to the kses-general mailing list if you know of any other
stand-alone, open-source filters.


* DEDICATION *


kses 0.2.2 is dedicated to Audrey Tautou and Jean-Pierre Jeunet.


* MISC *


The kses code is based on an HTML filter that Ulf wrote on his own back in 2002
for the open-source project Gnuheter ( http://savannah.nongnu.org/projects/
gnuheter ). Gnuheter is a fork from PHP-Nuke. The HTML filter has been
improved a lot since then.

To stop people from having sleepless nights, we feel the urgent need to state
that kses doesn't have anything to do with the KDE project, despite having a
name that starts with a K.

In case someone was wondering, Ulf is available for kses-related consulting.

Finally, the name kses comes from the terms XSS and access. It's also a
recursive acronym (every open-source project should have one!) for "kses
strips evil scripts".


// Ulf and the kses development group, February 2005

kses's People

Contributors

linusu avatar richardvasquez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

kses's Issues

Thanks.

Just found your repo on here, and wanted to say thank you for creating the kses project, and taking it back up again. Your project has been a life saver for many people, and sites over the years.

I'm currently using a modified (I minified, and included html5 cleaning) in my own project, LibreCMS.

oop php 5 "Undefined variable: string2"

Hi,
Decided to give this class a try and after using parse on an "a" tag that has a "href" attribute with http link it gives me an error:

Notice: Undefined variable: string2 in /*****/php5.class.kses.php on line 958

I'm I right to assume that that $string2 should be $string?

oop php5 deprecated on 5.5+ versions

Hi, it seems that some bits of the code were deprecated in 5.6 and 5.5 versions.

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/...path.../php5.class.kses.php on line 542

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in /var/www/...path.../php5.class.kses.php on line 158

Not Working in PHP 7

Warning: preg_replace_callback(): Requires argument 2, 'kses_normalize_entities2("\1")', to be a valid callback in F:\xampp\htdocs\XSS\kses-master\kses.php on line 522

Warning: preg_replace(): The /e modifier is no longer supported, use preg_replace_callback instead in F:\xampp\htdocs\XSS\kses-master\kses.php on line 81

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.