Opened 14 years ago

Last modified 13 years ago

#92 closed defect (FIXED)

Need a util.html_cleaner() method with good tests

Reported by: Christopher Allan Webber Owned by:
Priority: minor Milestone: 0.0.3
Component: programming Keywords:
Cc: Parent Tickets:

Description

There's no way around it, we're going to need a good sanitizer for
comment/description/etc html **regardless** of
`whether or not we use markdown <http://bugs.foocorp.net/issues/363#note-5>`_
... so I think the html cleaner should:


-  use
   `lxml.html.clean <http://lxml.de/lxmlhtml.html#cleaning-up-html>`_
-  Use **only whitelisted tags**... this is possible through lxml,
   we need to do it right. Tags I think we'll need to allow: b, i, em,
   strong, p, ul, ol, li, a, br. (any others?)
-  Only whitelisted attributes
-  XSS attribute attack prevention, other XSS prevention stuff...
   see the lxml.html.clean docs.
-  have tests that try to attack each one of these components.

In the future it might be a good idea to also prevent certain other
annoying things... deeply nested

.. raw:: html

   <p>
   
's, etc. But for now I think this will be good enough.



Change History (5)

comment:1 by Christopher Allan Webber, 14 years ago

Milestone: 0.0.3

comment:1 by Christopher Allan Webber, 14 years ago

I've added a cleaner method to util, util.clean\_html(). I even
added tests for the two biggest concerns, blocking images and any
sort of javascript. Could use a decent Elrond review... and better
tests ;)



comment:2 by Christopher Allan Webber, 14 years ago

Status: NewClosed
We have a couple of pretty good tests in here; I'm satisfied
enough. It looks like lxml has itself some good tests too for more
graunular stuff.

Maybe in the future we could explicitly test all of:


-  `http://htmlpurifier.org/live/smoketests/xssAttacks.php <http://htmlpurifier.org/live/smoketests/xssAttacks.php>`_
-  `http://ha.ckers.org/xss.html <http://ha.ckers.org/xss.html>`_

but I don't think we have time, and it really looks like we've
covered all of those cases (even with the few tests we have now)
but just not in a granular way.

Marking this as closed.



comment:3 by Elrond, 13 years ago

Component: Programming

comment:3 by Will Kahn-Greene, 13 years ago

The original url for this bug was http://bugs.foocorp.net/issues/379 .

Note: See TracTickets for help on using tickets.