Does java miss a good HTML sanitizer ?

I have been looking for a good Java HTML sanitizer, that would let me sanitize user input based on a whitelist, and would escape other things. There are some examples here and there but none seems so complete as PHP, Ruby, Python and Pearl.
Does java really miss a robust HTML sanitizer ?
Do you know any good open source HTML sanitizer for java !
update
I have released a grails plugin html-cleaner Which is a whitelist based html sanitizer – based on Jsoup
Advertisements

mvn eclipce:eclipse gives error – Request to merge when ‘filtering’ is not identical

Description: You get the error Request to merge when ‘filtering’ is not identical when running mvn eclipse:eclipse on a maven project

Root cause: It’s because of the change in maven eclipse plugin

solution: try this

mvn org.apache.maven.plugins:maven-eclipse-plugin:2.6:eclipse

Freemarker sucks, dependency on javax.swing, Wouldn’t run on GAE

I would say freemarker sucks.
Yes it does. A template engine has dependency on javax.swing !!
Isnt that surprising?

Today, I was trying to deploy a Stripe/Freemarker hello world app on GAE and I got following.


java.lang.NoClassDefFoundError: javax.swing.tree.TreeNode is a restricted class. Please see the Google App Engine developer's guide for more details.
at com.google.appengine.tools.development.agent.runtime.Runtime.reject(Runtime.java:51)
at freemarker.core.TextBlock.isIgnorable(TextBlock.java:375)
at freemarker.core.TemplateElement.postParseCleanup(TemplateElement.java:240)
at freemarker.core.MixedContent.postParseCleanup(MixedContent.java:76)
at freemarker.core.FMParser.Root(FMParser.java:2961)
at freemarker.template.Template.(Template.java:149)
at freemarker.cache.TemplateCache.loadTemplate(TemplateCache.java:448)
at freemarker.cache.TemplateCache.getTemplate(TemplateCache.java:361)
at freemarker.cache.TemplateCache.getTemplate(TemplateCache.java:235)
at freemarker.template.Configuration.getTemplate(Configuration.java:487)
at freemarker.core.Environment.getTemplateForInclusion(Environment.java:1465)
at freemarker.core.Include.accept(Include.java:157)
at freemarker.core.Environment.visit(Environment.java:210)
at freemarker.core.MixedContent.accept(MixedContent.java:92)
at freemarker.core.Environment.visit(Environment.java:210)
at freemarker.core.Environment.process(Environment.java:190)
at freemarker.template.Template.process(Template.java:237)
at freemarker.ext.servlet.FreemarkerServlet.process(FreemarkerServlet.java:452)
at freemarker.ext.servlet.FreemarkerServlet.doGet(FreemarkerServlet.java:391)
.........
........

Yes, you can not use freemarker on GAE, at least not without hacking it.

I found a patch here, seems that it would solve above exception.

http://groups.google.com/group/google-appengine-java/browse_thread/thread/dd84e44f604498c4

Who knows that it wouldn’t have dependency on more classes. we can just hope for good.

updates
Nope, above solution isn’t going to work all the time.
I had a quick look at the code, I think that the culprit is here.


abstract public class TemplateElement extends TemplateObject implements TreeNode

From the code, it seems that TemplateElement implements TreeNode
Just because it needs contract similar to TreeNode but it does not have any thing to do with swing API. (You already knew that)
Instead of implementing TreeNode, author could have created a new interface which exposes similar contract.

Do we implement any interface which exposes similar contract that we need? Context matters. EJB spec has some interface with methods that a servlet want doesn’t mean it can be implemented by a servlet.

Hope, devs would fix it in next release.
The issue is discussed in mailing list here http://n4.nabble.com/Dependeny-on-javax-swing-td978818.html

At the end, I would like to say, I like freemarker and that’s why I wrote even a hello world stripe application in freemarker.

Update 26/03/2010

Friends at freemarker has released a Freemarker GAE prerelease which can be downloaded here. It should work on GAE.  Any one interested in running freemarker on GAE should try it and report issues if any.

see  freemarker on GAE too.

update 28/04/2010

No, still freemarker-gae-pre2 will not work on GAE, I have tested it on GAE 1.3.2.  You will get one or both of the following exceptions

java.lang.VerifyError: (class: freemarker/ext/jsp/FreeMarkerJspApplicationContext, method: signature: ()V) Incompatible argument to function

OR

java.lang.NoClassDefFoundError: Could not initialize class freemarker.ext.jsp.PageContextFactory

See this thread and this thread . The issue has been reported to GAE team here vote for it.

Waiting for the freemarker or GAE team to come up with explanation/solution.

DDD Without any ORM tool, is it possible !!

DDD Without any ORM tool, is it possible !!

When reading DDD book and trying out it on a sample project that doesn’t use any ORM tool. I came across a question, is it possible to strictly implement DDD without any ORM tool !

Is there any one here who has implemented DDD on any real project that doesn’t use any ORM but pure JDBC only?

Before few days, I had a question on how to paginate and lazy load non root objects http://tech.groups.yahoo.com/group/domaindrivendesign/message/15925

Various alternatives and answers came up, but why do I need to find an alternative ! I don’t need to do any thing specific when using hibernate and just depend on the lazy loading support provided by it.

I want to paginate because I don’t want to load entire graph of few thousand objects into memory and when using hibernate and lazy loading that’s not an issue. With NO-ORM, I can achieve similar thing by hand coding dynamic proxies for lazy loading, and let repositories handle it.

However the second point is more important.

The second question is related to repositories, transitive persistence and dirty checking.
The DDD book (Chapter 6 – repositories) says “it allows freely switching persistence strategies at any time.”

Is it really possible and easy without modifying domain at all !

How about dirty checking and persistence by reachability without any ORM ?
When you save/update an aggregate root, entire graph should be saved and/or updated. How the repository will determine which objects are modified/inserted or deleted?

Probably you will need some thing like isNew() isStale() inside every entities to find out if it needs to be included as part of insert/update. And set flags when ever any setter or any other method changes state of the object. I have seen similar thing in a project which uses DDD with JDBC.

Several answers came up

Get the data from the database and determine updates, deletes, and inserts in code.

This may not be the best routine, but I use a boolean flag to determine what needs persisted. I have a property in each entity called .IsDirty. When anything inside the aggregate root needs persisted, I use the root’s repository and call root.Persist and I go through the root. The root repository then loops through each entity and tests the .IsDirty flag

Do you see any pattern in this answers ?

  • It says either modify your domain model or write some thing like your little ORM to do this things.
  • That says DDD is tightly bound to ORM and “Freely switching of persistence technology” means switching from one ORM to another (it could be your own).

How about Delete?
Lets say root has a collection of associated entities (In my example MessageFolder has Collection of messages)

I can do some thing like
MessageFolder.remove(Message message)

That would remove message from the collection. When saving the MessageFolder, how the repository can find out what messages are deleted? Either message folder need to maintain a collection of removed messages or repository need to have old message folder for comparision. The first option forces me to modify MessageFolder, that mean domain model became aware of underlying persistence technology. The second option would need to load old entity which would affect performance.

It might be possible to switch to ORM from a No-ORM application but what about switching to NO-ORM from ORM ! Does that mean I will have to add custom behavior to my entities to figure out weather they are new or modified or deleted.

These are few questions that I am not able to answer my self and seeking some expert thoughts. If you have done DDD for a real life project that does not use any ORM, please comment.

Some people said CQRS and Domain events can solve this problem, however I am yet to find a working example.

There is a sample DDD application created by DDD community. The sample application uses Hibernate in persistence layer.
I wish some day DDD community would release a sample application that works on pure JDBC. As still there are lots of companies who use bare JDBC and no ORM at all.

Look at the thread http://tech.groups.yahoo.com/group/domaindrivendesign/message/16021