A week ago, we encountered a funny problem where our Tapestry 3.0 application seemed to screw up the encoding of form posts. Every time we tried to post a form with diacritics in the input fields, the data got mangled before reaching the application code.
As it turned out, somebody had turned on the RequestDumperValve in the Tomcat configuration file. The request dumper does not only dump the request, but is also kind enough to mangle the data before handing it over to the servlet for further processing:
“Enabling the RequestDumperValve in both 5.5.12 and 5.0.16 (!) messes up the parsing of other-than-ISO-8859-1 incoming parameters.
After using a rather huge bunch of hours, this came down as the result: when this “debug valve” is turned on, it seems to default to ISO-8859-1 when it parses and log-outputs the incoming parameters, thus also implicitly setting the entire Request-object to this enc, so any subsequnt setting to UTF-8 doesn’t matter at all. At least this is true for POST parameters.
For GET parameters, the situation is a little different. Here an explicit setting of URIEncoding to UTF-8 seems to work as it should, while useBodyEncodingForURI doesn’t – it picks up the wrong already implicitly set encoding. (For 5.0.16 I can’t seem to get the latter version to work, and have to use the explicit setting.)
Sorry if my analysis doesn’t hold water, but at least the bug seems to be very consistent.”
(quoted from a post by Endre Stølsvik found in the mail archive)
So if you have strange UTF-8 problems with Tomcat, see if the following line is in your server.xml file:
If this line is in, comment it out, restart and try again. Chances are that your problem has dissapeared.
The most irritating point about this whole adventure is that the RequestDumperValve was found by a collegue with which I used the RequestDumperValve a year or so ago to fix some other problem, and we already had discovered exactly this behaviour. If only I had blogged about it then… Good thing Joris’ memory is better than mine