Faq-O-Matic Faq-O-Matic : Administrators' Guide : Suggestions :
Searches, charset, and Words.pm | |
The input of non-ASCII characters only works as intended
if the reader has the correct charset selected in their browser.
For example a user who has utf-8 selected and inputs non-us-ascii
characters will get strange results.
By default, faq-o-matic is sending out documents without a charset specification. This means that they will be displayed using whatever default coding ("charset") the user has selected in their browser, AND anything which they submit will also be using this charset. This is a bad idea! Especially after CA-2000-02 http://www.cert.org/advisories/CA-2000-02.html
Admins should configure their server to send out .html files
with their selected coding, e.g in Apache AddType "text/html; charset=iso-8859-1" htmland care should be taken that dynamically generated documents also send an appropriate HTTP charset attribute (recent versions of CGI.pm will do this). I'm still working on this aspect with my own FAQomatics, but I thought this should be on record as I hadn't seen it mentioned.
In Words.pm appears the following code: $string =~ s/[()'-]//g; $string =~ tr/A-Z/a-z/; # 7-bit ASCIIThis (including the introductory s///) can be replaced by a single tr as follows. IMHO much clearer and more compact (and incidentally more efficient, though I wouldn't brag about efficiency if it made the code pointlessly inscrutable...) $string =~ tr/A-Z\300-\326\330-\336()'-/a-z\340-\366\370-\376/d;cheers | |
[Append to This Answer] |
Previous: | (missing or broken file) |
Next: | Private version of CGI.pm |
|