You want to make sure that data flowing into your program has a consistent character encoding so you can handle it properly.
For example, you want to treat all incoming submitted form data as UTF-8.
You can’t guarantee that browsers will respect the instructions you give them with regard to character encoding, but you can do a number of things that make well-behaved browsers generally follow the rules.
A Content-Type header with a charset is a good hint to a browser that submitted forms should be encoded using the character encoding the header specifies.
Second, include an accept-charset=”utf-8″ attribute in <form/> elements that you output.
Although it’s not supported by all web browsers, it instructs the browser to encode the user-entered data in the form as UTF-8 before sending it to the server.
In general, browsers send back form data with the same encoding used to generate the page containing the form. So if you standardize on UTF-8 output, you can be reasonably sure that you’re always getting UTF-8 input.
The accept-charset <form/> attribute is part of the HTML 4.0 specification, but is not implemented everywhere.