Thursday, 28 February 2013

Specifying the Character Set | HTML5 Tutorial pdf

Specifying the Character Set

Problem
You need to define the character encoding of your web page.
Solution
In your document head, add a meta declaration for the character set:
<meta charset="UTF-8" />

Discussion

Character encoding tells browsers and validators what set of characters to use when rendering web pages.
If you do not declare the character set in your HTML, browsers first try to determine the character set from your server’s HTTP response headers (specifically, the “Content- Type” header).
The character set declared in the response headers is generally taken in preference over the character set specified in your document, and is thus the preferred method.
However, if you cannot control what headers your server sends, declaring the character set in your HTML document is the next best option.
If a character set is neither declared in the document nor in the response headers, the browser might choose one for you and it may be the wrong one for your site’s needs.
This could not only cause issues with rendering, but also poses a security risk.
In previous versions of HTML, the character set needed to be declared with additional attributes and values:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
But, like the DOCTYPE, HTML5 only needs the minimum information required by browsers. Again, this helps with backwards compatibility and makes it easier for authors to implement.
Special Characters
In this recipe, the page was specified to Unicode (UTF-8) because it is a versatile encoding that covers most web builders’ needs. Sometimes, though, you need to include a character that is outside the encoding.
For these characters, you specify them with a Numeric Character Reference (NCR) or as a named entity in order to help browsers render them correctly. If you wanted a copyright symbol, for example, you could include it in your HTML as an NCR:
&#169;
Or you could include it as a named entity:
&copy;

No comments: