HTML ASCII Character Sets and Entity Definitions

Posted on November 6, 2008

By default, HTML+ documents are made up of 8-bit characters from the ISO 8859 Latin-1 character set. The network protocol used to retrieve documents may translate the character set into a locally acceptable form, e.g. EBCDIC.

The HTTP protocol uses the MIME standard (RFC 1341) to specify the document type and character set. ISO SGML entity definitions are used to include characters which are missing from the character set or which would otherwise be confused with markup elements, e.g:

&
ampersand &

<
less than sign <

&gt;
greater than sign >

&quot;
the double quote sign ”

Some other useful accented characters in 7-bit ASCII entity definitions are:

&ndash;
en dash – (half the width of an em unit)

&mdash;
em dash — (equal to width of an “m” character)

&ensp;
en space

&emsp;
em space

&nbsp;
non breaking space
­
&shy;
soft hyphen (normally invisible)

&copy;
copyright sign ©

&trade;
trade mark sign ™

&reg;
registered sign ®

There are a large number of entities defined by the ISO, covering most languages and symbols for publishing and mathematics. Requiring all browsers to support these would be impractical, e.g. how should a dumb terminal show such symbols. In some cases there will be accepted ways of mapping them to normal characters, e.g. æ as ae and è as e. Perhaps the safest recommendation is that where authors need to use a specialised character or symbol, they should use ISO entity names rather than inventing their own. Browsers should leave unrecognised entity names untranslated.

Tags: // Category: Webmastering.

Other Posts

    Crucial Search Engine Optimization Tips

    Teen hacker releases Windows and Mac jailbreaking programs for iPhone 3G S


    Social Media Yoono for Mainstream and Early Adopters

    Chinese entrepreneurs, investors on Google: ‘Just quit. We don’t care.’


Leave a Reply

You must be logged in to post a comment.

Ad Ad Ad Ad

Translator



Skype Online Status

Call me! - allQoo Customer Service: Offline

Categories


Get Adobe Flash playerPlugin by wpburn.com wordpress themes