2.2 Text Representation


There are three types of text that produce pages of documents (Fred Halsall, (2001)):
Unformatted (plain) text:
This text consists of fixed-size characters and a limited character set. The American Standard
Code for Information Interchange (ASCII) is an example of the character set used to represent
plain text.
Teletext is an application example of this plain text. It is used to send broadcast information to a
standard television set.
The ASCII character set is not enough for international use. The Unicode character set uses 16
bits per character.
Formatted text:
The formatted text is used to produce papers, books, and journals. The characters can have
different size, and style (bold, italic, and underline).
Hypertext:
It enables to product an integrated set of documents that have defined linkages between them.
Each document is called a page and the defined linkages between the pages are called hyperlinks.

2.2 Text Representation


The most common example of hypertext is HyperText Markup Language (HTML). HTML
commands are given within a pair of tags (<>). Extensible Markup Language (XML) is another
hypertext language. XML is designed to transport and store data.