12.3 The World Wide Web and Uniform Resource...


Most web browsers are capable of using protocols other than HTTP, which is the basic protocol of the Web. For example, these browsers are usually also Gopher and FTP clients or are capable of using your existing Telnet and FTP clients transparently (without it being obvious to the user that an external program is starting). Many of them are also Network News Transfer Protocol (NNTP) and SMTP clients. They use a single, consistent notation called a Uniform Resource Locator (URL) to specify connections of various types (Zwicky, Cooper and Chapman 2000).

Examples

The following example URIs illustrate several URI schemes and variations in their common syntax components (Berners-Lee, et al. 2005):

ftp://ftp.is.co.za/rfc/rfc1808.txt
http://www.ietf.org/rfc/rfc2396.txt
ldap://[2001:db8::7]/c=GB?objectClass?one
mailto:John.Doe@example.com
news:comp.infosystems.www.servers.unix
telnet://192.0.2.16:80/
urn:oasis:names:specification:docbook:dtd:xml:4.1.2

12.3 The World Wide Web and Uniform Resource...


Uniform Resource Identifier

A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource.
URIs are characterized as follows (Berners-Lee, et al. 2005):

Uniform: Uniformity provides several benefits. It allows different types of resource identifiers to be used in the
same context, even when the mechanisms used to access those resources may differ. It allows:
Uniform semantic interpretation of common syntactic conventions across different types of resource
identifiers.
Introduction of new types of resource identifiers without interfering with the way that existing identifiers are
used identifiers.
The identifiers to be reused in many different contexts, thus permitting new applications or protocols to
leverage a preexisting, large, and widely used set of resource identifiers.

12.3 The World Wide Web and Uniform Resource...


Resource: The term "resource" is used in a general sense for whatever might be identified by a URI. Familiar
a service (e.g., an HTTP-to-SMS gateway), and a collection of other resources. Abstract concepts can be
resources, such as the operators and operands of a mathematical equation, the types of a relationship (e.g.,
"parent" or "employee"), or numeric values (e.g., zero, one, and infinity).
Identifier:An identifier embodies the information required to distinguish what is being identified from all other
things within its scope of identification. Our use of the terms "identify" and "identifying" refer to this purpose of
distinguishing one resource from all other resources, regardless of how that purpose is accomplished (e.g., by
name, address, or context). These terms should not be mistaken as an assumption that an identifier defines or
embodies the identity of what is referenced, though that may be the case for some identifiers. Nor should it be
assumed that a system using URIs will access the resource identified: in many cases, URIs are used to denote
resources without any intention that they be accessed. Likewise, the "one" resource identified might not be
singular in nature (e.g., a resource might be a named set or a mapping that varies over time).

12.3 The World Wide Web and Uniform Resource...


A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provides a means of locating the resource by describing its primary access mechanism (e.g., its network "location"). The term "Uniform Resource Name" (URN) has been used historically to refer to both URIs under the "urn" scheme, which are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable, and to any other URI with the properties of a name (Berners-Lee, et al. 2005).

URL components

The URLs may consist of two or more components depending upon the location of the resource (often a file). These components may include (Banerjee 2004):

Name of the protocol / scheme using which a resource can be located, like: HTTP, FTP, TELNET, File, News,
and Mailto etc.
Name of the Server / Domain on / in which the resource is located, like: www.bits-pilani.ac.in,
www.mediu.edu.my/, magazine.mediu.edu.my/etc.
Path of the file to be located, like: admissions/postgraduate.html etc.
Filename (as postgraduate.html in previous example).