The Internet

What it is

The Internet is a world-wide network of computers which are linked by telephone cables, satellites and micro-wave transmitters. The cables are leased from the telephone companies which own them by Internet Service Providers (ISP). The availability of the Internet depends on the quality of a country’s phone system and most less developed countries have little or no access to it. As much economic development now depends on communications systems most countries are now investing heavily in modern telephone systems - Vietnam, for example, is currently spending $3 billion US upgrading its telephone infrastructure. One reason for the privatisation of BT was to improve its competitiveness in world markets.

Internet.com   How it works

History

The Internet started out as a small number of computers on military bases in the United States in the 1960s. This system, at first called ARPANET, was designed to withstand military attack by Soviet missiles, not by being bomb-proof but by having lots of links between the various sites. If a direct link between computers A and B was broken data could be routed via computer C, or computers D and E, and so on. There is said to be a high degree of redundancy in such networks as there are many ways of linking any two computers.

Prodigious Growth

ARPANET, or the Internet as it became, expanded to include research institutions and universities all over America and around 1980 it went international when a link was made to Holland. The rest, as they say, is history, as the Internet grew at an increasing rate. Even in 1990 it was relatively small but two things happened then to bring it out of its military, academic and geeky backwater into the world of commerce and ordinary people. The first was the development of a hypertext ‘web’ at CERN in Geneva by Tim Berners-Lee which subsequently became the World Wide Web. The second was the development of a ‘web browser’ (‘Mosaic’) at the National Centre for Supercomputer Applications (NCSA) in America. The Web looked and behaved like the Windows or Macintosh graphical user interface (GUI) which made it much easier for ordinary people to connect to and use the Internet.

The Backbone

The ‘Internet’ is a series of networks with links between them, hence the name. The links between the networks which make up the Internet consist of high bandwidth ‘data pipes’ which together make the ‘backbone’ of the Internet. There are fast, high bandwidth links between the most developed countries in the world and between the major cities in each country. There are links between Europe, North America, South America, South East Asia, Japan and Australia but currently very little in Africa or Central Asia. A map of the Internet looks like a map of roads or airline routes with heavy traffic along national and international routes falling to a trickle in the boondocks.

Atlas of CyberSpaces WAN Technologies  

Megabits and Gigabits

There are links of 45Mbps (megabits per second, 106 or rather 220) between major centres, slower links of 2048Mbps out to the smaller centres and 64kbps lines to the smallest and most remote places. MCI in North America runs a backbone at 622Mbps , UUNET UK is developing a 155Mbps link and there are 100Mbps links to Europe from London via the European backbone or ‘Ebone’. Lines with a capacity of 2.4Gbps (Giga bits per second, 109 or rather 230) and 9.6Gbps are under development, though they may cost more than even the biggest companies can afford, hence the need for mergers such as that between BT and American MCI. Investment in such things may keep some countries ahead of the others in the world economic system of the 21st century. The Telecommunications infrastructure rates alongside roads, health and education as a key strategic resource for the future well-being of any country.

LINX and D-GIX

The various data cables meet at switches or ‘routers’, which are controlled by powerful computers. Every packet of data sent across the Internet has to be routed to its destination and it is the role of these high speed routers and their computers to decode the address and select the best route currently available. These routers handle hundreds of thousands of such decisions per second so they are very busy. The main routing centre in the UK is Telehouse in London’s docklands, other wise known as LINX or the London Internet Exchange. Further exchanges are planned, two in London and one in Manchester to provide the sort of redundancy built into ARPANET at the start - with just one centre the national Internet system would be vulnerable to catastrophic events such as a plane crash. Calls to computers around the world are routed through LINX while calls to computers within the UK may be handled either by LINX or by direct, peer-to-peer links between ISPs. As well as LINX in Lodon there is a ‘D-GIX’ (Distributed Global Internet Exchange) in Paris and Stockholm and similar centres in major American cities.

POPs

Each of the places where the slow lines meet the backbone is a ‘Point of Presence’ (POP) which is where ISPs connect their clients to the Internet proper. The larger ISPs join LINX and provide their customers with direct links to the fastest part of the Internet. LINX currently has 33 members including BT, Cable Online, Compuserve, Demon, Direct Connection, Easynet, IBM Global Network, Mercury, Netcom, Planet Online, UUNET Pipex and the educational JANET network. These larger organisations have their own customers and they also let out bandwidth to smaller ISPs.

TCP/IP

All computer networks use protocols for exchange of data between different physical devices. The protocols used on the Internet are TCP/IP (Transport Control Protocol and Internet Protocol). Data on the Internet is broken into regular sized ‘packets’ which are transmitted independently and reassembled at the computer which receives them. The IP level of the TCP/IP protocol provides the addressing functions which enable routers to send the packets to the next stage of their journey. Every computer using the Internet has a unique IP address which allows it to communicate with other computers. IP addresses may be fixed to a single computer or they may be allocated dynamically from a block of addresses when a computer requests use of the Internet. The TCP level deals with error detection such as identifying missing packets and requests for retransmission.

The current version of TCP/IP, v4, makes provision for 2^24 network devices (more than 4 billion). An IP address is divided into four parts, each of 256 locations (0..255), for example 234.127.68.120. This is a convenient shorthand for numbers that lie in the range 0-2^24 (-1 to be exact!). The right hand number can represent addresses in the range 0..255, the next number  to the left can represent numbers in the range 256..65535 (including the 255 in the right-most numbers); the next set of numbers to the left can represent addresses in the range 65,536..16,777,215 (including the 65,535 in the numbers to the right); the left-most numbers can represent addresses in the range 16,777,216..4,294,967,295 (including all the numbers to the right). The total number of addresses available in version 4 of TCP/IP is, therefore, 4,294,967,295, or 256^4-1 or 2^32-1. This is easier to follow if you understand binary and hexadecimal, the decimal number is presented in four parts for those who don't.

The account above refers to 'network devices' because it is not only computers that are connected to TCP/IP networks but also print servers and routers and, most recently, CCTV cameras, all of which are given IP addresses.

The number of devices has exceeded the number of addresses available and a new version of TCP/IP is being prepared to address this problem. The initial design allowed for considerable expansion, of an order that has only been realised in since 1997 or so. As the Internet has expanded exponentially since then the number of addresses has been surpassed. Many devices, however, computers, print servers, CCTV cameras, etc., are on internal company networks and access the Internet through a router. Only the router needs an IP address so organisations and households can support many devices behind their routers and there is a combined total of networked devices far greater than 4,294,967,295.

Packets and Access Speed

Internet data such as web pages, email messages and files are broken into a number of ‘packets’ which are transmitted separately and which will almost certainly follow different routes through the tangle of lines to their destination. Packet switching was devised by Donald Davies in the early 1960s and is distinct from 'switching', the way in which telephone calls are routed. The packets are reassembled into correct order by software on the receiving computer. Each router through which a packet passes subtracts 1 from an 8-bit digit which is included in the header. When this number reaches 0 the packet is discarded so the largest number of ‘hops’ a packet can take is 255. The more hops a packet takes the slower the journey will be as even the fastest routers slow down transmission. Access is also slowed down by traffic which exceeds the bandwidth available, especially where ISPs sell more access than their own links to the backbone can justify. Like the number of vehicles on the roads the situation will probably get worse in the future. Internet access is still growing rapidly in developed countries while countries with large populations like China, India and Indonesia have only a few thousand users each at present. Adding these countries to global communications will require those 9.6Gbps lines and more!

Internet Protocols

A protocol is a set of rules for communication. There are various forms of data moving across the Internet between computers, each with its own protocol. The protocol of the Web is http or Hypertext Transport Protocol which defines the communication rules for hypertext documents of the sort which are used by Web browsers. Hypertext files, as you can see from any HTML file, are stored in ASCII format as a series of characters. HTML tags are interpreted by a Web browser and the text formatted on-screen according to the commands within. Movement of binary files is governed by ftp or File Transport Protocol . Email is governed by the mail protocol. Another protocol which modern browsers understand but which is not used as much as http is gopher. Gopher was developed at the University of Minnesota in 1991 and named after the university’s mascot.

DNS

DNS, or Domain Name Server, provides a way to translate numerical IP addresses into names that are easier to remember and can carry personal and brand names. The link between a domain name and the numerical IP address is stored on domain name servers (held on behalf of the internet community in the USA) and may also be stored on more local servers. The domain name follows the protocol (http) and includes the address of the server and a suffix to indicate the type of service - .com, .mil, .org, .net, .edu, .gov and .co. The domain name also includes the country where the server is located or registered: .uk, .fr, .es, .it, .ca, .de, to name but a few (.tv is for Tuvalu, formerly the Gilbert and Ellis islands in the Pacific, a domain which has gained in popularity since owners of TV channels realised the connection. The address of a resource on the internet is represented as a URL or Uniform Resource Locator (such as http://www.bbc.co.uk).

Domain Name Servers 

Changes to the Internet

The biggest problems facing the Internet are the lack of bandwidth and the exhaustion of IP addresses. We have seen how bandwidth may be improved, though whether to a satisfactory level remains to be seen. The current method of creating IP addresses (IPv4) uses a 32 bit number which gives potentially 4 billion addresses but the actual number is much less because they are issued in blocks and many are unused. A new specification, IPv6 or IPng (Internet Protocol Next Generation!), uses 128 bit addressing which will provide 100 IP addresses for every individual on the planet. IPv6 may be slightly slower than IPv4 because of the longer address but it will include built-in security and encryption and packets will no longer be fragmented. As the Internet improves, however, so we think of new ways to overload it again, with video, live audio, telephone calls and Push technology.

Clients and Servers

A server within the Internet is a computer which makes services available to client computers, such as HTML pages from a web server, text, images and programs from a gopher server, files from a ftp server, and email from a mail server. A server should be permanently connected to the Internet or else its services will not always be available. Permanent connections are usually established by a leased line from a large ISP or a telephone company. A 64k leased line costs around £9000 p.a., 128k around £18,000 and 2Mbps around £40,000, all excluding installation. An Internet server needs a firewall which is a software protection device designed to prevent illicit entry to the server. ISPs sometimes set up proxy servers which sit between the Internet and the real server and which can be set up to perform tasks such as filtering out undesirable material. Client computers are those which have access to the Internet but do not provide services or resources. They are typically owned by schools, businesses or individuals who want to use Internet services. The resources of a computer using the Internet, such as memory or the hard disc, cannot be accessed by other computers on the Internet.

Web Servers

Using the Internet

The main services you are likely to use are the World Wide Web (the ‘Web’), electronic mail (‘email’), file transfer (ftp) and Internet Relay Chat (IRC) - in the latter you ‘chat’ with other users via your keyboard and monitor. Other services include Usenet (discussion groups) and gopher (a forerunner of the Web). Most ISPs provide the software needed to access the most popular services and once you are on-line you can obtain any other items free from sites on the Internet.

Web Browsers

The most popular browsers are now Microsoft Internet Explorer, Mozilla Firefox and Safari (on the Mac). Web pages are written in HTML (Hypertext Mark-up Language) which is interpreted by the browser and the pages displayed according to these instructions. HTML documents can include instructions in script languages such as JavaScript and VBScript and also embedded applets written in Java or as ActiveX controls. Browsers accept ‘plug-ins’ which are instructions for the decoding of new file types such as audio and video which the publishers of the browser did not know about.

Netscape Navigator was the most popular Web browser up to 1996 but it was superseded by Microsoft Internet Explorer. Microsoft, after a legal battle, won the right to include IE as an integral and therefore 'free' part of the Windows operating system. This meant that other browsers had to make themselves free as well. The true cost of IE can be hidden in the gernal cost of Windows. Netscape provided many of the early innovations in browser technology such as JavaScript and Secure Sockets Layer (ssl) but this was not enough to maintain market share, though you can still get it (for free, of course).

IE is a rather safe piece of software, there was little incentive for Microsoft to improve it beyond version 5, though they have been forced to do so in the last couple of years as security needs have become more serious and tabbed browsing has been introduced.

Web Pages

Using the Internet – Searching

Apart from reading what you have found this is the most challenging task on the Internet. There are millions of pages of information on computers around the world and finding the one or two you want may seem impossible, but help is at hand. Companies such as Yahoo and Alta Vista run programs which search these computers on the Web and compile lists of pages and indices of what they contain.

Search Engines

Searching  
Google Searching Google Searching  

Scripting

CGI PERL  

Home Page