WEB SCIENCE | CREATING THE WEB
SECTION 1 | DISTINGUISH BETWEEN THE INTERNET AND WORLD WIDE WEB
Remember that the World Wide Web and the internet are two different things. The world wide Web is the protocols, language and webpages we access, whereas the internet is the network that provides a medium for the transfer of data.
WHAT IS THE INTERNET?
The internet is simply a computer network that has grown and grown and is now the largest WAN available to the public all over the world. Underground cables all over the world carry data across the internet. When someone creates a webpage, the computer or server that holds/hosts the page is made available to the public and and address is provide on how to access the page. The internet is the network and components that provide a means for the data transfer.
Video: What is the Internet?
Whilst various data transfer medium are used the internet still predominantly uses cables. When you access the internet from your mobile phone data service the data will travel from your phone to the nearest receiving mast and then the rest of the journey for the data will be done via cables, even laid under the oceans throughout the world.
SECTION 2 | HOW THE WEB IS CONSTANTLY EVOLVING
The World Wide Web has undergone significant transformations since its inception, evolving through various phases, each marked by technological advancements and changes in user interaction.
WEB 1.0 | 1990s TO EARLY 2000s
-
Static Content | Websites were mostly static, serving the same content to all users.
-
Read-Only | The web was primarily informational, with limited user interaction or content creation capabilities.
-
Webmaster Dominance | Content was created and controlled by website owners or webmasters.
-
Limited User Engagement | Interaction was limited to simple forms and email communication.
WEB 2.0 | EARLY 2000s TO EARLY 2010s
-
User-Generated Content | Emphasis on user-generated content, blogs, and social media.
-
Interactivity | Increased interactivity with dynamic content and AJAX-driven web applications.
-
Social Networking | Rise of social media platforms, transforming how users communicate and share information.
-
Collaboration | Enhanced collaboration tools like wikis, allowing users to contribute and edit content.
-
Cloud Computing | Emergence of cloud services, enabling access to data and applications over the Internet.
WEB 3.0 (THE SEMANTIC WEB) | EARLY 2000s TO PRESENT
-
Data Connectivity | Focus on making data more connected and machine-readable.
-
Semantic Understanding | Use of technologies to create a more intelligent and interconnected web, "semantic understanding" refers to the ability of web technologies to understand and interpret the meaning and context of information, much like humans do. It's about making the web not just about displaying data, but also about understanding the data
-
Personalization | Enhanced personalization and targeted content delivery based on user preferences and behavior.
-
Artificial Intelligence | Integration of AI and machine learning for smarter search and data analysis.
WEB 4.0 AND BEYOND | PRESENT AND POSSIBLE FUTURE DEVELOPMENTS
-
Internet of Things (IoT) | Expansion of the web to include a vast network of connected devices, from home appliances to industrial equipment.
-
Virtual and Augmented Reality | Integration of VR and AR technologies, creating more immersive web experiences.
-
Blockchain and Decentralization | Use of blockchain for enhanced security, privacy, and decentralized web applications (dApps)
-
Decentralized Finance (DeFi) | Emergence of decentralized financial services, bypassing traditional financial intermediaries.
-
Data Sovereignty | Users gaining more control over their personal data, with blockchain enabling secure and transparent data management.
-
Smart Contracts | Self-executing contracts with the terms of the agreement directly written into code, enhancing trust and efficiency in transactions.
-
Predictive Analytics | Advanced AI algorithms predicting user behavior and preferences for more personalized web experiences.
-
Natural Language Processing (NLP) | Enhanced ability of computers to understand and interpret human language, improving user interactions with AI systems.
-
Automated Content Creation | AI-driven content generation, changing the landscape of digital marketing and online content.
-
Quantum Internet | Potential development of a quantum internet that could revolutionize data security and computational power.
-
Enhanced Security Protocols | Quantum-resistant algorithms to safeguard against the potential threats posed by quantum computing.
-
Green Computing | Increased focus on reducing the environmental impact of web technologies.
-
Ethical AI | Addressing ethical concerns in AI development, ensuring fairness, transparency, and accountability.
-
5G and Advanced Networking:
-
Faster Connectivity | 5G networks provide significantly faster internet speeds, reducing latency, and enabling more connected devices.
-
Edge Computing | Processing data closer to where it is generated for faster and more efficient computing.
The web is continuously evolving, driven by technological advancements and changing user needs. From the static pages of Web 1.0 to the interactive and user-generated content of Web 2.0, the intelligent and interconnected Semantic Web, and the emerging trends of blockchain, AI, and IoT, the web is becoming more dynamic, personalized, and integrated into our daily lives. As we look to the future, the focus will likely be on enhancing connectivity, ensuring security and privacy, and leveraging technology for sustainable and ethical development.
SECTION 3 | FOUNDATIONAL WEB TECHNOLOGIES
The Internet, as we experience it today, is a complex tapestry woven from various technologies, each playing a crucial role in how information is created, presented, and exchanged. At the heart of this digital ecosystem are foundational technologies like the Hypertext Transfer Protocol (HTTP) and its secure counterpart HTTPS, which facilitate the transfer of web pages and data between servers and clients. Hypertext Markup Language (HTML) serves as the backbone for structuring and presenting content on web pages, while Cascading Style Sheets (CSS) and JavaScript enrich this content with aesthetic styles and interactive functionality. Extensible Markup Language (XML) and Extensible Stylesheet Language Transformations (XSLT) further extend the web's capabilities, allowing for the customization, transformation, and exchange of data in a structured and flexible manner.
Together, these technologies form the building blocks of the internet, enabling it to function as an interconnected network of information, services and communication.
HYPERTEXT TRANSFER PROTOCOL (HTTP)
-
Communication Protocol | Used for transmitting web pages over the Internet.
-
Stateless Protocol | Each request from a client to server is independent.
-
Client-Server Model | Used in a request-response model between clients (browsers) and servers.
-
Default Port | Typically uses port 80.
HYPERTEXT TRANSFER PROTOCOL SECURE (HTTPS)
-
Secure Version of HTTP | Encrypts data for secure communication over a computer network.
-
SSL/TLS Encryption | Uses Secure Sockets Layer (SSL) or Transport Layer Security (TLS) protocols to protect data.
-
Authentication | Provides authentication of the accessed website.
-
Default Port | Typically uses port 443.
HYPERTEXT MARKUP LANGUAGE (HTML)
-
Standard Markup Language | Used for creating and structuring sections, paragraphs, and links on web pages.
-
Tags and Elements | Uses a system of tags and elements to define the structure of web content.
-
Web Browsers | Interpreted by web browsers to display content.
UNIFORM RESOURCE LOCATOR (URL)
-
Web Address | A reference (an address) to a resource on the Internet.
-
Components | Includes protocol (e.g., HTTP, HTTPS), domain name, and path to a specific page or file.
-
Unique Identifier | Each URL is unique and directs to a specific resource.
EXTENSIBLE MARKUP LANGUAGE (XML)
-
Data Format | Designed to store and transport data.
-
Customizable Tags | Allows users to define their own tags, making it flexible.
-
Structured Data | Emphasizes the structure and storage of data, not its display.
-
Incompatible Formats in Systems | In the real world, computer systems and databases often have data in formats that don't easily work together.
-
XML's Plain Text Storage | XML stores data in a simple text format. This makes it easier to use the data across different software and hardware because it's not tied to any specific technology.
-
Flexibility with Platforms and Upgrades | XML is helpful when you want to switch to new operating systems, applications, web browsers, or platforms. It adapts easily, making transitions smoother.
-
Separation from HTML | XML keeps data separate from HTML, which is used for creating web pages. This separation means that the data can be managed and modified independently of the web page layout.
EXTENSIBLE STYLESHEET LANGUAGE TRANSFORMATIONS (XSLT)
-
Transformation Language | Used for transforming XML documents into other formats like HTML, text, or other XML formats.
-
Stylesheet | An XSLT document is a stylesheet that describes how to display an XML document of a given type.
-
Data Presentation | Enables the separation of data from presentation.
JAVASCRIPT
-
Programming Language | Scripting language used to create and control dynamic website content.
-
Client-Side | Usually runs in a user's web browser, but can also be used on the server-side.
-
Interactivity | Adds interactive features like games, responding to button clicks, data entry forms, etc.
CASCADING STYLE SHEETS (CSS)
-
Style Sheet Language |Used for describing the presentation of a document written in HTML or XML.
-
Layout and Design | Controls the layout of multiple web pages all at once.
-
Separation of Content and Presentation | Allows for the separation of document content from document presentation, including aspects like layouts, colors, and fonts.
SECTION 4 | URL
A Uniform Resource Identifier (URI) is a string of characters used to identify a resource on the Internet. It's a broader term that encompasses all types of names and addresses that refer to objects on the web. A URI typically includes a scheme, such as http, https, ftp, or mailto, followed by a colon and then a scheme-specific part. For example, https://www.example.com.
Its primary purpose is to enable interaction with representations of resources via a network (typically the World Wide Web) using specific protocols. The URL is a subset of the URI.
A URL or Uniform Resource Locator is the address of a webpage, the URL for the homepage of this website is www.computersciencecafe.com. The structure of a url can be broken up into 5 main parts, looking at the url of this page the parts would be:
PART 1 - PROTOCOL: https: - This refers to the protocol used the main two protocols are HTTP and HTTPS
PART 2 - DOMAIN NAME: computersciencecafe.com - This could be further divided into TLD (Top Level Domain) and SLD (Second Level Domain)
PART 3 - PATH OR SUB-DIRECTORY: /resources/internetprinciples - This refers to where there page requested is saved
PART 4 - QUERY: ? Used to provide dynamic web pages with custom features for the user.
PART 5: PARAMETERS - Parameters show further breakdown of the query returned
When you enter a URL into a web browser, it sends a request to the server specified in the URL, and the server responds with the content of the page. The following process happens.
-
Entering the URL | When you type a URL (like http://www.example.com) into your web browser and press Enter, you are making a request to access a specific resource on the Internet.
-
Browser Processes URL | The browser interprets the URL, understanding it as an instruction to connect to the web server hosting www.example.com. The URL specifies the protocol (http), the domain name (www.example.com), and often a specific path to a resource on that server (like /home or /contact-us).
-
DNS Lookup | The browser performs a DNS (Domain Name System) lookup to convert the domain name into an IP address, which is the numerical address that identifies the server on the Internet.
-
Server Connection | Using the IP address, the browser sends a request over the Internet to the server that hosts the website. This request is typically made using the HTTP or HTTPS protocol, as specified in the URL.
-
Server Response | The server receives the request, processes it, and sends back the requested resource. This could be a web page, an image, a video, or any other type of content hosted on the server.
-
Displaying the Content |The browser receives the response from the server, which usually includes HTML, CSS, and JavaScript files. The browser then renders these files to display the web page as intended by the website designers.
-
Interaction | Once the page is loaded, you can interact with it – click links, submit forms, watch videos, etc. Each of these interactions may involve the browser making additional requests to the server.
Entering a URL is the start of a process where your browser requests a resource from a web server, and the server responds by sending the data back to your browser, which then displays the content to you.
Video: IP Addresses & DNS
SECTION 5 | DOMAIN NAME SERVER (DNS)
A Domain Name Server (DNS) is a crucial component of the internet, acting as the directory that translates human-friendly domain names into machine-readable IP addresses. Here's how it functions:
1. UNDERSTANDING DOMAIN NAMES AND IP ADDRESSES
-
Domain Names | These are the web addresses that we type into a browser (like www.example.com).
-
IP Addresses | Every device connected to the internet has a unique IP address, which is a series of numbers (and letters in the case of IPv6). It's like the phone number for a computer or server on the internet.
2. THE ROLE OF DNS
-
Translation | When you enter a domain name in your browser, the DNS translates this domain name into its corresponding IP address.
-
Directory Service | Think of DNS as an internet phonebook. It maintains a directory of domain names and translates them to IP addresses.
3. THE DNS QUERY PROCESS
-
Initial Request |When you request a website, your computer first checks if it has the IP address in its cache. If not, it makes a DNS query.
-
Recursive and Iterative Queries | The query usually goes through a series of DNS servers, including recursive resolvers, root name servers, TLD (Top-Level Domain) name servers, and finally the authoritative name server for the domain.
-
Response | The authoritative name server responds with the IP address of the domain.
4. CACHING FOR EFFICIENCY
-
Temporary Storage | Once a DNS server finds the IP address for a domain name, it stores this information for a certain period. This process is known as caching, and it speeds up future access to the same domain.
5. ENSURING RELIABILITY AND SPEED
-
Distributed Database | The DNS system is a distributed database and is spread across many servers worldwide. This distribution ensures high availability and reliability.
-
Load Balancing |DNS can also direct traffic to different servers based on load, location, and other factors, contributing to efficient load balancing and reduced latency.
6. SECURITY MEASURES
-
DNS Security Extensions (DNSSEC) | This adds an extra layer of security by validating the authenticity of the response to prevent DNS spoofing attacks.
The Domain Name Server system is a fundamental part of the internet's infrastructure, enabling the seamless translation of user-friendly domain names into IP addresses that computers use to identify and communicate with each other on the network. This system not only makes the internet more user-friendly but also contributes to its efficiency and security.
SECTION 6 | IP, TCP AND FTP
INTERNET PROTOCOL (IP)
The Internet Protocol (IP) is a set of rules governing the format of data sent over the Internet or other networks. IP is a fundamental protocol that forms the basis of internet communication. Here are its key characteristics:
-
Data Routing | IP is responsible for routing data packets from the source to the destination. It uses IP addresses to identify sending and receiving devices.
-
Packet Switching | Information is divided into small packets for transmission. Each packet may take a different route to reach the destination.
-
Addressing System | IP addresses are unique numerical labels assigned to each device connected to a network using the IP for communication.
-
Version 4 and 6 | There are two versions in use, IPv4 and IPv6. IPv6 was developed to deal with the long-anticipated problem of IPv4's exhaustion of addresses.
-
Network Layer Protocol | IP operates at the network layer of the OSI model, providing a path for data to travel across networks.
TRANSMISSION CONTROL PROTOCOL (TCP)
Transmission Control Protocol (TCP) is a standard that defines how to establish and maintain a network conversation through which application programs can exchange data. TCP works closely with IP and has several important features:
-
Reliable Delivery | TCP ensures the reliable delivery of a data stream sent from one machine to another without duplication or data loss.
-
Connection-Oriented | It is a connection-oriented protocol, meaning a connection is established and maintained until the application programs at each end have finished exchanging messages.
-
Data Segmentation |TCP divides a message into smaller packets before they are sent over the Internet and reassembles them at the destination.
-
Flow Control | It provides flow control and acknowledgment of data.
-
Error Checking | TCP includes error-checking to ensure data integrity.
FILE TRANSFER PROTOCOL (FTP)
File Transfer Protocol (FTP) is a standard network protocol used for the transfer of computer files between a client and server on a computer network. FTP is built on a client-server model architecture and uses separate control and data connections between the client and the server:
-
Data Transfer | FTP is used for transferring files from one host to another over TCP-based networks.
-
Modes of Transfer | It supports two modes of transfer - ASCII mode for text files and binary mode for binary files like images, audio, etc.
-
Authentication | FTP often requires authentication for accessing files, typically in the form of a username and password. However, some servers make their files available without needing authentication.
-
Commands and Responses | FTP clients initiate a connection to a remote server and send FTP commands. The server responds with various status codes to these commands.
-
Passive and Active Modes | FTP operates in two modes, active and passive, which determine how the data connection is established
SECTION 7 | IP ADDRESSING
Note: IP Addressing is not explicitly part of the specification. But understanding IP addressing does help in the understanding of how networks work.
IP addressing is a critical component of networking that enables devices to communicate on a network and the internet. Here's an overview of how IP addressing works, including the distinction between public and private addresses, the creation of these addresses, device identification, and the role of MAC addresses.
WHAT IS AN IP ADDRESS
An IP (Internet Protocol) address is a unique identifier assigned to each device connected to a computer network that uses the IP for communication. Its primary function is to identify the host or network interface and provide the location of the host in the network.
PUBLIC VS PRIVATE IP ADDRESSES
PUBLIC IP ADDRESS
-
Usage | Used on the internet and assigned to your network by your Internet Service Provider (ISP).
-
Uniqueness | Each public IP address is unique across the entire internet.
-
Accessibility | Allows your network to be recognized by other networks globally.
PRIVATE IP ADDRESS
-
Usage | Used within a private network, such as a home or office network.
-
Local Network | These addresses are not routable on the public internet.
-
NAT (Network Address Translation) | Routers use NAT to translate private IP addresses to a public address for internet communication.
CREATION OF THE IP ADDRESS
-
Assignment by ISP | Public IP addresses are assigned by ISPs. They can be static (permanent) or dynamic (changing periodically).
-
Local Assignment | Private IP addresses are assigned by the network administrator or automatically by the router using DHCP (Dynamic Host Configuration Protocol).
-
Identifying Computers on a Network
-
IP Address Role | Each device on a network is assigned an IP address, which is used to identify it on the network.
-
Communication | Devices use IP addresses to locate and communicate with each other within a network and across the internet.
MAC ADDRESS
A MAC (Media Access Control) address is a hardware identifier that uniquely identifies each device on a network. Each network interface card (NIC) has a unique MAC address hard-coded by the manufacturer. When a device communicates over a network, it uses the MAC address to ensure that the data reaches the correct device within the local network.
While the IP address is used to identify the location of a device on a network, the MAC address is used to identify the specific device itself. Routers and switches use MAC addresses to direct network packets to the correct devices on a local network.
IP addressing, both public and private, plays a crucial role in network communication, ensuring that devices can be uniquely identified and can communicate effectively. The MAC address complements this by providing a unique hardware identifier for each network interface, facilitating accurate data transmission within local networks
PRIVATE IP EXAMPLE
192.168.1.1: This is a common example of a private IP address. It's often used as the default address for many home routers. In a home network, this address would be assigned to the router, and it would use it to communicate with other devices in the same network. Private IP addresses typically fall within certain ranges:
10.0.0.0 to 10.255.255.255
172.16.0.0 to 172.31.255.255
192.168.0.0 to 192.168.255.255
PUBLIC IP ADDRESS EXAMPLE
203.0.113.45: This is an example of a public IP address. Public IP addresses are assigned by Internet Service Providers (ISPs) and are unique across the entire internet. They are used to identify networks and devices on the internet, allowing them to communicate with each other. Unlike private IP addresses, public IP addresses must be globally unique and are carefully managed to avoid duplication.
The addresses provided here are for example purposes. In real-world scenarios, your specific public IP address would be assigned by your ISP, and your private IP address can be set by your network administrator or automatically by your router.
CIDR Notation: Sometimes, IP addresses are followed by a slash and a number (like 192.168.1.1/24). This is CIDR notation, indicating the network portion and the host portion of the IP address.