Web Technologies - Comprehensive Study Notes (R18A0517)
Introduction to the Internet and World Wide Web
- Definition of the Internet: A global computer network providing a variety of information and communication facilities, consisting of interconnected networks using standardized communication protocols. It is a global system of interconnected computer networks that use the Internet protocol suite (TCP/IP) to link devices worldwide.
- Composition of the Internet: It is a network of networks that consists of private, public, academic, business, and government networks of local to global scope, linked by a broad array of electronic, wireless, and optical networking technologies. It carries a vast range of information resources and services.
- History of the Internet:
* Cold War Context: Roots lie in the need to connect top US universities to share research data without time lags.
* Advanced Research Projects Agency (ARPA): Formed at the end of the 1950s following the Russian launch of Sputnik.
* 1969: ARPA achieved success in interconnection.
* 1971: Ray Tomlinson created a system to send electronic mail, opening gateways for remote computer accessing (telnet).
* 1973: Preparations for TCP/IP and Ethernet services.
* End of 1970s: Usenet groups surfaced.
* 1980s: IBM introduced the PC based on the Intel 8088 processor.
* 1982: Defense Agencies made TCP/IP compulsory, and the term "internet" was coined.
* 1984: Domain name services arrived.
* 1988: A worm attacked the computers, disabling over 10% of systems worldwide. This led to the formation of the Computer Emergency Rescue Team (CERT).
* World Wide Web (WWW): Discovered by Tim Berners-Lee as a service to connect documents in websites using hyperlinks.
- The World Wide Web (WWW):
* Definition: An information space where documents and other web resources are identified by Uniform Resource Locators (URLs), interlinked by hypertext links, and accessible via the Internet.
* Inventor: English scientist Tim Berners-Lee invented the Web in 1989 and wrote the first web browser program in 1990 while at CERN in Switzerland.
* Release Dates: Released to research institutions in January 1991 and to the general public in August 1991.
* Web Pages: Primarily text documents formatted and annotated with Hypertext Markup Language (HTML). They contain multimedia content (images, video, audio) and software components rendered by web browsers.
* Navigation: Embedded hyperlinks permit navigation between pages. A collection of pages with a common theme or domain name forms a website.
* Client/Server Computing: The client requests documents (graphics, sounds, etc.) from a server (Web server), using the Hyper Text Transport Protocol (HTTP).
Web Browsers and URLs
- Browsers (WWW Clients): Programs used to access the WWW. They can be graphical (supporting audio/graphics) or text-only. They understand Internet protocols such as HTTP, FTP, gopher, mail, and news.
- Historical List of Web Browsers:
* 1991: WorldWideWeb (Nexus).
* 1992: ViolaWWW, Erwise, MidasWWW, MacWWW (Samba).
* 1993: Mosaic, Cello, Lynx 2.0, Arena, AMosaic 1.0.
* 1994: Netscape Navigator, IBM WebExplorer, SlipKnot 1.0, MacWeb, IBrowse, Agora (Argo), Minuet.
* 1995: Internet Explorer 1 & 2, Netscape Navigator 2.0, OmniWeb, UdiWWW, Grail.
* 1996: Internet Explorer 3.0, Netscape Navigator 3.0, Opera 2.0, Arachne 1.0, PowerBrowser 1.5, Cyberdog, Amaya 0.9, AWeb, Voyager.
* 1997: Internet Explorer 4.0, Netscape Navigator 4.0, Netscape Communicator 4.0, Opera 3.0, Amaya 1.0.
* 1998: iCab, Mozilla.
* 1999: Internet Explorer 5.0, Amaya 2.0, Mozilla M3.
* 2003: Apple Safari 1.0, Opera 7, Epiphany 1.0.
* 2004: Firefox 1.0.
* 2008: Google Chrome 1, Mozilla Firefox 3, Apple Safari 3.1.
* 2015: Microsoft Edge, Vivaldi.
- Uniform Resource Locators (URLs): The address of a document found on the WWW. Browsers interpret the elements to connect to servers.
* URL Elements:
Protocol://server's address/filename.
* Protocol Examples: http://, ftp://, telnet://, news:. - Domains: Divide sites into categories based on ownership.
* Top-Level Domains:
.com (commercial), .mil (military), .org (organization), .int (international treaty), .net (network), .biz (commercial/personal), .edu (educational), .info (commercial/personal), .gov (government), .name (personal). Two-letter country domains (e.g., .fr for France) also exist. - MIME (Multi-Purpose Internet Mail Extension):
* An extension of original e-mail protocol allowing exchange of different data files (audio, video, images, applications) along with ASCII text.
* Proposed in 1991 by Nathan Borenstein of Bellcore.
* Servers insert a MIME header at the start of transmission; clients use it to select a "player" application.
* New types are registered with IANA. Defined in RFC 1521 and 1522.
Hypertext Transfer Protocol (HTTP) and Secure Protocols
- HTTP Definition: Underlying protocol of the WWW defining how messages are formatted/transmitted and how servers/browsers respond to commands.
- Stateless Protocol: HTTP is stateless because each command is executed independently without knowledge of previous commands, making intelligent user-input reaction difficult to implement.
- HTTPS: Hyper Text Transfer Protocol Secure. The secure version of HTTP. Communications are encrypted by Transport Layer Security (TLS) or Secure Sockets Layer (SSL).
- HTML: A markup language to describe document layout. It is not a programming language (cannot describe computations). Controls consist of tags and attributes.
- Plugins and Filters: Plugins integrate into word processors as WYSIWYG editors. Filters convert documents from other formats to HTML.
- XML: A meta-markup language used to create new markup languages for specific purposes.
- JavaScript: A client-side HTML-embedded scripting language for dynamic interaction.
- Flash: A system for building movies, sound, and animation. Consists of an authoring environment and a player.
- PHP: A server-side scripting language for form processing and database access.
- Ajax (Asynchronous JavaScript + XML): Uses asynchronous requests to the server to receive small parts of documents, resulting in faster responses.
- Java Web Software: Includes Servlets (server-side Java classes), JSP (JavaServer Pages), and JSF (JavaServer Faces).
- ASP.NET: Used in the .NET environment to allow .NET languages for server-side scripting.
- Ruby on Rails: A development framework using Ruby, a pure object-oriented interpreted scripting language.
- Heading Tags:
<h1> (largest) to <h6> (smallest). - Paragraph Tag:
<p>. Use the align attribute ("left", "center", "right"). Browsers ignore indentations/blank lines in source text without <p>. - Line Break:
<br>. Forces a single spacing line break. - Horizontal Rule:
<hr>. Acts as a divider. Attributes include width and align. - Basic Structure:
*
<!DOCTYPE...>: Defines document type/version.
* <html>: Encloses the complete document.
* <head>: Represents the header (contains <title>, <link>).
* <body>: Contains visible content (<h1>, <p>, etc.). - Lists:
* Unordered List:
<ul> and <li>. Collection of items marked with bullets.
* Ordered List: <ol> and <li>. Items are numbered.
* Definition List: <dl> (start), <dt> (term), <dd> (description). - Tables:
* Tags:
<table>, <tr> (row), <td> (data cell), <th> (heading cell).
* Attributes: border, bordercolor, bgcolor, background (image). - Images: Inserted via
<img>. Attributes include src, alt (alternative text), and align ("left", "right", "middle", "top", "bottom"). - Frames: Used to divide the browser into multiple sections (frameset). Note: Not supported in HTML5.
*
<frameset>: Tags used instead of <body>. Attributes: rows (horizontal), cols (vertical).
- Definition: Forms collect data (name, email, etc.) from visitors and post it to back-end applications (PHP, CGI, ASP).
- Form Tag:
<form action="URL" method="GET|POST">. - Input Types:
* Text Input:
<input type="text"> (single line).
* Password Input: <input type="password"> (masks characters).
* Text Area: <textarea> (multi-line input).
* Checkbox: <input type="checkbox"> (multiple selections allowed).
* Radio Button: <input type="radio"> (one selection from many).
* Select Box: <select> and <option> (drop-down list).
* File Upload: <input type="file">.
* Hidden Control: <input type="hidden"> (data not visible to user).
* Buttons: <input type="submit">, <input type="reset">, and <input type="button">.
Cascading Style Sheets (CSS)
- Purpose: Describes how HTML elements are displayed on screen or paper. Controls layout of multiple pages at once.
- Ways to Add CSS:
* Inline: Using the
style attribute inside HTML elements.
* Internal: Using the <style> tag in the <head> section.
* External: Using a <link> tag in the <head> to reference a .css file. - CSS Properties:
* Fonts:
color, font-family, font-size (e.g., 300%).
* Box Model: border (e.g., 2px solid), padding (space between text and border), margin (space outside border).
JavaScript Overview and Syntax
- Definition: A popular independent scripting language used commonly for client-side validation. It is object-based.
- Comparison with Java:
* JavaScript: Cannot live outside a web page, no compiler needed, untyped, object-based.
* Java: Stand-alone applications, requires compiler, strongly typed, object-oriented.
- Features: Developed by Netscape (originally LiveScript). Netscape's JScript is Microsoft's version. It handles dates, time, and events (onSubmit, onClick, etc.).
- Syntax Rules:
* Case-sensitive (variable
a is not A).
* Semicolons (;) are optional but required for multiple statements on one line.
* Comments: // (single line), /* ... */ (multi-line). - Variable Declaration: Declared using the
var keyword. Keywords are small letters. Variables are memory locations where data is stored. - Functions: Declared using the
function keyword. Can be called directly or via event handlers. - Event Handlers: Not case-sensitive. Examples:
onclick, onmouseover, onkeydown, onload, onsubmit, onreset, onchange. - Popup Boxes:
1. Alert:
window.alert("message"). User must click OK to continue.
2. Confirm: window.confirm("message"). Returns true (OK) or false (Cancel).
3. Prompt: window.prompt("message", "default"). Returns input value or null. - Validation:
* Server-side: Secure, works without JS, but slower response.
* Client-side: Immediate feedback, rich UI, relies on JS (can be bypassed if JS is off).
XML (Extensible Markup Language)
- Definition: Developed by W3C in 1996. Derived from SGML. Designed to store/transport data, not display it.
- XML vs. HTML:
* XML stores data; HTML displays it.
* XML allows custom tags; HTML has built-in tags.
* XML is case-sensitive; HTML is not.
* XML attributes must be quoted; HTML's are optional.
- Well-Formed XML: Must have a root element, matching closing tags, proper nesting, and quoted attributes.
- Building Blocks:
* Elements: Defined via
<!ELEMENT name (content)>.
* #PCDATA: Parsed Character Data (scanned by parser for markup/entities).
* #CDATA: Character Data (not parsed; tags ignored).
* Attributes: Specified within double quotes.
* Entities: Variables like < (<, > (>), & (&).
XML DTD and Schemas
- Document Type Definition (DTD): Specifies rules for XML structure.
* Internal DTD: Declared within the XML file using
<!DOCTYPE root [ ... ]>. standalone attribute must be "yes".
* External DTD: Declared in a separate .dtd file. standalone must be "no". Syntax: <!DOCTYPE root SYSTEM "file-name">. - XML Schemas (XSD): Alternative to DTD, written in XML syntax. Supports namespaces and data types (string, date, boolean, integer).
* XSD Indicators:
* Order:
<all>, <choice>, <sequence>.
* Occurrence: maxOccurs, minOccurs (e.g., maxOccurs="unbounded").
* Group: group name, attributeGroup name. - Schema vs. DTD: Schemas are richer, extensible, support data types and namespaces, and are written in XML.
XML Parsers and DOM/SAX
- Parsers: Software that accesses or modifies XML data.
- DOM (Document Object Model): Tree-based approach. Loads the entire document into memory. Suitable for small documents requiring random access.
* Methods:
getElementsByTagName(), appendChild(), removeChild(), getAttribute().
* Properties: nodeName, nodeValue, parentNode, childNodes. - SAX (Simple API for XML): Event-based approach. Sequential, forward-only parsing. Low memory consumption; good for large documents.
* Events:
startDocument, endElement, characters.
PHP (Hypertext Preprocessor)
- Introduction: Server-side scripting language released in 1994 by Rasmus Lerdorf. Embedded in HTML.
- Variable Rules: Start with
$. Must start with a letter or underscore. Loosely typed (no declaration needed before use). - Variable Scopes:
* Global: Declared outside a function. Accessed inside using the
global keyword or $GLOBALS[index].
* Local: Declared inside a function.
* Static: Variable not deleted after function execution; uses the static keyword. - Data Types: Scalar (Integer, Double, Boolean, String), Compound (Array, Object), Special (NULL, Resource).
- Operators:
* Arithmetic:
$+$, $-$, *, /, %, ** (exponentiation).
* Comparison: ==, === (identical), !=, <>, !== (not identical), >, <, >=, <=.
* Logical: and, or, xor, !, &&, ||.
* String: . (concatenation), .= (concatenation assignment). - Control Structures:
* Decision:
if, else, elseif, switch.
* Loops: for, while, do...while, foreach (for arrays). - Functions: Declared as
function name(). Supports parameters, default values, and returning values. Passing by reference uses & (e.g., function addSix(&$num)).
Java Servlets and Web Servers
- Web Server: Computer with software like Apache Tomcat. Port 8080 is commonly assigned.
- Servlets: Server-side Java programs acting as a middle layer between browsers and databases. They handle requests via threads rather than forking processes.
- Servlet Lifecycle Methods:
1.
init(): Called once when the servlet is created.
2. service(): Dispatches requests to doGet, doPost, etc.
3. doGet(): Handles HTTP GET requests (data in URL, limit 1024 characters).
4. doPost(): Handles HTTP POST requests (data in request body).
5. destroy(): Called once at the end of the lifecycle for cleanup. - Servlet API Packages:
*
javax.servlet: Generic protocols.
* javax.servlet.http: HTTP-specific protocols. - Session Tracking: Techniques to maintain state over stateless HTTP.
* Cookies: Text files stored on client machine. Methods include
addCookie(), getCookies(), setMaxAge().
* HttpSession: Maintains user data across multiple requests. Methods include setAttribute(), getAttribute(), invalidate().
Database Access (JDBC)
- JDBC (Java Database Connectivity): API to interact with databases.
- Driver Types:
* Type 1: JDBC-ODBC Bridge (uses ODBC drivers).
* Type 2: Native-API (converts calls to native C/C++ API).
* Type 3: Network-Protocol (three-tier, middleware approach).
* Type 4: Native-Protocol (Pure Java, communicates directly with DB socket).
- JDBC Workflow:
1. Load Driver:
Class.forName("driver_name").
2. Establish Connection: DriverManager.getConnection(url, user, pass).
3. Create Statement: con.createStatement().
4. Execute Query: stmt.executeQuery("SELECT ...") or stmt.executeUpdate("UPDATE/INSERT ...").
5. Process Results: ResultSet stores database results, accessed via rs.next(), rs.getString(), etc.
JSP (Java Server Pages)
- Introduction: Extension of servlets separating logic from presentation. JSP is effectively Java code embedded in HTML.
- JSP Elements:
* Expressions:
<%= expression %>. Evaluated at runtime and converted to string.
* Scriptlets: <% java code %>. Placed in _jspService method.
* Directives: <%@ directive ... %>. Global instructions (e.g., page, include, taglib).
* Declarations: <%! code %>. Used for defining class-level variables and functions. - Implicit Objects: Built-in objects provided by the container (
request, response, session, out, application, config). - MVC Architecture:
* Model: Business logic and data manipulation.
* View: Presentation/Display (JSP).
* Controller: Request processing (Servlets).
JavaBeans
- Definition: Reusable software components in Java.
- Conventions:
* Must have a no-argument constructor.
* Must be serializable.
* Must provide getter (
getProperty) and setter (setProperty) methods. - Features: Portable, supports introspection (exposing properties/methods), customization, and persistence via Object Serialization.