WWW and Semantic Web
Chapter3 WWW and semantic web
Introduction to World Wide Web: components: URIs, HTTP. HTML
The World Wide Web (WWW) is a network framework to distribute/retrieve information resources. It relies on the following three elements:
A uniform naming scheme for locating resources on the Web (e.g., URIs).
An access protocol to named resources over the Web (e.g., HTTP).
A document standard to easily navigate among resources (e.g., HTML).
HTTP: hypertext transfer protocol:
Initial request line, initial response line (status line)
Initial Request Line
• A request line has three parts, separated by spaces: a method
name, the local path of the requested resource, and the
version of HTTP being used.
• Atypical request line is:
GET /path/to/file/index.html HTTP/1.0
• GET(uppercase) is the most common HTTP method; it
says "give me this resource". Other methods include
POST and HEAD.
• The path is the part of the URL after the host name, also
called the request URI (a URI is like a URL, but more
general).
• The HTTP version always takes the form "HTTP/x.x",
uppercase.
Initial Response Line (Status Line)
• The initial response line has three parts separated by spaces:
the HTTP version, a response status code that gives the
result of the request, and an English reason phrase
describing the status code.
• Typical status lines are:
HTTP/1.0 200 OK
or
HTTP/1.0 404 Not Found
HTML, How to add a tag
What is a Web Page?
• Web pages are text files containing HTML
• HTML –Hyper Text Markup Language
• A notation for describing
• document structure (semantic markup)
• formatting (presentation markup)
• Looks like: A Microsoft Word document
• The markup tags provide information about the page content
structure
HTML Structure
• HTML files are just normal text files.
• They usually have the extension of .htm, .html, or .shtml.
• HTML documents havetwo parts: the head and the body.
• The head ofthedocument contains the document's title
and similar information, and the body contains most
everything else.
HTML is comprised of “elements” and “tags”
• Begins with <html> and ends with </html>
• Elements (tags) are nested one inside another.
• Tags have attributes:
<img src="logo.jpg" alt="logo" />
• HTML describes structure using two main sections: <head>
and <body>
<html> <head></head> <body></body> </html>
HTML Tags
• HTML tags are used to mark-up HTML elements
• HTML tags are surrounded by the two characters < and >
• The surrounding characters are called angle brackets
• HTML tags normally come in pairs like <b> and </b>
• The first tag in a pair is the start tag, the second tag is the end
tag
• The text between the start and end tags is the element
content
• HTML tags are not case sensitive, <b> means the same as <B>
Semantic Web
Introduction to Semantic Web
The Semantic Web is the next generation of the internet, designed to integrate various resources and devices to facilitate knowledge sharing among machines.
Its purpose is to leverage the economies of scale by enabling machines to process, integrate, and analyze information autonomously.
By specifying resources, concepts, and knowledge in a machine-readable format,
the Semantic Web allows machines to acquire and share knowledge, make decisions, and collect insights.
Examples of Semantic Web applications include intelligent personal assistants, and improved search engines that understand context and intent.
HTML, XML, RDF and OWL: compare, progressive
HTML, XML, RDF and OWL
• HTML
• HTML stands for Hyper Text Markup Language
• An HTML file is a text file containing small markup tags
• The markup tags tell the Web browser how to display the
page
• XML
• XML stands for eXtensible Markup Language
• XML is a markup language much like HTML
• XML was designed to carry data, not to display data
• XML tags are not predefined. You must define your own
tags
• XML is designed to be self-descriptive
• XML is a W3C Recommendation
RDF
• RDF stands for Resource Description Framework
• RDF is a framework for describing resources on the web
• RDF provides a model for data, and a syntax so that
independent parties can exchange and use it
• RDF is designed to be read and understood by computers
• RDF is not designed for being displayed to people
• RDF is written in XML
• RDF is a part of the W3C's Semantic Web Activity
• RDF is a W3C Recommendation
OWL
• OWL stands for Web Ontology Language
• OWL is built on top of RDF
• OWL is for processing information on the web
• OWL was designed to be interpreted by computers
• OWL was not designed for being read by people
• OWL is written in XML
• OWL has three sublanguages
• OWL is a web standard
XML vs. HTML
Weakness of HTML
• HTML is not a suitable language for making data
meaningful to computer programs.
• This is a serious shortcoming because the whole business
world (banking, insurance, retail, etc.) is dependent on
computer programs interpreting data.
Definition of XML
• XML stands for eXtensible Markup Language. XML was
designed for carrying and storing data. However, HTML
was designed to display data.
• XML has following features:
• XML is a markup language much like HTML.
• XML tags are not predefined. You must define your
Own tags.
• XML is designed to be self-descriptive
Advantages of XML
• Data awareness: since XML is self-describing it is possible for
programs that process them to act more "intelligently".
• Independence of communicating parties: XML is
independent of all machines, operating systems,
programming languages and databases.
• Standard language: XML is being used to define standard
languages, vocabularies, for sharing data by many
industry sectors and professional groups (e.g.,
Mathematical Markup Language (MathML), Open
Financial Exchange (OFX), etc.).
XML Document Structure
• XML documents forma tree structure that starts at "the
root" and branches to "the leaves".
• An Example XML Document:
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Mary</to>
<from>Tom</from>
<heading>Reminder</heading>
<body>Don't forget the meeting on this Friday!</body>
</note>