Before we go any farther, I think it is important to define exactly what HTML is. HTML stands for HyperText Markup Language. Many people believe that HTML is a programming language. This is wrong. HTML is a markup language. More specifically (with more recent specifications of the language), it is a Structure Markup language, which means that it is used to control the structure and layout of a document by marking up sections of the document. We will, of course, cover all the aspects of this over the course of this guide.
HTML is viewed in a web browser. The most popular web browsers of today include: Google Chrome, Mozilla Firefox, Microsoft Internet Explorer, Opera, and Apple Safari, among others. These browsers all have their own way of displaying (or, in computer jargon, rendering) web pages. The way browsers do this is by using engines. Firefox uses an engine called Gecko, Microsoft’s engine is called Trident, Opera uses the Presto engine, while Chrome and Safari use an engine called WebKit. So why are there so many different ways of rendering a webpage? For that, we need to learn a little bit of the history of HTML.
HTML was created in late 1990 by physicist Tim Berners-Lee, who was working for CERN at the time. What set HTML apart from other markup languages of the day was its use of the hyperlink, which enabled one document to link to another within the text of the document. This is the reason why the first part of HTML is HyperText. Originally, HTML was completely unstandardized, and it wasn’t until 1995 that a standardized version of the language was created (HTML 2.0). The World Wide Web Consortium (W3C) was created in 1994 to maintain and enforce HTML 2.0 compliance, and later other web standards.
HTML, meanwhile, sat at version 4.01 for a long time, and was replaced in most cases by XHTML (a syntactically stricter version of HTML which was not really HTML at all but XML, which will be covered in a future guide), which was released in 2000. It was not until 2004 that a new group calling themselves the Web Hypertext Application Technology Working Group (WHATWG), began to revive the HTML standard and started working on HTML 5.0. WHATWG was merged into the W3C, who was working on XHTML 2.0 at the time (which has since been abandoned), in 2008. As of the time of this article, HTML5 is in the final stages of ratification by the W3C and is widely supported already, although some of its features are still being adopted by some web browsers. It is with HTML5 that this guide will focus on.
At their core, all HTML files are essentially text documents. What distinguishes them from ordinary text files is their extension. While text files have the extension
.txt, HTML files have the extension
.html (or the more old-fashioned
.htm). This extension tells the browser that the document is to be interpreted as an HTML document.
As mentioned previously, HTML files are text documents. You could, if you wanted to, type only text into a text editor (such as notepad), save it as a file with the extension
.html, and it would be considered an HTML document. However, this document is not considered to be well-formed because it lacks certain features which all HTML documents should have according to the standard.
All well-formed HTML documents use markers called tags. These come in three main varieties: normal, self-closing, and special. Tags are recognized by their use of angled brackets (otherwise known as less than and greater than symbols),
>. Tags are closed with a forward slash (
/), indicating that the tag is finished. All tags in a well-formed document are to be typed in lowercase to comply with the current standard of HTML (older revisions wanted tags to be in uppercase but this is no longer encouraged).
Normal tags surround text, which can, but do not have to, contain other tags. They have two parts: an opening tag and a closing tag. Here is an example of a normal tag, in this case denoting a paragraph:
<p>This is a paragraph.</p>
As you can see, it has two parts to it, the opening tag (
<p>) and the closing tag (
</p>). The text (or code) in the middle will be rendered according to the rules governing the tag.
Early versions of HTML required most tags to be written as normal tags (above) in order to be well-formed. Later revisions to the language added a shorter tag for some elements, called a self-closing tag. It is used for elements which do not need to contain any text within the tags (stand-alone), such as images. Here is an example of a self-closing tag, denoting a line break:
This tag is closed by using a
/ before the final
>. According to newest specification of HTML (HTML5), it doesn’t matter if stand-alone tags are closed or not. However, for my time and money, it is easier to include the slash because it stands out more when reading over your code (and I was taught to code using XHTML, which requires it). Forms such as
<br> will therefore be avoided in this guide, but you may use them if you so choose. They are supported in all major browsers.
Special tags are… well, special. They do not follow the same rules as other tags. There are two main types of special tags. The first is the comment tag. Comment tags insert comments, which contain text which is not displayed on the rendered document. Comments are used in this way:
<!-- This text will not be displayed. --> But this text will.
All text between
--> will be ignored by the browser when rendering the page. So what is the use of it then? Simply, to make it easier when reading code. For example, if you have a section of your document that shows a table of contents, you might write
<!--Table of Contents--> before the section to let someone else (or yourself looking over old code) know what that section of the document displays.
The other main type of special tag is the
<!DOCTYPE> tag, which will be described in the next part of this guide.
- HTML is a markup language, not a programming language.
- HTML documents are rendered in web browsers using engines.
- Browsers use the
.htmlextension to determine an HTML document.
- Well-formed HTML documents use tags to separate parts of the document.
- There are 3 types of tags: normal, self-closing and special.
- One type of special tag is the comment, which makes the browser ignore its text.
Check back soon for Part 2, when we look at the parts of a well-defined HTML document.