1. Code
  2. HTML5

Get Up to Speed With HTML - Basix


We'll soon be publishing our first HTML5 tutorials here at Activetuts+, but before we start, here's a fast and easy tutorial to get up to speed on the basics of HTML - even if you've never done any before.

The Difference Between HTML and .html

Create a new file, anywhere on your computer, and call it page.html. In Windows, the easiest way to do that is right-click an empty spot within a folder, click New > Text File, then make sure you delete the ".txt" at the end of a file. Windows will probably tell you that it's dangerous to change the file extension like this; just click OK. Alternatively, you can open a text editor (like Notepad or TextEdit), click File > New, and enter page.html as the filename.

Once you've created the file, open it in a text editor; there are plenty of fancy editors that offer all sorts of HTML-friendly features, but for this tutorial, the aforementioned Notepad or TextEdit will do just fine. (If you just double-click the file, it'll probably open in your browser, so you'll either need to open the text editor first and use File > Open to select the file, or (in Windows) right-click the file and choose Open With > Notepad.) Make sure you use a simple text editor rather than a word processor.

Enter this in your file:

Save it, and open it in your browser - or just click here to open mine. It'll look something like this:

HTML beginner tutorial guide

So there you go! You've created a web page - a .html page - and it displays just fine, with no error messages. That's it, end of the tutorial, thanks for reading.

Just kidding, of course.

Browsers Aren't Psychic

Here's an experiment: take an image file, copy it (so that you don't damage the original image), and rename the copy to something.html. Then, try to open this with your browser. Click here to open the screenshot above, after it's been renamed from .png to .html, in your browser.

Yikes. That's a lot of gobbledegook. But if you open the original .png file in your browser, it'll load just fine. Your browser can obviously cope with these files, so what gives?

When you open a .html file with your browser, it says, "Hey, I know how to deal with .html files!" - it assumes that the contents of the file are written using HyperText Markup Language (HTML for short, as you'll have guessed), and tries to use everything it knows about this markup language to display the page. Image files aren't written using HTML, so when it tries to display something that makes any sense using its HTML rules, it fails miserably.

There are other file extensions that browsers automatically associate with HTML - like .htm - and browsers can also be told to assume that other types of files will contain HTML - like ones ending in .php or .aspx.

So, not every file that's written in HTML will end in .html. But does this mean that Is this HTML? is perfect HTML in itself? Well... not exactly.

People Make Mistakes

While the actual contents of an image file are usually generated by a program like Paint or Photoshop based on a user's input, the contents of a .html file may be typed directly into the file by the user. And users, being human, make mistakes.

Browsers, in general, err on the side of forgiveness; rather than pettily refusing to display a page if the user has made even a simple mistake - like certain overzealous English teachers - it'll try to guess what the user meant and display the page to the best of its abilities.

Is this HTML?, you will be shocked to hear, is not ideal HTML, in that it does not contain all of the information a browser would like to see - but it'll be displayed anyway.

This means, then, that files containing HTML don't always end in .html, and files ending in .html don't always contain perfect HTML. To confuse the issue even further, sometimes you won't even save a file at all, but rather type HTML straight into a big text box, the contents of which a computer will later insert in the middle of a bigger HTML file, like I'm doing now:

HTML beginner tutorial guide

What Makes a "Proper" HTML Page?

All right, so a single line of English doesn't count as by-the-book HTML, that's no surprise. What should we add to our simple file, then?


Tags are the most important element of HTML. A bit like how blog post tags identify the topics of articles, or hashtags identify which hilarious meme a Tweet is joining in on, HTML tags identify something about whatever it is they're tagging. This is vague, I know, but you'll see why.

For example, in our current page:

...what is "Is this HTML?". It's text, sure. Specifically, we could say it's a paragraph of text. And in order to show this, we can tag the paragraph by wrapping it in <p> tags (p for paragraph - and no, before you ask, there's no "sentence" tag):

Each tag starts with an opening chevron < and ends with a closing chevron >, with the name of the tag between the chevrons. Note that the second tag has a slash / directly before the tag name; this indicates that it's a closing tag, and therefore marks the end of the paragraph started by the opening <p> tag. The whole paragraph (including tags), <p>Is this HTML?</p> can be called a p-block.

None of this makes the page display any differently, though; the browser was displaying the contents of the original file as if they were in a paragraph anyway. Some tags that will change the appearance of the text are b - for bold - and i for italic. Try this:

We've got tags within tags now: the italic-tagged text and the bold-tagged text are both inside the paragraph block; this is called a hierarchical structure. It's like a tree, with the p being the trunk, and b and i being branches. I hope you can see that this wouldn't make sense:

...although, your browser would actually display that, because it is so lenient.

Anyway, save this and open it in your browser:

Click here to see mine. You'll see that it displays with bold and italic text, like this:

Is this HTML?

Any experienced HTML developers reading this over your shoulder will be furiously scratching themselves and calling me a fool for using these exact tags - you'll see why soon - but just ignore them for the moment.

Okay, now compare these three snippets:

How do you think they'll each display?

The obvious answer is that the first will display the two paragraphs on separate lines, one after the other; the second will display the two paragraphs on one line; and the third will display them on separate lines with a huge gap between them.

This isn't true.

All three cases will display in the same way, like so.

Why? Because HTML doesn't care about carriage returns (new lines). It has a rule that says, by default, "paragraph blocks start on new lines". Even this will display in the same way:

While we're at it, so will this:

HTML does care about spaces, but only one at a time. If there are two or more in a row, they get condensed down to one.

Why? Well, it's really part of a bigger theme that explains your developer friend's itchiness:

Content, Not Presentation

Whenever a post on any Tuts+ site is tagged "Basix", our WordPress software automatically adds a tiny speech bubble with a "b" inside over the top of the image at the top of the post . Similarly, my Twitter client is configured so that any Tweets that have been hashtagged "#bieberfever" are displayed in giant bold red text so that I don't miss them.

But neither Basix nor #bieberfever imply anything about the presentation of the thing they are attached to; Basix says "this tutorial is written for beginners", and #bieberfever says "this Tweet is about Justin Bieber". They each imply something about the content. Their presentations change because of external rules that decide how certain types of content should be displayed.

HTML follows the same ideals, and that's why the b and i tags are looked down upon. They only exist because when HTML was invented (about 18 years ago), the creators hadn't decided on this "separate content from presentation" mindset, so browsers have (again, due to their lenience) continued to display bold and italic tags ever since.

Still, this doesn't mean that you can't use italic or bold text in your HTML files! No, you just have to use different tags, which identify something about the content rather than the presentation. See, when you add italics to a section of text, you're trying to emphasize the content - so instead of using an i tag, you should use an em tag. And when you make some text bold, you're trying to strengthen it on the page - so you should use a strong tag.

Your HTML should therefore look like this:

By default, your browser will display that in exactly the same way as it did when you used old b and i tags. You may well ask why we'd bother changing it, then - the answer lies in the phrase "by default".

Individual web pages - even individual paragraphs - can tell the browser to display em and strong tags differently: suppose you decide that a dotted underline is a better way of strengthening the text, and small caps are a better way of showing emphasis. You can reconfigure em to do one and strong to do the other on your site, and the appearance of all the text will change, without you having to alter the textual content at all. But it wouldn't make any sense to have the b-for-bold tag display a dotted underline, would it?

There are a few other tags like b and i that are no longer in use (said to be "deprecated"); you can find out more about any particular tag on W3Schools.

I'll explain a bit more about changing the appearance of web pages later on. I can't explain everything, as it's a huge subject, with a Tuts+ site all to itself.

Required Tags

Okay, so we've got a basic HTML page that displays correctly, and contains no deprecated tags. Great! But there are some tags that every HTML page requires, officially. Let's add them one by one.

The html tag says, "hey, browser, this is an HTML document!" To which the browser would presumably respond, "yeah, thanks, I'd guessed that!" if it were a sarcastic human being rather than a piece of software. All of the document should go inside the html tag; to go back to the tree metaphor, html should always be the trunk.

The body contains the actual content of the page. "As opposed to what?" As opposed to...

...the head tag, which contains information about the page, like...

...the title. If you look at the title bar of your browser with your old HTML page loaded, it'll just say something like "page.html" (or maybe "Untitled", or the name of the browser). If you use the code above, though, it'll say "This is an HTML page". Note that "This is an HTML page" does not appear anywhere in the actual content of the page, though: this illustrates the difference between the head and the body.

That HTML is getting a little hard to read now, so you might prefer to use the Tab key to indent the lines:

Indenting lines like this also helps illustrate that tree-like structure. If you accidentally copied and pasted a tag at the wrong level, it'd be really obvious, like this:

That second paragraph is clearly out of place.


I've mentioned a few times that browsers are very forgiving when trying to display broken (or non-standard) HTML. This is a blessing and a curse.

It means that sloppy HTML will still be displayed, which is great for anybody that ever makes a mistake i.e. the whole of humankind - but different browsers will display non-standard HTML in different ways.

It means that the people that make the browsers can come up with all sorts of new tags and features that aren't part of standard HTML but that do display in their browser, which is great for driving the power of what can be done with HTML forward, beyond the official standards - but not all browsers will support the same set. (For instance, years and years ago, Netscape Navigator allowed you to create blinking text using the <blink> tag, while Internet Explorer allowed you to create text that scrolled across the screen using the <marquee> tag; neither tag was supported by the other browser, and both effects were very annoying.)

It means that browser developers broke the rules of (or flat out ignored) some areas of the official HTML standards, so that there was a difference between how a page should display, and how it actually looked.

In a nutshell, it means that, even today, different browsers display the same pages differently.

To attempt to get around this we have doctypes. For example, stick this at the top of your page (even before the html tag):

...and the browser will display the page in "quirks mode", meaning, "using the non-standard practices from the late 90s". Replace it with this:

...and it'll display it in "standards mode", meaning, "using the official HTML standards". Except, some older browsers that are still in use, like Internet Explorers 6 and 7, still don't fully follow all the standards. And different browsers still each have their own additional features.

Doctypes are a mess - there are lots and lots of them - but fortunately we don't really have to worry about them, here at Activetuts+, because we're focusing exclusively on HTML5. And HTML5 has only one doctype:

Ahhh. Okay, sure, older browsers don't really know what to do with that, but they can't handle HTML5 anyway, so it doesn't matter to us. We've just got to hope that today's browsers stick to the HTML5 standards and don't get into that old mess again.

So now you should edit your HTML page to add a doctype:

It's a weird tag, because it doesn't actually enclose anything (there's no </!DOCTYPE html>), it's got a space in it, and it's partly in capitals. That's because it's not a tag at all, it just looks a bit like one because of the chevrons. Don't worry about it.


Okay, we've used this Language to Markup some Text, but what the heck makes that Text so "Hyper", the H of HTML?

Hypertext is defined as text, displayed on some sort of electronic device, with hyperlinks to other pieces of text. (The term was coined in the 60s, which explains why it sounds so corny.) And you know what hyperlinks are: bits of text that link to other pages when you click them, like this.

We create hyperlinks using the a tag - for anchor, because you use it to anchor a URL to some text. No, I don't know why they didn't use h or l.

But this alone isn't enough:

I mean, how could it? There's no URL there.

We have to use an attribute of the a tag, like so:

We've added an href (hypertext reference) attribute to the a tag, which will allow it to hold a URL (though we haven't specified the URL yet). Note that it's still an a tag, not an a href tag, and that the closing tag is still </a>, not </a href>; this is why tag names are always a single word, with no spaces.

To set the href's value to a specific URL, we use an equals sign, and enclose the URL in quotes:

Try it out. Clicking the link will take you to the Activetuts+ homepage.

Attributes make HTML a lot more powerful. Putting a p tag around some text just says, "this is a paragraph" - it attaches one piece of information to the text - but with attributes we can say so much more; here we've said, "the word 'this' is an anchor, and it refers to".

Self-Closing Tags

This is all very plain, so let's add an image.

Except... hmm, how exactly do we mark up a piece of text to make it an image? You might guess at something like this:

...which is a good guess, but not correct.

In this case, it makes no sense to have the image's tag - which is img, by the way - tag any text. We write it (without any attributes, for now) like so:

The slash at the end indicates that the tag is closing itself. It's a bit of a weird concept, and is where the whole idea of tags tagging text falls down a bit. If you prefer, you can call HTML tags elements instead.

Anyway, let's insert this into our page:

Like the a tag, the img won't display anything without any attributes. Let's add some.

We'll need an image file to use; this file has to be accessible by anyone reading your page, so while you could use the file path of a picture on your computer, you'd be the only one that could see it.

I'll use my Twitter avatar, since that's online, and hope it doesn't change before you read this tutorial. It's:, so, not exactly brief.

Rather than using href for the attribute that points to this URL, we have to use src (short for "source") - after all, it's not actually a hyperlink. So:

If you load this page, you'll see my avatar at the bottom - unless I've changed it or Twitter has gone down, in which case you'll see a "broken image" symbol.

In case the image does break, we can add an alternative piece of text to be shown instead, using the alt attribute:

(Remember, HTML doesn't care about tabs, new lines, and spaces, so it doesn't matter that I've shoved the new attribute on a separate line.)

Try changing the src URL to one that won't work (like, http://blahblahblah/), and you'll see the alternative text (click here for an example). Well, you might, depending on your browser; the HTML 5 standards suggest doing this, but don't insist on it. Chrome doesn't do it, much to my surprise.

Anyway, it's a good habit to get in to, because it also allows blind people - who are reading your page via text-to-speech software - to hear the contents of the images.

Nested Tags

We've seen that tags can be nested, as in our case of some italic emphasized text within a paragraph, or paragraphs within a body, but I want to make it clear that you can make things more complex if you wish.

For example, you can mark up some emphasized text with an anchor: long as you don't do something silly like:

You can even tag an image with an anchor, as if it were a piece of text:

See the results here.

There are a ton of other tags I could cover, like h1, h2, h3 (etc.) for headings; pre for inserting big blocks of sourcecode or ASCII art (inside which, unusually, HTML does pay attention to extra spaces and new lines); table for inserting tabulated information; input for adding buttons and textboxes; iframe, which lets you insert a whole extra webpage inside the current webpage; and more - but this is enough about the content, because we need to talk about another aspect of HTML.

Presentation: CSS

HTML - the language - isn't concerned with presentation, it's true. But HTML pages need to be able to modify their presentation, or everything would be Times New Roman with purple links, like in the examples we've seen so far!

For this purpose we have another language, which we embed inside HTML pages, which is all about presentation: Cascading Style Sheets (CSS).

A Simple Example

Remember earlier I mentioned that we could make strong tags display with a dotted underline, and em tags display in small caps? We can do that using the CSS language. It looks like this, on its own:

Doesn't look at all like HTML, does it? You can probably read it well enough, though. We have the name of a tag, followed by a pair of curly brackets ("braces"), inside which are all the properties of the text that we wish to set. Each property then has a value, with a colon separating the property from its value, and a semi-colon at the end of each property-value pair. Together, this is called a style sheet. Simple enough.

We apply a style sheet to the page like so:

Test the page. You'll see that the word "this" now uses small capitals (in both cases, because they're both emphasized), and "HTML" has a dotted underline.

However, the em tag still has an italics effect, and strong is still displaying words in bold. What gives?

This is what the "Cascading" in "Cascading Style Sheets" refers to: the browser has its own, default CSS styles (like "em" means "italics"), and these cascade down, so that the CSS styles that you apply are simply added on top, rather than replacing the existing ones. It's as if the browser has a style sheet that looks like this:

(Yes, I know, it's counter-intuitive not to use the same property to set italics and bold styles.)

This then cascades down, so that it's as if the overall style sheet - including your variations - looks like this:

Note that your properties are added after the browser's own. Since the browser computes these properties one at a time, in order, we can cancel out the browser's styles by changing our set of styles like so:

Then, the overall style sheet will look like this:

When the browser comes across text that's been tagged strong, it'll say, okay, first, I need to make this bold. Then, I need to give it a 1 pixel thick, dotted border, at the bottom of the text. Then, I need to make the font weight normal. Wait, didn't I make this bold a second ago? Oh well, never mind.

Result: normal-weighted text, with a dotted border across the bottom. Let's incorporate this new set of styles into our page:

Load the page. Perfect!

Other Sources of CSS

Sticking the style sheet in a style tag within the head of a page is called embedding it, but there are other ways we can affect the overall style of the page.

For instance, we can put style sheet into its own .css file (called an external style sheet), and tell the page to use it. To do that, we use a place a self-closing link tag within the head, whose rel attribute is stylesheet and whose href tag is the URL of a .css file.

You should be able to figure out how to write that from the description; try it with the style sheet we use at Activetuts+: If you need help, expand the code box below.

Click here to see the result. It's a mess, to be honest, but you can see that it's loaded the styles.

External style sheets are applied after the browser's defaults, but before any embedded ones. This means you can create a single external style sheet to use across your entire web site, and then tweak it by embedded specific styles for each page, if you wish.

In fact, you can get even more specific, and apply a style to just one single element. This is called an inline style, and it's done by entering the CSS into the style property of a tag:

Check out the result of the above change. Inline styles are applied after all the other styles.

Note that you don't have to enter the name of the element and use curly braces - you've already defined which tag you want to style. You can apply multiple styles within the same style attribute; just type them one after the other (no new line needed), and make sure there's a semi-colon between each of them.

What if you want to style just a few words within a paragraph? You could wrap them in an em tag, and style that tag like so:

...but then they'll be in italics too, right? So you cancel that out like so:

...but that's no good either, because you can't guarantee that the em tag doesn't have other styles attached to it as well - and since you don't know, you can't cancel them all out. Fortunately, there's a tag to help.

The <span> Tag

The span tag provides no visual changes by itself, which means it's perfect for our example above:

It's unlikely - not impossible, but unlikely - that someone will have applied certain styles to all span tags in an embedded or an external style sheet, so you should be safe using this.

Ah, but now look what we're doing. We're applying presentation rules right there in the content - exactly the sort of situation that the em and strong tags were invented to avoid! It's not technically invalid HTML, but it's still dodgy, and should be used sparingly or not at all.

However, sometimes you will want to say, "this section of text is different to the others, but none of the existing HTML tags describe that difference very well." Maybe you're writing a page about animals, and you decide that it would be useful to tag all the words that represent species - like "cat" and "dog" and so on - so that you can apply an style to all of them at once later.

HTML doesn't have a species tag, though. It's tempting to say, "well, on my website, I'm going to say that the strong tag is always used to represent species" - but this is a bad idea; it's not what strong is for.

Instead, you can use classes. Any tag can be given a class like so:

Setting a class doesn't do anything on its own, but you can then define a style for each class in your style sheet:

Note that to define a style for a class, rather than a tag, you have to place a dot before its name in the CSS.

Take a look at the page now. It looks like something you might have found on Geocities about ten years ago, but at least it's showing some important HTML principles. Classes' styles are added after embedded styles and before inline ones, so the whole cascading order looks like this:

  • Browser defaults
  • External
  • Embedded
  • Classes
  • Inline

So, we've looked at content and presentation within HTML files. There's one important aspect left.

Code: JavaScript

HTML and CSS are not programming languages. They don't actually do anything. They just tell text how it is structured and how it should look.

JavaScript is a programming language. It does stuff.

One of the simplest examples of JavaScript code is alert('Hello'). When the browser runs this code, it'll make a little dialog box appear, with the word "Hello" inside.

Much like with CSS, we can embed JavaScript into a page, add it via an external file, or attach it to a single tag. To embed it, we put it inside a script tag in the head, rather than a style tag:

Check out the result.

Linking to external JavaScript is easy, too; you put your code in a .js file, and add it to your page like this:

So, it's the same tag as we used to embed some code, except it's empty, and it has a src attribute. I'm not going to give an example of this, but feel free to try your own.

Inline JavaScript - i.e., attaching code to specific elements - works a little differently than inline CSS. A first guess at how we could do this might be:

...but, if you think about it, this doesn't make sense. When would the alert appear? When the page loaded? When that specific section of text loaded? When it was on the page? When it was clicked? You can't guess, and neither can the browser.

Instead, there are a whole bunch of attributes that you can stick JavaScript into; they're called event attributes, and they trigger the JavaScript inside them for different reasons.

For instance, one such attribute is onclick, which triggers the JavaScript inside it whenever the user clicks the contents of the tag with their mouse. It looks like this:

Load the page and click the last paragraph - yes, you can click it, even though it's not a hyperlink. Tada! You're half-way to programming a duet.

That's Not All

I'm not going to go into a lot of detail on JavaScript here, because that's what the bulk of our future HTML5 tutorials are going to focus on. However, you should know that JavaScript can do a lot more than make annoying dialog boxes appear.

Besides making the browser do stuff on its own, JavaScript can alter and create HTML (as well as CSS and JavaScript code itself). It's hard to overstate how powerful that makes it.

And as if that weren't enough: if you're using Chrome, you can play Angry Birds in your browser, written in JavaScript, HTML, CSS... and, okay, a touch of Flash for the sounds.

What Next?

If you're interested in learning more about the design and appearance of webpages, check out our sister site Webdesigntuts+. One of our other sister sites, Nettuts+, has already published a lot of articles on HTML, CSS, and JavaScript that you should dig in to.

We'll be focusing on the latest version of HTML - HTML5 - and using it to create apps and games for modern browser. If you want to keep up to date with what we're doing, then you can follow us on Twitter, like us on Facebook, subscribe to us through RSS, or sign up to our free email newsletter.

Here's one last tip for the curious: you can view the HTML of any web page on the Internet by right-clicking a blank area and selecting "View Source". I should warn you: some sites make the markup all messy so that you can't read (and copy) it; some sites' markup is just naturally messy. But try it out!

Looking for something to help kick start your next project?
Envato Market has a range of items for sale to help get you started.