The browser and user problem
One of the biggest problems of HTML is browsers - they are simply too forgiving. Especially Internet Explorer takes anything that resembles some well-formed HTML and tries to display it. This makes learning HTML seemingly very easy, but trust someone who's been around since Netscape 3 was a hot new browser - it isn't.
Marking up HTML, styling it with CSS and making it do things with Javascript is a hard task. You have standards that you can follow, but you also need to know which ones display how on what browser and operating system. You work in the unknown, you don't know the user agent used to display your pages and you don't know what the users can do - can they see? can they use a mouse?
Show some respect
This is why it is all the more annoying to have created a perfect template, one that is working to certain degrees in all possible environments and looks cool, just to see it butchered by the middleware scripters. Probably as they don't know the amount of work that went into it, or they consider their environment the world ("What do you mean it is messed up - works great on my X box in Y").
So, please oh please take what you get from the Interaction Designer, HTML scripter, webdesigner - or whatever else you call the unfortunate in between design and backend - with some respect. Try to make your code spit out the same markup with the added data from the database, XML feed or whatever you are asked to retrieve.
Talk is important: If we need to create conditional markup, let's ask for a template that has all eventualities in it, and put our logic around it. If we are not sure, let's ask again, good interaction designers will gladly help rather than get the bugs we caused sent back to them for fixing.
What is an HTML document?
A good HTML document starts with a DOCTYPE. This one defines what the document is, what flavour of HTML is used. This is not only interesting for the developer, it also defines how the browser will render the page. A wrong doctype can mess up a design completely.
After the doctype, there is the HTML tag. Nothing much to say about that one, except that we can and sometimes must define our language and the direction of the text in there (as not all languages get displayed left-to-right).
After that, there's a HEAD element, which contains META tags, defining for example the encoding type, and links to other documents (like style sheets).
We might encounter a SCRIPT tag, linking to a Javascript, or even some inline CSS and Javascript.
The HEAD tag is closed and the BODY begins. The BODY is where all text should be displayed, there and nowhere but there. After closing the BODY tag, we close the HTML tag, and voilà, we got ourselves an HTML document.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html dir="ltr" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <title></title> <style type="text/css"> </style> <script type="text/javascript"> </script> </head> <body> </body> </html> |
HTML Syntax 101
If we are not sure, what we need to do, or if we need to achieve a certain look and feel, the best option is to ask an Interaction Designer. If there is none at hand, let's remember the following:
-
We are not living in 1997 any longer, browsers expect valid HTML
-
The world is not only Internet Explorer
-
Tags and attributes that define visuals should not be used.
-
Let's keep the presentation in the CSS.
The syntax of HTML is easy, however not following these simple rules can break a layout:
1. All tags are lowercase
Let's not use <P></P> or <P></p>, but <p></p>. Also, let's not mix cases like <Table></TABLE>.
2. All attributes need to be lowercase and their values embedded in quotationmarks
I know, it is a lot easier to do:
| echo "<p class=main>foo</p>"; |
than
|
echo "<p class=\"main\">foo</p>"; |
but that is how it is, a clean option is to use ' for our printout commands.
| echo '<p class="main">foo</p>'; |
Attributes shall never be uppercase or mixed case. onMouseOver is not good, onmouseover is.
3.Whitespace can be bad
Per definition, whitespace does not affect HTML - in the real world it does, though.
Not HTML, but CSS and Javascript can be very much affected by an extra blank line or linebreak. Therefore, let's not create any extra whitespace. Whitespace is needed to make the markup readable by humans, as we generate HTML for a browser, and not for the developer to change, we don't need any.
4.Links shall be links
Let's avoid creating dead links. Javascript is not a protocol, it is a pseudo protocol.
| <a href="javascript:open('foo.html')">open document</a> |
is a dead link when Javascript is not available.
| <a href="foo.html" onclick="open(this.href);return false">open document</a> |
also works when Javascript is not available. The return false ensures that the main document is not loaded.
5.Let's keep up to date
Things that are outdated and impossible to overwrite with CSS should not be used any longer:
| tag/attribute |
replacement |
| bgcolor |
Should be in the CSS, add a class instead. So instead of <td bgcolor="blue"> use <td class="highlight"> and in the CSS td.highlight{color:blue;}. |
| style |
Avoid inline styles, add a class and do it in the CSS file instead |
| <b></b> |
should be <strong></strong> |
| <i></i> |
should be <em></em> |
| <font></font> |
should also be defined in the CSS |
| <br> and |
should be replaced with <p></p> and margin or padding settings in the CSS. A foo<br>bar is not styleable, whereas a <p>foo</p><p>bar</p> is. Non-breaking spaces pure evil, they are text data used for presentation, something that should not happen at all. |
| <center></center> |
should be defined in the CSS, as text-align:center of the parent element and margin:auto; on the element itself; |
| align |
should be defined in the CSS, as text-align |
A quick and solid introduction on which elements can be used and how can be found at HTMLDog.
6.Comments
Commenting in HTML is done via the <!-- --> markers. Comments can span one or more lines and can also be adjacent to another.
Comments should only be kept in the HTML if they are needed by a CMS tool or some other data is stored in them, otherwise let's delete them.
Comments are a great way to communicate between the Interaction Designer and us. Ask them to do something like the following:
<!-- replace all %xxxx% with the real data --> <table summary="results year %year%"> <tfoot> <tr> <th scope="row">Overall</th><td>%overall value%</td> </tr> </tfoot> <tbody> <tr> <th scope="col">Client</th> <th scope="col">% of satisfaction</th> </tr> <!-- Loop over all sellers and populate this data --> <tr> <td>%sellername%</td> <td>%percentage%</td> </tr> <!-- end Loop --> </tbody> </table> |
The same works for conditions:
<!-- If the user is logged in --> <p>Hello <strong>%name%</strong>.</p> <!-- else --> <a href="login.php">Please log in</a> |
These are also perfect examples of comments that need to be deleted when we generate the final pages.
Sometimes you might encounter whole sections of code commented out. These sections may become necessary in the future, however it is pretty pointless to generate them. Let's remove the comment markers and comment them out in the backend code instead.
<!-- <p>Please continue to <a href="cust.php">your personal page</a></p> --> |
Should become (as an example in PHP):
#echo '<p>Please continue to <a href="cust.php">your personal page</a></p>'; |
If we add own comments, let's follow the correct syntax, and don't overdo it.
| <!--------------------stephen did this---------------------> |
is bad and might cause some problems in browsers and validation.
| <!-- stephen did this --> |
is sufficient. We should also indent multi line comments.
<!-- index.php created 01.04.04. This is a generated file, do not edit! --> |
Our friend, the validator
That's all there is to say, if we stick to these rules, we make all involved in the project happy, and we are less likely to get bug reports of visual problems.
The best way to ensure that we are on the right track is take the generated code (by using "view source", not "save as" as Internet Explorer has the nasty habbit of changing the code when it saves it) and validate it. This can be done by using the free w3c HTML validator.