Whenever a page or an entire website is published, there is the problem of producing valid XHTML code. Let’s face it: Open Text Websolutions is not (and RedDot CMS was not) the most eager CMS in the world to achieve this goal.
Some open source solutions (e.g. WordPress ;-) ) do a better job.
So what can we do? Since nearly everyone is able to easily create a standards compliant webpage, Open Text and us partners have to cope with a big challenge in explaining why an expensive system is not as good as a free one.
So there are two goals for us:
Are you ready? Let’s go…
First of all, let’s have look at where do the problems come from:
Assuming that we developers are able to program tidy and standards compliant templates, there is only one source causing all of our problems: The text editor!
Uppercase tags, mso-styles and incorrect empty tags like <BR> instead of <br /> spread our code that was once so beautiful.
So the most simple way to avoid all of that would be to keep the editors away from anything where they could produce one single tag without our control. This ends up in denying them to use the text editor, at least the “not in ASCII mode” one.
But I have never seen a project like this…
Another good way to publish standard compliant web pages is to use tidy. Simply activate it in your project variant settings and all problems are gone!
Well, almost all:
At this point, I did not investigate any further.
Version 9 introduced Telerik RadEditor as a new approach. This seems to be a good and well integrated tool that produces really cool tidy code. Good for brand new projects. Changing a running project can be a bit tricky, but the main problem is that you would have to open and close every single text element to see the effect, however.
And there are some issues around, which have been discussed yet in other posts here.
Have a look at the /cms/asp folder of your server. There you should find a file named HtmlConvertTable.txt.
It’s a simple plain text document containing tabulator separated strings. Matches left will be replaced by the text right. It has been used in ancient times (before unicode) to convert these strange characters like our german umlauts to their corresponding HTML entity (e.g. ä became ä).
The great advantage of this solution is that it only changes the content of elements, but not one single character of the template code!
In your project variant settings, you can choose between three options (section Conversion of RedDot content):
So let’s create a conversion table for XHTML code, save it to the /cms/asp folder of your server, name it e.g. HtmlConvertTableXhtml.txt and enter this file name into the text entry field.
Into this file, write down all uppercase tags to the left and the corresponding lowercase tags to the right, separated by a tabulator.
Here’s a little example (when you copy it ensure that you get the tabulators right):
<BR> <br /> <IMG <img <P <p <P> <p> </P> </p> <A <a </A> </a> <STRONG> <strong> </STRONG> </strong> <EM> <em> </EM> </em>
Very simple, but powerful. What do we see in this example?
You can list the ampersand, too, of course, if you do it as follows:
& &
Both the left and the right one must have a leading and a trailing space to ensure that only the ampersands standing alone will be converted and not those which are already part of an entity.
Last but not least you can convert deprecated tags into a standard compliant way:
<NOBR> <span class="nowrap"> </NOBR> </span>
The behaviour of the “nowrap” class is then defined using CSS.
In my experience so far, this solution delivers the best results for projects using the built-in RedDot text editor (I’ve tried it with version 7.5), although it’s not possible to convert all tags. For example, you cannot convert empty tags that must have attributes (e.g. img) into their XHTML variant.
That’s it. Now I’d like to hear about your experience with this.