The primary markup language used on the web is HyperText Markup Language (HTML). The first release of the markup language lacked many of the features taken for granted today. Many years of development were required for HTML to reach its current state. In fact, almost four years passed between the first attempts at a markup language and HTML 2.0. Since 1995, HTML has continued to change to reflect continued use and changing demands. It takes a substantial commitment of time and resources to develop a markup language.
Before HTML development began, it required software to test its capabilities. This software was the precursor to the first web browser. When developing a markup language, there is no way to test without having the software first. This is one requirement for developing a markup language. Another requirement is to support users. The users drive the software and the software drives the markup language.
This article describes how General Reuse Markup Language, or GRML, developed into its current format. There are examples provided showing the differences between GRML 1.0 and GRML 2.0. Each markup language has its attributes described and its use on the web is discussed.
Background
Before continuing, read the article,Introducing GRML. It is an overview of existing file formats and markup languages, and explains why GRML was created.
It is helpful to have an interest in markup languages, potential alternative approaches, and web browser technology. This discussion is not centered around HTML. Rather, a complementary approach for using file and web browsers is covered here.
The process of creating GRML was not direct. It began with a desire to create a front-end to extract content from web pages. The idea was to submit a web page request and retrieve the content in a format usable by a variety of applications. HTML displays content in one way, so it is not used by a variety of applications. Since the target web pages used HTML, the retrieved content needed to be available using another format.
The only solution to extracting content from a HTML web page is to use an adapter. Adapters read data in one format and write them in another. This was the perfect solution, except for one thing. HTML web pages are described differently for every web page requested. There is no way to extract author information, or article text, or product descriptions without creating an adapter for each web page. There had to be a better way.
While trying to find a practical way to extract content from web pages, a front-end was being developed to display the content. A single adapter was developed to format HTML from a single web page into an informal format used by the front-end. This informal format was the initial step toward creating a markup language.
From June, 2002 until August, 2002, the front-end used an adapter to convert HTML web pages to text, for display. There was no format, other than reading single lines of text from the adapter. As development continued, more adapters were added, until 6 were available. Web page requests sent from the front-end had to use one of these 6 adapters. There was no feature for users to directly enter a web page request.
The first attempt.
As the front-end was developed, a form was needed for sending requests using input controls. This required a formal approach for handling requests to and responses from a web page. Using arbitrary lines of text was inefficient. This was the beginning of Personal Markup Language.
The new markup language had form support and provided a structure for formatting web page content. However, the form was limited. At first, the front-end created a form from the first web page request. There was no way to display another form. To allow the markup language to create a form for each web page request, the front-end was updated.
Upgrading the format.
With form support, the front-end now sent web page requests from input controls and created a form from web page responses, when necessary. The only feature missing was a way to organize web page content into groups, and display each group of content separately in the front-end. This required a new markup language. It was the beginning of the Simple Markup Language.
When the front-end displays content from a web page, it is called a dimension. Splitting content into different groups creates a dimension for each group. The front-end needed to display different dimensions of content for forecasting, logistics, and data analysis. Once the front-end added this, the markup language supported multidimensional views.
As the markup language was being developed, there was one constant. The front-end did not allow the user to directly enter a web page request. A user had to choose from the 6 web page requests used by the front-end. Or, submit a request using the form input controls of a web page.
Once direct web page requests were added, it was possible to "browse" web pages. The front-end became a web browser. Using a web browser required the markup language to be completely redesigned. This new markup langauge was the first version of GRML.
Completed January 2003, GRML supported form input controls, columns, and results. There was multidimensional support and it used the concept of "web applications". Each represented an activity a user performs on the web. The first GRML web browser had "web applications" for using a search engine, getting news headlines, viewing auction listings, and doing a job search.
"Web applications" were a holdover from the days of the front-end, when directly submitting a web page request or opening a file were not supported. While the web browser allowed web page requests, they had to be from a "web application" or a form if the request was to be sent.
The reason for "web applications" is to use content from HTML web pages in GRML web browsers. Since HTML web pages are abundant and GRML is new, it is advantageous to have the ability to adapt HTML to GRML.
An example of "web applications" in GRML 1.0 is below.
GRML was designed to be used by many different browsers. It was not possible to test this capability since only one GRML web browser existed. As other browsers were created and the markup language developed, GRML moved to version 1.1 in the first 4 months of 2003.
The next major upgrade to GRML occurred when resolving the problem of "web applications."
One limitation of the "web application" approach was the need for a separate adapter for each HTML web page. Since there are billions of HTML web pages, it was impractical to create a "web application" for each one. Another problem was keeping the "web application" updated if a web page changed. If supporting a multitude of web pages is difficult, trying to keep them updated was practically impossible. GRML needed modification.
During March, 2004, everything related to "web applications" was removed from GRML. This allowed the markup language to focus on form input controls, columns, and results. With the "web applications" removed, it was now possible to read any HTML web page using more generic and consistent web adapters.
GRML 1.2 was the last of the 1.x releases of GRML. During the next six months of use, it set the stage for another change in the syntax of the markup language.
The inital versions of GRML worked well on the web and the local filesystem. It allowed the development of many different web browsers that use its form and column/result approach. Other than removing "web applications", the syntax for GRML did not change much from the 1.0 to 1.2 versions. Issues of speed, control, and reliability were not considered. However, this changed with GRML 2.0.
This version of GRML was designed to create small file sizes, handle file and web page content using fewer browser resources, and allow more options for arranging file and web page content. The old syntax was completely abandoned in favor of smaller tags and more specific tag keywords. The sample GRML from version 1.2 looks as follows in 2.0.
Using the GRML 2.0 syntax, tags drop to a fraction of their size from version 1.2. In addition, there are no problems with handling very large text strings (greater than 1024 characters). In version 1.2, the content sometimes was ignored because of its size. This often disrupted the display of all remaining content in the file or web page. This problem was solved in version 2.0, because each result item specifies a column.
It is possible to organize columns and results using version 2.0 that is not possible with version 1.2. Results are ordered according to the column display order. This is set by listing the top column as first, and all subsequent columns in order until the bottom column is last. If there are 5 columns, and the 3rd should be displayed first, place it at the top of the column order.
A result item only displays if the column it specifies appears in the column order. If it is necessary to display only one column of results, only that column exists in the column order. Or, specify any number of columns and only those results are displayed. This was not possible with previous versions of GRML.
GRML has moved through many versions since its first release, January 2003. It has moved from a "web application" markup language to a web page markup language. With version 2.0, it has the smallest, fastest, and most flexible syntax of any version released.
With its support for form input controls, columns, and results, GRML is able to support many web browsers by organizing its content for use regardless of how the content is displayed.