Welcome to gnomedia codeworks!

This is a blog, a collection of articles, some software projects, some miscellaneous scripts, a kitchen sink... I hope you'll find something useful or interesting.

Using SAX Parsers

December 2nd, 2003

This article will introduce the subject of parsing XML files, using as examples the Expat parser and the Xerces parser. In the process we will examine the two event interfaces for XML parsers, SAX1 and SAX2. I will assume that you’ve read the two previous articles in the series (Introducing XML by David Nash and History of Unicode by myself) and I assume that you have a good understanding of C++. The article won’t cover the design of XML documents, the samples we use will from necessity be simple and designed to demonstrate the basic facilities of the XML parsers. We will create a simple program to parse an XML file and count the characters and tags in it, showing how the program differs between Expat and Xerces.

Read the rest of this entry »

Introduction to XML and C++

November 22nd, 2003

Over the last few years a growing number of applications and services have been using a type of text mark-up known as XML. The structure of XML, and the timing of its introduction, made it a perfect match for the new (at that time) and fast growing language Java. However, its use in C++ has lagged behind somewhat, and this series of articles is aimed at redressing the balance a little.

Read the rest of this entry »

A Short History of Character Sets.

November 22nd, 2003

In this article I will provide some background to character sets and character encodings. The focus is on what is needed to work with XML parsers, as a preliminary to further articles in the series. For this reason there are some areas (glyphs and representation for example) that have not been covered.

Read the rest of this entry »