NewsML Toolkit - The NewsML library from Reuters & WAVO
Written by David Megginson

Last updated: 1 December 2000
Version: 0.1 alpha
Comments to:

The NewsML Toolkit library and the NewsML Explorer demo application are copyright (c) 2000 by Reuters PLC and WAVO Corporation, Inc., and are released under the terms of version 2.1 of the Gnu Lesser General Public License (LGPL).

1. Overview

NewsML is a new, open electronic-news specification developed by the International Press Telecommunications Council (IPTC) and supported by major news vendors and amalgamators. Based on the Extensible Markup Language (XML), NewsML allows news providers to bundle compound news objects in different media (such as text, video, photographs and graphics) into a single package for electronic distribution.

News customers can process NewsML packages with low-level, generic XML tools and libraries like the Simple API for XML (SAX), the Document Object Model (DOM), and Extensible Stylesheet Language Transformations (XSLT), but the large feature set of the NewsML format can make the work difficult, especially if an XML specialist is not available. The Java-based NewsML Toolkit, jointly developed by the Reuters Group PLC in the U.K. and Wavo Corporation, Inc. in the U.S., provides a simple interface that lets you perform the most important NewsML processing tasks without any knowledge of XML or the intricacies of NewsML markup.

Java developers with no prior XML knowledge can use the NewsML Toolkit to extract many kinds of information from a multimedia NewsML package, including news lines, permissions, dates, whether a story is embargoed, and where to find the individual news objects, all using regular Java object methods. The first release of the library also includes a simple demonstration application, the NewsML Explorer, for browsing NewsML packages interactively.

For advanced users who need access to information not provided directly by the first alpha release of the library (such as full metadata support or incremental updates), the NewsML Toolkit allows direct access to the full original markup through a DOM interface whenever needed.

2. Features and Benefits

The NewsML Toolkit is implemented in Java and should run on any platform with a Java2-compliant virtual machine, including (but not limited to) Unix, Linux, Windows NT, Windows 2000, Windows 95/98, and MacOS. To date, the library has been tested under Linux and Windows.

The NewsML Toolkit and the NewsML Explorer application are both Open Source: freely redistributable, with source code included. The library's license allows it to be incorporated into commercial software packages royalty-free, as long as any modifications or improvements to the library itself are released back to the public. A shared, vendor-friendly open-source library makes it possible for NewsML developers to concentrate on innovation rather than writing basic NewsML processing code over and over again and losing weeks or months tracking down the resulting bugs.

The NewsML Toolkit works with the industry-standard DOM standard for XML processing, and will work with any conformant Java-based DOM library: if you have already assembled an XML toolkit that you're happy with, you do not have to throw it away. While the initial NewsML Toolkit release concentrates on presenting the most important information as simply as possible, the full XML markup is always available through the DOM whenever needed.

The NewsML Toolkit will save developers time and money, by allowing non-XML-specialists to develop NewsML-based applications quickly and easily.

3. Library Structure

The NewsML Toolkit contains many classes to represent the different kinds of information that can be present in a NewsML package, but most NewsML work is based on five key classes:

This class represents the top-level NewsML package, containing one or more NewsItem objects. The top-lev