Avoiding identity crises
By Julian Perkin
Two waves of change - the need for governments to share information to protect the public and the trend in policy-making towards open government and greater transparency - underpin the need for global standards that enable information to be correctly and uniquely identified, wherever it is held.
National governments, the European Union and the Organisation for Economic Co-operation and Development (OECD) are investing in two related technical innovations - "Handles" and "Digital Object Identifiers" - which are rising to prominence as the key technologies to achieve this (see the first of this two-part feature, in the November 17 FT-IT).
The Handle System uses persistent identifiers, known as "handles", which are more robust than normal web links to reference digital information on the internet. DOIs are a standard method, based on this system, for identifying published digital content. Unlike web links, DOIs identify the actual content, while the web references its location, which can lead to the broken links we have all become used to on the worldwide web.
Handle-based identifiers can also be used, through a system of access via trusted intermediaries, to ensure that everyone gets the same, definitive version of a document, ending the confusion caused by different versions of the same material popping up at various locations on the web.
Robin Wilson, director of digital identifier and metadata services at the UK's TSO (The Stationery Office), which provides publishing services to the UK government, parliament and assemblies, likens the difference between traditional web links and Handles to that between house bricks and industrial strength bricks. "On the face of it they do the same job," he says, "but the industrial strength bricks are more durable, more flexible, better for rendering, water-proof etc. They may be harder work for the builders, but they provide much more value for the user."
Mr Wilson believes the key to enabling the information systems of government departments to inter-operate is to be able to identify each element of information in a way that is unique, universally recognisable, based on a global standard and guaranteed not to change over time. Such a system can then be relied upon always to resolve to a real document and a definitive version.
"Government departments have to get out of their silo mentalities," says Mr Wilson. "Whenever you need to have systems interoperating, you face the problem of federated object identification."
Mr Wilson says the Handle system can be used directly by government departments for information sharing helping them meet demands such as those driven by freedom of information and data protection legislation. "Government licences of the Handle Architecture have been configured for the sharing of grey internal information between departments and inter-governmentally", he explains. ("Grey" means information that may be made public though it was not created for publication.)
TSO was the UK's first DOI registration agency and allocates persistent digital identifiers for the official publications marketplace within the UK. It has recently announced that it will distribute these identifiers free-of-charge. This should encourage all government departments to adopt and use persistent digital identifiers, thus helping the government fulfil its open-government commitments. It is also a sound business strategy for TSO, which has been a private company since 1996, as with widespread adoption it could make money on associated services, such as embedding DOIs within picture archives.
OPOCE, the Luxembourg-based Office for Official Publications of the European Communities, is intending also to adopt Digital Object Identifiers and is echoing TSO's model, having become a DOI registration agency in August this year. OPOCE manages 60 publishing services with 1,700 titles. An official journal is published twice a day in all the European Union's 20 languages containing every regulation and directive issued.
"It is a huge amount of material that needs to be managed and this requires electronic identification, especially since EU information is increasingly accessible 'digital first'," says Serge Brack, head of author services at OPOCE. "DOI is a very suitable solution because it is highly flexible, it exists and has been operational for some time, it is becoming recognised throughout the world and it addresses many of our needs - notably questions of copyright, accessibility and traceability."
Mr Brack adds that the system is flexible and adaptable because it can accommodate any kind of numbering system - including ISBN / ISSN and the indexing systems of existing information systems, and can accommodate the transfer of ownership from one organisation to another - for example, when a new department takes over responsibility for publishing reports in a subject area.
Another service that uses DOIs is CrossRef (www.crossref.org), operated by the Publishers International Linking Association, through which academic journals can link their citations to articles in other publications. The DOIs ensure the links remain valid over time.
In the commercial sector, New York-based Content Directions (CDI) is spearheading innovation in DOIs, providing publishers, information providers and e-commerce businesses with a simple means to turn their web links into persistent DOI links that can present users with convenient menus of references to related material rather than a simple one-to-one web link.
These menus are built dynamically so that changes such as an additional menu item - for example, for a new edition of a publication, or a new product specification sold online - will be automatically and instantly reflected in all websites, personal bookmarks and search results that reference the web page through the DOI link.
CDI may have major publishing clients including Harvard Business Online, McGraw Hill and Bowker, but installations exploit the DOI system's inherent strengths and typically utilise the client's pre-existing metadata, so requiring no bespoke development and modest set up - mainly of menu structures. It is therefore accessible to smaller organisations such as London-based Snapshots International, which publishes market research reports. Debra Curtis, chief executive, says: "We have no IT development team and minimal IT budget." Yet DOIs enabled the company to present all its related publications to users from a single link, and the company finds its publications ranking high in the relevant results returned by search engines.
The intricate multiple linking of DOIs tends automatically to give sites a more prominent position in search engine results - a highly valuable feature.
The OECD, based in Paris and representing 30 of the world's most advanced economies, has become TSO's first customer of DOIs. OECD publishes thousands of charts and tables in its books and reports. Toby Green, head of marketing at OECD Publishing, says the amount of information in the charts is constrained by the physical size of the page and customers often want greater range or finer granularity or different selection of data typically for different countries. In future, DOI links will be shown at the bottom of each chart, enabling users to click through to a bigger or more up-to-date version of the table.
OECD Publishing is developing a service called "Stat Link" that will allow users to see the spreadsheet source of the chart so that further analysis can be done using different parameters, for example, applying different assumptions of growth rates. This concept could one day extend to allowing users to access the original databases that house the statistics, so that further information can be drawn from the database in addition to the figures held in the spreadsheet.
A key strength is that DOIs will facilitate a degree of convergence between printed reports and online data. The DOI code will also be printed below the tables and charts in hard copy reports and books. This code can be typed into a web browser to access the same services - latest figures, more exhaustive statistics, access to source data etc. "In this way, the printed page itself becomes interactive", says Mr Green.
The Stat Link service gives a taste of how DOIs will affect us all. DOIs will address fundamental weaknesses in the web - addressing actual content rather than locations that cannot be wholly relied upon.