RSS Quest: In the Middle… 2001-2004

Internet | Language/Structure

Today, a typical publisher demonstrates their RSS savvy by the amount of flavors they support, including RSS 1.0, RSS 2.0, and Atom– not by the depth of richness of metadata that comes with the feed. None offer support for categorization, threading, discussions through RSS. Blog posts represented in RSS feeds remained really simple. And that was Winer’s vision. Most of the personal publishing that reached millions on the web derived from his style of weblogging (the pure journaling/diary-writing developed independently later, according to chronicler Rebecca Blood).

Major news publishers– O’Reilly Media, NYTimes, MSNBC, BBC– have brought non-blog content into their feeds as well. Whatever richness exists in NewsML format on one side of the wall is dropped, since feed readers haven’t demanded anything. The developers of ICE did start to think that they might be able to this through their standard. From September 2001 to December 2003, ICE 2.0 was developed, and it introduced “Basic ICE” for public feed syndication. The technical walkthrough explains quite clearly: “ICE 2.0 was designed to extend RSS into a business environment.” Granted, that line was in a PDF powerpoint slide; nobody ever thought to put that message in a media of viral marketing, a blog.

I checked in with Laird Popkin to figure out whatever happened to any integration efforts. He told me, “Back when we wrote the ICE2 specification, the folks driving each of the RSS variants don’t seem too interested in any discussions that involved cooperating with each other or with the ICE working group, as they each had ideas that they wanted to pursue that would be slowed down with more participants, which I can appreciate as an engineer, but which I think is unfortunate from the perspective of their users.”

When ICE 2.0 and NewsML 1.2 came together in fall 2003, their sponsoring groups called a News Standards Summit in Philadelphia in conjunction with the XML 2003 conference. Representatives of various standards convened– ICE, NewsML, PRISM, XMP, along with RSS/Atom, represented by Sam Ruby of IBM, and Ben Hammersly. Ruby wrote afterwards that he was impressed, but to Popkin’s point, I haven’t found any other followup.

Fortunately, some integration work did get underway over 2004 to integrate RSS with one of the standards, PRISM (which today stands for Publishing Requirements for Indusrty Metadata). What NewsML is to newspapers, PRISM is to magazines and journals. (Both PRISM and ICE are coordinated by the International Digital Enteprise Alliance; NewsML is a standard of the International Press Telecommunications Council). The first PRISM specification had been developed in 2001, and had more of a focus, according to its specification, on “describing content and how it may be used.” In August 2004, Nature Publishing Group, the London-based scientific journal publiser, developed a adopted vocabulary from the PRISM standard. Other publishers like Ingenta would soon follow suit, as the NPG developers revealed in a article in the D-Lib Magazine (a journal about digital libraries). In addition, the academic-geared social bookmarking site CiteULike has announced support for PRISM metadata.

Still, one would think that the NewsML standard would have some attraction to the new media world, since blog posts tend to more resemble newswire stories in speed and length than magazine articles. Yet there’s comparatively very little written on it on the web, and one can look through many books on RSS or RDF before finding a references. The Online Journalism Review at USC has mentioned NewsML once– editor Robert Niles, in a call last November for new standards and tools for distributed online reporting, dismissed “the overkill of NewsML” in a parenthetical comment. I wrote him for a qualification of that comment. He wrote me back: “NewsML died on the vine in becoming an open standard powering news publishing and delivery *outside* the print newsroom — RSS won that battle. (Though NewsML was in many ways a superior alternative, perhaps the Betamax/VHS analogy applies.)”

If NewsML is a superior alternative, what exactly does it do? I found the answer by entering NewsML and Lexis into Google– I imagined that having news encoded with NewsML was like having the Lexis-Nexis search engine at one’s fingertips. The first item that came up was a graduate term paper at UCLA. The author was Carol Perusso, who I then learned had been the editor of and is now on the faculty of the Journalism Department at Cal State-Long Beach. The paper, NewsML: Is it Meeting Its Goals?, included this brief explanation of what it could do:

For example, the Reuters story on June 10 about Iraqi insurgents holding seven Turks hostage had the following NewsML coding supplied by Reuters. This metadata told that the story was about Iraq and Turkey. That metadata, along with information about the urgency of the story and the source (Reuters), allowed the Los Angeles Times to confidently route the story automatically to the World News category of its website without a human editor reviewing it.

What a tremendous use! It’s as if this were so powerful, there were a conspiracy to keep it under wraps.

Five years ago on the syndication mailing list, Winer told Popkin that he didn’t think Reuters would always be necessary, because “the Internet” would supply a reporter. There may well now be millions of citizen-reporters around the world. The challenge is having one come to you with the urgency and trust you desire.

That’s for the next part, tomorrow.