In an ideal publication, every online article has a consistent URL format, which remain permanent, and every article allows threaded comments. In reality, few major newspapers follow these simple rules. This alone partly explains the popularity of weblog and CMS platforms like Drupal (used here), which support these. (Incidentally, Clark Hoyt announced yesterday that the Times will be supporting comments on every article.)
Here's an idea: online publications should document their "News Experience" through an XML document. It would explain how URLs are formed (supporting multiple formats), and which groups of articles allow comments, and details such as how articles are chunked and what the home page refresh rate is. This document can be shared so that industry researchers can view the configurations across multiple publications.
I sketched out a simple XML Schema which can serve as a start to how it would work:
Recent comments
9 weeks 5 days ago
10 weeks 5 days ago
11 weeks 52 min ago
17 weeks 5 days ago
20 weeks 2 days ago
23 weeks 2 days ago
24 weeks 15 hours ago
24 weeks 1 day ago
24 weeks 3 days ago
25 weeks 1 day ago