Jabber/XMPP pubsub

Most people who know about Jabber/XMPP think of it as an instant messaging platform. Of course, that is the primary use for Jabber at present but that may not always be the case.

The Jabber/XMPP network forms an XML based overlay network. Each message or packet of information carried by this overlay network is an XML stanza. You can think of Jabber servers as being XML routers and the clients as end nodes. In fact, the instant messaging portions of the XMPP standards are defined in a separate RFC from the core XML streaming technology (RFC 3920, RFC 3921).

One example of a non-IM use of Jabber is defined in JEP-0072: SOAP over XMPP. This document specifies how SOAP, which is normally used with HTTP to form web services, can be carried on top of Jabber/XMPP.

Another interesting non-IM use of Jabber comes from JEP-0060: Publish-Subscribe (aka pubsub). Pubsub is basically an event notification system that runs on top of Jabber/XMPP. In pubsub, a user publishes some XML data to a Jabber server which supports JEP-0060. Other users are then able to “subscribe” to this node. Whenever the node changes, a notification will be sent to all subscribed users.

There are lots of interesting things that could be done with pubsub. Off hand, here are a few of examples:

  • You want to checkout a book from the local Library. Unfortunately, someone else already has the book. In order to find out when the book has been returned, you subscribe to the node that represents that book on the library’s pubsub server. Once the book is returned you will know instantly.
  • You plan on purchasing a large, expensive TV in the near future. Rather than manually looking at the websites for several major retailers every few days, you subscribe to the pubsub node at each retailer for the particular TV model you are interested in. If any of the retailers have a sale, you find out instantly.
  • If like many people you use a RSS reader to keep up with new posts on your favourite blogs, you know that RSS readers periodically poll all feeds on your list. Often there are no new posts and this polling is a waste of resources. Instead, a pubsub enabled blog could notify interested readers of a new post. Not only do you find out about the new post sooner, network resources are saved.

In all of the above examples, subscribing to the particular pubsub node could be as simple as clicking on a link (JEP-0147: XMPP URI Scheme Query Components).

Also interesting is JEP-0163: Personal Eventing Protocol which defines a subset of the full pubsub (JEP-0060) specification which can be used for simpler instant messaging related tasks such as providing current geographic location information (JEP-0080: User Geolocation) or providing contacts with information about the music you are currently listening too (JEP-0118: User Tune).

It will be interesting to see how pubsub will be integrated into other network applications such as RSS readers and Jabber IM clients. It seems likely that pubsub notifications will be handled either by a Jabber client separate from the one that is used for IM or at least the Jabber IM client will have to distinguish these events from normal IM traffic.

For a nice overview of pubsub (with pretty pictures) see Jive Software: All About Pubsub.

The Vatican’s astronomer

Quirks and Quarks is the CBC‘s weekly science and technology radio show. It is also available as a Podcast.

This past week’s episode contains an interview with the Vatican’s astronomer. He has a very interesting take on the intersection of science and religion. Definitely worth listening to.

Many people think that science and religion don’t mix. But Brother Guy Consolmagno couldn’t disagree more. He’s a Jesuit, and also an accomplished astronomer – in fact, he works for the Vatican Observatory. And for Brother Guy, science and religion aren’t in conflict in the least. He sees them as two compatible and complementary ways to seek the truth about the universe. This Easter weekend, Brother Guy tells us how he views the cosmos – both literally and spiritually.

Linux and proprietary (graphics) drivers

From New Linux look fuels old debate:

For Nvidia, intellectual property is a secondary issue. “It’s so hard to write a graphics driver that open-sourcing it would not help,” said Andrew Fear, Nvidia’s software product manager. In addition, customers aren’t asking for open-source drivers, he said.

The open-source community already maintains many drivers. Even if NVidia’s drivers are somehow better at present, I bet NVidia would be very surprised how quickly the community would improve them. “It’s so hard to write a graphics driver that open-sourcing it would not help,” sounds like something people would have said about building a high-quality operating system like Linux 10 years ago.

Secondly, as an NVidia customer, I am asking for open-source drivers. I am sick of the driver dance that closed drivers force me to go through. I want my graphics driver to be packaged and updated as necessary by my distribution just like the rest of my system. I want an open-source driver so that the Xorg developers can modify the driver to take advantage of new features and architectural changes. As the speed of development on Xorg increases (which appears to be the case in recent history) proprietary drivers are going to have more difficulty keeping pace.

The next graphics card I buy will have good open-source drivers, even if it slower than the alternative with proprietary drivers. From the article linked above, it looks like it may use an Intel graphics chip.

Note: If you don’t understand why the Linux kernel developers dislike the idea of closed-source drivers so much you should read Linux in a binary world… a doomsday scenario by Arjan van de Ven (also linked to in the quoted article).

Business as Morality

Doc Searls: Business as Morality reprints an email written by Doc Searls discussing business morality. As with most of Doc’s writing it is worth reading. However, I would like to draw a little attention to one of the comments posted in response. It starts with the text “Wake the dragon”. This comment discusses the effects of the enormous cost reductions that the Internet has brought to content creation and distribution. The main idea is that the cost of content creation and distribution has been reduced to the point where content is being created without a profit motivation. This leads to a situation where for-profit companies must compete with entities who do not need to make money.

The main difference in the scenario above [media consolidation] and the current one that exist in the internet business sector is that the old scenario of market domination, and consolidation has been super imposed as a belief model in an space that it will not fit.

They [newspapers regarding on-line classified ads] also viewed the internet in an old world economic framework that postulates that business are only created and survive when revenue can be generated that makes the endeavor profitable.

Blogs, search engines and WordPress

One problem with the blog format is that the same content can show up on several URLs. This content layout is nice for humans. In the case of my blog, the same post content can show up on the main page, a category URL and an archive URL.

Unfortunately, what is convenient for humans is not so good for search engines. There are two aspects of the standard blog format which cause search engine problems. The first is the dynamic nature of some blog URLs. Consider the main page of an active blog. Most only show about ten posts; older posts are removed as newer ones are created. Often this results in a particular URL not containing the content the search engine thinks it does. Personally, I find this incredibly annoying since I often have to search the site using a local search engine after Google has directed me to the main page of a blog. The second problem of the blog format with respect to search engines is that some URLs, like a category URL, contain many posts which are not directly related to a particular search. This results in having to search the page with the browser’s find function after the search engine gets you there.

Both of these problems have been annoying me for some time now. So today I did a little digging. Fortunately there is a solution, the robots meta tag. This tag specifies, on a page by page basis, whether or not the content on the current page should be indexed by the search engine and if links on the current page should be followed.

The solution then, is simple. URLs which contain multiple posts should be marked “noindex,follow” while individual posts should be marked “index,follow”. This should result in the content of each post only being in the search engine database once. I also found a post called A critical SEO Tip for WordPress which describes a way to accomplish this in WordPress. The slightly modified version of this solution which I have added to my WordPress theme’s header.php is below. Unless there are downsides to this approach that I don’t know of, I think every theme author should add something like this to their theme.

<?php
if (is_single() || is_page() || is_author()) {
echo "<meta name=\"robots\" content=\"index,follow\"/>\n";
} else {
echo "<meta name=\"robots\" content=\"noindex,follow\"/>\n";
}
?>

Isn’t it semantic?

Isn’t it semantic?: An interview with Sir Tim Berners-Lee. There are some interesting comments on the semantic web in this interview.

In physics, to take the behaviour of gases as an example, you visualize them as billiard balls, model the rules they follow and then transpose that to a larger scale to account for the effects of temperature and pressure – so physicists analyze systems. Web scientists, however, can create the systems.

So we could say we want the Web to reflect a vision of the world where everything is done democratically, where we have an informed electorate and accountable officials. To do that we get computers to talk with each other in such a way as to promote that ideal.