Search Engine on CBC

This probably isn’t news to many people by now but CBC’s Search Engine will not be returning in the fall. What a loss. To me Search Engine is a great example of what a radio show and Podcast can be. The show had strong audience participation and felt almost more like a blog post than a traditional radio show. More importantly, Search Engine covered digital issues such as Copyright reform in a way that is greatly needed at this time.

I really hope that CBC will reconsider this cancellation. Public broadcasters need to bring in young people and new listeners. A new and experimental format like Search Engine is a great way to accomplish this. The huge amount of interest in this spring’s Copyright reform bill shows that many Canadians are becoming aware of the topics Search Engine covered. Now is not the time to give up on this show.

Fortunately it looks like Search Engine’s sister show, Spark, is still going to continue.

End-to-end in standards and software

Two things. Both relate to Microsoft but that is just by coincidence.

The first

Apparently IE8 will allow the HTML author to specify the name and version number of the browser that the page was designed for. For example, the author can add a meta tag that says essentially “IE6”. IE8 will see this tag and switch to rendering pages like IE6 does. Apparently this came about because IE7 became more standards compliant thereby ‘breaking’ many pages, especially those on intranets which require the use of IE. The new browser version tag will allow MS to update the browser engine without breaking old pages. As a result they will be forced to maintain the old broken HTML rendering engine (or at least its behavior) for a very long time. This will consume development resources that could otherwise be put into improving IE. It will also increase the size, complexity and undoubtedly the number of bugs. As for the pages broken by newer more standards compliant browsers, what is their value? Any information in a corporate Intranet or otherwise that has value will be updated to retain its value. If no one bothers to update the page is was probably nearly worthless anyway. Also, most of the HTML pages now in use are generated by a templating system of some kind. It’s not like each and every page will have to be edited by hand.

The second

The Linux kernel development process is notorious for improving (breaking) the kernel’s internal driver APIs. This means that a driver written for version 2.6.x might not even compile against 2.6.x+1 let alone be binary compatible. This of course causes all kinds of trouble for companies not willing to open source their drivers. However, the advantages of this process are huge. It is completely normal that during the development process the author will learn a lot about how the particular problem can be solved. By allowing the internal APIs to change the Linux kernel development model allows the authors to apply this new found knowledge and not be slowed down by past mistakes. As I already mentioned this causes problems for binary only kernel drivers but if the product has value the manufacturer will update the driver to work with the new kernel release. If it doesn’t have value the driver it won’t get updated and the kernel doesn’t have to carry around the baggage of supporting the old inferior design. How does this relate to Microsoft? From Greg Kroah-Hartman:

Now Windows has also rewritten their USB stack at least 3 times, with Vista, it might be 4 times, I haven’t taken a look at it yet. But each time they did a rework, and added new functions and fixed up older ones, they had to keep the old api functions around, as they have taken the stance that they can not break backward compatibility due to their stable API viewpoint. They also don’t have access to the code in all of the different drivers, so they can’t fix them up. So now the Windows core has all 3 sets of API functions in it, as they can’t delete things. That means they maintain the old functions, and have to keep them in memory all the time, and it takes up engineering time to handle all of this extra complexity. That’s their business decision to do this, and that’s fine, but with Linux, we didn’t make that decision, and it helps us remain a lot smaller, more stable, and more secure.

So what was the point?

I don’t know what to make of these two little stories but the later has been bothering me for some time. Where does the responsibility for dealing with change belong? The Internet has taught us that we should push as much work as possible to the ends of the network. The alternative is rapidly growing complexity and inflexibility in the core. It seems to me that this applies to both of the situations I outlined here as well.

Linus on distributed version control and Git

Recently Linus Torvalds did a presentation at Google about distributed version control. This is a great introduction to distributed version control if you have wondered what the big deal is. Unsurprisingly, the presentation also spends a considerable amount time talking about Git and picking on CVS and Subversion.

The Future of Computing

The Future of Computing: From mainframes to microblades, farewell to GHz CPUs provides a nice overview of trends in CPU and system design. I have a couple of comments to add.

When in late 1950s computers became fast enough to relieve some of the coding burden from the shoulders of programmers high level languages were developed such as Ada, Algol, Fortran and C. While sacrificing code efficiency big time these high level languages allowed us to write code faster and thus extract more productivity gains from computers.

As time passed we kept sacrificing software performance in favor of developer productivity gains first by adopting object-oriented languages and more recently settling with garbage-collected memory, runtime interpreted languages and ‘managed’ execution. It is these “developer productivity” gains that kept the pressure on hardware developers to come up with faster and faster performing processors. So one may say that part of the reason why we ended up with gigahertz-fast CPUs was “dumb” (lazy, uneducated, expensive — pick your favorite epithet) developers.

Although true in some sense, the term developer productivity is a bit of a misnomer here. High(er) level tools and design methodologies do not just save developer time they make modern software possible. I seriously doubt that creating a web browser or any of the other huge pieces of software that we use everyday in assembly language is a tractable problem. Even if the problem could be brute forced, the resulting software would likely have a far higher defect rate than current software.

In the long term it makes little sense to burden CPU with DVD playback or SSL encryption. These and similar tasks should and with time will be handled completely by dedicated hardware that is going to be far more efficient (power and performance-wise) than CPU.

This completely ignores one of the most important aspects of fast general purpose CPUs, flexibility. For instance, a computer which relies on a MPEG decoder for video playback becomes useless when content is provided in another format. Continuing with this example, innovation in the area of video codecs would also become very difficult.

Despite the nitpicks, there is lot of good information in the article.

Extra, Extra – Read All About It: Nearly All Binary Searches and Mergesorts are Broken

If you follow many software or computer science related blogs you may have already seen the article linked below. I’m going to link to it again anyway because everyone who is involved in software should read it.

Extra, Extra – Read All About It: Nearly All Binary Searches and Mergesorts are Broken

The general lesson that I take away from this bug is humility: It is hard to write even the smallest piece of code correctly, and our whole world runs on big, complex pieces of code.

Software as speech

Well, my sense of software is that it’s something that is both speech and a device, depending on how you define it. When you talk about software as speech, many good things tend to flow from that. When you use software as a device you can get into great benefits and also fairly scary issues.

— Don Marti

The above was taken from the November 2005 issue of Linux Journal in an article titled “Dialogue with Don“. This article is definitely worth reading if you have access to it or can wait for it to become freely available.

Software analogy

Inside Risks is the last page column in Communications of the ACM. The Inside Risks column in the September 2005 issue, written by Barbara Simons and Jim Horning, discusses how hard it is to get non-technical people to understand why writing bug-free, and more importantly secure software is so hard. The article offers a nice analogy with the following caveat, “Analogy is a poor tool for reasoning, but a good analogy can be very effective in developing intuition.”

One possibly useful analogy is the U.S. Tax Code. Americans have some sense of its complexity and of the large number of people employed in its interpretation. Tax loopholes are analogous to hidden malicious code or Trojan horses in software.

The tax code resembles software in other ways as well:

  • It is intended to be precise and to interface with messy realities of the real world.
  • It has been developed in multiple iterations, responding to changing circumstances and requirements.
  • The people who wrote the original version are no longer around.
  • No one understands it in its entirety.
  • It can be difficult to infer intent simply be reading a section.
  • There are people who actively seek to subvert it.

Of course, there are also major differences between the tax code and software. The tax code is relatively “small” – although it runs to several thousand printed pages, Windows XP has 40 million lines of source code.

Alan Kay quote

If you look at software today, through the lens of the history of engineering, it’s certainly engineering of a sort – but it’s the kind of engineering that people without the concept of the arch did. Most software today is very much like an Egyptian pyramid with millions of bricks piled on top of each other, with no structural integrity, but just done by brute force and thousands of slaves.

— Alan Kay, ACM Queue, Vol 2. No. 9

LQL#

Work has begun on the long promised Mono (C#) bindings for LQL. This little C# program will display traffic statistics for all of the queueing disciplines that are supported by the C LQL library.

using System;
using LQL;
class MainClass {
    public static void Main(string[] args) {
        Gtk.Application.Init();
        LQL.Con con = new LQL.Con();
        GLib.List ifList = con.ListInterfaces();
        foreach (LQL.Interface netInf in ifList) {
            GLib.List qdiscList = con.ListQdiscs(netInf);
            foreach (LQL.QDisc qdisc in qdiscList) {
                qdisc.UpdateStats(con);
                qdisc.PrintStats();
            }
        }
    }
}

Very exciting stuff!

LQL# is not nearly polished enough for a public release yet but I am quite happy with how the work is progressing.