I finally got around to watching A new way to look at networking yesterday. This is a talk given by Van Jacobson at Google in 2006 (yes, it has been on my todo list for a long time).This is definitely worth watching if you are interested in networking.
A couple of quick comments (These are not particularly deep or anything. This is mostly for my own reference later.):
- He says that the current Internet was designed for conversations between end nodes but we’re using it for information dissemination.
- Me: This distinction relies on the data being disseminated to each user being identical. However, in the vast majority of cases even data that on the surface is identical such as web site content is actually unique for each visitor. Any site with advertisements or with customizable features are good examples. As a result we are still using the Internet for conversations in most situations.
- He outlines the development of networking:
- The phone network was about connecting wires. Conversations were implicit.
- The Internet added metadata (the source and destination) to the data which allowed for a much more resilient network to be created. The Internet is about conversations between end nodes.
- He wants to add another layer where content is addressable rather than the source or destination.
- He argues for making implicit information explicit so the network can make more intelligent decisions.
- This is what IP did by adding the source and destination to data.
- His idea of identifying the data not the source or destination is very interesting. A consequences of this model is that data must be immutable, identifiable and build in metadata such as the version and the date. It strikes me how the internal operation of the Git version control system matches these requirements.
If you are interested in how Git works internally take a look at Git for Computer Scientists. This document explains how Git stores data in a DAG and even has pretty pictures.
Interesting programming language article.
The Semicolon Wars from American Scientist.
A catalog maintained by Bill Kinnersley of the University of Kansas lists about 2,500 programming languages. Another survey, compiled by Diarmuid Piggott, puts the total even higher, at more than 8,500. And keep in mind that whereas human languages have had millennia to evolve and diversify, all the computer languages have sprung up in just 50 years. Even by the more-conservative standards of the Kinnersley count, that means we’ve been inventing one language a week, on average, ever since Fortran.
If you follow many software or computer science related blogs you may have already seen the article linked below. I’m going to link to it again anyway because everyone who is involved in software should read it.
Extra, Extra – Read All About It: Nearly All Binary Searches and Mergesorts are Broken
The general lesson that I take away from this bug is humility: It is hard to write even the smallest piece of code correctly, and our whole world runs on big, complex pieces of code.
The following article offers a nice introduction to some design techniques that may be used to create more reliable operating systems.
Nevertheless, it is interesting to note that microkernelsâ€”long discarded as unacceptable because of their lower performance compared with monolithic kernelsâ€”might be making a comeback due to their potentially higher reliability, which many people now regard as more important than performance. The wheel of reincarnation has turned.
Can We Make Operating Systems Reliable and Secure? by Andrew S. Tanenbaum
Computer science growing into a basic science
The talks demonstrated that research in computer science is moving beyond the study of a set of â€˜computing machinery.â€™ Just as mathematics and physics have matured into fundamental sciences, computer science, too, is graduating into one.
Last Tuesday I attended CASCON 2005. CASCON is hosted by IBM’s Centers for Advanced Studies. I have been to many technology conferences in the past such as Internet World but this was the first academic conference I have attended. As such, I don’t have anything to compare CASCON against. The conference itself seemed to be organized well. The atmosphere was very relaxed.
The keynote speech for the day was by Rob Clyde from Symantec Corp. His speech was entertaining and had lots of good statistics on the current state of computer security. Throughout the whole speech one thought kept circling in my mind, the security industry is far more worried about managing the security problems that plague computer networks than solving them. This makes sense since it is hard to sell solutions to problems that no longer exist. The moral for this story is that computer science as a discipline shouldn’t be looking to the main stream computer security industry for solutions to basic security problems.
A key part of CASCON is the technology showcase. Interested faculty and students are given small booths where they can present their current research to anyone interested. The closest analogy may be an elementary school science fair for adults. This is a great way to get some idea of what other people are currently researching and also provided me with many ideas for my own thesis topic.
Perhaps the most memorable part of my CASCON experience came after the conference was over for the day. During diner I lucked into sitting beside Dr. Morven Gentleman. A short while into the meal I discovered that among several other distinguished positions, Morven had worked at Bell Labs during the late sixties. If you know anything about the history of computing you probably know that both Unix and C were developed at Bell labs during this time. Hearing first hand anecdotes about the formative years of Unix and C was absolutely fabulous. The rest of the diner consisted of me peppering Morven with questions about the history of computing which he seemed happy to answer. Hopefully I wasn’t too annoying.
For an interesting opinion on software patents check this article out. I really like his conception of the line between what is patentable and what is not.
Linux Journal has an article in the January 2005 issue that introduces a doubly linked list that is designed for memory efficiency.
Typically elements in doubly linked list implementations consist of a pointer to the data, a pointer to the next node and a pointer to the previous node in the list.
The more memory efficient implementation described in the article stores a single offset instead of the next and previous pointers.
The pointer difference is calculated by taking the exclusive or (XOR) of the memory location of the previous and the next nodes. Like most linked list implementations a NULL pointer indicates a non-existent node. This is used at the beginning and end of the list. For the diagram above, the pointer differences would be calculated as follows:
A[Pointer difference] = NULL XOR B
B[Pointer difference] = A XOR C
C[Pointer difference] = B XOR NULL
One nice property of XOR is that it doesn’t matter what order the operation is applied. For example:
A XOR B = C
C XOR B = A
A XOR C = B
The memory efficient linked list uses this property of XOR for traversals. The trick is that any traversal operation requires both the address of current node and the address of either the preceding or following node.
Using the example figure above, calculating the address of the B node from A looks like:
B = NULL XOR A[Pointer difference]
What is really interesting is that traversing the list operates exactly the same in both directions. As shown below calculating the address of node A or C from B is simply depends on which direction the traversal is going.
A = C XOR B[Pointer difference]
C = A XOR B[Pointer difference]
The original article presents some time and space complexity results. I won’t bother repeating them here.