Todays bash lesson

Leave a reply

Today I noticed that a bash script I created a couple of weeks ago to calculate some log file statistics was no longer working properly.

The culprit was the following line:

REGEX=`printf "^%s %i %.2i:%.2i\n" ${MONTH} ${DAY} ${HOUR} ${TMP_MIN}`

This regular expression was designed to match lines that start with a certain date and time. Lines like following:

May 28 12:02

The regular expression was no longer matching the log file entries I wanted it to. The problem was the day value. When I wrote and tested the script the day was double digits. Now, at the beginning of a new month, the days are single digits. The log file I was analyzing pads the day field to always be two digits wide. Thus, the regular expression no longer matched the lines because there was an unexpected space. Figuring out what was broken and fixing the regular expression didn’t take long.

REGEX=`printf "^%s %2i %.2i:%.2i\n" ${MONTH} ${DAY} ${HOUR} ${TMP_MIN}`

The script now worked properly but I noticed something weird. I had added a debugging statement below the REGEX definition.

echo ${REGEX}

This debugging statement was printing the regular expression with one space between the month and day, not the expected two. Yet, the script appeared to work perfectly. What was going on?

In trying to figure this out I went as far as creating a quick C program so that I could see exactly what was being passed in as the arguments.

Of course, it turned out to be something that should have been obvious. The bash echo command prints out each of the passed arguments with a single space between them. The echo command was interpreting each part of the string in the REGEX variable as an individual argument. The ‘fix’ was to enclose the bash variable in quotes so that it would be considered a single argument.

echo "${REGEX}"

Bash programming rule #x:: Always put double quotes around variables.

Vim tips for DOS text files

1 Reply

DOS (Windows) uses CR-LF to mark the end of lines in text files. Unix just uses LF. Wikipedia has a long article on these differences if you are interested.

Viewed in older versions of vim, DOS text files had a ^M at the end of every line. This made identification of text files that had been uploaded via binary mode FTP very easy. It seems recent versions of vim auto-detect the text file type and no longer show the ^M by default.

Vim can be told to not try the DOS text file type with the ‘:set fileformats=unix’ command. If you set this option DOS text files will have the familiar ^M at the end of each line.

The text file type can be changed to Unix for the current buffer (file being edited) by ‘:set fileformat=unix’. Opening a DOS text file, setting the type to be Unix and then saving the file will convert it to a Unix text file.

IPv6 address usage

Leave a reply

I stumbled upon a fabulous article that discusses address utilization in IPv4 and IPv6. Two aspects of this article struck me most: wrapping your head around really big numbers is hard and it is sad to see that some of the same mistakes made during the early days of IPv4 address allocations may be repeated.

Just how big is IPv6?

Hardware tips from a kernel hacker

1 Reply

Dave Jones, a RedHet kernel guy, has put together a list of hardware tips. This could be useful for anyone who has to deal with potentially flakey hardware.

http://www.livejournal.com/users/kernelslacker/16552.html

Memory efficient doubly linked list

8 Replies

Linux Journal has an article in the January 2005 issue that introduces a doubly linked list that is designed for memory efficiency.

Typically elements in doubly linked list implementations consist of a pointer to the data, a pointer to the next node and a pointer to the previous node in the list.

Picture of a typical linked list

The more memory efficient implementation described in the article stores a single offset instead of the next and previous pointers.

Diagram of the memory efficient linked list

The pointer difference is calculated by taking the exclusive or (XOR) of the memory location of the previous and the next nodes. Like most linked list implementations a NULL pointer indicates a non-existent node. This is used at the beginning and end of the list. For the diagram above, the pointer differences would be calculated as follows:

A[Pointer difference] = NULL XOR B
B[Pointer difference] = A XOR C
C[Pointer difference] = B XOR NULL

One nice property of XOR is that it doesn’t matter what order the operation is applied. For example:
A XOR B = C
C XOR B = A
A XOR C = B

The memory efficient linked list uses this property of XOR for traversals. The trick is that any traversal operation requires both the address of current node and the address of either the preceding or following node.

Using the example figure above, calculating the address of the B node from A looks like:
B = NULL XOR A[Pointer difference]

What is really interesting is that traversing the list operates exactly the same in both directions. As shown below calculating the address of node A or C from B is simply depends on which direction the traversal is going.

A = C XOR B[Pointer difference]
C = A XOR B[Pointer difference]

The original article presents some time and space complexity results. I won’t bother repeating them here.

Red Hat Magazine

Leave a reply

Red Hat puts out a monthly publication called Red Hat magazine. It is usually worth looking at. Often they discuss new technologies that are entering RHEL or Fedora.

This month some of the articles include video presentations. However, the video is only available in Real Video or Quicktime. These codecs do not ship with Fedora; I’m not sure if they ship with RHEL.

Bad Red Hat!

Sin City

Leave a reply

Last Friday Karen and I went to see Sin City. Since then I have made several attempts to write about it without success.

I’m still a little stunned by this movie. Its scary, disturbing but yet funny at times. The bad guys are sometimes just lowly hit-men out to make a buck; others include cannibals and child molesters. Make no mistake, this movie hits all the evil person buttons. The good guys are still bad guys but do bad things for mostly good reasons.

Sin City is violent but the violence forms a big part of the world so it doesn’t seem out of place.

About the only conclusive thing I can say about this movie is that it is the most unique movie I have ever seen. For that reason alone, you should go see it too.

Update: I forgot to add one of my favourite quotes from the movie.

”I love hitmen. No matter what you do to ’em, you never feel bad.”
— Marv (Mickey Rourke)

HTTP Panties

1 Reply

Only ThinkGeek could come up with something as funny as HTTP Panties.

It’s been a long time..

Leave a reply

Wow, it’s been a long time since I have posted. So here is a grab bag of what has been going on.

The last semester of my undergraduate degree is almost over. Less than two weeks left.

I handed in my undergraduate thesis last week; the presentation is this Tuesday. Fortunately, I was able to get the main work of the project done a couple of weeks ago which left me with lots of time to write the final report. This is good because I am a slow writer. At some point I’ll probably post a PDF version of the report here. If you are considering doing an undergraduate thesis I certainly recommend it. It is really great to be able to work on something personally interesting and get credit for it. Of course that is the catch, if your not interested in the topic, it is probably very hard to force yourself to do the required work.

After the thesis presentation on Tuesday the only course work I have left is two assignments: one for distributed systems and one for cryptography.

Also on the university front, I received my acceptance to the MSc program a few weeks ago.

Starting my MSc in September will be exciting but for now I’m just looking forward to four months with no homework. I’m pretty much at the burnout point.

One goal I have for the summer (besides being outside a lot) is to catch up on my reading. I tend to get quite far behind during the school year. In order to catch up I have to read: 6 issues Communications of the ACM, 5 issues of Linux Journal, 2 issues of IEEE/ACM Transactions on Networking, 5 issues of Queue and a few other miscellaneous magazines. Plus all of the unread books that have been accumulating.

And of course, some blogging!

Blog spam

Leave a reply

The blog spam on my site has gotten so bad I decided to use WordPress 1.5’s new word black list feature. It seems that 90% of the spam I get is for online casinos. These posts always contain either “texas holdem” or “poker” so I have black listed “holdem” and “poker”. This is a pretty big stick that I don’t want to use but constantly deleting spam in the moderation queue is getting quite annoying.

Dan Siemon

Things that interest me

Todays bash lesson

Vim tips for DOS text files

IPv6 address usage

Hardware tips from a kernel hacker

Memory efficient doubly linked list

Red Hat Magazine

Sin City

HTTP Panties

It’s been a long time..

Blog spam