Today I noticed that a bash script I created a couple of weeks ago to calculate some log file statistics was no longer working properly.
The culprit was the following line:
REGEX=`printf "^%s %i %.2i:%.2i\n" ${MONTH} ${DAY} ${HOUR} ${TMP_MIN}`
This regular expression was designed to match lines that start with a certain date and time. Lines like following:
May 28 12:02
The regular expression was no longer matching the log file entries I wanted it to. The problem was the day value. When I wrote and tested the script the day was double digits. Now, at the beginning of a new month, the days are single digits. The log file I was analyzing pads the day field to always be two digits wide. Thus, the regular expression no longer matched the lines because there was an unexpected space. Figuring out what was broken and fixing the regular expression didn’t take long.
REGEX=`printf "^%s %2i %.2i:%.2i\n" ${MONTH} ${DAY} ${HOUR} ${TMP_MIN}`
The script now worked properly but I noticed something weird. I had added a debugging statement below the REGEX definition.
echo ${REGEX}
This debugging statement was printing the regular expression with one space between the month and day, not the expected two. Yet, the script appeared to work perfectly. What was going on?
In trying to figure this out I went as far as creating a quick C program so that I could see exactly what was being passed in as the arguments.
Of course, it turned out to be something that should have been obvious. The bash echo command prints out each of the passed arguments with a single space between them. The echo command was interpreting each part of the string in the REGEX variable as an individual argument. The ‘fix’ was to enclose the bash variable in quotes so that it would be considered a single argument.
echo "${REGEX}"
Bash programming rule #x:: Always put double quotes around variables.