What can you do with a while loop? One common technique is to read the output of previous commands.
Let’s say you’re using the Subversion revision control system, which is executable as svn.
(This example is very similar to what you would do for cvs as well.)
When you check the status of a directory subtree to see what files have been changed, you might see something like this:
1 2 3 4 5 6 7 8 9 10 |
$ svn status bcb M bcb/amin.c ? bcb/dmin.c ? bcb/mdiv.tmp A bcb/optrn.c M bcb/optson.c ? bcb/prtbout.4161 ? bcb/rideaslist.odt ? bcb/x.maxc $ |
The lines that begin with question marks are files about which Subversion has not been told; in this case they’re scratch files and temporary copies of files.
The lines that begin with an A are newly added files, and those that begin with M have been modified since the last changes were committed.
To clean up this directory it would be nice to get rid of all the scratch files, which are those files named in lines that begin with a question mark.
Try:
1 2 |
svn status mysrc | grep '^?' | cut -c8- | \ while read FN; do echo "$FN"; rm -rf "$FN"; done |
or:
1 2 3 4 5 6 7 8 9 |
svn status mysrc | \ while read TAG FN do if [[ $TAG == \? ]] then echo $FN rm -rf "$FN" fi done |
Both scripts will do the same thing—remove files that svn reports with a question mark.
The first approach uses several subprograms to do its work (not a big deal in these days of gigahertz processors), and would fit on a single line in a typical terminal window.
It uses grep to select only the lines that begin (signified by the ^) with a question mark.
The expression ‘^?’ is put in single quotes to avoid any special meanings that those characters have for bash.
It then uses cut to take only the characters beginning in column eight (through the end of the line).
That leaves just the filenames for the while loop to read.
The read will return a nonzero value when there is no more input, so at that point the loop will end.
Until then, the read will assign the line of text that it reads each time into the variable “$FN”, and that is the filename that we remove.
We use the -rf options in case the unknown file is actually a directory of files, and to remove even read-only files.
If you don’t want/need to be so drastic in what you remove, leave those options off.
The second script can be described as more shell-like, since it doesn’t need grep to do its searching (it uses the if statement) and it doesn’t need cut to do its parsing (it uses the read statement).
We’ve also formatted it more like you would format a script in a file.
If you were typing this at a command prompt, you could collapse the indentation, but for our use here the readability is much more important than saving a few keystrokes.
The read in this second script is reading into two variables, not just one. That is how we get bash to parse the line into two pieces—the leading character and the filename.
The read statement parses its input into words, like words on a shell command line.
The first word on the input line is assigned to the first word in the list of variables on the read statement, the second word to the second variable, and so on.
The last variable in the list gets the entire remainder of the line, even if it’s more than a single word.
In our example, $TAG gets the first word, which is the character (an M, A, or ?) that the whitespace defines the end of that word and the beginning of the next.
The variable $FN gets the remainder of the line as the filename, which is significant here in case the filenames have embedded spaces.
(We wouldn’t want just the first word of the filename.) The script removes the filename and the loop continues.