Command-line for Editors, Part 2: Breaking down Part 1

In Part 1, I showed you simple folder creation and sorting commands, as a demonstration of how command-line scripting can make your life easier. In this installment, I’ll break down in nitty-gritty detail how the command-line syntax works, and what all those commands are really doing.

The first important thing to note is that these command-line scripts are written using Bash, which is the default command-line “shell” on macOS. For the most part, you don’t really need to worry about learning the nuances between all the different types of command-line shells. Bash also happens to be the default shell on Linux, so if you have to learn only one Unix shell, Bash is the one worth spending your time on.

2019 UPDATE: With the release of macOS Catalina, Apple has changed the default shell to zsh. While Bash—albeit an older version of it—will still be included in the standard install of macOS for hopefully the foreseeable future, it’s highly recommended that you learn to write new scripts using zsh, in case Apple deprecates and removes the Bash interpreter from a future version of macOS, as they did with Python 2. Thankfully, zsh is very similar to Bash, so the learning curve to switching is not huge. I highly recommend reading Armin Briegel’s fantastic Moving to zsh series, which is very helpful in highlighting the differences between Bash and zsh.

Using “For loops” to create a series of folders

The first command in Part 1 created a series of alphabetically named blank folders when typed into a new Terminal prompt:

for i in {A..Z}; do mkdir "$i"; done

That may look like random noise, but let’s break this simple command down so it makes more sense. First, take note of all the semicolons in the command. These semicolons (highlighted in yellow) are there to separate different commands from each other, kind of like how you’d separate sentences in a paragraph from one another using periods.

for i in {A..Z}; do mkdir "$i"; done

In this example, there are 3 separate commands. The first one, for i in {A..Z}, is called a “for loop“.

For loops allow you to take a list of items, then perform the same programmatic operation on each individual item in that list. It’s very important to understand how Loop functions work, as they are foundational to almost every programming language in existence, and is the primary method for automating and batch processing tasks in those languages.

for i in {A..Z}; do mkdir "$i"; done

For loops work on lists of items. In our example, we want to create folders named after each letter of the alphabet, which can be conceptually represented as a list of letters in a document, one letter on each line.

You could do something like this:

for i in 'A B C D E F G H I J K L M N O P Q R S T U V W X Y Z'

But this is very unwieldy. Thankfully, most shell scripting languages have handy-dandy shortcuts that make ordered lists like this easier to both type and read.

In Bash, the {A..Z} is a shorthand expression that represents a range of characters or numbers that the For loop will iterate over, just like any other list. So if you wanted to iterate over a list of numbers between 10 and 100, you could change the expression to {10..100}. You can also do fun things like {100..1}, which creates a list of numbers between 1-100, except in backwards order.

for i in {A..Z}; do mkdir "$i"; done

The “for i in” part can be a bit confusing, especially if you look at a lot of example code that can be found online. The “for” begins the For loop, and the “i” serves as a placeholder—more properly called a variable in computer programming—for the currently looped item in the list.

In natural language, this part of the command is saying “Loop sequentially through every item in a list of alphabetic letters, and then on each iteration of the loop, put the name of the letter into a variable named 'i' (overwriting the existing contents of that variable)”

What can be initially confusing for some programming newbies is that ‘i‘ variable. The Pro-tip here is that it doesn’t actually need to be called ‘i‘. This is just a programming convention that many experienced coders use when iterating over For loops. For more complex scripts, I prefer to use more descriptive variables (e.g. “for letter in {A..Z}“) so it’s easier to decipher my code when debugging it, or if I ever have to modify it many years later. However, for short scripts like this one, using ‘i‘ is simpler, and saves you a lot of keystrokes.

for i in {A..Z}; do mkdir "$i"; done

The second part of the command (which comes after the first semicolon) is the command that will be executed upon each iteration of the For loop.

The “do” keyword instructs the For loop to execute everything that appears after it as a command, each time we iterate over each item in the list. In this case, we want to run the mkdir command, which is a command that makes a directory (aka “folder”) with the directory name specified in the ‘$i‘ variable.

So when we run this command, the ‘i‘ variable will first be assigned the letter “A”. The for loop then continues to the next command, which creates a directory named “A” in the same directory that the command was run from.

The “done” command will then instruct the for loop to start the next iteration of the loop, which reads the next item in the list, then overwrites the existing contents of the ‘i‘ variable with the letter “B”. The loop continues to iterate until it runs out of list values, at which point the loop exits.

Notice that the ‘i‘ variable in the mkdir command now has a dollar-sign in front of it. The dollar-sign tells the script to read the value that’s contained in ‘i‘, instead of assigning a value to it, like we did in the initial for loop command.

A very important note is that the ‘$i‘ variable is enclosed in double quotes, in case the contents of the variable contains spaces, which would cause the command to either fail or run with unexpected results. It’s always a best practice to enclose variables in double quotes when you’re recalling them in a string based context, even if you aren’t expecting spaces in the values.

Sorting files into alphabetic folders

The second command I demonstrated sorted a bunch of font folders into those “A-Z” folders, based on the first character of the font folder’s name:

shopt -s nocasematch; for dir in *; do cp -r "$dir" "/Volumes/path/to/alphabetized/folders/${dir:0:1}"; done

Let’s break down what each command does!

shopt -s nocasematch

shopt is a Bash command (I’m assuming it means “shell option”?) that allows you to modify some of the default behaviors of a bash shell.

Because Unix is an inherently case-sensitive operating system (unlike Windows), we want our script to be able to sort a folder named “foobar” into the “F” folder, without having to worry about the command failing because there’s no pre-existing lowercase “f” folder. So we use the “-sflag to tell the shopt command to set the nocasematch option, which tells Bash to enable case-insensitive matching when matching folder or file names.

for dir in * 

This works similarly to the for loop in our first folder creation command. But in this case, the “*” is a wildcard character that represents a list of every file or folder located in the current folder that you ran the command from. Each item gets placed in a variable named “dir” (for “directory”) upon each iteration of the loop.

do cp -r "$dir" "/Volumes/path/to/alphabetized/folders/${dir:0:1}"

This command uses the Unix “cp” command to copy the folder in the “$dir” variable to another location (the /Volumes/path/to/alphabetized/folders/ directory).

The “-r” flag tells the cp command to copy the contents of the folder recursively (in other words, copy everything inside the $dir folder, and not just the folder name itself).

The “${dir:0:1}” section is an example of a really cool feature of Bash called parameter expansion. Notice that the dir variable is inside of curly braces, and that the dollar sign, which normally denotes a variable reference, now appears on the outside of those curly braces. Those curly braces tell the Bash interpreter that you want to do stuff with the variable, and not just simply read the value currently stored in the variable, as-is. One way of “doing stuff” with the variable is via parameter expansion.

Parameter expansion allows you to manipulate or extract portions of the value stored in a variable, and then output the result without actually having to store the result in another intermediate variable name.

In this case, we want to sort the font’s folder into a destination folder that matches the first character of the font folder’s name, which is stored in the $dir variable. We can do this by extracting just the first letter from the font’s folder name, then telling the cp command to copy the font folder into a folder matching that same letter inside the /Volumes/path/to/alphabetized/folders/ directory.

To extract a substring from a variable, you use the following parameter expansion syntax: {variable_name:offset:length}. So if the current value of $dir is “Abadi MT Std”, we can extract just the “A” in “Abadi” by using an offset of 0 (Bash indexes character counts starting at zero), and a length of 1.

The coolest thing about this is you can use the results of this parameter expansion just like any other variable. So “/Volumes/path/to/alphabetized/folders/${dir:0:1}” would resolve to “/Volumes/path/to/alphabetized/folders/A” before it even gets fed to the cp command, inside the for loop (parameter expansions occur before any actual commands are run).

So in just two simple command-lines, we can accomplish in just seconds what might otherwise take a half hour to accomplish by clicking around with the mouse and keyboard, and with a lot less chance of making mistakes!

Additional references