C The Command Line

The command line is an interface to a computer—a way for you (the human) to communicate with the machine. But unlike common graphical interfaces that use windows, icons, menus, and pointers, the command line is text-based: you type commands instead of clicking on icons. The command line lets you do everything you’d normally do by clicking with a mouse, but by typing in a manner similar to programming!

An example of the command line in action (from Wikipedia).

The command line is not as friendly or intuitive as a graphical interface: it’s much harder to learn and figure out. However, it has the advantage of being both more powerful and more efficient in the hands of expert users. (It’s faster to type than to move a mouse, and you can do lots of “clicks” with a single command). The command line is also used when working on remote servers or other computers that for some reason do not have a graphical interface enabled. Thus, command line is an essential tool for all professional developers, particularly when working with large amounts of data or files.

This chapter will give you a brief introduction to basic tasks using the command line: enough to get you comfortable navigating the interface and able to interpret commands.

For an updated and expanded version of this content, see Chapter 2 of Programming Skills for Data Science. The text is available online through Safari Online, free for students with a UW NetID.

C.1 Accessing the Command line

In order to use the command line, you will need to open a command shell (a.k.a. a command prompt). This is a program that provides the interface to type commands into. You should have installed a command shell (hereafter “the terminal”) as part of setting up your machine.

Once you open up the shell (Terminal or Git Bash), you should see something like this (red notes are added):

A newly opened command line.

This is the textual equivalent of having opened up Finder or File Explorer and having it show you the user’s “Home” folder. The text shown lets you know:

  • What machine you’re currently interfacing with (you can use the command line to control different computers across a network or the internet).
  • What directory (folder) you are currently looking at (~ is a shorthand for the “home directory”).
  • What user you are logged in as.

After that you’ll see the prompt (typically denoted as the $ symbol), which is where you will type in your commands.

C.3 File Commands

Once you’re comfortable navigating folders in the command line, you can start to use it to do all the same things you would do with Finder or File Explorer, simply by using the correct command. Here is an short list of commands to get you started using the command prompt, though there are many more:

Command Behavior
mkdir make a directory
rm remove a file or folder
cp copy a file from one location to another
open opens a file or folder (Mac only)
start opens a file or folder (Windows only)
cat concatenate (combine) file contents and display the results
history show previous commands executed

Warning: The command line makes it dangerously easy to permanently delete multiple files or folders and will not ask you to confirm that you want to delete them (or move them to the “recycling bin”). Be very careful when using the terminal to manage your files, as it is very powerful.

Be aware that many of these commands won’t print anything when you run them. This often means that they worked; they just did so quietly. If it doesn’t work, you’ll know because you’ll see a message telling you so (and why, if you read the message). So just because you didn’t get any output doesn’t mean you did something wrong—you can use another command (such as ls) to confirm that the files or folders changed the way you wanted!

C.3.1 Learning New Commands

How can you figure out what kind of arguments these commands take? You can look it up! This information is available online, but many command shells (though not Git Bash, unfortunately) also include their own manual you can use to look up commands!

man mkdir

Will show the manual for the mkdir program/command.

Because manuals are often long, they are opened up in a command line viewer called less. You can “scroll” up and down by using the arrow keys. Hit the q key to quit and return to the command-prompt.

The mkdir man page.

If you look under “Synopsis” you can see a summary of all the different arguments this command understands. A few notes about reading this syntax:

  • Recall that anything in brackets [] is optional. Arguments that are not in brackets (e.g., directory_name) are required.

  • “Options” (or “flags”) for command line programs are often marked with a leading dash - to make them distinct from file or folder names. Options may change the way a command line program behaves—like how you might set “easy” or “hard” mode in a game. You can either write out each option individually, or combine them: mkdir -p -v and mkdir -pv are equivalent.

    • Some options may require an additional argument beyond just indicating a particular operation style. In this case, you can see that the -m option requires you to specify an additional mode parameter; see the details below for what this looks like.
  • Underlined arguments are ones you choose: you don’t actually type the word directory_name, but instead your own directory name! Contrast this with the options: if you want to use the -p option, you need to type -p exactly.

Command line manuals (“man pages”) are often very difficult to read and understand: start by looking at just the required arguments (which are usually straightforward), and then search for and use a particular option if you’re looking to change a command’s behavior.

For practice, try to read the man page for rm and figure out how to delete a folder and not just a single file. Note that you’ll want to be careful, as this is a good way to break things.

C.3.2 Wildcards

One last note about working with files. Since you’ll often work with multiple files, command shells offer some shortcuts to talking about files with the same name. In particular, you can use an asterisk * as a wildcard when naming files. This symbol acts like a “wild” or “blank” tile in Scrabble–it can be “replaced” by any character (or any set of characters) when determining what file(s) you’re talking about.

  • *.txt refers to all files that have .txt at the end. cat *.txt would output the contents of every .txt file in the folder.

  • hello* refers to all files whose names start with hello.

  • hello*.txt refer to all files that start with hello and end with .txt, no matter how many characters are in the middle (including none!)

  • *.* refers to all files that have an extension.

C.4 Dealing With Errors

Note that the syntax of these commands (how you write them out) is very important. Computers aren’t good at figuring out what you meant if you aren’t really specific; forgetting a space may result in an entirely different action.

Try another command: echo lets you “echo” (print out) some text. Try echoing "Hello World" (which is the traditional first computer program):

echo "Hello world"

What happens if you forget the closing quote? You keep hitting “enter” but you just get that > over and over again! What’s going on?

  • Because you didn’t “close” the quote, the shell thinks you are still typing the message you want to echo! When you hit “enter” it adds a line break instead of ending the command, and the > marks that you’re still going. If you finally close the quote, you’ll see your multi-line message printed!

IMPORTANT TIP If you ever get stuck in the command line, hit ctrl-c (The control and c keys together). This almost always means “cancel”, and will “stop” whatever program or command is currently running in the shell so that you can try again. Just remember: “ctrl-c to flee”.

(If that doesn’t work, try hitting the esc key, or typing exit, q, or quit. Those commands will cover most command line programs).

Throughout this book, we’ll discuss a variety of approaches to handling errors in computer programs. While it’s tempting to disregard dense error messages, many programs do provide error messages that explain what went wrong. If you enter an unrecognized command, the terminal will inform you of your mistake:

lx
> -bash: lx: command not found

However, forgetting arguments yields different results. In some cases, there will be a default behavior (see what happens if you enter cd without any arguments). If more information is required to run a command, your terminal will provide you with a brief summary of the command’s usage:

mkdir
> usage: mkdir [-pv] [-m mode] directory ...

Take the time to read the error message and think about what the problem might be before you try again.

C.5 Directing Output

So far all these commands have either modified the file system or printed some output to the terminal. But you can specify that you want the output to go somewhere else (e.g., to save it to a file for later). These are called redirects. Redirect commands are usually single punctuation marks, because the commands want to be as quick to type (but hard to read!) as possible.

  • > says “take output of command and put it in this file”. For example echo "Hello World" > hello.txt will put the outputted text “Hello World” into a file called hello.txt. Note that this will replace any previous content in the file, or create the file if it doesn’t exist. This is a great way to save the output of your command line work!

  • >> says “take output of command and append it to the end of this file”. This will keep you from overwriting previous content.

  • < says “take input from this file”. This is a much less common redirect.

  • | says “take the output of this command and send it to the next command”. For example, cat hello.txt | less would take the output of the hello.txt file and send it to the less program, which gives that arrow-based “scrolling” interface that man pages use.

Redirects are a more “advanced” usage of the command line, but now you know what those random symbols mean if you see them!

C.6 Shell Scripts

Shell commands are a way to tell the computer what to do—in effect, program it!

But often the instructions you want a computer to perform are more complex than a single command (even with redirects), or are something you want to be able to save and repeat later (beyond just looking it up in the history). It is useful if you can write down all the instructions in a single place, and then order the computer to execute all of those instructions at once. This list of instructions is called a script, with a list of shell commands called a shell scripts. Executing or “running” a script will cause each instruction (line of code) to be run in order, one after the other, just as if you had typed them in one by one. Writing scripts allows you to save, share, and re-use your work—by saving instructions in a file, you can easily check, change, and re-execute the list of instructions (assuming you haven’t caused any irreversible side effects).

Bash shell scripts (also known as “Bash scripts”) are generally written in files that end in the .sh extension (for shell). Once you’ve written a shell script, you can execute it on the command line simply by typing the file name as the command:

./my-script.sh

(The ./ indicates that we want to execute the file in the current folder, instead of looking for a program elsewhere on the computer as normally happens with commands).

  • Important On OS X or Linux you will need to give the file permission to be executed to run it. Use the chmod (change mode) command to add (+) execution permissions to the file:

    chmod +x my-script.sh

For portability (e.g., to work well across computers), the first line of any Bash shell script should be a a shebang that indicates which shell should be used:

#!/usr/bin/env bash

After that, you can begin listing the command your script will run, one after another (each on a new line):

#!/usr/bin/env bash

# anything after a '#' is a comment, and will not be executed
echo "Hello world"
echo "This is a second command"
  • You can include blank lines between your command for readability, and add comments (notes to yourself that are not executed by the computer) by starting a line with #.

Shell scripts are an easy way to make “macros” or shortcuts for commands you want to run, and are common ways of sharing computer instructions with others.