Recitation 1c: Manipulting files and directories with the command line

mv: moving and renaming files

Since we’re currently doing a command line tutorial, let’s go into that directory and see what is there.

cd command_line_tutorial
ls

We see that we have a directory called sequences, as well as a FASTA file named some sequence.fasta. This file name has the annoying space in it. We would like to rename it something without a space, say some_sequence.fasta. To do this, we us the ``mv`` command, short for “move.” We enter mv, followed by the name of the file we want to rename, and then its new name.

mv some sequence.fasta some_sequence.fasta

Uh-oh! That gave us some strange output, talking about the usage of mv. This is because the space in the file some sequence.fasta was interpreted as a gap between arguments of the mv command. To specify that the space is part of the file name, we need to use an escape character. The escape character for macOS or Linux is \. With Windows, you can use a caret ^ as an escape character or you can enclose the file name with a space in single quotes. The space following the escape character is not considered as an argument separator. To change the name do the following.

  • macOS or Linux: `mv some sequence.fasta some_sequence.fasta

  • Windows: `mv ‘some sequence.fasta’ some_sequence.fasta

However, if these files were under version control, you should precede the mv command with git. That way, Git will keep track of the naming changes you made. So, you would do this:

  • macOS or Linux: git mv some\ sequence.fasta some_sequence.fasta

  • Windows: git mv 'some sequence.fasta' some_sequence.fasta

Now, we probably want this file in the sequences directory. We can also move files into directories (without changing their file names) using the mv command.

mv some_sequence.fasta sequences/

The trailing slash is not necessary, but I always include it out of habit to remind myself that I am moving a file to a directory. (Again, if these files were under version control, you would precede the above command with git.)

Now let’s go into the sequences directory and see what we have.

cd sequences
ls

We see that some_sequence.fasta is there, along with other FASTA files.

Word to the wise: NO SPACES

Look at what is in the directroy using ls.

ls

The directory command_line_tutorial/ has some files that will help us through this lesson. Note that there are no spaces in the directory name. In general, you avoid spaces in directory and file names, even though your operating system often has them in there. Trust me on this, they can make things a total mess, especially on the command line, since a space also separates commands. Really. NO SPACES.

Exploring file content

We would like to see what is in the sequence files. Bash offers various ways to display the content of files. We’ll look at the genome of the dengue virus in the file dengue.fasta. There are lots of ways to do it. We’ll start with less. It got its name because it is more feature-rich than more, which was used to look at files before less came to be. (“less is more,” get it?) It allows using the arrow up and arrow down keys traverse up or down by line. It also allows scrolling by touchpad or mouse. Since it doesn’t require the whole file to be read before displaying the top content, it’s ideal for larger files. It also supports searching initiated by “/” followed by the query; shift+g will go to the end of the file; gg to the beginning; and you can specify a line number by “:” followed by the line number.

  • macOS or Linux: less dengue.fasta

  • Windows: more dengue.fasta

To exit less or more, hit Q.

We’ll now look at several other ways to look at files. Just substitute them for less in the above command.

cat

cat prints the entire file to the standard output (terminal). This is especially useful if the files are very small. Windows users, use !type instead of cat.

tail

Like head, but for the last lines of the file. Windows users, note alternative command.

Copying files and directories: cp

If you want to retain a copy of the folder/file in the original folder you can use the copy command cp. It works straightforwardly with files. Applied to directories it requires a flag: cp -r, meaning “recursive.” A flag typically begins with a hyphen (-) and gives the command some extra directions on how you want to do things. In this case, we are telling cp to work recursively.

Let’s have a look at the cp command in action.

cp dengue.fasta copy_of_dengue.fasta

Maybe we want a copy of the entire sequences directory. To do that, we will cd one directory up to the command_line_tutorial directory.

cd ../

We went up one directory using ../. This is an example of a relative path. The current directory is “./”, “../../” is two directories up, “../../../” is three directories up, and so on. This is very very useful when navigating directory structures. Now let’s try copying an entire directory with the -r flag.

cp -r sequences copy_of_sequences

We can also rename directories with the mv command. Let’s rename copy_of_sequences to sequences_copy. This is silly, but illustrates how things work.

mv copy_of_sequences sequences_copy

Removing files and directories with rm

Yes, some of the things we just did are silly. We have no need for having a copy of a given sequence or a copy of the whole sequences directory. We can clean things up by deleting them. First, let’s get rid of our copy of the dengue sequence. Let’s cd into the sequences directory and make sure it’s there.

cd sequences
ls

Now let’s remove the file and verify it is gone.

rm copy_of_dengue.fasta
ls

And poof!, its gone! And I mean gone. It is pretty much irrecoverable. Warning: rm is a wrecking ball. It will destroy any files you have that do not have restrictive permissions. This is so important, I will say it again.

rm is unforgiving.

Therefore, I always like to use the -i flag, which means that rm will ask me if I’m sure before deletion.

rm -i some_sequence.fasta

You will get a prompt. Answer “n” if you do not want to delete it.

Now, let’s use rm to remove an entire directory. To do this, we need to use the -r flag.

rm -r sequences_copy

Computing environment

[1]:
%load_ext watermark
%watermark -v -p jupyterlab
CPython 3.7.4
IPython 7.8.0

jupyterlab 1.1.4

Copyright note: In addition to the copyright shown below, this recitation was developed based on materials from Axel Müller.