Manipulating files and directories with the command line¶
mv: moving and renaming files¶
Since we’re currently doing a command line tutorial, let’s go into that directory and see what is there.
cd command_line_recitation
ls
We see that we have a directory called sequences
, as well as a FASTA file named some sequence.fasta
. This file name has the annoying space in it. We would like to rename it something without a space, say some_sequence.fasta
. To do this, we us the ``mv`` command, short for “move.” We enter mv
, followed by the name of the file we want to rename, and then its new name.
mv some sequence.fasta some_sequence.fasta
Uh-oh! That gave us some strange output, talking about the usage of mv
. This is because the space in the file some sequence.fasta
was interpreted as a gap between arguments of the mv
command. To specify that the space is part of the file name, we need to use an escape character. The escape character for macOS or Linux is \
. With Windows, you can use a caret ^
as an escape character or you can enclose the file name with a space in single quotes. The space following the
escape character is not considered as an argument separator. To change the name do the following.
macOS or Linux:
mv some\ sequence.fasta some_sequence.fasta
Windows:
mv 'some sequence.fasta' some_sequence.fasta
However, if these files were under version control, you should precede the mv
command with git
. That way, Git will keep track of the naming changes you made. So, you would do this:
macOS or Linux:
git mv some\ sequence.fasta some_sequence.fasta
Windows:
git mv 'some sequence.fasta' some_sequence.fasta
Now, we probably want this file in the sequences
directory. We can also move files into directories (without changing their file names) using the mv
command.
mv some_sequence.fasta sequences/
The trailing slash is not necessary, but I always include it out of habit to remind myself that I am moving a file to a directory. (Again, if these files were under version control, you would precede the above command with git
.)
Now let’s go into the sequences
directory and see what we have.
cd sequences
ls
We see that some_sequence.fasta
is there, along with other FASTA files.
Word to the wise: NO SPACES¶
Look at what is in the directory using ls
.
ls
The directory command_line_recitation/
has some files that will help us through this lesson. Note that there are no spaces in the directory name. In general, you avoid spaces in directory and file names, even though your operating system often has them in there. Trust me on this, they can make things a total mess, especially on the command line, since a space also separates commands. Really. NO SPACES.
Exploring file content¶
We would like to see what is in the sequence files. Bash offers various ways to display the content of files. We’ll look at the genome of the dengue virus in the file dengue.fasta
. There are lots of ways to do it. We’ll start with less
. It got its name because it is more feature-rich than more
, which was used to look at files before less
came to be. (“less
is more
,” get it?) It allows using the arrow up and arrow down keys traverse up or down by line. It also allows
scrolling by touchpad or mouse. Since it doesn’t require the whole file to be read before displaying the top content, it’s ideal for larger files.
macOS or Linux:
less dengue.fasta
Windows:
more dengue.fasta
It also supports searching initiated by “/” followed by the query: /AAAA
. You can specify a line number by “:
” followed by the line number: :40
.
To show line numbers, type -N
. Other useful commands are shift+g
will go to the end of the file and gg
to the beginning. To exit less
or more
, hit Q
.
We’ll now look at several other ways to look at files. Just substitute them for less
in the above command.
cat¶
cat
prints the entire file to the standard output (terminal). This is especially useful if the files are very small. Windows users, use !type
instead of cat
.
head¶
head
just prints the top lines of the file to the standard output. The default can be changed:
head -5
This will print the first 5 lines to the standard output. Windows users, note alternative command: gc myfile.txt -head 5
.
tail¶
Like head
, but for the last lines of the file. Windows users, note alternative command: gc myfile.txt -tail 5
.
Copying files and directories: cp¶
If you want to retain a copy of the folder/file in the original folder you can use the copy command cp
. It works straightforwardly with files. Applied to directories it requires a flag: cp -r
, meaning “recursive.” A flag typically begins with a hyphen (-
) and gives the command some extra directions on how you want to do things. In this case, we are telling cp
to work recursively.
Let’s have a look at the cp
command in action.
cp dengue.fasta copy_of_dengue.fasta
Maybe we want a copy of the entire sequences
directory. To do that, we will cd
one directory up to the command_line_recitation
directory.
cd ../
We went up one directory using ../
. This is an example of a relative path. The current directory is “./
”, “../../
” is two directories up, “../../../
” is three directories up, and so on. This is very very useful when navigating directory structures. Now let’s try copying an entire directory with the -r
flag.
cp -r sequences copy_of_sequences
We can also rename directories with the mv
command. Let’s rename copy_of_sequences
to sequences_copy
. This is silly, but illustrates how things work.
mv copy_of_sequences sequences_copy
Removing files and directories with rm¶
Yes, some of the things we just did are silly. We have no need for having a copy of a given sequence or a copy of the whole sequences directory. We can clean things up by deleting them. First, let’s get rid of our copy of the dengue sequence. Let’s cd
into the sequences directory and make sure it’s there.
cd sequences
ls
Now let’s remove the file and verify it is gone.
rm copy_of_dengue.fasta
ls
And poof!, its gone! And I mean gone. It is pretty much irrecoverable. Warning: rm
is a wrecking ball. It will destroy any files you have that do not have restrictive permissions. This is so important, I will say it again.
rm is unforgiving.
Therefore, I always like to use the -i
flag, which means that rm
will ask me if I’m sure before deletion.
rm -i some_sequence.fasta
You will get a prompt. Answer “n
” if you do not want to delete it.
Now, cd into the higher directory cd ../
, and let’s use rm
to remove an entire directory. To do this, we need to use the -r
flag.
rm -r sequences_copy
Copyright note: In addition to the copyright shown below, this recitation was developed based on materials from Axel Müller.