Linux from a shell

Motivation

I have friends and family who all have remote access to Linux computers but have never used it. I wrote this article for them.

You work with Windows or Mac. You don't know much (or anything) about Linux. But you can't ignore it because sometimes you come across it in the form of a shell account on a remote Linux box. And then you would like to be able to use it. You are not interested in system administration. Programming doesn't mean anything to you. But you would like to be able to perform some basic tasks. If you agree with most of the former statements this article is for you. This article does not cover graphical user interfaces as those are usually not available on servers. It solely covers shell access (text).

What is a shell at all?

A shell is a program that allows you to interact with a computer's files and processes (running programs). A shell lets you run programs. It executes them and displays the output of the program to you. There are many small programs that help you do basic tasks. So actually you are not learning how to use a shell, but how to use many small programs. The shell itself has also some built-in commands which can be used to write so called shell scripts. Those are also programs. But you said you did not want to know about programming. So at the moment just forget about shell scripts.

No windows?

Why do those programs not open windows, you may ask. All the programs we will use in this article work in text mode. They provide a command-line interface (CLI) to the user. That means all user interaction works through the keyboard and output in text form. That has several advantages over a window based user interface:

Bandwidth: Using a graphical user interface over the network is slow because all information is transferred as images. And images take up a lot of space and networks are quite slow. Text data is much much smaller and can thus be transaferred over a network thousands of times faster.
Server resource: Graphical user interfaces use a lot of server resources (memory, processor time) for every connected user and for every running program. Linux is a multi-user system. That means multiple users can be logged in on one computer at the same time. On a big system there can be hundreds of users. If every user used a graphical user interface the system could handle much less users.
Monitors: Servers often are rack mounted systems that have no monitor connected (head-less). So for the system administrator there is no need for a graphical user interface and so it is often not installed at all.
Scripting: One can easily call CLI programs in shell scripts (small programs). They are the basic building blocks out of which one can build powerful things with very little "glue code". Some of the command-line interfaces of these programs have been specifically designed to ease interoperability with other CLI programs. Scripting is an extremely important technology that is widely used in Linux/Unix for almost everything. Were these programs equipped with a graphical user interface only, it would be impossible to control such a program from a script.

Note: It really is possible to control a Linux machine with a graphical user interface over the network. But it is not common practice.

Connecting to the remote machine

Today remote Linux machines are usually accessed with SSH (Secure Shell). SSH is a protocol and a shell that lets you execute commands on the remote machine and transfer files to and from the remote machine. SSH encrypts all data as it is sent over the network. It is safe to use even if your computer is connected to the network over WLAN. So you need a SSH client on your local computer to be able to connect to the remote computer.

SSH clients

SSH clients are freely available on the web. The choice is yours. You can use a "graphical" one like Putty. But you can just as well use a command-line based one, like OpenSSH. If you use Mac OS-X, you can open a Terminal and use OpenSSH directly from there. You don't need a graphical SSH client at all.

Cygwin

If you use Windows you may try Cygwin. It offers a powerful Linux shell (and much more) under Windows. It also includes OpenSSH. So you can start with a shell right from your local computer. And most things you learn in this article you can even try out with Cygwin alone - without the need for a remote Linux machine!

Connecting with OpenSSH

From now on you will see commands that you have to type in at the shell displayed like this:

echo "Hello world"

Press the Enter key at the end of a command so it is executed by the shell. To connect as the user john to the remote computer very.remote.com with SSH type:

ssh john@very.remote.com

This looks like an email address. But it is not. When the connection is established you will be prompted for your password. If you enter it correctly you will be logged in on the remote machine.

The $ is the prompt

The shell will display something like this to you: john@very ~ $ or -bash-2.05b$
and wait for input. This looks a bit strange to the avarage user. It is called the prompt. It is just a piece of information that is printed by your shell. It's actually possible to completely customize your prompt (by setting the PS1 variable) but I won't go into this much detail here. From now on I will just write a single dollar $ as the user prompt in the boxes. Of course you must not type that.

What's my shell?

As mentioned before there are many different shell programs. The most popular one on Linux is probably the bash (Born Again Shell). There is also the less popular tcsh. On a Sun/Solaris system the Korn Shell (ksh) is very popular. They all conform to the a common standard (the POSIX shell). So they have a common set of features and a common language. But each of them has proprietary extensions that makes them powerful for certain things. To find out which one you are using type echo $SHELL. If this command suggests that you are using bash, it's okay. Because I am going to explain everything here for bash. If the output suggests that you are using something else than a bash, then please try to run a bash.

$ echo $SHELL
/bin/tcsh
$ bash
$ echo $SHELL
/bin/bash

You can permanently change the default shell that is started when you login with the chsh command. Make sure you know the absolute path to bash before.

$ which bash
/bin/bash
$ chsh
Changing the login shell for john
Enter the new value, or press ENTER for the default
        Login Shell [/bin/tcsh]: /bin/bash

Changing your password

The first thing you will probably want to do is change the password to something you can more easily remember.

$ passwd

This invokes the passwd program. It will ask you to enter your current password and then ask you to choose a new one and enter it twice. For security reasons the passwords will not be displayed on the screen as you type them.

Please note that most systems require you to choose a secure password. That means it should be long enough, have enough different upper as well as lower case characters and number in it or even special characters like an exclamation mark or a period. Generally passwords that are likely to appear in a dictionary are inherently unsafe: fox is a bad password, but h9B!S-v2K is a good one.

Who is there?

Check out who else is currently logged in:

$ who
john  pts/0        Jan  9 21:42
jack  pts/1        Jan  9 21:57

You see there is also jack logged in since 21:57.

Where am I?

The remote machine has a file system that consists of files in a hierarchical directory structure. Much like folders and documents under Windows. The shell has a "current directory" it is in at the moment. You can find out what the current directory is:

$ pwd
/home/john

Listing files

As you are now working on a remote machine you are probably interested in some files that reside there. The ls (list) command will list the files in the current directory (folder):

$ ls

If the directory is empty there won't be much output of course. Otherwise you get a list of file names. The ls program accepts many options that influence its output. I can only list the most important ones here. Options are single letters that are placed after a space and a minus character after the program name. ls -Ahl means: execute the ls program with the options A, h and l. They have the following meaning and effects:

A: list really all files. Files that start with a dot are normally hidden.
l: show details. Shows the permissions, file size, modifcation date etc. along with the file.
h: human sizes. Shows the file size in better readable form (KB, MB etc) instead of bytes.
L: show link targets instead of links

You can also filter the list of files:

$ ls -l *.jpg
-rw-------  1 john users      836 Jul 11 18:24 cat.jpg
-rw-r-----  1 john users  1238182 Mar 24 17:12 dog.jpg

only lists JPG files. Please note that (unlike on Windows) under Linux filenames are case sensitive. That means dog and Dog are different files.

Large directories

The output of ls may scroll by your screen too fast and may be too large for your window buffer so you lose output. To view the output page by page you must redirect the output to a pager program like more or less. See below for how to use these.

$ ls | more

Note the "pipe" character. It's AltGr-1 or AltGr-7 depending on your keyboard. There are two characters that look very similar. The pipe should be this whole straight uninterrupted line: |
and not this character: ¦

You can scroll forward page by page with the space key or exit with the q key.

Reading ls output

Look at this sample output of ls -l:

drwxr-xr-x  2 john users       80 Jan 31  2004 projects
-rw-------  1 john users      836 Jul 11 18:24 todo.txt
-rw-r-----  1 john audio  1238182 Mar 24 17:12 news.mp3
-rwxr-----  1 john users      873 Mar 22 12:46 backup
lrwxrwxrwx  1 john users       22 Jan 31  2004 www -> projects/www.odi.ch

Let's ignore the left half of the output at the moment, as it only deals with permissions. The right most column is the file name, followed by file's the date and time and its size in bytes. Note the intelligent date display: the time is only displayed for files younger than one year. Likewise older files show the year instead of the time. projects is a directory which is noticable by the d at the beginning of the line. www on the other hand is not an actual file but a (symbolic) link to the file (or directory) projects/www.odi.ch. Links show a l at the beginning of the line. Links are a feature not present in Windows filesystems. On Linux you can access a link as if it was the file itself. You won't notice a difference.

Navigating the filesystem

You have already learned what you current directory is and how to list its contents. Directory names are always separated by the forward slash character / (like in www links). The top of the directory hierarchy is called the root directory. Several directory names separated by slash are called a path. A path can either be absolute or relative. An absolute path starts with a slash and tells us how to find a directory starting from the root directory. A relative path does not start with a slash and tells us how to find a directory starting from the current directory.

Now how can you change into a different directory? You use the cd (change directory) program and give it the path to the new directory as a parameter. Here is an example session:

john@very ~ $ cd /
john@very / $ cd usr
john@very usr $ cd bin
john@very bin $ pwd
/usr/bin
john@very bin $ cd ..
john@very usr $ pwd
/usr
john@very usr $ cd ../tmp
john@very tmp $ cd
john@very ~ $ pwd
/home/john

Note...

how the prompt changes while navigating through the directory hierarchy.
how .. is used to change into the next higher directory.
where the last cd command without a parameter takes you

No drive letters

On Windows you are used to have drive lettern (C: is you hard drive usually, D: may be the CD-ROM). This concept is different in Unix/Linux. There is only one file system. So Linux does not need any drive letters. File systems from different media are combined together in a single directory structure. This is called mounting. We say for instance, the CD-ROM file system is mounted under /media/cdrom. And the main hard drive is mounted under / (root).

What is where

Unlike Windows, Linux has a well ordered file system where everything has its place. There are still some minor differences between distributions however. And BSD Systems like Mac OS-X have their own philosophy how to organise files. Here is a list of some directory and what you find there.

Path	Contents
/bin	Standard programs
/boot	Files needed at startup
/dev	Devices (don't go there)
/etc	Most configuration
/home	All home directories (you can only access yours)
/lib and /usr/lib	Libraries shared by programs
lost+found	Files recovered after filesystem errors (never happens really)
/mnt and /media	Mounted media like cdroms, floppy disks, USB drives
/opt	Large third party software like Java, Oracle, etc.
/proc	Information about processes (don't go there)
/root	root's home directory (you don't have access)
/sbin and /usr/sbin	System administration programs. Some may be useful to you even.
/sys	Information about devices (don't go there)
/tmp and /var/tmp and /usr/tmp	Space for temporary files for everybody
/usr/bin	Programs (installed by package manager)
/usr/doc /usr/share/doc /usr/man /usr/share/man /usr/info /usr/share/info	Documentation
/usr/local	Custom installed software (not by package manager)
/usr/src	Source code
/var	Runtime data used by services
/var/cache	Caches
/var/db	Database files
/var/lib	Where services store data and persistent state
/var/log	Service logfiles
/var/mail	Local Email
/var/spool	Print and mail queues
/var/run	Info about running services

Use TAB to autocomplete

When navigating through directories you have to type many directory names. The autocomplete feature of the bash shell comes in handy here! It is enough to type the first few letters of a directory or command and then press the TAB key. The shell will instantly fill in the missing characters for you! This works only if there is only one possibility to complete the name. If there is more than one possibility the shell will beep (might not always work) and do nothing. Pressing TAB again in this case will list all possibilities.

Creating new files

touch creates new (empty) files for you.

$ touch 'new file'
$ ls
new file

Note how we used single quotes to allow a space in the file name.

Deleting files

If you want to remove the file you have just created then use the rm (remove) program.

$ rm 'new file'
$ ls

Be careful when deleting files. There is no way back. Deleted files are lost forever! Despite this horrible fact rm is extremely powerful. Using it with asterix wildcards can even wipe out several files at once: rm *.tmp. Its r option will even wipe out entire hierarchies of directories: rm -r crap (DON'T TRY THIS NOW). So be extra careful when deleting files. Especially when you are root. You have been warned.

Creating new directories

mkdir creates new (empty) directories.

$ mkdir Documents
$ ls
Documents
$ cd Documents
$ ls

Deleting directories

The rmdir does the opposite of mkdir. It deletes a directory. This only works if the directory is empty. Use ls -la to check. If you need to delete non-empty directory hierarchies use rm with the r option. See above.

Home sweet ~

You have noticed that cd takes you to a special directory (/home/john in the example) if you don't tell it where to go. This is called your home directory. It has the short name ~ (tilda). That also explains the tilda in your prompt! Your home directory is the place where you can put your own files. You have all permissions on this directory. You may create new subdirectories as you like. This is also the place where programs that you run will store their settings. Like when you run a browser it will store your bookmarks in a hidden subdirectory of your home directory. This is cool, because your data is all in one place and not scattered across the whole system. So you know what you must backup.

Searching for files

The easiest way to find a file with a specific name on Linux is to use locate. It expects a partial filename as a parameter and will print out the results. Please note that file names on Linux are case sensitive. That is: Two files mybook and myBook are different files and may happily live together inside the same directory (Windows does not allow that normally). So if you are unsure about the proper case you should use the i option to make the search case insensitive:

$ locate -i mybook
/home/john/books/myBook.txt
/home/jack/stuff/important/MyBoolet.doc

locate is very fast because it uses a pre-built index of the file system. This also means that files are not available through locate until this index is regenerated. This usually happens once a day or once a week. Also locate does not allow to further restrict the results. You will get too many results most of the time. The most interesting options of locate are:

i: case insensitive (normally the match is case sensitive)
n 10: only show 10 results
r: interprete the search pattern as a regular expression

A more powerful search program is find. It has many complicated options. And you can find files in the most obscure ways. find does not use an index and is therefore slower than locate but finds files immediately. A typical find call looks like this:

$ find /home/jack -iname \*book\* -type f

The above call looks for files whose name contain "book" (case insensitive) anywhere below /home/jack. Only files (type f) are considered; whereas directories (type d) are ignored. Note that asterix needs to be escaped with a backslash. That is because the asterix is otherwise interpreted by the shell. Another example: The above command looks for empty directories anywhere below the current working directory.

$ find -type d -empty

find can also run a program on every file it finds. A convenient way to remove all CVS directories from a whole directory tree is for example:find -type d -name CVS |xargs rm -r

Permissions

We have now mentioned permissions a few times. It's time for you to know what they are. As Linux is a multiuser system it is important to restrict access between these users. Users are not only human users. But also services can have their own user. For example a web server may run as the http user. Surely you don't want the webserver to be able to access your email. This is what permissions control. Now, look at this sample output of ls -l:

drwxr-xr-x  2 john users       80 Jan 31  2004 projects
-rw-------  1 john users      836 Jul 11 18:24 todo.txt
-rw-r-----  1 john audio  1238182 Mar 24 17:12 news.mp3
-r-xr-----  1 john users      873 Mar 22 12:46 backup
lrwxrwxrwx  1 john users       22 Jan 31  2004 www -> projects/www.odi.ch

Just look at the left half of the output now. Every file (and directory) of a Linux file system belongs to a user. That user is the owner of the file. You can see all files in this directory belong to the user john. Also every file is assigned to a group. You can see all files but one belong to the group users, and one file belongs to the group audio.

The file mode is noted for every file in the first column which for instance reads: -rwxr-----. There are three different permissions: read, write and execute wich are represented by the characters r, w and x. Each right can be given to the file's owner, group and other users. People without a system account are not called users. They don't have access to files.

The todo.txt file for example is readable and writable by its owner john. Nobody else is allowed to access it. The file news.mp3 can also be read by users belonging to the group audio. The backup file can be executed by john: it is a program. He can also read this file but can not modify it. Of course john could grant himself the right to write backup. Careful: That a file is not writable doesn't mean it can't be deleted!

Because projects is a directory the permissions have slightly different meanings. Read access to a directory means, the user is allowed to list the files inside the directory. Write access means the user can create and delete files in the directory. So again: if you want to protect a file from being deleted make sure the user has no write access to the file nor the directory. If a user can "execute" a directory it means she is permitted to enter (with cd).

Last but not least links always have all permissions (you can't change that) because the permissions are actually taken from the target file. Also owner and group are those of the link rather than the target file.You can use ls -lL to display the link target and its permissions directly.

Which groups do I belong to?

We have met user groups now. A user can be in any number of groups. Every user has a default group. Groups are managed by the system administrator (root). But you can see in which groups you are by using the groups program.

$ groups
users cron audio cdrom dialout video games cdrw scanner

You normally don't need to care about group membership. You normally leave new files assigned to your default group. There is no need to change it.

Modifying permissions

First of all, only the owner of a file may change its permissions (being in the same group as the file does not count). There are three programs to modify the permissions of a file: chown (change owner), chgrp (change group) and chmod (change mode). Note: Ordinary users can not use chown, because they can not change the owner of files due to security reasons. The first two are easily exmplained by example:

$ ls -l
-rw-------  1 john users      836 Jul 11 18:24 todo.txt
$ chgrp audio todo.txt
$ ls -l
-rw-------  1 john audio      836 Jul 11 18:24 todo.txt
$ chown jack todo.txt    #normal users can not do that!
$ ls -l
-rw-------  1 jack audio      836 Jul 11 18:24 todo.txt

chmod is a bit more complicated. A numeric mode consists of three digits: 640 for example. The first is the permissions for the user, the second is the permissions for the group and the third for others. Each digit is the sum of r=4, w=2 and x=1. So a 6 corresponds to rw-, 5 corresponds to r-x, 7 corresponds to rwx, to name the most important combinations. This is known as the octal representation of file modes. The following table lists some frequently used file modes:

Octal	Symbolic	Use
644	-rw-r--r--	World readable files
640	-rw-r-----	Group readable files
600	-rw-------	Private files
755	-rwxr-xr-x	World readable directories World executable programs
750	-rwxr-x---	Group readable directories Group executable programs
700	-rwx------	Private directories Private programs

The chmod program is used to change the file mode like so:

$ ls -l
-rw-------  1 john users      836 Jul 11 18:24 todo.txt
$ chmod 644 todo.txt
$ ls -l
-rw-r--r--  1 john users      836 Jul 11 18:24 todo.txt

Of course all of those programs accept a wildcard filename to bulk change many files. There is also a second variant of using chmod which is sometimes desirable to use:

$ ls -l
-rw-------  1 john users      836 Jul 11 18:24 todo.txt
$ chmod go+rw todo.txt
$ ls -l
-rw-rw-rw-  1 john users      836 Jul 11 18:24 todo.txt
$ chmod u-r+x todo.txt
$ ls -l
--wxrw-rw-  1 john users      836 Jul 11 18:24 todo.txt

It's power is to modify the existing mode by selectively adding or removing individual permissions. The above example can be read aloud as: change mode (for) group (and) others add read (and) write (of the file) todo.txt. The second example as: change mode (for) user remove read (and) add exexcute (of the file) todo.txt. So the first parameter of chmod is a combination of:

The list of subjects: user, group, others or all.
The operation: + or -
The list of permissions: r, w and x

Default permissions

When you create a new file or directory it will get some default permissions. You will be the owner. Your default group will be used as the group. And the mode will normally be world readable or private depending on your umask. The following values make sense for a umask:

umask	file mode	directory mode
0022	-rw-r--r--	drwxr-xr-x
0077	-rw-------	drwx------

How the umask works exactly is difficult to explain if you don't know about boolean algebra. But it's enough to keep those two values in mind.

$ umask
0077
$ touch a
$ ls -l
-rw-------  1 john users      836 Jul 11 18:24 a
$ umask 0022
$ touch b
$ ls -l
-rw-------  1 john users      836 Jul 11 18:24 a
-rw-r--r--  1 john users      836 Jul 11 18:24 b

Creating links

We have met symbolic links a couple of times now. You may have though, wow how handy, I would like to make a few of them in my home directory as shortcuts to point to my favourite places! That's a good idea. Look how simple it is to create a symbolic link:

$ ln -s /some/long/path/I/often/use shortcut
$ ln -s /a/file/which/I/often/edit favourite
$ ls -l
lrwxrwxrwx  1 john users  22 Dec 23 12:22 shortcut -> /some/long/path/I/often/use
lrwxrwxrwx  1 john users  22 Dec 23 12:22 favourite -> /a/file/which/I/often/edit

Just always remember to include the s option. Otherwise you create a hard link to files. Hard links can not be distinguished from normal files. They are normal files. They are just listed in the file system more than once. But they are the very same file. ls -li may help you identify identical files. Here is an example using hard links. It may be a bit confusing:

$ ls -l
-rw-------  1 john users      836 Jul 11 18:24 text
$ ln text sametext
$ ls -li
23873 -rw-------  1 john users      836 Jul 11 18:24 text
23873 -rw-------  1 john users      836 Jul 11 18:24 sametext
$ rm text
$ ls -l
-rw-------  1 john users      836 Jul 11 18:24 sametext

Had we used a symbolic link for sametext we would have lost the original file and ended up with a broken link.

Copying files

It's easy to copy files from one place to another with the cp (copy) program. In Windows that's called copy-paste.:

$ cp document /safe/place/to/be/
$ ls /safe/place/to/be
document
$ ls
document

or just copy them to the same place under a different name:

$ cp document a-copy
$ ls
document    a-copy

Moving (and renaming) files

It's as easy to move a file (not copying it) with the mv (move) program. The same can be used to give a file a new name:

$ ls
new-document
$ mv new-document old-document
$ ls
old-document
$ mv old-document archive/
$ ls archive
old-document
$ ls

Running scripts

Sometimes you must run programs that are in user directories. For example the startup.sh script in a Tomcat bin directory. For security reasons normally programs are only executed when they are in special directories. All programs we have used so far are normally in /bin or /usr/bin. The following command will show you where programs are that you can execute. (This is stored in the PATH variable of your shell.)

$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.3.6:
/opt/sun-jdk-1.4.2.10/bin:/opt/sun-jdk-1.4.2.10/jre/bin:
/opt/sun-jdk-1.4.2.10/jre/javaws:/usr/qt/3/bin:/usr/kde/3.4/bin:/usr/games/bin

Let's see where the ls program actually is:

$ which ls
/usr/bin/ls

When you enter ls the shell actually scans all the PATH directories for a program called ls. It executes the first one it encounters. If none is found the shells displays an error message. So if you want to run a script or program which is not in your PATH you have to tell the shell where to look. Imagine there is a script /opt/tomcat/bin/startup.sh. To run it you can do this:

$ cd /opt/tomcat/bin
$ ls -l
-rwxr-xr-x  tomcat root  1027 Jan 12 10:29 startup.sh
-rwxr-xr-x  tomcat root  1027 Jan 12 10:29 shutdown.sh
$ ./startup.sh

The dot always means the current directory. So the last line actually means execute the startup.sh program that resides in the current directory.

Manual pages

This is important. It's easy to forget options to programs you don't often use. Furthermore some programs have different options on some systems. But on the vast majority of systems there is a built-in help system! It is called the manual or man pages. All man pages are structured in the same way. After seeing a few you will easily find what you are looking for. Try man mkdir. It shows you the manual for the mkdir command. Browse through the page with the space and Enter key or Page-Down and Page-Up. Type q to go back to the shell. You can also search by typing a forward slash / followed by a search term and Enter. Pressing n repeatedly takes you through the highlighted search results. See also "Viewing text files" below.

Other languages

Don't think Linux is English only! Depending on the installation there may be many languages available. Which language is displayed can be controlled by setting the LANG environment variable. Valid values of this variable can be obtained with the command: locale -a. The following would select German language on my system.

$ export LANG=de_DE

There are other variables that influence also how dates and system messages are displayed. A list of all those variables and their current values can be obtained with:

$ locale
LANG=en_US
LC_CTYPE="en_US"
LC_NUMERIC="en_US"
LC_TIME="en_US"
LC_COLLATE="en_US"
LC_MONETARY="en_US"
LC_MESSAGES="en_US"
LC_PAPER="en_US"
LC_NAME="en_US"
LC_ADDRESS="en_US"
LC_TELEPHONE="en_US"
LC_MEASUREMENT="en_US"
LC_IDENTIFICATION="en_US"
LC_ALL=

When in doubt just set LC_ALL and LANG and forget about the rest.

Viewing text files

In Linux there are many text files. So a common task is to view text files. Viewing means reading only. That excludes modifying. The most simple way to view a text file is the program cat:

$ cat /etc/resolv.conf
nameserver 10.0.123.1
domain localdomain.net

cat just writes the whole file out to the screen. While that's fine for very short files it is inconvenient for larger files because they will scroll by way too quickly. A more powerful way to view a file is the more program. It is an interactive program. That means if the file is larger than your screen it will accept keyboard commands so you can navigate back and forth within the file. The following keys are recognized:

Enter: scroll down one line
space: scroll down one page
b: scroll back one page
/: search
n: search next
h: help screen
q: quit

more is available on all systems but is quite an old program. Today it has mostly been replace by the bit more powerful less command. Less always behaves interactively, while more is only interactive if necessary. Less supports the same keyboard commands as more, but additionally the cursor keys and page up/down work. It has also some more features. See the built-in help screen (press h).

Less or more are also used when viewing manual pages. Which program is used for viewing is defined in the PAGER environment variable. You can check the current setting with echo $PAGER and set to use less:

$ export PAGER=less

Environment variables

This chapter is specific to bash. It is also a little advanced and you may decide to skip it. But sooner or later you will read it. If you are using another shell things may work differently. We have met environment variables a couple of times now. I'll quickly explain what this is. When you start a program the program is given an "environment". An environment consists of a number of variables (name/value pairs). The program can read these variables and change its behaviour according to their value. For instance, it can show text in the language that is set in the LANG variable. Which variables a program expects is normally documented in the program's man page.

Now, the shell is also a running program. So it also has an environment. You can list the environment variables of your shell with the set command. You will be surprised you many variables there are already. When you start a new program then all these environment variables get inherited to the program. The program gets an actual copy of the environment which it can modify without touching your environment. This means you can give every program that you start a completely different environment if you like.

So how do you modify the environment? You have done that already. You have used the export command to set the language. So to set the environment variable X of your shell to abc you do export X=abc. Sometimes you will see that people do it like this:

$ X=abc
export X

This has the same effect. You can also remove the variable X from the environment again with unset X. To set certain environment variables automatically when you login you can put the necessary commands into one of the files .bashrc, .bash_login or .bash_profile in your home directory. They are executed automatically upon login. Not all of them may be present; you can just create them. See the next chapters for how to edit these files.

Editing text files

On the Linux console you will often need to edit text files. Thus you need a text editor. Beginners will often use nano or pico. Mostly because they come with sort of a menu. I have tried those editors. But actually I could never figure out how to do the must basic stuff like mark, copy/paste and search. So I gave them up quickly because I already knew how to use vim. You may try out nano if you like. It should be self-explanatory. But don't blame me if you don't get along with it. Sooner or later you will learn how to use vi or vim anyway. So why not just right now?

vi is a powerful editor

vi is a probably the most powerful commandline editor. But it is difficult to learn, and hard to remember. The first difficulty is: There are two different versions. vi is the traditional (old) Unix editor. vim is its successor and much more advanced.vi lacks many of the most important features of vim. On most Linux systems when you start vi you will get vim actually. This is the magic behind:

$ ls -l /usr/bin/vi
lrwxrwxrwx  1 root root 12 Oct 24 09:56 /usr/bin/vi -> /usr/bin/vim

A simple link. Okay, let's start it with some text file as the parameter: vi sample.txt.

You should see the first page of the text file now. vi is in so called "open" mode now. That means you can not actually modify text (yet). You can merely move around in the text using the arrow keys, PgUp, PgDwn or the keys h,j,k,l. Almost every key on the keyboard will do something special in command mode. So be careful! Here are some basic commands:

j,k: down, up one line
6j: down 6 lines
h,l: left, right one character
0: beginning of line
$: end of line
gg: top of the file
385gg: go to line 385
G: end of file
w: right one word
7w: right 7 words
x, Del: cut character
4x, Del: cut 4 characters
dd: cut line
2dd: cut 2 lines
10dd: cut 10 lines
dw: cut word
d$: cut to end of line
d0: cut to beginning of line
dG: cut to end of file
u: undo
2u: undo two commands
J: join two lines
y: copy one character
yy: copy one line
3yy: copy 3 lines
p: paste after current position
7p: paste 7 times after current position
P: paste before current position
v: mark visually for copy/cut (end with x or y)
n: find next (after search)
N: find previous
==: format current line
=G: format to end of file (useful after copy/paste in a Window when text contains tabs)

The following commands leave open mode and enter editing mode. In editing mode you can type text with the keyboard. To return to open mode press ESC.

i: insert at current position
a: append/insert after current position
o: insert line below
O: insert line above
s: replace character
3s: replace 3 characters
r: replace (and return to open mode)
5r: replace 5 characters with another character

In editing mode the following helpful keyboard commands are available:

Ctrl-n, Ctrl-p: complete word
Ctrl-x Ctrl-l: complete line
Ctrl-w: complete word

From open mode you can also go into command mode by typing a colon :. The cursor will jump to the last line of the screen and waits for you to type the command. After the command you are back in open mode. There are many commands. The most important ones are:

h: help (exit with :q)
w: save
w otherfile: save to otherfile
q: quit
x: save and quit
/needle: find "needle" (n: next)
/needle\c: find "needle" non-case sensitive
nohl: remove highlighting after search
set paste: turns off some annyoing features before a paste operation
set nopaste: turns these features back on
s/cat/dog: replace "cat" with "dog" (current line)
%s/cat/dog: replace "cat" with "dog" (whole file)
s/cat/dog/g: replace all "cat" with "dog"
s/cat/dog/gc: replace all "cat" with "dog" and confirm each one
colorscheme elflord: colors suitable for black terminal
colorscheme <Tab>: browse through color schemes
syntax enable: enable syntax highlighting

Zipping files with zip

You are probably familiar with ZIP files from Windows. Or Stuff-it files on a Mac. These are compressed archives that can contain many files. Those archives can also contain a whole directory structure. Archives are handy when you want to move around many files between different computers. ZIP files can also be used on Linux. However ZIP files are less common on Linux because they do not support permissions. And you have see that permissions are an important feature of Linux file systems. So if you unpack a ZIP archive on Linux, you will have to figure out manually what is an executable program and set the execute permission manually! In the next section I will introduce you to tar, the more popular archiver for Linux.

Nevertheless you may work with ZIP files on Linux using the programs zip and unzip. When you start zip or unzip without parameters it will print a help screen. Here is how to use zip to store some photos in the pictures directory in a file called holidays.zip:

$ zip -r holidays.zip pictures
  adding: pictures/ (stored 0%)
  adding: pictures/dsc00461.jpg (deflated 1%)
  adding: pictures/dsc00462.jpg (deflated 1%)
  adding: pictures/dsc00463.jpg (deflated 1%)

This is how the same ZIP file you have just created is unzipped:

unzip holidays.zip
Archive:  holidays.zip
   creating: pictures/
  inflating: pictures/dsc00461.jpg
  inflating: pictures/dsc00462.jpg
  inflating: pictures/dsc00463.jpg

Compressing files with gzip

gzip is GNU Zip, which is something else than zip. The two are not compatible! gzip compresses a single file and renames it by appending .gz to its name. gunzip is the counterpart that uncompresses a file again. Here is an example:

$ ls -l
-rw-r--r--  1 john users 616 Jan  3 03:25 letter.txt
$ gzip letter.txt
$ ls -l
-rw-r--r--  1 john users 383 Jan  3 03:25 letter.txt.gz
$ gunzip letter.txt.gz
$ ls -l
-rw-r--r--  1 john users 616 Jan  3 03:25 letter.txt

See how letter.txt is compresses from 616 bytes down to 383 bytes and then again is uncompressed back to its full size. Note that both gzip and bzip2 can not compress more than one file. This is because of the Unix philosophy of programming. On Unix we have many small programs that can do one special thing very well. Those programs can easily be combined in a vast variety of ways to achieve a more complex task. So on Linux the task of compressing data is separate from the task of putting files into an archive. See Tarballs below if you need to compress more than one file.

Compressing files with bzip2

bzip2 works similar to gzip but uses a different compression algorithm. Its slower but can create smaller files than gzip especially when used on large binary files. bzip2 appends .bz2 to the file name. bunzip2 is the counterpart that uncompresses a file again. Here is an example:

$ ls -lh
-rw-r--r--  1 john users 64M Jan  3 03:25 fatfile
$ bzip2 fatfile
$ ls -lh
-rw-r--r--  1 john users 56M Jan  3 03:25 fatfile.bz2
$ bunzip2 fatfile.bz2
$ ls -l
-rw-r--r--  1 john users 64M Jan  3 03:25 fatfile

See how fatfile is compresses from 64MB down to 56MB and then again is uncompressed back to its full size.

Tarballs

The more natural archive format for Linux are archives created by the program tar. They are also called tarballs. Tar has many options of which you will only need a few. To tar your holiday pictures above you would do:

$ tar cf holidays.tar pictures

The above command did not apply any compression to the archive (see comment in the section about gzip). As we have archived JPG images trying to compress them even more would have been futile anyway. What we have is a plain tar file now. Tar has no compression built-in. But GNU tar supports gzip and bzip2 by just specifying an option. If you want to use gzip compression you can do:

$ tar czf website.tgz /var/www/www.odi.ch

The above command uses gzip compression - note the z option. Note that we indicate the gzip compression by using the suffix .tgz. The suffix .tar.gz is also common. You can also use bzip2 compression:

$ tar cjf website.tar.bz2 /var/www/www.odi.ch

Note the j option that indicates bzip2 compression. Also note that we indicate the compression used in the file name with the a suffix bz2. The suffix tbz is less common. Of course we can do archiving and compressing in two steps:

$ tar cf holidays.tar pictures
$ bzip2 holidays.tar

This would leave us with a file called holidays.tar.bz2. Extracting a tar archive again works similar. Instead of the c (create) option the x (extract) option is used. The right compression option (z or j) must be present if the file is compressed:

$ tar xjf website.tar.bz2

If you just want to know what files are inside an archive but you don't want to actually extract the archive then use the t (tell) option instead of x: $ tar tjf website.tar.bz2

Here is the list of options again:

f: work on a file (must always be present!)
c: create a new archive
x: extract an archive
t: list the contents of an archive
z: use gzip compression
j: use bzip2 compression
p: (together with x) preserve permissions
v: print the file names

Transferring files to a remote computer

If you can connect to a remote computer with SSH, you can also transfer files to and from it. This is done with the scp program. Assume you want to copy a local file called New Document.doc to the remote computer called very.remote.com. The file should be copied to the directory documents/drafts inside the john's home directory. This is done with the follwing command:

$ scp 'New Document.doc' john@very.remote.com:documents/drafts/

Of course you can copy the file in the other direction, too:

$ scp 'john@very.remote.com:documents/drafts/New Document.com' .

Note the dot at the end of this command. A single dot always means the current directoy. Another example: $ scp music/elvis.tgz john@very.remote.com: would copy the file to john's home directory. Never forget the colon at the end, or scp will copy the file and give it the name john@very.remote.com!

Instead of scp you may also use sftp, which is a program similar to ftp.

KDE's konqueror has built-in support for SSH. Simply type in the URL bar: sftp://very.remote.com/

Redirecting output to files

Your shell can save output of programs to files directly. Just try

$ ls -1 > dir.txt

The above writes the output of ls -l to the file dir.txt instead to the screen. Careful, the file dir.txt is overwritten if it exists! Of course you an also just append to a file instead of overwriting it:

$ ls -l >> dir.txt

If you want both the output to screen and to a file use the tee program like so:

$ ls -l | tee dir.txt

The above command makes use of a pipe (that funny | character produced by Alt Gr-1 or Alt Gr-7) which we will dicuss later on.