# Chapter 1

This chapter will introduce you to the shell environment from both a user's and a programmer's perspective. The Portable Operating System Interface (POSIX) standards specify how programs should implement their interface so that they are portable across Unix-like platforms. Bash follows these standards and augments them with additional features.

A **shell** is a user-interface to an operating system's services, and this guide is about the **command-line interface** for the shell. We run a **terminal** program in order to use this interface. In turn, the terminal invokes a shell program, which is typically `bash`. So, launch a terminal for your platform, and let's get started!

Hopefully, your terminal now eagerly awaits your command, and it is running a shell program, which is probably `bash`. You can verify this by

```bash
ps -p $$
```

After you typed the above command and pressed `enter`, the command should print something similar to the output below.

```
  PID TTY          TIME CMD
25719 pts/0    00:00:00 bash
```

Your output will be slightly different, but the important part is that the `CMD` column shows `bash`. My output indicates that I am using an instance of `bash` with the **process id** of 25719 (yours will very likely be different). Every process on Linux is assigned a process ID so that they may be uniquely identified by commands such as `kill`, which can terminate a launched process.

Note that most commands introduced hitherto should be similarly typed into the terminal following the **prompt string** (a short piece of text that typically ends with the `$` character). If the command-line prompt string is ever missing, it means that your current command is incomplete. Unless you know your command spans multiple lines, your command may contain a syntax error. In this case, you should press `C-c` (hold `Ctrl` key and press `c`) to interrupt the current command and return to the command-line prompt so that you may start over from the last successfully executed command.

Commands are interpreted by `bash` and executed sequentially. Bash commands follow the format

```
program arg1 arg2 arg3 ...
```

Our command begins with the name of the **program** (`ps`), followed by the **arguments** (`-p $$`). (An argument is a value passed to a program; it can pretty much be anything and *does not* need to be persuasive.) The arguments are split by **whitespace** characters (i.e. space, tab, and new-line) and sent as separate values to the program. The program will receive `-p` as the first argument and `$$` as the second argument. Typically, Linux programs are written so that arguments begining with the `-` character are interpreted as **parameter** names, which may consume one or more additional arguments. In our example, `$$` is the argument to the `-p` parameter; that is, the program will set the `-p` parameter to the value of `$$`.

In general, words that begin with `$` are **variable** names, and Bash will **expand** variable names into their current values. The `$$` variable is expanded into 25719 in the above example and the `ps` program will actually receive `-p` and `25719` as command-line arguments (not `-p` and `$$`).

The above commands should work if you are using a shell that follows the POSIX standards. Since this guide is strictly about Bash, if the `ps -p $$` command outputted a shell other than `bash`, then invoke `bash` and enter the Bash environment by simplying entering

```bash
bash
```

This command starts a new instance of bash and enters its environment. When you wish to return to your original shell environment, type

```bash
exit
```

## Bash script

Commands can be typed into `bash`, but they can also be stored in a text file known as a **script**, which can be run upon request. By convention, shell scripts have the extension `sh`.

Open a file in the current directory using your text editor (e.g. `nano`, `vim`, `emacs`), and enter the following text

```bash
echo "Hello, $1."
```

Save the file to disk. Then, run the script using `bash` by

```bash
bash hello.sh Brian Fox
```

In the above command, `bash` took `hello.sh`, `Brian`, and `Fox` as arguments and runs the `hello.sh` script with arguments `Brian` and `Fox`. The `hello.sh` script contains a line of command that runs the `echo` program (which simply outputs a given text) with the argument `"Hello, $1."`. This argument is converted by Bash into `Hello, Brian.`, and it is passed to `echo` for output to the terminal.

The `$1` variable refers to the *first* argument. Accordingly, `$2` refers to the second, `$3` refers to the third, and so on. Since `$2` is not used, the second argument (`Fox`) does not appear in the output. In order to send `Brian Fox` as one argument, we must surround the text using quotes:

```bash
bash hello.sh "Brian Fox"
```

The quotes preserve the space characters and protect the argument `Brian Fox` from being split into two arguments.

The current script will work in any POSIX-compliant shell, but we can be (and should be) more specific in how the script is to be run by including a **shebang line** (which begins with `#!` in the script:

```bash
#!/bin/bash

echo "Hello, $1."
```

Additionally, we need to make the script *executable* by enabling the `x` flag for the file using the `chmod` (change file modes) program:

```bash
chmod +x hello.sh
```

Then, we can execute (run) the script by

```bash
./hello.sh "Brian Fox"
```

The shebang line in the script instructs the shell to run the script using the executable at `/bin/bash`, the standard location for `bash` on most Linux platforms. Effectively, the script is executed by the following command

```bash
/bin/bash hello.sh "Brian Fox"
```

In the first command `./hello.sh`, the `.` is necessary so that the shell knows that the `hello.sh` script is found in the current directory (`.`). The `/bin/bash` in the second command explicitly specifies the path to the `bash` executable (in the event that multiple instances of `bash` are installed on the machine. In our casual use of the script earlier, we only specified `bash` instead of the full path, because executables are installed in standard locations on Linux platforms, and the shell searches standard locations for the first instance of bash.

To see where your default `bash` is stored, you can use the `which` program:

```bash
which bash
```

Of course, `hello.sh` can still be executed by invoking `bash hello.sh` instead of `./hello.sh` regardless of whether the shebang line is present or not. Nonetheless, the shebang line also serves as documentation for how the script is meant to be run.

## Parameters

When programs are called, the user may supply arguments to parameters in order to control the behaviour of the program. (**Parameters** are the variables that can be set in a program; **arguments** are values that a user supplies to the program.) What parameters and how these parameters are used by the program ultimately depends on the author of the program. Fortunately, Linux programs tend to specify parameters in a similar pattern.

```
program [--opt_arg0] [--opt_arg1 X] [--opt_arg2 Y Z] <pos_arg1> <pos_arg2>
```

This template command is known as a **usage** statement for the program.

In the above usage statements, `pos_arg1` and `pos_arg2` are **positional parameters**, which are generally mandatory and passed arguments are assigned in order based on position. These parameters are often enclosed in angular brackets (inequality signs) `<>`. So, the command

```
program A B
```

will set the parameter `pos_arg1` to value `A` and the parameter `pos_arg1` to value `B`. Notice that the parameter names are not specified in the above command; rather, the program understands that the first specified value is for the first positional parameter and the second specified value is for the second positional parameter.

In contrast, **optional parameters** are *not* mandatory, and they are usually prefixed by one or two dashes `-` and possibly enclosed by square brackets `[]`. If the user wishes to specify a value for an optional parameter, the user must refer to the parameter by name. For example,

```
program --opt_arg1 C A B
```

will set `opt_arg1` to `C`, and the positional parameters `pos_arg1` and `pos_arg2` to the same values as before. Some programs will require that optional parameters be specified before position parameters. That is,

```
program A B --opt_arg1 C
```

may not be parsed correctly. Some programs also support the use of `--` to indicate the end of optional parameters, as in

```
program --opt_arg1 C -- A B
```

These parameter descriptions are simply guidelines that many Linux programs follow. Users should also consult the help message or the manual entry for specific instructions for the program.

It would be helpful to users if you follow these guidelines when you design the command-line interface to your program. `bash` does not automatically parse command line arguments: it simply splits into words as delimited by whitespace, as discussed in the `ps -p $$` example earlier. For more complete command line parsing, we can use the [`getopts`](http://wiki.bash-hackers.org/howto/getopts_tutorial) command that is built into Bash. Alternatively, we can use newer programs such as [`docoptp`](https://github.com/djhshih/docoptp), or parse the arguments [using Bash](http://mywiki.wooledge.org/BashFAQ/035).

## Job Control

Inside the Bash environment, commands are executed to run various programs to achieve a task. Each running program is also called a *job*, or more formally, a *process*. A running job can be terminated by pressing the `Ctrl` key together with `c` (refered to as `C-c`).

The `C-z` combination *suspends* a job and sends it into the background (but does not terminate it) so that the user can continue to interact with the Bash command-line and do additional work in parallel to the background job. To resume the job previously sent to the background while keeping it in the background, run `bg`. To resume a background job and bring it to the foreground (current command-line prompt), run `fg`.

If a command is executed with `&` at the end, the job is launched in the background. In other words, the `&` character has a similar effect as pressing `C-z` on a job and resuming it in the background with `bg`.

```bash
bash -c 'sleep 1; echo "Woke up."' &
```

Here, a new instance of `bash` is launched with the `-c` argument, which takes a command to execute: `sleep 1; echo "Woke up."`. This command sleeps for 1 second, and prints `Woke up` to the terminal. After running this command, the user is immediately returned to the command-line prompt and allowed to issue more commands. After 1 second, the background running job will complete and print `Woke up.` to the same terminal. If the `&` were omitted, the user would have to wait for the command to complete before running additional commands.

In contrast, the `C-d` combination terminates input to the program, for programs that expect additional user inputs, such as `cat` (which stands for **concatenation**). For example, `cat` does not terminate after the user calls it, because `cat` waits for additional input and will continue to wait unless it receives the `C-d` signal. Type

```bash
cat > rose.txt
What's in a name? that which we call a rose
By any other name would smell as sweet.
```

and press `C-d`. Thereafter, `cat > rose.txt` outputs your text input to a file named `rose.txt`. If you were to press `C-c` to terminate the program, the output file may be incomplete, as the `cat` process will be interrupted and terminated immediately.

## Navigating directories

By default when you launch a terminal, you enter a shell environment (e.g. Bash) and start in the home directory of the user (typically `/home/brian` for a user named `brian` on a Linux platform. To navigate to other directories, we use the `cd` (change directory) command:

```bash
cd dir/subdir
```

The argument following `cd` specifies the path of the new directory `dir/subdir`, which must already exist. Directories inside directories are separated by the slash character `/` on Linux platforms. `cd` recognizes special paths `.`, and `..`, and `-`. The `.` path refers to the current directory, `..` refers to the parent directory, and `-` refers to the previous directory. To print the current directory, we use `pwd` (print working directory). Without any argument `cd` returns the user to the home directory.

To print a list of directories and files within a directory, we use `ls`, and `ls -l` for a more verbose output that indicates permission, ownership, time stamp, file size, and other detals.

The table below lists additional commands

| Command  | Use                                             |
| -------- | ----------------------------------------------- |
| `cd`     | Change working directory.                       |
| `pwd`    | Print working directory.                        |
| `ls`     | List directory contents.                        |
| `mkdir`  | Make directory.                                 |
| `rmdir`  | Remove directory only if it is empty.           |
| `touch`  | Touch a file, creating it if it does not exist. |
| `rm`     | Remove files or directories.                    |
| `mktemp` | Create a temporary file.                        |
| `mkfifo` | Create a first-in-first-out file (named pipe).  |
| `ln -s`  | Create a symbolic link to a file or directory.  |
| `mv`     | Move files or directories.                      |
| `cp`     | Copy files or directories.                      |

## Variables

In order to adapt to changing circumstances, Bash scripts use variables to hold information that is provided by the user and is subject to change. Variables are used to store values temporarily, and values they store can be changed subsequently. The `$1` variable is an example of a **special variable**, which is not defined explicitly by the programmer. Aside from `$1`, `$2`, `$3`, ..., we have seen the `$$` special variable, which holds the process id of the current running program (e.g. `bash`). Below are a few more useful special variables:

| Variable      | Value                                                        |
| ------------- | ------------------------------------------------------------ |
| `$$`          | ID of current process                                        |
| `$!`          | ID of the most recently submitted background process         |
| `$0`          | Name of current process                                      |
| `$1` `$2` ... | First, second, ... arguments                                 |
| `$#`          | Total number (`#`) of arguments                              |
| `$@`          | All arguments as an array (starting with `$1`)               |
| `$?`          | Exit status of the most recently executed foreground program |

A number of **shell variables** are also pre-defined by your platform or previously in the current Bash session. You can print all currently set (defined) shell variables by calling `set` without any arguments:

```bash
set
```

In addition, several **environment variables** are also pre-defined. Environment variables are always inherited by the new children processes spawned by the current shell program, whereas **shell variables** are not inherited by children processes. To see a list of currently defined environment variables, call `env` without arguments:

```bash
env
```

Among the environment variables, we will point out `$PATH` and `$LD_LIBRARY_PATH`, because they are particularly important for Bash to find executables and shared libraries. `$PATH` contains a list of paths that point to possible storage locations for executables, separated by the colon character `:`. On most Linux platforms, `/usr/bin` should be one of these possible locations. Bash searches for programs sequentially in the order the paths appear in the `$PATH`. The `$LD_LIBRARY_PATH` environment variable contains locations where shared libraries are stored. Furthermore, the `$HOME` environmental variable points to the home directory of the current user.

In Bash, you can define your own shell variables by using the following syntax:

```bash
author_name="Brian Fox"
```

Here, the `author_name` variable is defined and holds the value `Brian Fox`. Variable names cannot contain spaces or special characters. Generally, you should only use letters, numbers, and underscore for variable names. Importantly, no space surrounding `=` is allowed. Inserting a space after `author_name` will cause Bash to interpret `author_name` as a program name, since it is an isolated word at the beginning of a command.

We access the value of a variable by using the `$` prefix in front of the variable name. By doing so, **variable expansion** occurs: the variable reference is expanded to the value contained in the variable.

```
echo $author_name
```

If we omit the `$` prefix, we are simply refering to the text `author_name` and not the value inside the `$author_name` variable.

```
echo author_name
```

If we were to run another instance of bash (by calling `bash`), we would not be able to access shell variables from the current bash session.

```
bash -c 'echo $author_name'
```

In this command, `bash -c` starts a new instance of `bash`, executes the specified command, and returns to the original shell environment following the execution of the command. The specified command `echo $author_name` is executed within the new instance of `bash`, and it will print nothing because `$author_name` is not defined for this `bash` instance.

In order for new `bash` instances (**subprocesses**) to have access to a shell variable, the latter must be **exported** into the list of environment variables:

```bash
export author_name
```

This command exports the shell variable with the name `author_name` into the environment, so subprocesses, such as another instance of Bash, can access this variable. (Notice that we did not use the `$` prefix, because we are literally specifying the name of the variable: no variable expansion should occur.)

Now the command

```bash
bash -c 'echo $author_name'
```

should print `Brian Fox`.

To recap, `$author_name` is expanded to the value of the `$author_name` variable (`Brian Fox`) if it is defined in the current Bash session. The `$author_name` variable is defined in the current Bash session, but not in a new Bash session, until the `$author_name` has been exported as an environment variable.

The astute reader will notice that we used the single quote `'` instead of the double quote `"` to pass `echo $author_name` to a new `bash` instance. This choice was intentional, as we wanted to pass the text `echo $author_name` literally, without variable expansion, to a new `bash` instance in order to illustrate the need to export variables.

Suppose we use double quotes in the argument to `bash -c`,

```bash
animal=Penguin
bash -c "echo $animal"
```

Then, `Penguin` is printed despite that the `$animal` is not defined in the new instance of `bash`. We observe this behaviour, because double quotes `"` permit variable expansion by the current `bash` instance while single quotes `'` do not. Therefore, the `"echo $animal"` argument is expanded to `echo Penguin`, which is then passed to the new instance of `bash`, which receives `-c` and `echo Penguin` as arguments. The new instance of `bash` no longer needs to know about `$animal`, because it receives the substituted value.

The `name=value` syntax defines a shell variable in the current `bash` session, but it can also be used to define variables in a new environemnt if the syntax is directly followed by a command:

```bash
food=Bamboo bash -c 'echo $food'
```

If `food=Bamboo` and `bash -c` were on separate lines, the `$food` variable would be defined for the current `bash` instance and not for the new `bash` instance, as discussed earlier. Since `food=Bamboo` and `bash -c` are part of the same command, however, the `$food` variable is actually directly defined in the environment of the new `bash` instance (and not in the current bash).

Therefore, the new `bash` instance receives `echo $food` as arguments (without variable expansion), and it would expand the variable `$food` into the text value `Bamboo`, because it received the environment `food=Bamboo`. Subsequently, the new `bash` instance would print `Bamboo` and exit. In contrast, the current `bash` instance will not have access to `$food`. That is, the command

```bash
echo $food
```

now prints nothing to the terminal, because the `$food` variable is no longer defined. Suppose we now use double quotes instead of single quotes to permit variable expansion *before* invoking `bash -c`, we similarly receive no output, because the `$food` is not being defined in the current `bash` instance, where variable expansion now takes place:

```bash
food=Bamboo bash -c "echo $food"
```

In summary, the syntax for executing a command is

```bash
var1=a var2=b var3=c ... program arg1 arg2 arg3 ...
```

## Bash initialization files

When you first enter the shell environment in a terminal, you are working within an **interactive**, **login** shell. If you invoke `bash` at the command-line prompt without arguments, you are creating a **subprocess**: you are starting another instance of `bash`. This time, you are using an interactive, non-login `bash` instance, unless you invoke `bash --login`. You may exit this subprocess by invoking `exit`. If you run `bash -c` and supply an argument containing a command, you are starting a non-interactive, non-login `bash` instance, which will exit when the command completes. You can also invoke `bash script.sh`, which runs `script.sh` using a `bash` subprocess. Similarly, `./script.sh` also runs `script.sh` in a subprocess as specified by the shebang line. To emphasize, no shell variables are inherited by subprocesses, but the environment variables are inherited by subprocesses, as illustrated with the earlier examples. Shell variables become environment variables when they are exported, and they are in turn inherited by subprocesses.

So what about an independent instance of `bash` that is invoked in another terminal? A `bash` instance in one terminal is not a subprocess of the `bash` instance in another terminal; they are completely independent, and they cannot send variables to each other. In order for both `bash` instances to have access to a common set of variables, you may define environment variables in initialization script files.

Most commonly, you would define variables in the `~/.bashrc` file (where `~` is a short-hand for `$HOME`). The `~/.bashrc` script is run whenever a non-interactive `bash` instance is started (e.g. `bash -c`, `bash script.sh`, `./script.sh`). If `bash` is started interactively (e.g. by using `bash --login` or `bash -l`, the `~/.bash_profile` initialization script will be run.

Bash also runs the global `/etc/profile` initialization script in addition to the personal `~/.bash_profile`. Furthermore, some platforms may store additional global initializations script `/etc/profile.d/` that are run by the `/etc/profile` script. These script files are only run in an interactive, login `bash` instance (e.g. first invoked `bash` inside a terminal and `bash --login`). The idea is that these profile scripts are sourced once during login, and each `bash` subprocess sources `~/.bashrc`.

On many systems, the `~/.bash_profile` actually sources `~/.bashrc` so that commands in `~/.bashrc` will eventually be executed. (You should verify that your `~/.bash_profile` contains a line with `. ~/.bashrc`, `source ~/.bashrc`, or something equivalent. *Above all, it is important to `export` all variables intended for global use in the `~/.bashrc` initialization script.*

Suppose you wish to define your own initialization script that defines non-exported variables for certain tasks. You can create some script file named `initialize.sh` stored at `path` and run it within the current `bash` instance by:

```
. path/initialize.sh
```

The space after `.` is critical, as this `.` does not refer to the current directory; it instructs `bash` to run the script in the *current* instance instead of a subprocess. If you run the script by `./path/initialize.sh` or `bash path/initialize.sh`, you would define the variables in a subprocess, and these variables will not be accessible to the current `bash` instance. The `export` command will not help, since only subprocesses inherit environment variables: an environment variable in the subprocess will not be accessible to the parent `bash` instance. Executing `initialize.sh` with `.` is equivalent to copying over the content of `initialize.sh` and running them in the current `bash` instance. This command is also known as `source`, so the below command does exactly the same:

```
source path/initialize.sh
```

In fact, you may notice that the standard initialize script files (e.g. `/etc/profile`) often source other script files using `.` or `source`.

## Summary

So what's with all the fuss about who inherits what and who has access to what? This chapter illustrated an important concept in modern programming: **compartmentalization**. Information stored in variables are compartmentalized so that one `bash` instance does not necessarily have access to all the information; an effective programmer will control the scope of each variable appropriately so that each process has access to the right information at the right time. Indeed, if all variables are shared among all processes, and each process can modify each variable, then it would be increasingly difficult to predict variable content and program behaviour as the size and complexity of the program grows. Therefore, it is important to understand how information flows from external environments to the current program (via command-line arguments), from parent `bash` instances to child instances (via environment variables), as well as how some information can be propagated among independent `bash` instances (via modifications to the initialization scripts). The next chapter will elaborate more on how information can flow from one program to another independent program.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://djhshih.gitbook.io/quick-bash/ch01.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
