For this project I built a simple bash-like shell capable of running and managing programs in Linux. The key features of this shell include:
- [[#Creating Processes|Running a program in the foreground or background]]
- [[#Viewing and Managing Jobs|Viewing and managing existing jobs with built-in commands]]
- [[#Piping Between Processes|Piping data between two programs]]
The goal of this project was to gain a better understanding of processes and how they work, as well as how to use system calls to utilize lower-level tools offered by operating systems.
<iframe src="https://drive.google.com/file/d/1NtMXOtBdHi2XP5onc6n9ApiNLX2906wk/preview" width="640" height="480" allow="autoplay"></iframe>
## Creating Processes
There are a few steps before you can begin creating processes:
- Write signal handlers for `SIGCHLD`, `SIGINT`, and `SIGTSTP`
- Create a function that blocks the shell until the foreground process exits, stops, or terminates
- Create a signal mask
Creating signal handlers for `SIGINT` (which interrupts the process, resulting in its termination) and `SIGTSTP` (stops the process, effectively pausing it) is simple. All you need to do is forward the signals to the foreground process:
```c
if (foreground_job != NULL) {
pid_t foreground_pid = foreground_job->pid;
foreground_job = NULL;
kill(-foreground_pid, <SIGINT/SIGTSTP>);
}
```
`SIGCHLD` (which is received when a child process changes state) is a little bit more complicated because you need to consider what the child process' new state is. If it was stopped, the shell should update its internal bookkeeping to reflect this new state so it can be resumed by the user later. If the process was terminated, the shell can safely discard any reference to it. Additionally, distinguishing between different types of state changes enables the printing of useful messages that inform the user of the event.
There is one other consideration to be made when writing a `SIGCHLD` handler: concurrency. The scheduler makes no guarantees about the order in which processes run, so it's likely that some will change state while the `SIGCHLD` handler is already running. Signals aren't queued, so the handler won't be run for these processes. Not only does this result in the shell retaining outdated information on processes, but it prevents un-reaped zombie processes from being cleaned up. To address this, every time the handler is run you need to process all children that have changed state but haven't been handled.
Now that we've covered everything the `SIGCHLD` handler needs to do, we can begin writing it. At the core of this handler is the `waitpid(pid_t pid, int *wstatus, int options)` function, which (by default) blocks the current process until the process with specific process ID changes state. This isn't the behavior we want. We want it to return for any process that is a child of the shell rather than one with a specific PID, and we also don't want it to block the shell. To get the desired behavior, we pass $-1$ as the PID to instruct the function to return for any child process. Additionally, we pass `WNOHANG | WUNTRACED` as the option parameter, which instructs it not to block the shell and to return $-1$ if no child processes have changed state. By placing this call in a loop we can successfully process all children that have changed state:
```c
pid_t chld_pid;
int status;
while ((chld_pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
// Process changes
}
```
A reference to `status` is passed to `waitpid` which updates its value to reflect the state of the child process. We can use some built-in macros to extract this information and learn about the child's state:
```c
pid_t chld_pid;
int status;
while ((chld_pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
if (WIFEXITED(status)) {
// Child exited normally
}
if (WIFSIGNALED(status) && WTERMSIG(status) == SIGINT) {
// Child was terminated
}
if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTSTP) {
// Child was stopped
}
}
```
Now that the signal handlers are complete, we need a way of blocking the shell until the foreground process changes state. We can do this by creating a new function that uses the same `waitpid` function (this time utilizing the `WUNTRACED` option) and blocks until the process exits, terminates, or stops:
```c
void waitfg(pid_t pid, sigset_t reset_mask) {
sigsuspend(&reset_mask);
int status, res;
// Wait for foreground_job process to exit or stop
while ((res = waitpid(pid, &status, WUNTRACED)) > 0 && !WIFEXITED(status) && !WIFSTOPPED(status) && !WIFSIGNALED(status)) {
sigsuspend(&reset_mask);
}
if (res <= 0) {
return;
}
if (WIFEXITED(status)) {
// Process exited normally
}
// Check the new status of the job
if (WIFSTOPPED(status)) {
// Process was stopped
}
if (WIFSIGNALED(status)) {
// Process was terminated
}
}
```
As previously discussed, a child process can finish running at any point, even before the shell is ready for it. This could lead to the signal handler modifying the shell's bookkeeping before it's been updated with information about the new job. To prevent the `SIGCHLD` handler from being called prematurely, the shell needs to block `SIGCHLD` signals before creating a new process. This means that by the time `waitfg` is called, the shell won't be able to receive `SIGCHLD` signals. Without these signals, there is no way of knowing when the job is finished. `sigsuspend` offers an elegant solution to this problem by unblocking signals, then re-blocks them after any invoked signal handler finishes running. To specify which signals should be blocked, you use a `sigset_t`:
```c
sigset_t chld_mask, prev_mask;
// Create mask
sigemptyset(&chld_mask);
// Add SIGCHLD to mask
sigaddset(&chld_mask, SIGCHLD);
// Block
sigprocmask(SIG_BLOCK, &chld_mask, &prev_mask)
```
At this point we have created all the required signal handlers, a way of waiting for the foreground process to change state, and protection against race conditions; we can finally start creating the child processes. To create a child process, use the `fork` function. `fork` returns twice: once in the parent process and once in the child. The parent is returned the PID of the new process and the child is returned $0$. This provides an easy way of distinguishing between which process you're currently in:
```c
pid_t child_pid;
if ((child_pid = fork()) == 0) {
// In child process
} else {
// In parent process
}
```
There are a few things to do in the child process. First, its process group ID needs to be changed so that it is no longer in the same group as the shell. This is required in order for the shell to forward signals to the child process and any children that it may have. This is done by calling `setpgid(pid_t pid, pid_t gid)` with `pid` and `gid` set to $0$, instructing the function to set the process' GID to be the same as its PID. Next, `SIGCHLD` signals need to be unblocked because the shell's state was copied to the new process, including its signal mask. This is done again with the `sigprocmask` function. Finally, we can use `execve(const *char pathname, char *const argv[], char *const wenvp[])` to run a program. The user-provided command line will be parsed and passed as the path name and arguments parameters. All of this comes together to look like this:
```c
setpgid(0, 0);
sigprocmask(SIG_UNBLOCK, &chld_mask, NULL);
execve(argv1[0], argv1, NULL);
```
In the parent process we need to complete shell-related bookkeeping, update the child's GID (updating the GID in both the parent and child prevents data discrepancies caused by concurrency issues), then unblock `SIGCHLD` signals:
```c
// Bookkeeping...
// Again, 0 means 'same as PID'
setpgid(child_pid, 0);
sigprocmask(SIG_UNBLOCK, &chld_mask, NULL);
```
This is everything you need to begin running background processes. To run something in the foreground, everything is the same except for a call to `waitfg`:
```c
// Bookkeeping...
// Again, 0 means 'same as PID'
setpgid(child_pid, 0);
sigprocmask(SIG_UNBLOCK, &chld_mask, NULL);
waitfg(child_pid, prev_mask);
```
Now we can create new processes!
## Viewing and Managing Jobs
Viewing and managing jobs is substantially more simple than creating processes. In my shell, I set up the command line parser to perform special actions for three built-in commands:
- `jobs`
- `fg`
- `bg`
The `jobs` command is simple: it lists off all current jobs and their status. In the context of a shell, a job is a group of related processes. Any time you enter a command, the processes that are created as a result belong to a single job. This is the simplest of the built-in commands because all you need to do is have a way of tracking all the jobs that are created. I tracked jobs using a list of structs, with several helper functions for accessing and modifying them:
```c
struct job_t {
pid_t pid;
int jid;
int state;
char cmdline[MAXLINE];
};
struct job_t jobs[MAXJOBS];
struct job_t *getjobpid(struct job_t *jobs, pid_t pid){...}
void changestatepid(struct job_t *jobs, pid_t pid, int state){...}
// And several other helpers
```
To keep this array updated, we need to create a new entry every time the user creates a new process, and update/remove entries whenever we receive an update via the `SIGCHLD` signal handler. Now, when the shell receives the `jobs` command, it simply needs to print every job it has record of.
Importantly, every job is assigned a unique job id that is printed next to its entry when you run `jobs`. You can use this value alongside `fg` and `bg` to move a process to the foreground or background. This is done using the `kill(pid_t pid, int sig)` function which sends a signal to the process with a specific PID. Since we already have an array of all the current jobs, we can find the entry with a specific job ID and get that job's associated process' ID. We can then use `kill` to send the `SIGCONT` signal to the process, which instructs it to resume. This process is identical for both `fg` and `bg` except for additional calls to `waitfg` and `sigemptyset` in `fg`:
```c
void do_bg(int jid) {
struct job_t *job;
// If job can't be found or is already running in the GB, return
if ((job = getjobjid(jobs, jid)) == NULL || job->state == BG)
return;
// Change state to BG and send SIGCONT signal
changestatepid(jobs, job->pid, BG);
kill(job->pid, SIGCONT);
}
void do_fg(int jid) {
struct job_t *job;
if ((job = getjobjid(jobs, jid)) == NULL)
return;
// Change state to FG, then send SIGCONT signal
changestatepid(jobs, job->pid, FG);
sigset_t mask, prev;
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, &prev);
kill(job->pid, SIGCONT);
waitfg(job->pid, prev);
sigprocmask(SIG_UNBLOCK, &mask, NULL);
}
```
Now we can print information about all jobs being managed by the shell and move them between the foreground and background!
## Piping Between Processes
Piping between jobs can be accomplished using the `pipe(int pipefd[2])` function, which creates a Linux pipe (temporary data channel) and stores the read and write file descriptors in an array such that `fd[0]` will contain the read file descriptor and `fd[1]` will contain the write file descriptor. If you create a pipe in the shell process and then fork into two children, both of the child processes will have access to the file descriptor array.
By redirecting the standard output of the first process and the standard input of the second process to the pipe, data will pass from one to the other. To do this, we use `dup2(int oldfd, int newfd)` to make a copy of `oldfd` and stores it at `newfd` such that `oldfd` and `newfd` can then be used interchangeably. In our case, this means calling `dup2(pipe_fd[1],1)` in the first process to make its standard output to write to the pipe. Similarly, in the second process we call `dup2(pipe_fd[0],0)` to redirect standard input to the pipe.
The only other crucial change is for both processes to close the pipe file descriptors that they aren't using. This needs to be done because the second process will continue trying to read from the pipe until it receives an end of file value, which won't happen until both processes close their write file descriptors. The operating system sends the EOF once all write file descriptors have been closed, because at that point is is impossible for any program to write to the pipe. Therefore, if you don't close the write file descriptors you create a deadlock, but if they are closed correctly the data will be passed and the programs will terminate as expected.
In the shell process, there is only one modification that needs to be made when running piped jobs: wait for two processes instead of one. If you were to only call `wait` (a simplified version of `waitpid` that blocks the current process and waits for any child process to change state) once, your shell process may resume after the first process completes, which is not the expected behavior. Fortunately, a second `wait` call is all that is needed to resolve this problem.
Everything discussed in previous sections regarding un-piped processes still applies here. Putting everything together results in the following code:
```c
// Block SIGCHLD signals
sigset_t chld_mask, prev_mask;
sigemptyset(&chld_mask);
sigaddset(&chld_mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &chld_mask, &prev_mask);
if ((child_pid = fork()) == 0) {
setpgid(0, 0);
int pipe_fd[2];
// Create the pipe
pipe(pipe_fd);
// Child A
if ((child_pid = fork()) == 0) {
// Redirect stdout to pipe
dup2(pipe_fd[1], 1);
// Close to prevent deadlock
close(pipe_fd[0]);
close(pipe_fd[1]);
sigprocmask(SIG_UNBLOCK, &chld_mask, NULL);
execve(argv1[0], argv1, NULL);
}
// Child B
if ((child_pid = fork()) == 0) {
// Redirect stdin to pipe
dup2(pipe_fd[0], 0);
close(pipe_fd[0]);
close(pipe_fd[1]);
sigprocmask(SIG_UNBLOCK, &chld_mask, NULL);
execve(argv2[0], argv2, NULL);
}
// Shell process
// Shell also needs to close pipe FD's to prevent a deadlock
close(pipe_fd[0]);
close(pipe_fd[1]);
sigprocmask(SIG_UNBLOCK, &chld_mask, NULL);
int status;
// Wait for both processes
wait(&status);
wait(&status);
exit(status);
```
And that's it! Through utilization of useful functions like `fork`, `waitpid`, `pipe`, and many more, we have a fully functional shell that can effectively create and manage jobs.