Chapter 6: Processes

In this chapter, we look at the structure of a process, paying particular attention to the layout and contents of a process’s virtual memory. We also examine some of the attributes of a process. In later chapters, we examine further process attributes (for example, process credentials in Chapter 9, and process priorities and scheduling in Chapter 35). In Chapters 24, 25, 26, and 27, we look at how processes are created, how they terminate, and how they can be made to execute new programs.

6 Processes
6.1 Processes and Programs
6.2 Process ID and Parent Process ID
6.3 Memory Layout of a Process
6.4 Virtual Memory Management
6.5 The Stack and Stack Frames
6.6 Command-line Arguments (argc, argv)
6.7 Environment List
6.8 Performing a Nonlocal Goto: setjmp() and longjmp()
6.9 Summary
6.10 Exercises

Chapters 24 to 28 are in copyedit

Chapters 20 and 21 came back copyedit. Chapters 24 to 28 have gone to copyedit.


Thanks, LWN

The kind folks at LWN.net picked up on this blog. Given the relatively small amount of publicity I've made so far for the blog, their article easily generated the best day of traffic so far. Thanks, LWN!

Chapter 5: File I/O: Further Details

In this chapter, we extend the discussion of file I/O that we started in the previous chapter.

In continuing the discussion of the open() system call, we explain the concept of atomicity--the notion that the actions performed by a system call are executed as a single uninterruptible step. This is a necessary requirement for the correct operation of many system calls.

We introduce another file-related system call, the multipurpose fcntl(), and show one of its uses: fetching and setting open file status flags.

Next, we look at the kernel data structures used to represent file descriptors and open files. Understanding the relationship between these structures clarifies some of the subtleties of file I/O discussed in subsequent chapters. Building on this model, we then explain how to duplicate file descriptors.

We then consider some system calls that provide extended read and write functionality. These system calls allow us to perform I/O at a specific location in a file without changing the current file offset, and to transfer data to and from multiple buffers in a program.

We briefly introduce the concept of nonblocking I/O, and describe some extensions provided to support I/O on very large files.

Since temporary files are used by many system programs, we’ll also look at some library functions that allow us to create and use temporary files with randomly generated unique names.

5 File I/O: Further Details
5.1 Atomicity and Race Conditions
5.2 File Control Operations: fcntl()
5.3 Open File Status Flags
5.4 Relationship Between File Descriptors and Open Files
5.5 Duplicating File Descriptors
5.6 File I/O at a Specified Offset: pread() and pwrite()
5.7 Scatter-gather I/O: readv() and writev()
5.8 Truncating a File: truncate() and ftruncate()
5.9 Nonblocking I/O
5.10 I/O on Large Files
5.11 The /dev/fd Directory
5.12 Creating Temporary Files
5.13 Summary
5.14 Exercises


Chapter 4: File I/O: The Universal I/O Model

With this chapter, we start to look in earnest at the system call API. We start with files, since they are central to the Unix philosophy. The focus of this chapter is the system calls used for performing file input and output.

We begin by introducing the concept of a file descriptor, and then look at the system calls that constitute the so-called universal I/O model. These are the system calls that open and close a file, and read and write data. In addition, we look at the system call that is used to seek to a random location in a file.

We focus on I/O on disk files. However, much of the material covered here is relevant for later chapters, since the same system calls are used for performing I/O on all types of files, such as pipes and terminals.

We continue the discussion of file I/O with further details in Chapter 5. One other aspect of file I/O, buffering, is complex enough to deserve its own chapter. In Chapter 13, we cover I/O buffering in the kernel and in the stdio library.

4 File I/O: The Universal I/O Model
4.1 Overview
4.2 Universality of I/O
4.3 Opening a File: open()
4.4 Reading from a File: read()
4.5 Writing to a File: write()
4.6 Closing a File: close()
4.7 Changing the Current File Offset: lseek()
4.8 Operations Outside the Universal I/O Model: ioctl()
4.9 Summary
4.10 Exercises


Chapter 3: System Programming Concepts

This chapter covers various topics that are prerequisites for system programming. We begin by introducing system calls and detailing the steps that occur during their execution. We then consider library functions and how they differ from system calls, and couple this with a description of the (GNU) C library.

Whenever we make a system call or call a library function, we should always check the return status of the call in order to determine if it was successful. We describe how to perform such checks, and present a set of functions that are used in most of the example programs in this book to diagnose errors from system calls and library functions.

We conclude by looking at various issues related to portable programming, specifically the use of feature test macros and the standard system data types defined by SUSv3.

3 System Programming Concepts
3.1 System Calls
3.2 Library Functions
3.3 The Standard C Library; The GNU C Library (glibc)
3.4 Handling Errors from System Calls and Library Functions
3.5 Notes on the Example Programs in This Book
        3.5.1 Command-line Options and Arguments
        3.5.2 Common Functions and Header Files
3.6 Portability Issues
        3.6.1 Feature Test Macros
        3.6.2 System Data Types
        3.6.3 Miscellaneous Portability Issues
3.7 Summary
3.8 Exercises


Chapter 2: Fundamental Concepts

This chapter introduces a range of concepts related to Linux system programming. It is intended for readers who have worked with other operating systems, or who have only limited experience with Linux or another Unix implementation.

2 Fundamental Concepts
2.1 The Core Operating System: The Kernel
2.2 The Shell
2.3 Users and Groups
2.4 Single Directory Hierarchy, Directories, Links, and Files
2.5 File I/O Model
2.6 Programs
2.7 Processes
2.8 Memory Mappings
2.9 Static and Shared Libraries
2.10 Interprocess Communication and Synchronization
2.11 Signals
2.12 Threads
2.13 Process Groups and Shell Job Control
2.14 Sessions, Controlling Terminals, and Controlling Processes
2.15 Pseudoterminals
2.16 Date and Time
2.17 Client-server Architecture
2.18 Realtime
2.19 The /proc File System
2.20 Summary


LCA 2010 Proposals

I've submitted a talk and some tutorial proposals for LCA 2010, which takes place 18-23 January in Wellington, New Zealand. The tutorials are all related to subjects in my book. I'll get to know whether any proposals got accepted in September. Either way, I'm hoping to be at LCA 2010.


Chapter 23 is in copyedit

One more chapter sent of to the publisher, to make sure my copyeditor does not run out of things to do.

Chapters 20, 21, and 22 are in copyedit

Chapters 17, 18, and 19 came back copyedit. The copyeditor has chapters 20, 21, and 22 now.


Chapter 1: History and Standards

Chapter 1 is in two main parts. The first part presents short histories of Unix, the C programming language, the GNU project, and the Linux kernel. The second part looks at the various standards that have evolved as C and Unix grew more popular.

Here's the chapter ToC:

1 History and Standards
1.1 A Brief History of Unix and C
1.2 A Brief History of Linux
        1.2.1 The GNU Project
        1.2.2 The Linux Kernel
1.3 Standardization
        1.3.1 The C Programming Language
        1.3.2 The First POSIX Standards
        1.3.3 X/Open Company and The Open Group
        1.3.4 SUSv3 and POSIX.1-2001
        1.3.5 SUSv4 and POSIX.1-2008
        1.3.6 Unix Standards Timeline
        1.3.7 Implementation Standards
        1.3.8 Linux, Standards, and the Linux Standard Base
1.4 Summary


64 chapters

The book is not small...

Currently, there are 64 chapters (a nice round number—0100 or 0x40—for computer scientists), 116 diagrams, 85 tables, and around 250 program listings spread over roughly 1500 pages (the precise number of pages will depend on how things fall out in typesetting).

Structurally, the book is divided into 8 parts (chapter numbers in parentheses):
  1. Background and concepts (1 to 3).
  2. Fundamental features of the Linux API (4 to 12).
  3. More advanced features of the Linux API (13 to 23).
  4. Processes, programs, and threads (24 to 33).
  5. Advanced process and program topics (34 to 42).
  6. Interprocess communication and synchronization (44 to 55, though it's shoehorning things a little to include chapters 49 and 50 here...).
  7. Sockets and network programming (56 to 61).
  8. Advanced I/O topics (62 to 64).
Here's the list of chapters (some titles might yet change):
1. History and Standards
2. Fundamental Concepts
3. System Programming Concepts
4. File I/O: The Universal I/O Model
5. File I/O: Further Details
6. Processes
7. Memory Allocation
8. Users and Groups
9. Process Credentials
10. Times and Dates
11. System Limits and Options
12. Retrieving System and Process Information
13. File I/O Buffering
14. File Systems
15. File Attributes
16. Extended Attributes
17. Access Control Lists
18. Directories and Links
19. Monitoring File Events with inotify
20. Signals: Fundamental Concepts
21. Signals: Signal Handlers
22. Signals: Advanced Features
23. Timers and Sleeping
24. Process Creation
25. Process Termination
26. Monitoring Child Processes
27. Program Execution
28. Process Creation and Program Execution in More Detail
29. Threads: Introduction
30. Threads: Thread Synchronization
31. Threads: Thread Safety and Per-thread Storage
32. Threads: Thread Cancellation
33. Threads: Further Details
34. Process Groups, Sessions, and Job Control
35. Process Priorities and Scheduling
36. Process Resources
37. Daemons
38. Writing Secure Privileged Programs
39. Capabilities
40. Login Accounting
41. Fundamentals of Shared Libraries
42. Advanced Features of Shared Libraries
43. Interprocess Communication Overview
44. Pipes and FIFOs
45. Introduction to System V IPC
46. System V Message Queues
47. System V Semaphores
48. System V Shared Memory
49. Memory Mappings
50. Virtual Memory Operations
51. Introduction to POSIX IPC
52. POSIX Message Queues
53. POSIX Semaphores
54. POSIX Shared Memory
55. File Locking
56. Sockets: Introduction
57. Sockets: Unix Domain
58. Sockets: Fundamentals of TCP/IP Networks
59. Sockets: Internet Domains
60. Sockets: Server Design
61. Sockets: Advanced Topics
62. Terminals
63. Alternative I/O Models
64. Pseudoterminals

What's the book about?

The obvious question is, what's the book about? Below are extracts from the current draft of the preface (which could well change in the next few months).

This book describes the Linux API (application programming interface)—the system calls, library functions, and other low-level interfaces provided by Linux, a widely used free implementation of the long-established Unix operating system. These interfaces are used, directly or indirectly, by every program that runs on Linux. Programs that explicitly use these interfaces are commonly called system programs, and include applications such as shells, editors, windowing systems, terminal emulators, file managers, compilers, database management systems, virtual machines, network servers, and much other software that is employed on a daily basis on Linux systems. The book is written to be used both as an introductory guide for readers new to the topic of system programming, and as a comprehensive reference for experienced system programmers.

We describe most Linux system calls, along with many related C library functions (as implemented in the GNU C library, glibc). These interfaces allow programs to perform tasks such as: low-level file I/O; creation of new processes and execution of programs; retrieval and modification of process attributes; communication and synchronization between processes and threads on the same host (computer); and communication between processes residing on different hosts connected via a TCP/IP network. We usually illustrate the use of each interface with a small program.

Although this book focuses on Linux, it gives careful attention to standards and portability issues, and clearly indicates those parts of the discussion that refer to Linux-specific details. As such, the book also provides a comprehensive description of the Unix/POSIX API and thus can be used by programmers writing applications targeted at other Unix systems or intended to be portable across multiple systems.

Intended audience and background required

This book is primarily aimed at the following readers:
  • programmers and software designers building applications for Linux, other Unix systems, or other POSIX-conformant systems;
  • programmers porting applications between Linux and other Unix implementations or between Linux and other operating systems;
  • teachers and advanced students teaching or learning Linux or Unix system programming; and
  • anyone (e.g., system managers and "power users") wishing to gain a greater understanding of the Linux/Unix API and of how various pieces of system software are implemented.
We assume that the reader has some prior programming experience, although not necessarily on a Linux or Unix system. No previous system programming experience is required. It is assumed that the reader has a reading knowledge of C and knows how to use the shell and common Linux or Unix commands. For readers coming to this book from other operating systems, a programmer-oriented review of fundamental concepts of Linux/Unix systems is provided in Chapter 2.

Standards and portability

Although this book describes the features of a specific Unix implementation, we take special care to consider portability issues. Frequent reference is made to relevant standards, in particular, the combined POSIX.1-2001 and Single UNIX Specification version 3 (SUSv3) standard; we also note changes in the recent revision of that standard, the combined POSIX.1-2008 and SUSv4. For features that are not standardized, we try to indicate the range of differences on other Unix implementations. We also highlight those major features of Linux that are implementation-specific, as well as pointing out various minor differences between the implementation of system calls and library functions on Linux versus other Unix implementations. (Conversely, where we do not indicate a feature as being Linux-specific, the reader can normally assume that it is a standard feature that appears on most or all Unix implementations.)

Linux and Unix

This book could have been purely about Unix system programming: most features found on other Unix implementations are also present on Linux, and vice versa. However, while emphasizing that writing portable applications is an important goal, I have chosen to also cover a range of Linux-specific extensions to the standard Unix API. The rationale for this approach is as follows:
  • The use of nonstandard extensions is sometimes essential, either for performance reasons, or to access functionality that is unavailable in the standard Unix API. (All Unix implementations provide nonstandard extensions for these reasons.)
  • I chose to focus on Linux because it is the implementation that I find to be the most important and interesting for a combination of reasons including its openness, licensing model, development model, and steadily increasing market share. (Ultimately, these reasons led me to become an active participant in Linux development.)
So, while this book is designed to be useful for programmers on all Unix implementations, it also provides full coverage of programming features specific to Linux. These features include: epoll, a mechanism for obtaining notification of file I/O events; inotify, a mechanism for monitoring changes in files and directories; capabilities; extended attributes; the clone() system call; the /proc file system; the timerfd API; and Linux-specific details of the signals, threads, shared libraries, and sockets implementations.

Usage and organization

This book is written to be used in two ways:
  • As a tutorial introduction to the Linux/Unix API. The book can (if you have a lot of time) be read linearly: later chapters build on material presented in earlier chapters, with forward references minimized as far as possible.
  • As a comprehensive reference to the Linux/Unix API, for use by experienced Linux and Unix programmers. An extensive index and heavy use of cross-referencing allow topics to be read in random order.
I’ll say more about the logical structure and contents of the book in my next post.


At last... contract signed!

Well, already about three months ago, in fact. I'm signed up with the rather impressive No Starch Press, and all going well, my book will land in the shops in the first half of 2010.

Most of the hard work has already been done (the content is generally finished, though there'll probably still be updates, depending on what happens in the Linux kernel-userspace API in the next 6 months). And by now we're well into production, with me and NSP exchanging a lot of mails each week as we copyedit chapters. (I didn't think this phase would be so much work, but then I could say that about every part of working on the book so far...)

Me, contract in hand, with Bill Pollock, founder of No Starch Press, outside the No Starch Press offices on a cold, sunny San Francisco day.