Minix Kernel Internals

This course must have been last taught by me, if I recall correctly, in 2008 in a deemed university in Andhra Pradesh, India. I think I taught this course for two or three batches/years. The class was I. M.Tech. (Comp. Sc.) and class size was typically around 14 to 18 students.

Essentially this was a course designed to give students a feel of kernel internals design and code.

One interesting discussion I had with other faculty was on whether we should have a Minix Kernel Internals course or a Linux Kernel Internals course. The important points related to my view of that discussion are available here.

Prerequisites for the Course

C Programming including debugging skills, User level knowledge of Unix (Unix commands like ls, chmod, mkdir, kill etc.) and theoretical knowledge of modern operating systems like Unix./Linux Kernel Internals. Note that students typically would learn such theory of modern operating systems in a separate theory course done either prior to this course or in parallel to this course.

Course Book

“Operating Systems – Design and Implementation”, Second edition by Tanenbaum and Woodhull, http://minix1.woodhull.com/osdi2/.

A later edition, 3rd edition, is available, http://www.flipkart.com/operating-systems-design-implementation-3rd/p/itmdytczcxvugj3h. But I do not know whether it is appropriate to use that book for this course material. [BTW Instructor resources (would be useful for students too, I guess) is available here on request, http://www.pearsonhighered.com/educator/academic/product/0,3110,0131429388,00.html#resources. This course site of a US university, http://www.cise.ufl.edu/~nemo/cop4600/, seems to be using the 3rd edition book as the main course book. The site has slides which can be downloaded.]

Note: While the course book has some theory of Operating Systems as well, this course focuses only on design and implementation of Minix. A separate theory course on Operating Systems is done by students either prior to this course or in parallel to this course.

Minix website: http://www.minix3.org/

Teaching with Minix Howto (has links to Minix course websites): http://minix1.woodhull.com/teaching/

Chapter Wise Study Notes

Chapter 1 – Introduction

If time permits the whole chapter should be read. Sections 1.2.5 – History of Minix, 1.3 – Operating System Concepts, 1.4 – System Calls are important sections which MUST be read (they are a reading assignment for this course).

Given below are some notes of the important parts of this chapter for this course.

History of MINIX

  • Tanenbaum writes MINIX from scratch using same system interface as UNIX but the implementation being completely different from AT&T’s UNIX thereby avoiding licensing issues and making MINIX suitable for study in edcuational institutions (like universities). MINIX stands for mini-UNIX.
  • Students can dissect a real operating system – MINIX – like biology students dissect frogs.
  • MINIX code is meant to be studied and be readable unlike UNIX where the objective of the design was efficiency.
  • This book’s MINIX version is based on POSIX standard.
  • MINIX is written in ‘C’ programming language.
  • MINIX initial implementation was for the IBM PC; subsequently ported to other machines/architectures like Macintosh and SPARC
  • Running MINIX similar to running UNIX; commands like ls, cat, grep and make as well as the shell are available.
  • Finnish student, Linus Torvalds, wrote a MINIX clone called LINUX to be a “feature-heavy production system” as against an educational tool that MINIX is.

Operating System Concepts

  • System calls are the interface between OS and user programs

Processes

  • A program in execution is a process
  • A process had address space – memory locations which the program can read from and write to
  • Address space is divided into code – executable program, data and stack; (stack is also data but of a different type)
  • Each process has a set of registers including program counter and stack pointer needed to run the program (using the underlying processor and processor instructions).
  • In a time-sharing system OS suspends one process, saves its process state, and starts/re-starts another process; the suspended process is re-started after some time at which time first its process state must be restored; process table entry stores the process state when the process is suspended
  • Command interpreter or shell has to execute programs; as part of program execution and control of that execution it needs a system call to create a process which will execute the program; on the program finishing a system call is needed for the process to terminate

Files

  • Broad category of system calls that deal with the file system
  • system calls are available to create and remove files; to open, read, write and close files
  • hierarchical directory structure: system calls available to create and remove directories; put an existing file in a directory, remove an existing file from a directory; a directory itself can be a file entry in another directory thereby providing for a hierarchical directory structure
  • 9-bit file protection code giving rwx permissions for owner, group and others

Shell

  • Command interpreter is called shell; it is the primary interface between user on terminal giving commands and the operating system
  • On user login a shell is started up for the user; typical prompt is $; user types in command at this command prompt
  • input and output file redirection via < and >

Memory Layout of Process

  • A process has three segments in its address (memory) space – text (or code), data and stack; data segment grows upward and stack segment grows downward (see fig. 1-11)

Monolithic Operating System with User Mode and Kernel Mode

  • Most CPUs have (at least) two modes – kernel mode where all instructions are allowed and user mode where some instructions are prohibited
  • Special trap instruction in the CPU instruction set known as kernel call or supervisor call switches the CPU from user mode to kernel mode
  • Parameters of the kernel call/supervisor call instruction are used to identify which system call should be invoked

OS components

  • Four components of Minix OS – process management, I/O device management, memory management and file management; course book chapters are also on these lines.

Lab work: Installing VMWare on Linux and then installing Minix on VMWare (Instead of VMWare for some years/batches Bochs emulator was also used). One great advantage of using Minix on VMWare for this course was that Minix kernel crash (due to modified kernel code of students typically) only crashed the virtual machine. Bringing up a new virtual machine Minix (with stable kernel) on VMWare was quick and straightforward. This contrasts very positively with the time lost on Minix kernel crash on physical machine and resultant time and effort to bring up Minix (with stable kernel) on the physical machine.

This link provides info. on installing Minix on VMWare and other virtual machine/emulator software, and has some other related support info. too: http://minix1.woodhull.com/hints.html#emul-virt.

Chapter 2 – Processes

Section 2.5 – Overview of Processes in Minix and section 2.6 Implementation of Processes in Minix (pages 93 to 147) are very vital sections of the book. These are the first, and perhaps the most important, sections that delve into Minix kernel source code. These sections have to read and understood in conjunction with the actual source code they refer to.

[The Minix source code listing is provided in the book itself as Appendix A – The Minix Source Code from page 523 to page 903 (380 pages) consisting of 27646 lines of (line numbered) code. Appendix B is an Index to Files (pages 907 and 908) which gives the filenames and starting line number in listing for files in include directory, kernel, memory manager and file system. Appendix C is an Index to Symbols (Pages 911 to 923) giving symbol name and listing line number. These symbols include function names and macros (#define).]

[I discussed these sections (2.5 and 2.6) in class and also walked through appropriate code where required. These were intense sessions and required high level of commitment from students if they wanted to really benefit from the course.]

These sections involve some assembler code. Some, if not all, the detailed pages/sub-sections are listed below:
a) stackframe_s structure: Page 115 Lines 4537 to 4558 (Lines refer to the source code listing included as Appendix A)
b) mpx386.s Page 124
c) Section 2.6.7: Interrupt handling in MINIX, Pages 128 to 137
d) Section 2.6.10: Hardware-Dependent Kernel Support Pages 142 to 145
e) Section 2.6.11: Utilities and the Kernel Library, Pages 145 to 147

The assembler code (if I recall correctly) is Intel 386 code. So there must be some understanding of Intel 386 architecture and instructions. The students would have taken a computer architecture course in the past which would have covered Intel 386 architecture and instructions to some extent, at least.

The wikipedia page, http://en.wikipedia.org/wiki/Intel_80386, seems to give a decent overview of Intel 386.

An Intel 386 programmer’s reference manual is available here: http://intel80386.com/. From this reference these two chapters are important for understanding the assembler code (if I recall correctly) in Minix source code referenced by the course book:

Lab. Assignment: Change process scheduling algorithm – I am afraid I don’t recall the details now – could be changing scheduling from round-robin to lottery scheduling (random).

Assessment: Ideally assessment for this very intense chapter would involve quizzes/viva-voce to examine a student’s understanding of the design and implementation details covered by this chapter. However, I have to be honest and state that very few students of the batch had the time, inclination and/or resolve to get that deep into Minix design & code. Most students chose to focus on doing the lab. assignments where one could do some fiddling with part of the code without having got a full understanding of the design and code covered in this chapter. I could not blame the students much as the time available for this lab. course in terms of both study time and computer lab. time was limited (competition from other courses and other activities that students had to get involved with). Having said this, I must also say that the objective of exposing most students to the complexity of kernel internals of Minix got achieved with intense coverage in class of some of the design & code covered by this chapter. So, if in future, say for a IInd M.Tech. (CS) project, a bright student chose to do an OS component design & implementation project (like a filesystem implementation/port to Minix), the background exposure would have been provided by this course for the student to take up the challenge.

Chapter 3 – Input/Output

In this course we do not focus much on this chapter. However the following sections must be read as they may be useful (are useful, if I recall correctly) for better understanding of later chapters:

  • 3.4.1 Interrupt Handlers in Minix
  • 3.8.3 Overview of the Clock Driver in Minix
  • 3.10 The System Task in Minix

Chapter 4 – Memory Management

This whole chapter is an important part of this course and so the whole chapter should be read. In particular the following sections must be studied in-depth:

  • 4.6.3 Segmentation with Paging: The Intel Pentium
  • 4.7 Overview of Memory Management in Minix (including subsections)
  • 4,8 Implementation of Memory Management in Minix (including subsections)

The following topics were discussed in depth in class – fork system call implementation; Process scheduling, process table, process memory map and Translation. Viva-Voce/Quiz was conducted to test understanding of these topics by the students.

Lab. Assignment 1: Modify memory allocation algorithm from first fit to best fit.

Lab. Assignment 2: Add a new system call in MINIX (in MM). The system call should take in, say, two parameters, and have a return value. The system call’s service may be a very trivial one as the objective is to understand how add a new system call to Minix (and then test it, of course).

Lab. Assignment 3: Add a new system call in MM that gets the memory map for the requested process.

 Chapter 5 – Filesystems

Due to lack of time this chapter was not covered. However, if students have the time and inclination this chapter should be read and understood. [A couple of students who underwent this course in Ist. M.Tech. (CS) took up implementation/port in Minix of a filesystem supported on Linux but not in Minix (2.0). That project involved understanding this chapter and related code very well.]

Advertisements

Linux Kernel Customization – Mini Course

This mini course was taught by me only once in 2008, if I recall correctly, in a deemed university in Andhra Pradesh, India. As it was not a full-fledged course I have chosen to provide details of the mini-course, and suggestions to make it a full-fledged course, as a report.

Towards the end of the Minix Kernel Internals Lab course for Ist M.Tech. (CS), for a period of 6 days this mini-course/sub-course was conducted. The following topics were covered:

  1. Building the Linux Kernel. Students were taken through the steps of building the Fedora 7 Linux kernel. Topics explained included: Need for customized kernels (embedded devices, performance optimization, usage of new kernel features which are available only as kernel patches, bug fixes), boot process, grub, linux kernel executable files, statically linked kernel and dynamically loaded modules, related commands like uname, lsmod, /proc pseudo-filesystem, lspci etc., rpms needed for building kernel, configuring the kernel, installing the newly built kernel.

Students built two to three versions of the Fedora 7 kernel: A standard kernel similar to the Fedora distribution kernel and one or two smaller kernels. Some groups of students were able to reduce the kernel dynamic modules from 441 MB to less than 100 MB. They also were able to marginally reduce the statically linked kernel. As time was short we stopped at students getting a reasonable idea of what kernel customization involves.

  1. Very Simple Hacking of Linux Kernel. Kernel code was modified to include some printk statements. Students rebuilt this kernel and observed the output of their printk statements.
  2. Writing simple Linux Kernel modules. Students wrote one or two small modules and inserted them into the running kernel using insmod. A small kernel module which handles interrupts was also studied and demonstrated.
  3. Kernel patches. Quick introduction to Linux kernel patches, applying a patch and also creating a patch; importance of patches when contributing to Linux kernel community. We did not have time to do hands-on assignments for these topics (kernel patches).

Note: Evaluation of students was done by studying their assignment reports and by a Viva voce.

Linux Kernel Customization slides

Remarks

While this sub-course of 6 days (2 timetable hours per day) did give students a foot-hold in the rather intimidating area of Linux Kernel customization and Linux Kernel Programming, the period was too short to achieve any substantial goals.

I now feel that the minimum goal for such a sub-course would be the following:

• Build a minimal Linux kernel for an embedded device. Perhaps we could test that kernel using a simulator for that embedded device.

• Write a small but functional Linux device driver.

If these two goals are achieved then students will have gained something substantial. It may be of direct help for a possible device driver IInd M.Tech project. Further, students can mention it in their biodata and also talk about it during job interviews.

However, for such a sub-course the minimal period would be 4 weeks (assuming 8 timetable hours per week).

If required the course can also be made a full course by making the tasks mentioned above more complex like writing a full fledged device driver, and adding suitable topics from the resource links given below.

—- end report —

Making the Linux Kernel For Fedora 7 – Instructions and Log

Resource Links

(IBM Developer Works) Hacking the Linux 2.6 kernel, Part 1: Getting ready – http://www.cagdastopcu.com/wp-content/uploads/2010/01/l-kernelhack1-pdf.pdf

(IBM Developer Works) Hacking the Linux 2.6 kernel, Part 2: Making your first hack – http://marcelotoledo.com/wp-content/uploads/2008/04/l-kernelhack2-a4.pdf

Personal Fedora 7 Installation Guide – http://www.mjmwired.net/resources/mjm-fedora-f7.html

http://kernelnewbies.org/http://kernelnewbies.org/KernelBuild

http://www.kernel.org

The Linux Boot Process – http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch07_:_The_Linux_Boot_Process

Modifying the Kernel to Improve Performance – http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch33_:_Modifying_the_Kernel_to_Improve_Performance

http://www.linuxfromscratch.org/

Linux Kernel Programming US university course link – http://www.cs.utexas.edu/users/ygz/378-03S/

UNP – http

Given below is a reference to slides from a US university (links obtained via Google search and so presumed to be freely accessible to anybody on the net) which are the key teaching slides for this topic:

www.cs.rpi.edu/courses/fall02/netprog/notes/web/web.ppt – till slide 39, you may ignore the later slides.

[The slides that I used when I taught the course could not be located in US university sites easily via Google Search. But I found them at the following two links:

http://www.learningace.com/doc/1990348/748ff2abb5f5a9ee6c4d5c822a3d34aa/http

http://www.powershow.com/view/beb4-NjVjY/HTTP_Hypertext_Transfer_Protocol_powerpoint_ppt_presentation]

Additional references:

HTTP Made Really Easy: http://www.jmarshall.com/easy/http/

RFC1945 – HTTP/1.0: http://www.faqs.org/rfcs/rfc1945.html

RFC2616 – HTTP/1.1: http://www.faqs.org/rfcs/rfc2616.html

RFC 2396 – Uniform Resource Identifiers (URI): Generic Syntax: http://www.faqs.org/rfcs/rfc2396.html

RFC822 – ARPA Internet Text Messages Format: http://www.faqs.org/rfcs/rfc822.html

RFC 1521 – MIME (Multipurpose Internet Mail Extensions) Part One: http://www.faqs.org/rfcs/rfc1521.html

RFC 1522 – MIME (Multipurpose Internet Mail Extensions) Part Two: http://www.faqs.org/rfcs/rfc1522.html

Tutorial – HTTP Proxy – http://www.ragestorm.net/tutorial?id=15

RFC 2617 – HTTP Authentication: Basic and Digest Access Authentication: http://www.faqs.org/rfcs/rfc2617.html

Base 64 Resources

Online encoder/decoder: http://www.paulschou.com/tools/xlate/

Source code for encode/decode: http://www.adp-gmbh.ch/cpp/common/base64.html


Assignments

 Assignment 1

Write a browser “brow” implementing HTTP GET request:       brow –d GET ipaddress /………

The server response should be displayed on stdout. There should be a debug option –d to display (on/off) the reply messages.

UNP – Threads

Given below is a reference to slides from a US university (links obtained via Google search and so presumed to be freely accessible to anybody on the net) which are the key teaching slides for this topic:

http://www.cse.unr.edu/~mgunes/cpe401/cpe401sp09/Lecture18.ppt

[The slides that I used when I taught the course could not be located in US university sites easily via Google Search. But I found them at the following two links:

http://www.learningace.com/doc/1205385/e48a70888d6eecc154d027da7ecc4279/threads http://www.docstoc.com/docs/111876661/Threads-Programming]

An interesting additional reference is: http://www.cs.fsu.edu/~baker/realtime/restricted/notes/pthreads.html

Show/explain the following code:

nonblock/strclifork.c     threads/strclithread.c

threads/tcpserv01.c      threads/tcpserv02.c

Mutexes

Multi-threaded program which increments a global variable incorrectly (‘No Synchronization’)

threads/example01.c

Corrected version of above program using a mutex to protect the shared variable

threads/example02.c

Condition Variables

‘Web client and simultaneous connections’ using condition variables to specify which thread to wait for

threads/web03.c

Client/Server Design Alternatives

TCP Prethreaded Server, per Thread accept()

server/pthread07.h              server/serv07.c            server/pthread07.c

TCP Prethreaded Server, Main Thread accept()

server/pthread08.h             server/serv08.c           server/pthread08.c

Assignments

Assignment 1

Modify the Saisays server and client (stresser) programs of Assignment 4 – IntroSockets by making them ‘multi-threaded’ instead of spawning multiple processes for client connections.

Repeat the stress test against this server and perform a comparison of these results with the results obtained in the previous assignment.

Write a report giving your observations and conclusions regarding the assignment.

Assignment 2

Understand and try out the ‘mutexes’ and ‘condition variables’ examples in Chapter 26 (Threads) of the course book.

Understand and try out the different examples of client/server design alternatives examined in Chapter 30 of the book.

Assignment 3

Consider a fictitious Simple File Transfer Protocol (SFTP) which is as follows:

  • It is a line oriented ASCII protocol i.e. requests and replies both are in ASCII (not binary) and terminated by \n
  • It allows clients to access files in just one directory (called as SFTP data directory) managed by the server. It can list the files in the directory and also get a file. To simplify the protocol we will limit the files in the directory to be only text files (and no subdirectories).
  • Requests consist of just request keywords and optional parameters. They are as follows:
    • List\n
    • Get filename\n
    • Quit\n
  • Responses consist of a mandatory header and optional body.
  • Response Header always has a status line followed by a content length line as follows:
    • Status nnn\n
    • Content-Length  nnnnnn\n
  • Status code values are as follows:
    • 200      – Means OK
    • 401      – Invalid Command
    • 402      – File not found
  • All responses will have status. If the request has been handled successfully then the Status response of 200 is returned otherwise an appropriate Status error response is returned.
  • Content-Length line gives the length of the contents that follow after the header. Examples are as follows:
    • Content-Length 0\n    – For the case when there is no data that follows. Typically for error status responses and if for List there are 0 (no) files in the SFTP data directory.
    • Content-Length 246\n   – For the case when 246 bytes of data follow the header. Typically for List if the string containing the filenames including the return characters is 246 bytes and for Get if the text file whose data follows the header has a file size of 246 bytes.
  • List command lists the files in the SFTP data directory. Each filename is on a separate line ended by a \n. If there are no files in the directory then no data is returned to client (but response header is returned).
  • Get command gets the file data that it requests. E.g. Get abc.txt\n. If the file does not exist a suitable error status is returned to the client.
  • Quit command causes the server to return status of OK and Content-Length 0, after which the server closes the connection.

Write an SFTP server program which implements the application level protocol (SFTP) Simple File Transfer Protocol. (Implement only the ‘GET’ command).

The implementation should be multi-threaded with the server using prethreading to create a pool of available threads when it starts.

Implement the server program first with the main thread calling accept() and then with each thread in the pool calling accept().

Compare the performance of the two approaches using statistics like the response time of the server.

Write a report giving your observations and conclusions regarding the assignments.

Reading Assignments

Chapter 26 Threads

Chapter 30 Sections 30.11 and 30.12

Additional Reading

Implementing a read-write mutex:

http://doc.qt.nokia.com/qq/qq11-mutex.html

http://stackoverflow.com/questions/1350994/is-it-safe-to-read-an-integer-variable-thats-being-concurrently-modified-without

http://stackoverflow.com/questions/1087771/do-i-need-to-syncronize-thread-access-to-an-int

Need mutex for multiple read:
http://www.codeguru.com/forum/archive/index.php/t-495831.html

Do I need to lock a mutex for reading a variable:
http://www.openqnx.com/PNphpBB2-viewtopic-t11405-.html

Per-thread data example: http://www.cs.fsu.edu/~baker/realtime/restricted/examples/threads/perthread.c

POSIX Threads programming tutorial: Lawrence Livermore National Laboratory
https://computing.llnl.gov/tutorials/pthreads/

Introduction to Parallel Computing: Lawrence Livermore National Laboratory
(Has a section on Parallel Programming Models where it notes that Threads Model and Message Passing Model are two of the many models for Parallel Programming)
https://computing.llnl.gov/tutorials/parallel_comp/

UNP – Sockets – Miscellaneous

Line Oriented IO Issues

read functions which buffer data may not work properly with select

readline() function reads a buffer of data and from that buffer returns a line of input back to caller. Additional read data that is available will be kept in readline() function’s buffer and returned at subsequent call. Further readline() will block till at least one line of data is read.

Show/explain the following code:

test/readline1.c, lib/readline.c, lib/readn.c

Such a readline function is very useful for network programs dealing with line oriented protocols like HTTP, FTP etc. However there are many issues with such buffered IO functions:

  • If select is used to wait for input from multiple sources (say socket and stdin), select may block even though readline() has a line of data (read during earlier invocation of readline where read returned multiple lines in one call). Select uses the stdin fd and is unaware of the internal buffer of readline().
  • The second version of readline() above allows for such issues to be fixed by the programmer first checking readline’s exposed buffer for data before calling select. But this increases complexity of programming.
  • If readn and readline function calls are mixed unexpected behaviour may occur.
  • Even if the network programs expect data to be exchanged only in lines, due to bugs or malicious attempts some data may be sent which is not line terminated. Using a function like readline will make it difficult for the network program to detect such data and flag it as an error.
  • For the same reasons mentioned above stdio functions like fgets should not be used in socket programs.

Readn function suffers from some of the issues that readline has. So ideally one should avoid using readn type of functions as well.

So what should one do if lines (or fixed amt of data) have to be read (and written)

  • Always think in terms of buffers of data being read and written over sockets.
  • If a line is expected, read data into a buffer and check the buffer to see whether it contains a line
  • If a fixed amt of data is expected typically one has to continue reading till at least expected amt of data has arrived. But this should be done at a top level instead of in a readn kind of function. This way the partially read data is always available in a buffer for the top level code to do any error checking or similar kind of task, instead of the partially read data being hidden away in a readn() routine’s private buffer.

Reading Assignments

Section 3.9 (readn, writen, and readline functions)

Section 6.4 to Section 6.7

———————————————————————-
Socket Options

  • Various attributes are used to determine the behavior of sockets.
  • Setting options tells the OS/Protocol Stack the behavior we want.
  • Support is provided for generic options (apply to all sockets) and protocol specific options.

getsockopt() gets the current value of a socket option.

setsockopt() is used to set the value of a socket option.

int getsockopt( int sockfd,

int level,

int optname,

void *opval,

socklen_t *optlen);

level specifies whether the option is a general option or a protocol specific option (what level of code should interpret the option).

SO_RCVTIMEO         Option to be set for so that socket receiving functions timeout after specified time

But man page for socket (7) says that it cannot be set by user on Linux!!!

recv and send functions are also available instead of read and write for sockets.

Reading Assignment

Chapter 7         Socket Options

——————————————————————————————————-

Name Address Conversions

DNS provides Name and address conversion.

/etc/hosts file can be used to create some name to address mapping entries so that name address conversion facility in limited form is made available without DNS.

gethostbyname() function returns ip address for name. Reentrent version is also available.

names/hostent.c

gethostbyaddr() function takes ip address and returns hostname

/etc/services file maps service names to ports

getservbyname()  gives port number for a service name, protocol name pair

e.g. sptr = getservbyname(“ftp”, “tcp”);

names/daytimetcpcli1.c

gethostbyname, gethostbyaddr support only IPv4. getaddrinfo() supports both IPv4 and IPv6 and handles both name-to-address and service-to-port translation.

Reading Assignments

Chapter 11 Name and Address Conversions

——————————————————————————————————

Daemons and inetd

Given below is a reference to slides from a US university (links obtained via Google search and so presumed to be freely accessible to anybody on the net) which are the key teaching slides for this topic:

http://www.cs.rpi.edu/academics/courses/spring98/netprog/lectures/ppthtml/inetd/inetd.ppt

Show/explain the following code:

lib/daemon_init.c  : Linux has daemon function which does the same.

Inetd run server

lib/daemon_inetd.c          inetd/daytimetcpsrv3.c

Reading Assignments

Chapter 13 Daemon Process and the inetd SuperServer

———————————————————————————————

Assignments

Assignment 1

Modify the Saisays server and client programs that you have written, so that both of them use name-to-address and service-to-port conversions. The server program should take the service name as an argument and the client should take the IP address/hostname of the server, and the service name as arguments.

Assignment 2

Modify the Saisays server program so that it runs as a daemon process. It should also report appropriate messages to the syslog daemon (the central logging facility of UNIX) using the syslog() call with appropriate priority levels for different types of messages.

The daemon should write to the log all significant events which include messages on the daemon starting up, on the server being brought up, on acceptance of a connection from a client, on the occurrence of an error and anything else you consider important.

Assignment 3

Improve the Saisays server program by making it possible for it to be run as a Superserver based server (i.e. based on xinetd or inetd). It should write appropriate messages to syslog.

UNP – Robust Sockets

Typical Server and Client

Show/explain the working of TCP Echo server and client (version 1) from the book source code:

  • tcpcliserv/tcpcli01.c
  • tcpcliserv/tcpserv01.c
  • lib/str_echo.c
  • lib/str_cli.c

Robustness and Responsiveness  Issues with above server and client

Zombies

Server listener should clean up worker server processes by SIGCHLD handler and waitpid.

Show/explain the following code:

tcpcliserv/tcpcli04.c : Multiple connects to server. On exit results in zombies which are not effectively handled by wait in SIGCHLD handler (needs waitpid in a loop).

tcpcliserv/tcpserv04.c      tcpcliserv/sigchldwaitpid.c

(Flawed SIGCHLD handler) tcpcliserv/sigchldwait.c

Server Process Termination

If server is terminated prematurely tcpcli04.c will know of it only on attempting a socket library call. Typically it will be waiting for user input at fgets(). A more responsive client would detect that the server has terminated even while waiting for user input at fgets() and immediately inform the user of the situation. Solution is to use select() function to block on both stdin as well as socket.

Show/explain the following code:

select/strcliselect01.c

Shutdown function

The shutdown function allows us to terminate only read or write directions of the socket. This is useful in scenarios where we want to initiate close from one end but don’t want to lose any data that the other end may have sent us but which is in transit. Using close() would simply throw away the data in transit. Shutdown of write end of socket followed by read on read end will ensure proper closure of connection without any loss of data which was in transit at the time of shutdown inititiation.

int shutdown (int sockfd, int howto)

howto: SHUT_RD – Only read half of connection is closed

SHUT_WR – Only write half of connection is closed

SHUT_RDWR – Both read and write halves of connection are closed.

The following version of str_cli function uses shutdown() instead of close(). It also operates on buffers instead of line centric code (e.g. Readline). Show/explain the code.

select/strcliselect02.c

Reading Assignments

Chapter 4 Elementary TCP Sockets

Chapter 5 TCP Client/Server Example

Section 6.3 to Section 6.8 of Chapter 6 I/O Multiplexing: The select and poll Functions.

Assignments

Assignment 1

You should try out all the examples mentioned above and write down your observations.

Assignment 2

Implement what you have learnt about robustness and responsiveness to improve Assignment 4 of Introduction to Sockets. Write down your observations of the same.

UNP – Introduction to Sockets

Given below are references to slides from a US university (links obtained via Google search and so presumed to be freely accessible to anybody on the net) which are the key teaching slides for this topic:

1) Basic Socket API 1: http://www.cs.rpi.edu/academics/courses/spring98/netprog/lectures/ppthtml/sockets/sockets.ppt

2) Basic Socket API 2: http://www.cs.rpi.edu/academics/courses/spring98/netprog/lectures/ppthtml/tcp_sockets/tcp_sockets.ppt

The references to chapters and examples below are of the course book (and not the above slides).

Show/explain the working of:

  • intro/daytimetcpcli.c
  • intro/daytimetcpsrv.c

Reading Assignments

Chapter 1 Introduction

Chapter 2 Sockets Introduction

Assignments

Assignment 1

Study and try out the daytime tcp client and server example.

Assignment 2

Write Sayings client and server programs with the following behavior

  • As in the daytime server, whenever the server accepts a connection from a client, it sends a saying to the client and then closes the connection.
  • The client just reads the saying from the server and then displays it.
  • Make just one source file each for the client code and the server code. Avoid using the library functions such as readn, writen. The assignment folder containing the source files should be named Saisays.
  • The client program should take the ipaddress and port of the server program as an argument in the form of ipaddr:port. e.g. “./saicli 192.0.2.5:4000”. This will enable us to test the client program with any Saisays server. The server program should also take the same argument. In the case of the client the argument refers to the server ip address and port. In the case of the server also it refers to the server’s ip address and port. This will allow us to copy and run the server program to any machine without program modification and recompilation.

Assignment 3

Write a program called stresser which will subject the server to a stress test by flooding it with requests from multiple clients. Simulate server doing time consuming tasks by adding a sleep for some milliseconds, if required.

Assignment 4

Then increase the ‘availability’ of the server by creating a separate process for each connection accepted by it. This new process handles the communication with the client.

Also, subject the modified server program to a stress test by connecting to it from multiple clients, initially from the same machine and later on from many different client machines.

Write a report giving your observations and conclusions regarding the assignments. For submission of assignments put solutions for each assignment in a separate folder named appropriately (e.g. assign1).

UNP – Introduction to Networking

Theory courses on networking would have covered the important concepts of networking, OSI model and TCP/IP protocol. However, if that has not been done so far or your knowledge of them is fuzzy then here are some references to slides from US universities (links obtained via Google search and so presumed to be freely accessible to anybody on the net) with suggestions on how to study them as well as the particular topics to be studied (within those slides).

1) http://www.cs.cmu.edu/afs/cs/academic/class/15441-f01/www/lectures/lecture01.ppt
Suggested speed: Quick run through

Suggested (slide) topics to study/read/refresh: Internet, Protocol, Network Edge, Network Core, Circuit Switching, Packet Switching

2) http://www.cs.cmu.edu/afs/cs/academic/class/15441-f01/www/lectures/lecture02.ppt
Suggested speed: Quick run through;

Suggested (slide) topics to study/read/refresh: Layering

3) http://www.cs.cmu.edu/afs/cs/academic/class/15441-f01/www/lectures/lecture03.ppt
Suggested speed: Medium

Suggested (slide) topics to study/read/refresh: Applications and Application Layer Protocols, Client-Server Paradigm, UDP, TCP, Port Numbers, Names and Addresses. You may exclude the Socket API detailed slides as that is covered later on in this course.

4) https://cs.nmt.edu/~liu/CSE389/Lect_01.ppt

Suggested speed: Quick run through

Suggested (slide) topics to study/read/refresh: Headers, Router, Byte Ordering, Network Byte Order, MultiPlexing, Modes of Service, Error Control, Flow Control, End-to-End v/s Hop-to-Hop, Buffering, Addresses, Broadcasts

5) https://cs.nmt.edu/~liu/CSE389/Lect_02.ppt

Suggested speed: Quick run through

Suggested (slide) topics to study/read/refresh: Ethernet, CSMA/CD, IP datagrams, IP Addresses, Class A, B.., IP Services, UDP, TCP, Modes of Service, Connection Oriented, TCP vs. UDP. You may exclude the TCP segment detailed slides, the three-way handshake slides, the IP datagram example detailed slides and the HTTP slides.

6) http://www.cs.utep.edu/cheon/cs3331/notes/network.ppt

Suggested speed: Quick run through

Suggested (slide) topics to study/read/refresh: Socket Programming, Server vs. Client Sockets, Server Sockets, Client Sockets, Echo Server, Echo Client. You may exclude Multi Echo Server and the whole set of RMI slides (last part of the slides).

Unix Network (socket) Programming including pthread Programming

This course may have been last taught by me in 2008 in a deemed university in Andhra Pradesh, India. I may have taught this course to/for three or four batches/years. The class size was typically around 14 to 18 students.

Last updated on 23rd March 2014

This is a Lab course I taught for I. M.Tech. (Comp. Sc.).  It teaches network client and server programming using sockets and threads.

On completion of the course, most students have a platform from which they can take off to areas like writing an Internet browser client (like Firefox) or even an HTTP server (like Apache). Of course, we get time only to cover the fundamentals and so the student will have to get deeper into these topics on his/her own. But the platform would have been laid. As a by-product the student would have learned pthread programming which enables him/her to do any multi threaded programming tasks (using pthread or other thread libraries). I have been given to understand that multi-threaded programming is useful for multi-core programming and so I have ensured that adequate coverage is done for this topic.

Prerequisites for the Course

C Programming including debugging skills, User level knowledge of Unix (Unix commands like ls, chmod, mkdir, kill etc.) knowledge of file i/o and signal Unix system calls.

Course Book
Unix Network Programming Volume 1, 3rd edition: The Sockets Networking API by W. Richard Stevens, Bill Fenner and Andrew M. Rudoff (http://www.informit.com/store/unix-network-programming-volume-1-the-sockets-networking-9780131411555). Here’s the book support site: http://www.unpbook.com/.

Ideal Coverage (subject to time limitations)

  1. Quick revision of TCP/IP fundamentals. Almost all of the topics of TCP/IP fundamentals are covered in a Computer Networks theory course done at P.G. level. So we just did a quick revision. Topics revised: Internet, protocols, client-server model, TCP, connection oriented service, UDP, connection less service, packet switching, sequencing, error control, flow control, full duplex, half duplex, protocol stack, protocol headers. IP address, IPv4, IPv6, Port numbers, well known ports, hostnames, dns, Byte ordering, Network byte order, Ethernet, MAC address.
  2. Introduction to sockets:
    1. Socket APIs: sockaddr structure, socket(), htonx, ntohx functions, bind(), inet_aton(), inet_ntoa(), inet_pton(), connect(), listen(), accept(), read(), write(), close()
    2. Simple daytime socket client and server program.
    3. Assignment: Writing a Sayings client and server
  3. Robust sockets:
    1. TCP echo client and server. Server uses worker processes for each connection
    2. Avoiding zombies by handling SIGCHLD signal
    3. Using select() to wait on multiple file descriptors for I/O. How select enables writing of more responsive client and server programs.
    4. Shutdown function enabling closure of only Read or Write halves of the connection. How shutdown() enables graceful closure of connection between client and server.
    5. Assignments: Bringing in all above mentioned features, step-by-step, into Sayings client and server.
  4. Posix threads (pthreads):
    1. Basic thread functions: Pthread_create(), pthread_exit(), pthread_join(), pthread_detach()
    2. Thread synchronization using thread mutexes and condition variables: pthread_mutex_lock(), pthread_mutex_unlock(), pthread_cond_wait(), pthread_cond_signal(); Dangers of multi-threaded program and how to avoid them.
    3. Assignments: Sayings server implemented as a worker pool using multithreading – two versions, last version uses condition variables for higher efficiency.
  5. HTTP:
    1. Quick introduction to HTTP protocol
    2. Optional HTTP Assignment: Program which makes an HTTP browser request and displays/stores response from server.

Index of Links to Chapter Wise Course Pages

Please note that, if I recall correctly, I used slides from usually US university links (obtained usually via Google search  and so presumed to be freely accessible to anybody on the net) for all of the topics – why reinvent the wheel? However the topics links below clearly specify the sections covered/to be read as reading assignments and the assignments, most, if not all, of which were created by me specially for this course. So I think they give a good framework for students to learn Network (socket) Programming and pthread programming in a semester in college/university environments. [In some years, due to lack of time the pthread programming followed by the http parts of the course got moved to a separate course done after the Network programming course. Two separate courses allows/gave time for the students to attempt/work on most assignments of the course(s) thereby strengthening the knowledge they gained of the topics covered by the course(s).]

Further, please note that, as a first step, I have focused on putting up the course content used by me to teach this course in the deemed university in Andhra Pradesh, India, suitably modified, on this blog. As part of the minor modifications for putting it up on the publicly accessible blog, some errors may have crept in. I have not checked all the modifications for accuracy (due to other demands on my time). If this course (on this blog) does get utilized by students then the errors in the modified part will come to light and will get fixed by me (or others). I think that is a better way of investing my (and others) time for fixing any errors in this course (on this blog).

Introduction to Networking

Introduction to sockets

Robust and Responsive socket programs

Sockets – Miscellaneous

Threads

http

Former Student Feedback

A former student who had been taught these courses by me, wrote me on 22nd March 2014:

These courses (Advanced Unix Programming and Unix Network Programming) went a long way in helping me land my job at Alcatel-Lucent. I had a one-on-one interview with my hiring manager that was entirely on Unix. After joining the company I learned that this person(manager) was a big time ‘Unix fan’. It was very satisfying to have done well in that interview. On the job, we completely relied on Solaris Unix based servers and the concepts of processes and threads gained from these course(s), went a long way in helping me grasp the software.
Thank you Ravi Sir.

APUE – Chapter 15 – Interprocess Communication

You may find the reference given below to slides from a US university (links obtained via Google search and so presumed to be freely accessible to anybody on the net) worth viewing: http://www.cs.stevens.edu/~jschauma/810/lecture07.pdf (titled Interprocess Communication).

Reading Assignments

Entire Chapter

Report Submission

Run example programs 15.1 to 15.4 (Summary of UNIX System IPC, Two ways to view a half-duplex pipe, Half-duplex pipe after a fork, Pipe from parent to child – not programs), 15.5 (Send data from parent to child over a pipe), 15.6 (Copy file to pager program), 15.7 (Routines to let a parent and child synchronize), 15.11 (Copy file to pager program using popen), 15.12 (The popen and pclose functions) and verify that the programs work as expected. Write a report (C15report.txt) giving your observations. For each example typically limit the observation to two or three lines.

Also add observations relating to the assignments below in the report. The observations for these can be longer (they they do not have to be longer).

Assignment Submissions

  • 15A) Write a program which uses the pipe system call. Possible programs are: a) Extending the shell program written previously to handle piped commands b) Using pipe as a means of communicating tasks and results between mpcalc main program and worker processes.
  • 15B) Use fifo for implementing communication between mpcalc and worker processes.
  • 15C) Use Message Queues for implementing communication between mpcalc and worker processes.
  • 15D) Use Shared Memory (and Semaphores) for implementing communication between mpcalc and worker processes.