Thursday, 17 October 2013

Process Management Fundamentals

Let us start with an analogy between cooking & process management.

#You will have a recipe which contains list of ingredients and step by step procedures to cook the dish.
You may need to do some pre-processing of ingredients like cleaning, soaking, cutting etc..
Your recipe and ingredients can be anywhere in your kitchen till you start cooking.

$We have many .c & .h files as source code, which will be residing in your secondary memory, Hard disk. On compilation with tools like gcc we can see some *.o files and after linking we will get ELF format (executable and loadable) binary file. These files will be still residing in your secondary memory.

#You will start loading your ingredients in a vessel over stove, where the actual process of cooking happens.

$on ./program (execution), loader loads your elf into main memory (RAM) where the execution begins.

So a program in execution is called process. Process has its own address space (memory space in RAM) and other resources

How multiprocessing works?

Though we feel like multiple processing happening simultaneously actually one process gets the processor time slice at a time. but the switching between process gives us an illusion of things happening simultaneously. Each process will get the processor time slice in a way managed by scheduler. scheduler manages several queues and process will be moved in and out of queue for execution based on some policies like round robin. scheduler ensures all the process gets fair amount of processor time slice.

Parent and child process

In Linux after boot the first process created in "init". A process can create another child process by forking (fork() system call). the child process will be a copy of parent process memory address space. We will understand this through a program.

#include <stdio.h>
void main()
{
int pid,ppid,cpid;
ppid = getppid();/* parent process pid here it is bash shell*/
       cpid = getpid(); /*print cpid to know current process id, this will be the parent for forked child                       process*/
pid=fork();
/* creation of child process- 2 copies of same process will be there in memory ,
whatever comes after fork() will be executed twice in one instance (child)pid will be 0 and in other pid           will be parent's pid*/
if(pid==0){
printf("child\n");
printf("process getpid: %d\n", getpid());
printf("process getppid: %d\n", getppid());
sleep(20);
printf("child\n");
printf("process getpid: %d\n", getpid());
printf("process getppid: %d\n", getppid());
}
else{
sleep(10);
printf("parent\n");
printf("process getpid: %d\n", getpid());
printf("process getppid: %d\n", getppid());
}
}

Compilation:

gcc process1.c -o process1

Execution: ./process1 & (& means background process)

Explanation:

ppid = getppid(); // This will get the parent process id, here the parent process will be bash shell from where we are executing the ./process1

pid = getpid(); will give the pid of the current process (process1)

pid=fork( ); forking a child process, creating copy of parents process address space.

whatever comes after fork( ) will be executed both be parent and child,

Example:
pid= fork( );
printf(" test process\n");

you will get 2 prints (one belongs to parent and another belongs to child just now born by forking)

 but in out program this will not happen. How?

we will be having 2 pid values on fork( ), one with pid of child process and another with pid of 0

if(pid ==0)
{
statements
}

It means that child process is running

if (pid>0)
{
}

It means parent process is running. This sounds little confusing?

fork ( ) executes from parent process and not from child process as it was the output of fork, fork was executed in parent process address space and not in child process address space.

As a result there will be 2 pid value
1. pid of the newly formed child process in parent's address space.
2. pid with value '0' in child' address space.

In our program we are checking the condition pid==0 first, so child process will get the processor time slice first.

print statement inside will get printed, but how far? till sleep() call.

sleep call results in yielding the processor to scheduler which decides to give the processor time to some other process during that 20 second sleep. In our case it will transfer the control to parent process.

Output:
child
process getpid: 2800
process getppid: 2799
parent
process getpid: 2799
process getppid: 2753
child
process getpid: 2800
process getppid: 1

Output of ps -ef | grep process1 : ( ps command displays the process table)
execute this command continuouly in another command shell window in parallel

[root@localhost ~]# ps -ef | grep process1
root      2799  2753  0 16:29 pts/0    00:00:00 ./process1
root      2800  2799  0 16:29 pts/0    00:00:00 ./process1
root      2804  2777  0 16:29 pts/1    00:00:00 grep process1
[root@localhost ~]# ps -ef | grep process1
root      2800     1  0 16:29 pts/0    00:00:00 ./process1
root      2806  2777  0 16:29 pts/1    00:00:00 grep process1
[root@localhost ~]# ps -ef | grep process1
root      2809  2777  0 16:29 pts/1    00:00:00 grep process1



you can observe 2  process1 entries (parent, child)

second colum is child process id, third colum is parent process id
2799 is the parent process id
2753 is the parent of parent process id nothing but our shell prompt
2800 is the id of forked child process.

Actually the child process runs first (if(pid==0) so in output of the program we can see

child
process getpid: 2800
process getppid: 2799

then goes for a 20 seconds sleep, during that time parent process was granted time slice by the child process, though the parent process sleeps for 10 seconds it will capture the time slice again before the child which sleeps for 20 seconds

parent
process getpid: 2799
process getppid: 2753

Then parent terminates, now the child becomes an orphan, need proof?
check ps -ef

root      2800     1  0 16:29 pts/0    00:00:00 ./process1

parent process id changed to 1 from 2799, 1 is id of process dispatcher which adopts the orphan child as parent terminates before child.

what happens if child tries to terminate when parent is not active in execution, you can give a try simply by changing if(pid==0) to if(pid>0) in the code and change the printf statements (parent and child) accordingly.

you will get defunct output in ps -ef which shows child terminates when parent is not in execution.

synchronization between parent and child:

The order of execution between parent and child process is based on scheduler but synchronization between parent and child process is achieved by wait( ) call in parent process, which waits for the child process to terminate and then proceed with execution.

Just add a wait( ) call in parent process (pid > 0) to achieve synchronization, observe the output print and ps -ef