Wednesday 27 November 2013

Serial Console Mode Debugging with linux

If you want to redirect your kernel boot messages to serial console for debugging some issues in linux kernel booting, please follow the steps:

1. Connect a null modem cable from your test linux  machine (COM1/2) port to your serial port in laptop/desktop.

2. In Laptop with windows OS you can install putty/teraterm and set the necessary baud rate and other settings (9600 in my case)

3. In linux machine edit the grub.conf file 
    vi /boot/grub/grub.conf

4. Append the following lines after "hiddenmenu"
    serial --unit=0 --speed=9600
    terminal --timeout=10 console serial

5. comment out splashimage (as serial console will not have any graphics support)
    #splashimage=(hd0,0)/grub/splash.xpm.gz

6. Add "console=tty0 console=ttyS0,9600" to kernel command line of the kernel we want to debug

   Example: kernel /vmlinuz-2.6.32-42.0.10.ELsmp ro root=LABEL=/ console=tty0 console=ttyS0,9600
    
7. Edit file /etc/inittab
    append the line "1:23:respawn:/sbin/agetty ttyS0 9600 vt100" at the end

8. reboot

9. You will find press any key to continue message in your windows machine teraterm and all your log messages will be redirected to this console which you can save it for analysis.

Note: The above procedure was tested on centos 6.3 steps may vary for older kernels.

Thursday 17 October 2013

Process Management Fundamentals

Let us start with an analogy between cooking & process management.

#You will have a recipe which contains list of ingredients and step by step procedures to cook the dish.
You may need to do some pre-processing of ingredients like cleaning, soaking, cutting etc..
Your recipe and ingredients can be anywhere in your kitchen till you start cooking.

$We have many .c & .h files as source code, which will be residing in your secondary memory, Hard disk. On compilation with tools like gcc we can see some *.o files and after linking we will get ELF format (executable and loadable) binary file. These files will be still residing in your secondary memory.

#You will start loading your ingredients in a vessel over stove, where the actual process of cooking happens.

$on ./program (execution), loader loads your elf into main memory (RAM) where the execution begins.

So a program in execution is called process. Process has its own address space (memory space in RAM) and other resources

How multiprocessing works?

Though we feel like multiple processing happening simultaneously actually one process gets the processor time slice at a time. but the switching between process gives us an illusion of things happening simultaneously. Each process will get the processor time slice in a way managed by scheduler. scheduler manages several queues and process will be moved in and out of queue for execution based on some policies like round robin. scheduler ensures all the process gets fair amount of processor time slice.

Parent and child process

In Linux after boot the first process created in "init". A process can create another child process by forking (fork() system call). the child process will be a copy of parent process memory address space. We will understand this through a program.

#include <stdio.h>
void main()
{
int pid,ppid,cpid;
ppid = getppid();/* parent process pid here it is bash shell*/
       cpid = getpid(); /*print cpid to know current process id, this will be the parent for forked child                       process*/
pid=fork();
/* creation of child process- 2 copies of same process will be there in memory ,
whatever comes after fork() will be executed twice in one instance (child)pid will be 0 and in other pid           will be parent's pid*/
if(pid==0){
printf("child\n");
printf("process getpid: %d\n", getpid());
printf("process getppid: %d\n", getppid());
sleep(20);
printf("child\n");
printf("process getpid: %d\n", getpid());
printf("process getppid: %d\n", getppid());
}
else{
sleep(10);
printf("parent\n");
printf("process getpid: %d\n", getpid());
printf("process getppid: %d\n", getppid());
}
}

Compilation:

gcc process1.c -o process1

Execution: ./process1 & (& means background process)

Explanation:

ppid = getppid(); // This will get the parent process id, here the parent process will be bash shell from where we are executing the ./process1

pid = getpid(); will give the pid of the current process (process1)

pid=fork( ); forking a child process, creating copy of parents process address space.

whatever comes after fork( ) will be executed both be parent and child,

Example:
pid= fork( );
printf(" test process\n");

you will get 2 prints (one belongs to parent and another belongs to child just now born by forking)

 but in out program this will not happen. How?

we will be having 2 pid values on fork( ), one with pid of child process and another with pid of 0

if(pid ==0)
{
statements
}

It means that child process is running

if (pid>0)
{
}

It means parent process is running. This sounds little confusing?

fork ( ) executes from parent process and not from child process as it was the output of fork, fork was executed in parent process address space and not in child process address space.

As a result there will be 2 pid value
1. pid of the newly formed child process in parent's address space.
2. pid with value '0' in child' address space.

In our program we are checking the condition pid==0 first, so child process will get the processor time slice first.

print statement inside will get printed, but how far? till sleep() call.

sleep call results in yielding the processor to scheduler which decides to give the processor time to some other process during that 20 second sleep. In our case it will transfer the control to parent process.

Output:
child
process getpid: 2800
process getppid: 2799
parent
process getpid: 2799
process getppid: 2753
child
process getpid: 2800
process getppid: 1

Output of ps -ef | grep process1 : ( ps command displays the process table)
execute this command continuouly in another command shell window in parallel

[root@localhost ~]# ps -ef | grep process1
root      2799  2753  0 16:29 pts/0    00:00:00 ./process1
root      2800  2799  0 16:29 pts/0    00:00:00 ./process1
root      2804  2777  0 16:29 pts/1    00:00:00 grep process1
[root@localhost ~]# ps -ef | grep process1
root      2800     1  0 16:29 pts/0    00:00:00 ./process1
root      2806  2777  0 16:29 pts/1    00:00:00 grep process1
[root@localhost ~]# ps -ef | grep process1
root      2809  2777  0 16:29 pts/1    00:00:00 grep process1



you can observe 2  process1 entries (parent, child)

second colum is child process id, third colum is parent process id
2799 is the parent process id
2753 is the parent of parent process id nothing but our shell prompt
2800 is the id of forked child process.

Actually the child process runs first (if(pid==0) so in output of the program we can see

child
process getpid: 2800
process getppid: 2799

then goes for a 20 seconds sleep, during that time parent process was granted time slice by the child process, though the parent process sleeps for 10 seconds it will capture the time slice again before the child which sleeps for 20 seconds

parent
process getpid: 2799
process getppid: 2753

Then parent terminates, now the child becomes an orphan, need proof?
check ps -ef

root      2800     1  0 16:29 pts/0    00:00:00 ./process1

parent process id changed to 1 from 2799, 1 is id of process dispatcher which adopts the orphan child as parent terminates before child.

what happens if child tries to terminate when parent is not active in execution, you can give a try simply by changing if(pid==0) to if(pid>0) in the code and change the printf statements (parent and child) accordingly.

you will get defunct output in ps -ef which shows child terminates when parent is not in execution.

synchronization between parent and child:

The order of execution between parent and child process is based on scheduler but synchronization between parent and child process is achieved by wait( ) call in parent process, which waits for the child process to terminate and then proceed with execution.

Just add a wait( ) call in parent process (pid > 0) to achieve synchronization, observe the output print and ps -ef




Sunday 29 September 2013

VirtualBox first boot hang issue resolution

Virtual box is a virtual software package, where the user can load one or multiple guest OS like Linux, windows under a single host operating system (like windows, Linux) emulates hardware peripherals.

If you have a laptop with host OS as windows, you can load a guest linux OS with virtualbox and you can conduct your experiments with linux. 

Download the virtual box binary and install.

You can follow the basics of installation from Virtualbox wiki site

Create Virtual machine, you need to specify/select some parameters for virtual machine to be created.

In my case I have created a VM with linux RHEL 6.3 guest OS

Hard disk will be emulated as VDI - Virtual Disk Image which will be configured during creation of virtual machine with virtualbox. (say 8GB to 16 GB)

some part of RAM need to be dedicated to the virtual machine. (say 1024 MB)

Allocate 64 MB for graphics RAM.

set some shared folder

After creating a VDI, you need to install guest OS. you need to specify a location for iso image of hard disk or you need to insert the installation DVD in DVD drive, CD/DVD drive will be identified by virtualbox

Now start installing your guest OS with all necessary packages like make, gcc, glibc, kernel headers.

after installing your guest OS reboot

Issue 1: Virtual machine hangs here

Solution: Remove the CD/DVD drive from boot disk priority in the settings.

Issue 2: After reboot you will go some first time boot parameter starting with welcome screen, time, user login details, finally it may hang on kdump screen

Solution: 
while booting guest OS, edit the kernel cmdline (press e on grub bootloader loading)
edit the kernel command line and append "single" at the end
now guest OS will boot as single user
To disable firstboot
#chkconfig firstboot off  
Type ctrl + D to boot in graphics mode (run level 5).

Issue 3: Vbox package installation needed useful in shared folder

Solution: specify the path C:\Program Files\Oracle\VirtualBox\VBoxGuestAdditions.iso path to storage setting in settings.

VirtualBox storage setting



Start your experiments in guest OS, use shared folder to share files/folders between host and guest OS.





Wednesday 17 July 2013

POWER MANAGEMENT - INTRODUCTION

Power management is a extensive topic, some basic elements need to understood be we go deeper.

In Linux if we click on shutdown 4 options will be provided

1. Shutdown
2. Standby
3. Suspend
4. Hibernate

Standby mode: LCD and display back light are turned off, CPU clock speed is reduced. Power saving will be less but latency will be less. Here latency refers to the time taken to resume.

Suspend mode: Suspend to RAM, CPU in sleep state which means power is turned off in most of the devices and parts of CPU. But DRAM puts itself in self refresh mode to preserve the machine state. CPU will be wake up from sleep state by some preprogrammed event.

Hibernate: Suspend to disk, is like powering down a system by retaining its state into disk, active pages in RAM will be moved to disk (persistent storage), power to RAM is also turned off. Hibernate needs a swap partition or swap file space to swap the RAM contents. Latency will be more.


Wednesday 5 June 2013

KERNEL THREAD SYNCHRONIZATION WITH SEMAPHORE

// Kernel thread synchronization with semaphore
// Theory:
//Two threads can be synchornized by semaphore, blocked threads are pushed //into semaphore queue
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/kthread.h>
#include <linux/sched.h>
#include <linux/semaphore.h>
#include <linux/delay.h>

MODULE_LICENSE("Dual BSD/GPL");


struct kthr_data
{
    const char *name; //kthread name
    struct semaphore *sem1;
    struct semaphore *sem2;
};

static struct kthr_data dking, dqueen;
static struct semaphore kingsem, queensem;
static struct task_struct *tking, *tqueen;

/* our case:
   down dking sem1 decrements to make it (1->0)zero and runs the thread tking unlocked to locked state //king works
   up dking sem2 increments to make it (0->1) unlock because sem2 waitlist is still empty
   down dqueen sem1 decrements to make it (1->0)zero and runs the thread tking unlocked to locked state //queen works
   up dqueen sem2 increments to make it (0->1) unlock

*/
int kthread_function(void *data)
{
    struct kthr_data *pdata = (struct kthr_data*)data;
    while(1)
    {
        //down operation decrements the counter, if the count is 0 then down operation over the semaphore
        //1. blocks the calling kernel thread
        //2. Insert it into task structure to the queue of the semaphore
        //3. schedule another task
        down_interruptible(pdata->sem1); //uninterruptible sleep in sem->wait list when sem1 is zero
        printk("%s\n", pdata->name);
        mdelay(499);
        msleep(1);
        up(pdata->sem2);
        //if the semaphore wait queue is not empty then it pick a task and make it runnable else increment the counter
        //if sem->wait list is empty increment sem->count and leave if sem->count is 0 remove the first waiter structure from the sem queue
        if(kthread_should_stop())
            break;
    }
    return 0;
}

struct task_struct *ts;

static int __init kthr_init(void)
{
    printk("kthread init called");
    //semaphore is an object consists of a counter and a queue of waiting tasks

    sema_init(&kingsem, 1); //unlocked state
    sema_init(&queensem, 0); //locked state
    dking.name = "king";
    dqueen.name = "queen";
    dking.sem1 = &kingsem;
    dking.sem2 = &queensem;
    dqueen.sem1 = &queensem;
    dqueen.sem2 = &kingsem;

    tking = kthread_run(kthread_function, &dking, "king");
    tqueen = kthread_run(kthread_function, &dqueen, "queen");
    return 0;
}

void __exit kthr_exit(void)
{
    printk("tking_stop called");
    kthread_stop(tking);
    printk("tqueen_stop called");
    kthread_stop(tqueen);
}

module_init(kthr_init);
module_exit(kthr_exit);
//output:
//king queen king queen ....
//why we synchronize to know this just comment down and up lines and see there //will not be any ordered synchronized print statements
 

LINUX KERNEL THREAD SYNCHRONIZATION WITH WAITQUEUE


// Kernel thread synchronization with wait queue
/*
This code post is useful to learn the synchronization behavior of kernel thread with respect to synchronization 

Theory: 

Threads synchronization is based on event, 2 threads use completion event for synchronization and unblock the other thread

Blocked thread waiting for an event waits in a wait queue. when it receives the event it is eligible by the scheduler to run

Task in blocked state will expect some condition to be true (non zero)

wait queues in linux are defined by wait queue header which is a list_head node (Double linked list node) linked to wait_queue_t nodes which holds a function pointer to the task

wait_event_interruptible puts the task into waitqueue

wake_up_interruptible wakes the task on wakeup event


Code Explanation:

Initialization creates 2 instances of kernel thread (tone, tzero)with same kernel function (kthread function)
The kernel threads tone and tzero should alternatively display their names one and zero
Thread parameter structure is assocaited with each thread instance kthread_data
Thread parameters are
1. name
2. wait queue to synchronize the two threads
3. condition variable used with the wait queue
4. Pointer to other thread's parameter

kthread_function?

conditionally block on a wait queue using wait_event_interruptible

output the name
wait for the event in wait_queue
unblock the other thread using wake_up_interruptible

tone is the first thread with condition variable to true (1 - non-zero)
tzero thread with condition variable initialized to false (0 - zero)



*/

 
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/kthread.h>
#include <linux/sched.h>
#include <linux/delay.h>

MODULE_LICENSE("Dual BSD/GPL");

struct kthr_data {
const char *name;
//wq<--wqh-->wq
wait_queue_head_t thrdwq; // kernel thread waits on this queue //wait_queue_head_t is the list head other nodes are wait_queue_t //nodes which holds the task (function pointer)
int condition;
struct kthr_data *datalink;
};

static struct kthr_data done, dzero;


int kthread_function(void* data)
{
struct kthr_data *pdata = (struct kthr_data*)data; // this data depends on the kernel thread running "one"/"zero"
while(1)
{
printk("wait(%s)", pdata->name);
wait_event_interruptible(pdata->thrdwq, pdata->condition);
pdata->condition = 0;
printk("%s\n",pdata->name);
mdelay(500);
msleep(1);
pdata->datalink->condition = 1;
printk("wakeup (%s)\n", pdata->name);
wake_up_interruptible(&pdata->datalink->thrdwq);//wake up the //task waiting in queue
if(kthread_should_stop())
break;
}
return 0;

}



struct task_struct *tone, *tzero;

int __init kthread_init(void)
{
printk("\nkthread init");
init_waitqueue_head(&done.thrdwq);
init_waitqueue_head(&dzero.thrdwq);
done.condition = 1;
dzero.condition = 0;
done.name = "one";
dzero.name = "zero";
done.datalink = &dzero;
dzero.datalink = &done;

tone = kthread_run(kthread_function, &done, "one");
tzero = kthread_run(kthread_function, &dzero, "zero");
return 0;
}

void __exit kthread_exit(void)
{
printk("\nkernel exit thread");
kthread_stop(tone);
kthread_stop(tzero);
}

module_init(kthread_init);

module_exit(kthread_exit);

//output: one zero one zero ...
 



 

Tuesday 21 May 2013

MESSAGE SIGNALED INTERRUPT


Hope you might have read my previous article "Interrupt journey  from hardware to software"

Due to increasing pressure on chipset and processor packages to reduce pin count, the need for interrupt pins is expected to diminish over time.Devices, due to pin constraints, may implement messages to increase performance. 

PCI Express endpoints uses INTx emulation (in-band messages) instead of IRQ pin assertion. Using INTx emulation requires interruptsharing among devices connected to the same node (PCI bridge) while MSI is unique (non-shared) and does not require BIOS configuration support. As a result, the PCI Express technology requires MSI support for better interrupt performance.


INTERRUPT DELIVERY MECHANISM


Legacy PCI Interrupt Delivery

   

This mechanism supports devices that must use PCI-Compatible interrupt signaling (i.e., INTA#, INTB#, INTC#, and INTD#) defined for the PCI bus. Legacy functions use one of the interrupt lines to signal an interrupt. An INTx# signal is asserted to request interrupt service and deasserted when the interrupt service accesses a device-specific register, thereby indicating the interrupt is being serviced.


Native PCI Express Interrupt Delivery 


Native PCI Express device use message signaled interrupt. A Message Signaled Interrupt is not a PCI express message instead it is a simple memory write transaction. This write is distinguished from normal write by target address which is reserved for MSI interrupt delivery.

           PCI Express and Legacy Interrupt Delivery




Message Signaled Interrupts (MSIs) are delivered to the Root Complex via memory write transactions. The MSI Capability register provides all the information that the device requires to signal MSIs. This register is set up by configuration software (PCI bus driver) and includes the following information:


  • Target memory address
  • Data Value to be written to the specified address location
  • The number of messages that can be encoded into the data



         MSI Capability Register



MSI Configuration Process


The following list specifies the steps taken by software (PCI bus driver) to configure MSI interrupts for a PCI Express device.

1. At startup time, the configuration software scans the PCI bus(es) (referred to as bus enumeration) and discovers devices (i.e., it performs configuration reads for valid Vendor IDs). On discovering a PCI express function, the configuration software reads the Capabilities List Pointer to obtain the location of the first Capability register within the chain of registers.

2. The software then searches the capability register sets until it discovers the MSI Capability register set (Capability ID of 05h).

3. Software assigns a dword-aligned memory address to the device's Message Address register. This is the destination address of the memory write used when delivering an interrupt request.

4. Software checks the Multiple Message Capable field in the device's Message Control register to determine how many event-specific messages the device would like assigned to it.

5. The software then allocates a number of messages equal to or less than what the device requested. At a minimum, one message will be allocated to the device.

6. The software writes the base message data pattern into the device's Message Data register.

7. Finally, the software sets the MSI Enable bit in the device's Message Control register, thereby enabling it to generate interrupts using MSI memory writes.






Memory Write Transaction (MSI):


When the device must generate an interrupt request, it writes the Message Data register contents to the memory address specified in its Message Address register. Header fields need to filled.



MSI-x is a extension to MSI which supports additional vectors per function.

Reference: PCI Express System Architecture