Presentation on

Multi-Threading & POSIX Thread APIs

By DIPAK K. SINGH
dipak123@gmail.com / dipak123@yahoo.com

 
Version 1.0 dated 18 January 2013.

Navigation

Use mouse click, keyboard and swipe ( in touch device) to navigate.

Arrow keys and spacebar To navigate through slides
Mouse click on navigate icon shown on right bottom
Swipe left or right on touch device
Escape key To show/hide thumbnail of slides
Enter key To select highlighted slide in thumbnail mode
Mouse click on bottom bar To select slide as per position in bar

Slides have been arranged horizontally. Use left and right to navigate.
Few slides have additional slides arranged vertically. Use up and down to navigate them. Up/Down arrow of navigate icon gets highlighted for those slides.

Contents

A. Concepts of MultiThreading

  1. Life Cycle of a thread
  2. Resource sharing by threads
  3. Stack Memory
  4. Syncronisation
  5. Operating System aware threads

B. POSIX Thread

  1. POSIX Thread APIs
  2. Thread start and end
  3. Thread Termination
  4. Thread Cancellation
  5. Process Termination
  6. Thread Specific Data
  7. Thread Mutex
  8. Compilation & Linking
  9. Debugging using GNU gdb

Section A

Multi-Threading

Why Multi-Threading ?

Sequencial applications, e.g. execution of one instruction after another, is very natural and easy to understand. Then, why to go for multi-threading? Few reasons are

  1. Increase Application's Responsiveness

    Example: A GUI application will keep on interecting with users while data is being fetched from backend, say web in a browser .

  2. Better structuring of application

    Example: A file server in which individual requests are served by separate threads.

  3. Take advantage of multiple CPUs and Cores

    Example: Image rendering software can use multiple CPUs/Cores on different part of image at the same time to get result quickly.

Life Cycle of Thread

Life cycle of a thread is very similar to sequencial application (single threaded application). Fundamental states of a thread is same as single threaded application.

Created/New First state of a thread.
A thread just created. It will start running as per scheduling policy.
Running Thread is running.
Ready Ready to run when microprocessor becomes available.
Waiting Thread is waiting for a resource such as data from file.
Suspend Thread has been suspended, say by debugger.
Exit Thread exits after performing its job or terminated prematurely.
Last state of a thread.

Resource Sharing

All threads of an application see same memory image. Therefore, context switch between threads of the application is very fast. All other resources of application such as open file, socket etc. are also visible to all threads.

Though, all threads see entire memory of the application but some memory is made local to a thread in cooperative manner to enable independent stream of execution. Such data are Thread Id, Stack, Signal mask etc.

Stack Memory

Memory image of application is divided into four segments - Text, Data, Heap and Stack.

In multi-threaded application, separate stacks are created for each thread.

Why separate stack?    Answer below ( press spacebar or down arrow key ).

Stack Memory - Why separate stack?

Frame for each function call is pushed on stack memory and popped when function returns. If multiple threads push frame on same stack, popping would become dependent of other threads. Therefore, separate stacks are preferred.

Synchronisation

Synchronisation enables multiple thread to coordinate their activities. Synchronisation is required for

   1. Data access and
   2. Thread synchronisation


Synchronised Data Access

pthread
1. Mutex Lock Mutual exclusive lock. Yes
2. Read Write Lock Allows multiple concurrent reads of an exclusive write. Yes
3. Conditional variable Thread goes into sleep if condition is false. It is awaken by another thread when condition becomes true. Yes
4. Synchronised object Only one thread can access the object. Usually implemented by language. No

Synchronisation

Thread Synchronisation

pthread
1. Join A thread waits for another thread to terminate. Then moves ahead.

It is similar to UNIX wait() function which is used by parent process to wait() for child process to terminate.

Yes
2. Barrier All threads wait for other threads at the barrier point. Only when all threads arrive at the barrier point, then they all move ahead. Barrier is a point defined in the source code. Yes


pthread part of presentation contains examples of mutex ( synchronised access of data using API pthread_mutex) and join ( thread joining using API pthread_join).

Operating System Aware Threads

Programmers creates threads in an application. Threads can be implemented in library such that operating system is unaware of threads. In such a situation, multiple cores/CPUs cannot be used by the application.


Trend now a days is to map programmers' thread to kernel level threads. Operating system allocates one or more kernel threads to an application depending on active threads created by the programmer.

Kernel level threads are scheduled on CPUs/cores to run the application. Parallel processing is possible in this model.

Operating System Aware Threads - Mapping

Mapping between Programmer's Thread and Kernel Level Threads can take place in 1x1 , MxN or any other manner. 1x1 mapping is the most common as shown in first and second processes from left in the diagram below..

Section B

POSIX Thread APIs

POSIX Thread

POSIX Thread, commonly known as pthread, is a very popular C library which provides thread support to programming languages C and C++.

C++ programmers prefer boost::thread which is mainly a C++ wrapper over pthreads.

C++11 supports thread as part of standard library (std::thread). C++11 thread is gaining popularity but still quite less in installed base compared to pthread being very new (released in 2011). Boost and C++11 are out of scope of this presentation.

POSIX Thread

pthread refers to thread section of POSIX ( Portable Operating System Interface) Specification by http://opengroup.org . It has been accepted by IEEE as 1003.1 2013 specification. pthread does not include implementation. Platform vendors and third party are free to implement pthread specification.

pthread is an C library code.


This part of presentation for pthread covers commonly used APIs and important concepts only. You will be able to write multi-threaded application by the end of this presentation but refer to additional study materials to get expertise in pthread.

Thread Start and End

A thread is created by pthread_create() . A function is associated to pthread_create() to act as start routine (starting point) for the thread.


int pthread_create(pthread_t *thread,
	const pthread_attr_t *attr,
	void *(*start_routine)(void*),
	void *arg);
					

Thread ends when it returns from start_routine, calls pthread_exit() to terminate itself or cancelled by other thread.

A thread can wait for another thread to end by calling pthread_join().

Thread Termination

A thread terminates in any one of three ways

Exit status of thread
1 pthread_exit() is called argument to pthread_exit()
2 returns from start routine return value
3 thread cancellation PTHREAD_CANCELLED

When process starts, a thread is automatically created for main(). This thread is called Main Thread. Main thread can also be terminated by pthread_exit() or cancelled. However, return from main() terminates process not main thread.

Exit status of thread can be captured by pthread_join(pthread_t thread, void **exit_status_ptr).

It is possible to register cleanup functions to thread. Those registered functions are called as part of thread termination.

Process Termination

A multi-threaded process terminates in any one of four ways

Exit status of process
1 Return from main() Return value from main()
2 exit() called from any thread argument to exit()
3 All threads including main terminates Always zero
4 Killed by user or operating system.
Examples kill -9 pid, segv etc.
This is a case of abnormal termination
Not applicable

Point 3 is a case added by pthread. Other three cases are pthread independent and applicable to both sequential and multi-threaded application in exactly same manner.

Thread Cancellation

A thread can be cancelled by another thread of the same process. The recipient thread is cancellation immediately or at Cancellation Point as per setup.


int pthread_setcancelstate(int state, int *oldstate); 
					

Values of state are:
     PTHREAD_CANCEL_DEFERRED ( cancel at cancellation point) and
     PTHREAD_CANCEL_ASYNCHRONOUS ( cancel immediately).

Cancellation can be disabled too.


int pthread_setcanceltype(int type, int *oldtype);
					

Values of type are:
     PTHREAD_CANCEL_DISABLE and
     PTHREAD_CANCEL_ENABLE .
In PTHREAD_CANCEL_DISABLE mode, cancellation request received is deferred till cancelability is enabled.

Thread Cancellation Points

Cancellation Points are points in code where cancellation can take place. They are:

  1. Some of standard library calls, not all. Example is fopen()
  2. Some of pthread library calls, not all. Example is pthread_join()
  3. Explicit call of pthread_cancell()

Cancellation point ensures that thread does not get abruptly terminated. When a thread is cancelled, all regular cleanups are performed as if thread terminated normally.

Thread Specific Data

Thread specific data (TSD) is a global data for the thread. Values changed in TSD by one thread does not impact TSD of other threads.

How to use?

A variable of pthread data type pthread_key_t is created. This variable is a regular data type which must be visible to all threads. TSD is associated to variable of pthread_key_t . Look at the example below.


pthread_key_t thSpecificGlobal;

void setValue(char *str) {
	// Each thread associates its value
	pthread_set_specific(thSpecificGlobal, str);
}

char* getValue() {
	// Value saved by this thread is returned.
	return (char*)pthread_get_specific(thSpecificGlobal);
}					

Usually variable of pthread_key_t is global but not mandatory.

pthread mutex

pthread mutex can be used for synchronised access of shared data. Only one thread operates on protected data at a time.

Initialisation of mutex


pthread_mutex_t mutex_obj;
int main() {
	pthread_mutex_init(&mutex_obj);   // One time init
	... // use mutex
	pthread_mutex_destroy(&mutex_obj); // Destroy at end
}
					

Use of mutex for synchronised access


int update_record() {
	pthread_mutex_lock(&mutex_obj);     // Request mutex
	/* Only one threads runs at a time
	 * in this part of the code */
	pthread_mutex_unlock(&mutex_obj); // Release mutex
}
					

Compilation & Linking

Compiler: Since pthread is a library, any C/C++ compiler will work.

Library: pthread library comes preinstalled on almost all UNIX systems, Mac OS and other operating systems. If missing, you will have to find it and install.

Compilation:


cc simple_example.c -lpthread
					

Note need of pthread library in command line -lpthread .
Above step has been verified on Linux 2.6, Solaris 8, AIX 5.3 and Mac OS 10.9.4 .

Example Code

simple_example.c


/*
 * Simple Thread Example
 * It just creates two threads. Prints messages then exit.
 *
 * APIs used are:
 * 1. pthread_create
 * 2. pthread_attr_init
 * 3. pthread_join
 */
#include 
#include 
#include 
#include 

#define RETVAL_CHECK(retval,func_name) if(retval) {printf("Function '%s' returned error value %d\n", func_name, retval);}

/*
 * Start routine for the thread.
 * This is thread's starting point. Like main() for process.
 */
void *thread_start_routine( void *ptr )
{
    char *threadName = (char*)ptr;

    printf("%-15s: Called\n", threadName);
    sleep(10);
    printf("%-15s: Completed\n", threadName);

    return 0;
}

/*
 * Main Function
 */
int main()
{
    pthread_t thread1, thread2;
    pthread_attr_t tattr;
    int  retval;

    /* Create Thread with default behavior - null attributes object used */
    retval = pthread_create(&thread1, NULL, thread_start_routine, (void*) "First Thread");
    RETVAL_CHECK(retval,"pthread_create");

    /* Initialized by default thread attributes */
    retval = pthread_attr_init(&tattr);
    RETVAL_CHECK(retval,"pthread_attr_init");
    
    /* Create Thread with default behavior -  attributes object with default setup */
    retval = pthread_create(&thread2, &tattr, thread_start_routine, (void*) "Second Thread");
    RETVAL_CHECK(retval,"pthread_create");


    pthread_join( thread1, NULL);
    sleep(2);
    pthread_join( thread2, NULL); 

    return 0;
}
					

Another example max_number_in_array.c

Linux top

Kernel level threads created for the multi-threaded process can be seen in Linux. Specify option '-L' to UNIX command 'top' .


$ ps -Lf -u dipak
UID    PID PPID  LWP  C NLWP STIME TTY     CMD
dipak  190  189  190  0    1 Apr20 pts/5   /bin/bash
dipak  757  615  757  0    3 13:36 pts/7   simple_example
dipak  757  615  758  0    3 13:36 pts/7   simple_example
dipak  757  615  759  0    3 13:36 pts/7   simple_example
dipak  760  477  760  1    1 13:36 pts/4   ps -fL -u
$
					

For process 757, three kernel level threads were created . Ids are 757, 758 and 759 shown in LWP column.

LWP stands for light weight process which can be considered exactly similar to kernel level thread in context of multi-threaded applications.


LWP details are available at /proc<LWP Id>/ .

Debugging using GNU gdb

Debugger must be thread aware. GNU gdb is thread aware. gdb attaches to process as usually but focusses on any one of the thread at a time.

Display all threads of the process


(gdb) info threads
   2 process 350 thread 0x20f 0x00001c81 ...
 * 1 process 350 local thread 0xf03 main () at simple_example.c:84
 (gdb)
					

Thread in focus has been marked by sign * . To change focus to other thread, call


(gdb) thread 2
[Switching to thread 2 ( process 350 thread 0x20f 0x00001c81)]
0x00001c81 in thread_start_routine ...  simple_example.c:179
(gdb) 
					

View code, stepin, next etc. operations are perform on thread in focus.

Debugging using GNU gdb

gdb stops and starts process, not thread. Even in case of next, process starts then stops when next line is reached by the thread in focus. Process start and stop means all threads will be started and stopped.


Thread id ( 1, 2) printed by gdb is an unique id given to thread by gdb for tracking threads easily.

References

  1. The Open Group Base Specifications Issue 7 / IEEE Std 1003.1, 2013 Edition at pubs.opengroup.org . Login required.

  2. Multithreaded Programming Guide. Primarily for Solaris but useful on others platforms too. It covers both pthread and native Solaris thread docs.oracle.com .

  3. PThreads Primer - A Guide to Multithreaded Programmig. PDF book by Bil Lewis & Daniel J. Berg. Published long back but still quite useful. POSIXMultithreadProgrammingPrimer.pdf

  4. Tutorial on Java - Multithreading www.tutorialspoint.com .

  5. C++11 Concurrency, Part 1 www.youtube.com/watch?v=80ifzK3b8QQ ( 1st of series of 9 videos. 20-30 minutes each video ).

End of the Presentation
on
Multi-Threading


For any feedback or clarification, contact dipak123@gmail.com or dipak123@yahoo.com .