(The original version is in Japanese. Still under translation... This manual is based on version 0.2, though the library's current version is 0.3, in which SPMD and concurrency support are added. )
This is a library for writing parallel programs (for UNIX OSes). With this library, you can write parallel programs more easily than using other libraries like pthread directly. As a practical example, bzip2 is parallelized using this library.
Today, speedup of single processor is becoming difficult, and using multiple processors is a popular way to achieve high performance. This is true not only for EWS, but also for PCs and embedded processors.
However, it is not easy to write programs for parallel machines. Usually, programmers should use libraries like pthread, and need to use lock and/or semaphores. This is not easy, and tend to cause bugs that are quite difficult to debug, because the behavior of the program changes at every program run.
This library offers a way to write parallel programs that is more intuitive and easier. It offers:
For example, if "a" is declared as "Sync<int> a", you can do operations like "a.read()", "a.write(1)". The operation "a.write(1)" makes the contents of "a" 1, and the operation "a.read()" gets the contents of "a". You can do "inter process communication" using this functionality.
Here, the operation "a.read()" stops (blocks) until the operation "a.write(1)" is executed. This is called dataflow synchronization. The write operation can only be applied once for the same variable (more exactly, other operations after the 1st write operation cannot change the contents.)
SyncList<T> is a list of Sync<T>, and SyncQueue<T> is SyncList<T> whose length is limited.
WorkPool<T1,T2> supports multiple processes to extract "works" from a work pool.
In addition, this library supports SPMD style programming, and functionality for concurrent programming like timeout, interrupt, and list merger.
I will explain the usage using the samples.cc file in the samples directory. Here is the main of samples.cc with comments.
int main() { pards_init(); // ...(main:1) Library initialization Sync<int> a, b, c; // ...(main:2) Decl. of sync vars SPAWN(add(1,b,c)); // ...(main:3) fork add, wait for b, executed 3rd SPAWN(add(1,a,b)); // ...(main:4) fork add, wait for a, executed 2nd a.write(3); // ...(main:5) executed 1st int v = c.read(); // ...(main:6) wait for add(1,b,c) of (main:3) printf("value = %d\n",v); pards_finalize(); // ...(main:7) Finalize of the library }
In order to use the library, you need to call pards_init()(main:1)。 In addition, to finalize the library, you need to call pards_finalize() (main:7).
Synchronization variable are declared like Sync<int> a,b,c(main:2). In this case, these variables can contain values whose type is "int".
The function add(1,b,c) is SPAWNed at (main:3). This means that the function add(1,b,c) is forked as a process. (SPAWN is implemented as an macro).
Here, the function add is defined as follows:
void add(int i, Sync<int> a, Sync<int> b) { int val; val = i + a.read(); // ...(add:1) a.read() waits for a to be written b.write(val); // ...(add:2) b.write writes a value }
This function adds the 1st argument and the 2nd argument, and returns the value as the 3rd argument.The type of the 1st argument is simple "int", and that of 2nd and 3rd argument is Sync<int>.
"a.read()" in (add:1) blocks until the value of a is written. After the value of a is written, it restarts the execution, and get the value of it. After that, the value is added with the 1st argument, and written to the variable val.
In (add:2), val is written to b. This makes the processes that wait for b's value restart its execution.
Back to the main. The 2nd argument of the function add that is forked in (main:3) is not written by any processes. Therefore, this function will block for a while.
Likewise, the 2nd argument of the function add that is forked in (main:4) is not written at this time, this add also blocks.
Then, 3 is written to a in (main:5). This makes the function add that is forked in (main:4) restarts. After execution, 4 is written to b.
After the value is written to b, the function add that is forked in (main:3) also restarts. After execution, 5 is written to c.
The value of c is read in (main:6). This also blocks until a value is written to c. Therefore, it waits for the execution of the function add that is forked in (main:3).
As you can see, inter process communication and synchronization between the processes forked by SPAWN can be realized using Sync<int> variables.
Here, multiple writes to the same variable cannot change the value. This kind of variable is called single assignment variable.
You can write an algorithm like first-come-first-served using this functionality.
As mentioned above, SPAWN is implemented as a macro that calls fork() from it.
Sync<T> uses System V IPC to realize inter process communication; shared memory is allocated in pards_init().
In addition, semaphore of System V IPC is used in order to realize block and resume of processes and mutual exclusion of shared memory access.
Sync<T> variable only stores a pointer to shared memory and IDs of semaphores. Therefore, this variable can be passed as value to functions (arguments of function add in sample.cc). Of course, you can pass these variables as pointers or references.
The important thing is that even if you modify the global variable in the SPAWNed function, it does not affect other processes, because we use fork instead of pthread. Changing global variables in threads is a typical reason of bugs that cannot be corrected easily, but our library does not cause such bugs. And SPAWNed functions can read global / local variables that is set before SPAWN, because fork() logically copies all memory spaces (to be exact, the copy occurs only when write to the memory happens).
If we use many synchronization variables or the program runs for a long time, we need to release resources (shared memory and semaphores). Here, shared memory and semaphores are shared between multiple processes, so it is dangerous to release these resources in the destructor; even the resources is not needed in the process that writes a value to the synchronization variable, the process that reads the value still needs the resources. Therefore, basically in this library, you need to release resources explicitly.
As for variables allocated in the stack, you need to call free(). Of course, free() should be called only when other processes are not referring the resources. Typically, there is one writer process and one reader process, and just after the read is finished, free() can be called. Example of free() is in fib.cc in the samples directory.
If you allocate a synchronization variable using new, not only the value inside of the variable, but also memory for Sync<T> will be stored in the shared memory area. In this case, you can just use delete in order to release both shared resources and memory area for the synchronization variable.
The reason of this specification is that I wanted to make the specification similar to that of SyncList<T>. I will explain SyncList<T> next.
SyncList<T> is used in the generator-consumer pattern. In this pattern, one process creates list of values (generator), and the other process uses these values (consumer). By using different processes for generating and consuming lists, pipeline parallel processing becomes possible.
I will explain this using listsample.cc in the samples directory.
int main() { pards_init(); SyncList<int> *a; // ...(main:1) declaration of first cell of the list a = new SyncList<int> // ...(main:2) allocation of the list cell SPAWN(generator(a)); // ...(main:3) fork generator process SPAWN(consumer(a)); // ...(main:4) fork consumer process pards_finalize(); }
First,the first "cell" of the list is declared and allocated at (main:1), (main:2). Then, the generator process and the consumer process are forked at (main:3), (main:4). The first cell of the list is passed to the generator process and the consumer process.
Then, let's see the definition of the generator process.
void generator(SyncList<int> *a) { int i; SyncList<int> *current, *nxt; current = a; // ...(gen:1) assign the argument to current for(i = 0; i < 10; i++){ current->write(i); // ...(gen:2) write a value to the current list cell printf("writer:value = %d\n",i); nxt = new SyncList<int>; // ...(gen:3) allocate new list cell current->writecdr(nxt); // ...(gen:4) set the allocated cell as cdr of the current cell current = nxt; // ...(gen:5) set the allocated cell as the current cell sleep(1); // ...(gen:6) "wait" to show the behavior } current->write(i); printf("writer:value = %d\n",i); current->writecdr(0); // ...(gen:7) terminate the list using 0 }
The generator process creates a list and inserts values to it. Like Sync<T>, a value can be set to the list cell using write() (gen:2).
The next cell of the list is created using new at (gen:3). Then the cell is connected to the previous cell using writecdr() at (gen:4).
Here, a new cell should be created using "new"; don't connect a cell that is allocated on the stack. This is because the consumer process cannot read the memory if the cell is on the stack. The cell allocated using new is stored in the shared memory, so the consumer process can read it.
Because I need to make "new" of SyncList<T> allocate shared memory, I also made "new" of Sync<T> allocate shared memory.
The list is created by iterating the above process using the for loop. In order to show the behavior, 1 second wait is inserted at the end of the loop (gen:6). The end of the list is terminated by 0 (gen:7).
Then, let's see the definition of the consumer process.
void consumer(SyncList<int> *a) { SyncList<int> *current,*prev; current = a; while(1){ printf("reader:value = %d\n", current->read()); // ...(cons:1) read the value of the cell and print it prev = current; // ...(cons:2) save the current cell current = current->readcdr(); // ...(cons:3) extract the cdr of the current cell, and make it the current cell delete prev; // ...(cons:4) delete the used cell if(current == 0) break; // ...(cons:5) check the termination } }
The value of the cell is extracted and shown at (cons:1). Here, this read blocks until the value is written like Sync<T>.
The current cell is saved at (cons:2). Cdr of the current cell is extracted and is made to be the current cell (cons:3). Like read(), readcdr() blocks until cdr is written.
After the cdr is read, the previous cell is no longer needed. So it is deleted at (cons:4). Here, "delete" releases the memory in the shared memory area, and releases the semaphores.
Lastly, termination is checked at (cons:5).
The output of this program should be like this:
writer:value = 0 reader:value = 0 writer:value = 1 reader:value = 1 writer:value = 2 reader:value = 2 ...
The consumer process waits for the write of the generator process. Therefore, above output is shown second by second.
Since the list creation and consumption described above is typical pattern, I prepared abbreviated notation that reduces the amount of codes
Firstly, the operation "create a new list cell, and connect it to the current list cell" is described as follows:
nxt = new SyncList<int>; current->writecdr(nxt); current = nxt;
In order to describe this concisely, there is a create() member function that "creates new SyncList<T> variable, which is connected to the target object, and the newly created variable is returned". Using this member function, the above example can be described as follows:
current = current->create();
Now, the temporary variable nxt is no longer needed.
Next, the operation "extract cdr from the current cell, and make this as the current cell and delete the previous cell" is described as follows:
prev = current; current = current->readcdr(); delete prev;
In order to describe this concisely, there is a release() member function that "extracts cdr and delete the cell, then returns the cdr". Using this, the above example can be written as follows:
current = current->release();
Using these abbreviated notation, you can write programs concisely. The example that uses these notations is in listsample2.cc.
In the previous example, "wait" is inserted in the generator's side. It is OK to use SyncList if the generator's execution is the bottleneck. However, if the consumer's execution is slower than the generator's execution, the system might run short of the resource because releasing the resource of the consumer's side is slow.
To avoid this problem, we need to block the generator's execution until the consumer's release is done. SyncQueue<T> provides this functionality.
SyncQueue<T> is almost the same as SyncList<T>, but it accepts the length of the "Queue" as the argument of the constructor.
a = new SyncQueue<int>(2);
If SyncQueue is declared like this, the system limits the number of operations that connects "cdr" to the Queue up to 2. If more cons cells are tried to connected to the Queue, the operation blocks. Then, if the cells connected to the Queue is released by "delete" or "release", the number increases. If there is an operation that is waiting, it resumes the operation. This means that at most 3 cells can exist at a time.
Only the first cell require the number in the constructor. After that, other cells can be allocated same as SyncList<T>.
In addition, a cell cannot be set as multiple cells' cdr unlike SyncList<T> ; the system detects this and outputs an error.
An example of SyncQueue<T> is in queuesample.cc. It changed listsample2.cc so that SyncList type is replaces by SyncQueue<T> and the consumer's side waits. The output of this program should be like this:
writer:value = 0 writer:value = 1 writer:value = 2 reader:value = 0 writer:value = 3 reader:value = 1 writer:value = 4 ...
First, because the generator's side does not wait, 0, 1, 2 are shown. Then after the consumer side shows 0 and releases the cell, the generator resumes execution and shows 3, and so on.
If you use only SyncList and SyncQueue, there might be a case that you need to call SPAWN very frequently. Process invocation cost is not that large in recent OSes, but still you might need to reduce the number of process invocation.
Therefore, I prepared a class that supports a pattern like that worker processes are invoked at first (for example, number of processors), and they get their work from "work pool".
I will explain how to use the class using workpoolsample.cc in the "samples" directory.
int main() { pards_init(); SyncQueue<int> *work = new SyncQueue<int>(2); // ...(main:1) work queue SyncQueue<int> *output = new SyncQueue<int>(2);// ...(main:2) output queue WorkPool<int,int> *pool = new WorkPool<int>(work,output); // ... (main:3) definition of WorkPool SPAWN(generator(work)); // ... (main:4) creation of work SPAWN(worker(1, pool)); // ... (main:5) fork woker1 SPAWN(worker(2, pool)); // ... (main:6) fork woker2 while(1){ printf("%d...\n",output->read()); // ... (main:7) show output queue if((output = output->release()) == 0) break; // ... (main:8) get the next cell & check termination } pards_finalize(); }
At first, SyncQueue work whose value is "work", and SyncQueue output whose value is "output" are defined (main:1) (main:2). You can use SyncList instead of SyncQueue.
Next, "pool" whose type is WorkPool is defined (main:3). T1 and T2 of WorkPool<T1,T2> are SyncQueue<T> 's T for work (int), and SyncQueue<T> 's T for output (int). In addition, argument of the constructor includes SyncQueue work for "work" and SyncQueue output for "output".
Here, SyncQueue (SyncList) for work and SyncQueue (SyncList) for output are treated as a pair; when you get work cell from the pool, you also get the output cell for the work as a pair.
This enables us to get the output in order, even if the work is processed out of order by different processes.
In (main:5) and (main:6), worker processes are SPAWNed, whose argument includes the work pool. The other argument is id of the worker.
(main:7), (main:8) show the result.
Next, let's see the definition of the worker.
void worker(int id, WorkPool<int,int> *workpool) { while(1){ WorkItem<int,int> item = workpool->getwork(); // (worker:1) get the work if(item.output == 0) break; // (worker:2) check termination else{ item.output->write(item.work*2); // (worker:3) double the value and write to the output printf("(%d by [%d])\n",item.work*2,id); // (worker:4) print the worker id } } }
At first, the work is got from WorkPool in (worker:1). Here, the type of the work is WorkItem. T1 and T2 of WorkItem<T1,T2> are the same as WorkPool<T1,T2>.
You can get the "work" whose type is WorkItem by calling the getwork() member function. WorkItem includes the work and the output cell for the work.
Here, inside of getwork(),
Therefore, users of WorkPool don't have to worry about release/creation of cells.
The variable item whose type is WorkItem includes a member "output" that has the output cell. You can check if the work pool is terminated or not by checking if it is 0 (worker:2).
The variable item (whose type is WorkItem) includes a member "work" whose type is T1 (in this case, int). In (worker:3) this value is doubled and written to the output cell.
In (worker:4), in order to show the worker id that did the job, the value that is written to the output cell, and worker id are printed. You can see that the works are processed by multiple workers.
The output of this program should be like this:
(0 by [1]) (2 by [1]) 0... 2... (4 by [1]) (6 by [2]) 4... 6... (8 by [1]) (10 by [2]) 8... 10...
Integer lists like "0, 1, 2, ..." are doubled and shown like "0... 2... 3...". In addition, which worker did the work is shown like (2 by [1]). The worker id would change by each execution.
In the above example, WorkPool type variable was not deleted. When deleting this, it is difficult to decide when it is safe to delete it; if it is deleted just after the output is all extracted, there might be still exist workers (that didn't reach the termination check), and they cause errors by using the released resources.
To avoid this, you can specify worker's number as reference count as the argument of the constructor (the last argument). In addition, workers call the release() member function after the termination check. This reduces the reference count, and can tell the system that the worker terminated.
When delete of WorkPool variable is called, this counter is checked; until all the workers terminate, delete is blocked.
This example is written in workpoolsample.cc as comments.
If the counter is not specified in the constructor, you can delete the variable without calling release(). However, this is a dangerous operation in general. In addition, you can call release() even if the counter is not specified in the constructor.
If WorkPool variable is allocated in the stack, you need to call free() to release the resources (same as Sync). The above mechanism are also applied in this case.
When parallelizing numerical programs, loops are most often parallelized. It can be implemented with SPAWN, but it requires to create a function, which is a burden of programmer.
Therefore, the library supports a programming style called "SPMD".
SPMD stands for "Single Program, Multiple Data". The program is only one, but different processors work on different data.
"MPI" is famous for a parallel library of SMPD. MPI works (mainly) on distributed memory computers. Communication is done by message passing. All processors execute the same program, but by obtaining the my processor number, each processor can do different thing.
SPMD support by PARDS is similar to MPI. But communication is done through shared memory on PARDS.
I will explain how to use this using spmdsample.cc in the samples directory. This program only adds the data in the array "a".
#define NP 3 ... int *a = (int*)pards_shmalloc(100*NP*sizeof(int)); // (1) input data int *out = (int*)pards_shmalloc(NP*sizeof(int)); // (2) output data for(int i = 0; i < 100 * NP; i++) // (3) initialize a a[i] = 1; PBInfo* pbi = pards_begin_parallel(NP); // (4) begin parallel block int pno = pbi->getpno(); // (5) get the process number int *my_part = a + 100 * pno; // (6) calculate where to work int sum = 0; for(int i = 0; i < 100; i++) // (7) work on my part sum+=my_part[i]; out[pno] = sum; // (8) store the result in shared memory pards_barrier(pbi); // (9) wait for all processes int totalsum = 0; if(pno == 0){ // (10) process #0 calculate the sum for(int i = 0; i < NP; i++) totalsum += out[i]; } pards_end_parallel(pbi); // (11) end parallel block ...
At first, (1) and (2) allocates arrays for input data and output data on shared memory, which are shared by all the processes.
Here, to explicitly allocate shared memory, pards_shmalloc function is used. To release the memory, pards_shmfree is used.
In (3), the contents of input data array a is initialized as "1".
From (4) SPMD processing starts. From pards_begin_parallel to pards_end_parallel in (11), processes (whose number is the argument of pards_begin_parallel) do the same work.
pards_begin_parallel returns PBInfo, which includes the information of the parallel block.
(5) returns process no. of the executing process, so that each process can do different work. PBInfo is used for this. Process no. starts from 0, ends no. of process - 1.
(6) calculates the position to work from the process no.
(7) calculates the sum of the array where the processes should work on. Then, (8) stores the result in the shared memory allocated in (1).
(9) does "barrier", which means that all processes wait until they reach this statement. By this statement, the processes can know that other processes already did their work.
After the barrier in (9), (10) calculates total sum of all processes, which is done by process no. 0.
pards_end_parallel of (11) ends SPMD. After this statement, number of processes returns to 1.
SPMD functionality is effective if there are loops that manipulate large arrays and their execution time is dominant. Please see the matrix multiply example in samples directory also.
You can use SPAWN in the parallel block, but note that the SPAWN will be executed by all processes unless it is guarded by branch that uses the process number.
In addition, you can nest the parallel block (pards_begin_parallel), but it is also executed by all processes, which is hard to control. Especially, if you use barrier, be sure to use correct processes.
So far, we discussed "parallel processing" whose purpose is speedup of programs utilizing multiple processors. However, sometimes we need "concurrent processing" whose purpose is to express concurrency explicitly.
We need "concurrent processing" when we express multiple I/O, for example; i.e. a GUI application waits for both input from the user and data from the network.
When we write such a program, we want to make "block and wait for input from the user" process and "block and wait for input from the network" different processes (or threads).
(Of course, you can use something like "select", which waits for multiple inputs. But there would be a case that programs that use different processes/threads are easier to understand.)
PARDS has functionality that is needed for such programs. That is:
They are explained in detail as follows.
Consider a GUI application that waits for both input from the user and data from the network. There would be a case that waiting for these inputs are done by different processes, and if there is input from either process, another process does some process according to the input.
To realize such functionality, merger functionality of SyncList / SyncQueue is introduced. This functionality can be used for waiting multiple inputs like "select".
You can use this functionality by using class "Merger". I will explain this as follows. This example is in mergersample.cc in the samples directory.
SyncQueue<Item> *first = new SyncQueue<Item>(2); // (main:1) create the first cell Merger<Item> *merger = new Merger<Item>(first,2);// (main:2) create merger with the // first cell SPAWN(generator(1,merger)); // (main:3) call generator1 with merger SPAWN(generator(2,merger)); // (main:4) call generator2 with merger SPAWN(consumer(first)); // (main:5) call consumer with the first cell
At first (main:1) creates the first cell. In this case SyncQueue is used, but you can use SyncList. "Item" is defined as follows:
class Item{ public: int val; int id; };
The value of the item is in "val", and "id" holds generator's id.
Then (main:2) creates the "merger" that is an instance of Merger class. The first cell is passed as an argument of the constructor. The second argument of the constructor is the number of the processes that share the merger. This is used to release the resources that merger uses (same as WorkPool).
(main:3) and (main:4) call generators with the created merger. The generators add list cells to the SyncQueue List that starts from the "first" using the merger. This addition is done by generator1 and generator2 in non-deterministic order.
(main:5) calls "consumer" with the first cell. The consumer accesses the list that is created by generator1 and generator2.
Then let's see the definition of the generator.
void generator(int id, Merger<Item> *m) { ... for(int i = 0; i < 3; i++){ Item it; // (generator:1) create Item it.val = i; // (generator:2) set val of Item it.id = id; // (generator:3) set id of Item m->put(it); // (generator:4) put Item to the merger } m->release(); // (generator:5) release the merger }
(generator:1), (generator:2) and (generator:3) create Item and set the value of it.
Then, (generator:4) add the created Item to the list by calling "put" of the merger. This "put" does creation of list cell and setting to the cdr.
Other processes (in this case, another generator) may call "put" of the same merger. In this case, the cells is added to the list in the order of calling.
After creation of the list, (generator:5) calls release to release the merger. After "release" is called by all the processes (the number is set in the constructor of the merger), the resource of the merger is released, and the list is terminated by 0.
If you want to increase the number of processes that share the merger, you can use increase_referer().
Lastly, let's see the consumer.
void consumer(SyncQueue<Item> *l) { SyncQueue<Item> *current = l; while(1){ Item it = current->read(); printf("val = %d, id = %d\n",it.val, it.id); current = current->release(); if(current == 0) break; } }
This doesn't have anything special. It just traverses the list until the termination, and shows their values. The result would be something like this.
val = 0, id = 2 val = 0, id = 1 val = 1, id = 2 val = 1, id = 1 val = 2, id = 1 val = 2, id = 2
You can see that generator's id's are shown in arbitrary order.
In this case, I used the merger directory in the generators. But there might be a case that you can want to merge lists that are created by processes, if you already have such process definitions. In this case, you can implement this by creating processes for merging for each list. You can find this example in mergersample2.cc.
There would be a case that you want to terminate blocking operations like read() of Sync<T> after some amount of time. For example, in the case of waiting for data from a socket, there might be a case that you want to show an error to the user if the operation does not complete after some amount of time.
To support such cases, there are timeout operations as follows:
These operations are timeout version of read, readcdr, release, writecdr, create, put. The "struct timeval*" shows the time until the operation timeouts (if 0 is given, the operation does not timeout).
The pards_status* is pointer to enum that has either of SUCCESS, TIMEOUT or INTERRUPT. The programmer needs to allocate memory for the value and pass the pointer to it to timed operations above. The value of the variable is set to either SUCCESS, TIMEOUT or INTERRUPT by the operations. If the value is SUCCES, it shows that the operation succeeded without timeout, if it is TIMEOUT, it shows that the operation timedout. INTERRUPT shows that there was interrupt and the operation didn't succeed. The interruption will be explained later.
Here, timeout version of writecdr and crate does not exist in SyncList. This is because these operations does not block in SyncList. On the other hand, these operations might block in SyncQueue, so there are timeout version of them.
The timedput can only be used if SyncQueue is given to the constructor of Merger (if it is SyncList, put does not block).
Samples of them are in timedreadsample.cc, timedlistsample.cc, timedqueuesample.cc, and timedputsample.cc in the samples directory.
The user might want to cancel the operations, not only timeouts. For example, the user is waiting for data from a socket, and wants to cancel the operation even if it is within the timeout period.
To support such a case, the library has "interrupt" functionality.
Interrupt is functionality to send signal to the SPAWNed processes. Currently, SIGUSR1 is used for the signal.
In addition, you can't get the process id from SPAWN. Therefore, new macro ISPAWN(pid, func) is provided. The first argument is the variable to hold the PID (you can specify the variable directly (not pointer), because ISPAWN is macro). The type of PID is pards_pid. Currently, it is typedef of pid_t, but different type name is used for future extension.
You can send interrupt using pards_send_interrupt(pards_pid) to the processes that are spawned by ISPAWN.
You can know if interrupt is sent by calling pards_is_interrupted(). It returns 1 if there was interrupt, otherwise it returns 0. pards_clear_interruption() can clear the status.
If there is interrupt during the blocking operations, the behavior depends on the operations.
If the operations are normal operations like read, readcdr, release, writecdr and create, then they do not stop operations. It is guaranteed that the operations succeed.
If the operations are timeout operations like timedread, timedreadcdr, timedrelease, timedwritecdr, timedcreate, then the operations stop. They returns INTERRUPT as pards_status.
In addition, operations that stop by interruption without timeout are added:
These operations do not timeout, but stop by interrupt. You can know if there was interrupt or the operation completed successfully by seeing pards_status. If it is SUCCESS, the operation completed successfully, and if it is INTERRUPT, the operation stopped by interruption.
If there was interrupt during blocking system calls, the system call stops. In this case, usually the return value of the system call becomes -1, and errno shows EINTR (it depends on the kind of system calls).
The samples are intrreadsample.cc, intrlistsample.cc, intrqueuesample.cc and intrputsample.cc in the samples directory.