August 26, 2023

Parallel Programming in System Verilog: A Deep Dive into Fork and Join

  • In System Verilog, a thread or process is a segment of code that operates independently as a distinct entity.
  • During the verification process, there are occasions when we need to execute multiple tasks simultaneously. In System Verilog, a “fork-join” is utilized to create threads that operate in parallel, enabling two processes to perform tasks concurrently.
  • Each process or thread is encapsulated within a “begin…end” block. Without this block, each statement functions as a separate process.
  • SYNTAX:

fork

// Thread 1
// Thread 2
// Thread 3

join

  • There are three ways to use the fork…join construct in System Verilog:
  1. fork join: waits until all processes inside the fork…join block have completed before continuing.
  2. fork join_any: waits until at least one process inside the fork…join block is completed before continuing.
  3. fork join_none: does not wait for any processes inside the fork…join block to complete or even start; it immediately exits.”
  • Now, let’s delve into the functionality of each of the fork join variants:

1. fork join:

Here, all processes begin simultaneously, and the join statement will wait for all processes to complete.

Below is a simple example of how to implement a fork join in code:

fork

//Thread A;
//Thread B;
//Thread C;

join

//Thread D;

  • Here, three threads, A, B, and C, are initiated using fork join. The join statement ensures that Thread D will only execute after all three threads inside the fork join block have completed their tasks.
  • This process can be visualized in the following diagram, which illustrates how these threads are executed over time:
  • When code is enclosed within a begin…end block, the statements within it start in parallel and execute as a single process, forming a separate thread. The statements inside the begin…end block are executed sequentially. If no begin…end block is used within the fork join, the statements inside the fork join block run in parallel.”

Consider the below example:

module tb_fork_join;

initial begin
fork
begin // Thread A

$display (“Thread A has started at time = %0t”, $time);
#10;
$display (“Thread A has completed at time=%0t”, $time);
end

begin // Thread B

$display(“Thread B has started at time=%0t”, $time);
#20;
$display(“Thread B has completed at time=%0t”, $time);

end

begin // Thread C

$display (“Thread C has started at time=%0t”, $time);
#30;
$display (“Thread C has completed at time=%0t”, $time);
end

join

$display (“Fork join has completed at time=%0t”, $time);
end
endmodule

OUTPUT:

Thread A has started at time = 0
Thread B has started at time = 0
Thread C has started at time = 0
Thread A has completed at time = 10
Thread B has completed at time = 20
Thread C has completed at time = 30
Fork join has completed at time = 30

  • Now let us see how the above code works:
  1. In this example, we have created a module called tb_fork_join and added three threads to it: Thread A, Thread B, and Thread C.
  2. Each thread begins simultaneously at time 0ns. Here, each thread is enclosed within a begin…end block. In the next example, we’ll explore how the code behaves when threads are placed outside the begin…end block.
  3. In this scenario, all three threads start at the same time, but their execution times differ. Thread A begins execution at 10ns, Thread B at 20ns, and Thread C at 30ns.
  4. Consequently, the output initially shows all threads starting at 0ns, followed by Thread A at 10ns, then Thread B, and finally Thread C.
  5. Once all threads inside the fork join block have completed their tasks, the code after the join block will execute. Therefore, the final output indicates that the ‘Fork join has completed’ at time = 30ns.
  • Certainly, let’s explore another example to better understand how statements inside a fork join work with and without a begin…end block.

module tb_fork_join;
initial begin

#1 $display (“This statement is outside the fork join and starts at 1ns delay which is at time = %0t”,$time );

fork

#6 $display(“This statement is inside the fork join and starts at 6ns delay from the initial delay which is at time 1ns+6ns= %0t”,$time);

begin

#3 $display(“This statement is inside the begin end block and start at 3ns delay from the initial delay which is at time 1ns +3ns =%0t”, $time);

#5 $display(“This statement is the 2nd statement inside the begin end block and start at 5ns delay from the previous delay which is at time 1ns +3ns +5ns=%0t”, $time);

end

#10 $display(“This statement is inside the fork join and start at 10ns delay from the initial delay which is at time 1ns + 10ns=%0t”, $time);
join

$display (“This statement is outside the fork join and will get executed after all statements in fork join are executed which is at time=%0t”, $time);

end
endmodule

OUTPUT:

This statement is outside the fork join and starts at 1ns delay which is at time = 1
This statement is inside the begin end block and start at 3ns delay from the initial delay which is at time 1ns +3ns =4
This statement is inside the fork join and starts at 6ns delay from the initial delay which is at time 1ns+6ns= 7
This statement is the 2nd statement inside the begin end block and start at 5ns delay from the previous delay which is at time 1ns +3ns +5ns=9
This statement is inside the fork join and start at 10ns delay from the initial delay which is at time 1ns + 10ns=11
This statement is outside the fork join and will get executed after all statements in fork join are executed which is at time=11

  • In this example, we have a fork join construct that contains both statements within and outside a begin…end block. Let’s break down how these statements will execute:
  1. Execution begins at 1ns because statements outside the fork join block and at the beginning of the code are specified to start at that time.
  2. Within the fork join block, there are two normal $display statements and one begin…end block, which is treated as a single process.
  3. Each statement inside the fork join block has its associated delays, and the statements within the begin…end block are executed sequentially.
  4. After all the statements inside the fork join block have executed, the statements outside the fork join block will execute.

The key takeaway is that statements outside a begin…end block within a fork join are executed in parallel, while statements within a begin…end block are executed sequentially.

  • Nested fork join

Certainly, nested fork join constructs can be used to manage multiple levels of parallelism in System Verilog. Here are a few examples of how nested fork join works:

Example 1:

module tb_fork_join;
initial begin

$display (“Thread — 1 : Fork join will start at:”,$time);

fork
fork

#30 $display (“Thread — 2 : This statement is inside the nested fork and will execute at:”,$time);

join

#10 $display (“Thread — 3 : This statement is outside the nested fork and will execute at:”,$time);

join

$display (“Thread — 4 : This statement is outside the fork join block and will execute at last at time:”,$time);
end
endmodule

OUTPUT:

Thread — 1 : Fork join will start at: 0
Thread — 3 : This statement is outside the nested fork and will execute at: 10
Thread — 2 : This statement is inside the nested fork and will execute at: 30
Thread — 4 : This statement is outside the fork join block and will execute at last at time: 30

Example 2:

module tb_fork_join;
initial begin

$display (“Thread — 1 : Fork join will start at time=”,$time);

fork
fork

#30 $display (“Thread — 2 : This statement is inside the nested fork and will execute at time 0+30=”,$time);

begin

#5 $display (“Thread — 3 : This statement is inside the nested fork and will execute at time 0+5=”,$time);
#10 $display (“Thread — 4 : This statement is inside the nested fork and will execute at time 0+5+10=”,$time);

end

join

#10 $display (“Thread — 5 : This statement is outside the nested fork and will execute at time 0+10=”,$time);

begin

#10 $display (“Thread — 6 : This statement is inside the nested fork and will execute at time 0+10=”,$time);
#20 $display (“Thread — 7 : This statement is inside the nested fork and will execute at time 0+10+20=”,$time);

end

join

$display (“Thread — 8 : This statement is outside the fork join block and will execute at last at time=”,$time);

end
endmodule

OUTPUT

Thread — 1 : Fork join will start at time=0
Thread — 3 : This statement is inside the nested fork and will execute at time 0+5= 5
Thread — 6 : This statement is inside the nested fork and will execute at time 0+10= 10
Thread — 5 : This statement is outside the nested fork and will execute at time 0+10= 10
Thread — 4 : This statement is inside the nested fork and will execute at time 0+5+10= 15
Thread — 2 : This statement is inside the nested fork and will execute at time 0+30= 30
Thread — 7 : This statement is inside the nested fork and will execute at time 0+10+20= 30
Thread — 8 : This statement is outside the fork join block and will execute at last at time= 30

These examples demonstrate how you can use nested fork join constructs to manage parallelism at different levels within your System Verilog code.

2. fork join_any

  • Here, waits until at least one process inside the fork…join_any block is completed before continuing.
  • Below is a simple example of how to implement a fork join_any in code:

fork

//Thread A;
//Thread B;
//Thread C;

join_any

//Thread D;

  • Here, three threads, A, B, and C, are initiated using fork join_any. The join_any statement ensures that Thread D will get executed after any one of these three threads inside the fork join_any block have completed their tasks.
  • This process can be visualized in the following diagram, which illustrates how these threads are executed over time:
  • Consider below examples to understand how fork join_any works:

module tb_fork_join_any;
initial begin

#1 $display (“This statement is outside the fork join_any and starts at 1ns delay which is at time = %0t”,$time );

fork

#6 $display(“This statement is inside the fork join_any and starts at 6ns delay from the initial delay which is at time 1ns+6ns= %0t”,$time);

begin

#3 $display(“This statement is inside the begin end block and start at 3ns delay from the initial delay which is at time 1ns +3ns =%0t”, $time);

#5 $display(“This statement is the 2nd statement inside the begin end block and start at 5ns delay from the previous delay which is at time 1ns +3ns +5ns=%0t”, $time);

end

#10 $display(“This statement is inside the fork join_any and start at 10ns delay from the initial delay which is at time 1ns + 10ns=%0t”, $time);
join_any

$display(“This statement is outside the fork join_any and will get executed after atleast one statements in fork join_any is executed which is at time=%0t”, $time);

end
endmodule

OUTPUT:

This statement is outside the fork join_any and starts at 1ns delay which is at time = 1
This statement is inside the begin end block and start at 3ns delay from the initial delay which is at time 1ns +3ns =4
This statement is inside the fork join_any and starts at 6ns delay from the initial delay which is at time 1ns+6ns=7
This statement is outside the fork join_any and will get executed after atleast one statements in fork join_any are executed which is at time=7
This statement is the 2nd statement inside the begin end block and start at 5ns delay from the previous delay which is at time 1ns +3ns +5ns=9
This statement is inside the fork join_any and start at 10ns delay from the initial delay which is at time 1ns + 10ns=11

Here, in the output you can observe that thread starts at 1ns and next the statement inside begin end will execute at 4ns. Next thread at 6 ns delay will start executing and as soon as any one thread gets completely executed the statement after the join_any gets executed. In this way, the fork join_any works.

  • Below is the example of how nested fork join_any works:

module tb_fork_join_any;
initial begin

$display (“Thread — 1 : Fork join_any will start at:”,$time);

fork
fork

#30 $display (“Thread — 2 : This statement is inside the nested fork and will execute at:”,$time);

join_any

#10 $display (“Thread — 3 : This statement is outside the nested fork and will execute at:”,$time);

join_any

$display (“Thread — 4 : This statement is outside the fork join_any block and will execute at last at time:”,$time);
end
endmodule

OUTPUT:

Thread — 1 : Fork join_any will start at: 0
Thread — 3 : This statement is outside the nested fork and will execute at: 10
Thread — 4 : This statement is outside the fork join_any block and will execute at last at time: 10
Thread — 2 : This statement is inside the nested fork and will execute at: 30

3. fork join_none

  • Here, the statement outside fork join_none will get executed immediately and not wait for any process inside fork join_none to execute.
  • Below is a simple example of how to implement a fork join_none in code:

fork

//Thread A;
//Thread B;
//Thread C;

join_none

//Thread D;

  • Here, three threads, A, B, and C, are initiated using fork join_none. The join_none statement ensures that Thread D will only execute immediately and will not wait of the three threads inside the fork join_none block have completed their tasks.
  • This process can be visualized in the following diagram, which illustrates how these threads are executed over time:
  • Consider below example to understand how fork join_none works:

module tb_fork_join_none;
initial begin

#1 $display (“This statement is outside the fork join_none and starts at 1ns delay which is at time = %0t”,$time );

fork

#6 $display(“This statement is inside the fork join_none and starts at 6ns delay from the initial delay which is at time 1ns+6ns= %0t”,$time);

begin

#3 $display(“This statement is inside the begin end block and start at 3ns delay from the initial delay which is at time 1ns +3ns =%0t”, $time);

#5 $display(“This statement is the 2nd statement inside the begin end block and start at 5ns delay from the previous delay which is at time 1ns +3ns +5ns=%0t”, $time);

end

#10 $display(“This statement is inside the fork join_none and start at 10ns delay from the initial delay which is at time 1ns + 10ns=%0t”, $time);
join_none

$display (“This statement is outside the fork join_none and will immediately get executed at time=%0t”, $time);

end
endmodule

OUTPUT:

This statement is outside the fork join_none and starts at 1ns delay which is at time = 1
This statement is outside the fork join_none and will immediately get executed at time=1
This statement is inside the begin end block and start at 3ns delay from the initial delay which is at time 1ns +3ns =4
This statement is inside the fork join_none and starts at 6ns delay from the initial delay which is at time 1ns+6ns=7
This statement is the 2nd statement inside the begin end block and start at 5ns delay from the previous delay which is at time 1ns +3ns +5ns=9
This statement is inside the fork join_none and start at 10ns delay from the initial delay which is at time 1ns + 10ns=11

  • Here, you can see that the statement outside the fork join_none block will get executed immediately and other statements inside fork join_none block will get executed after it.

In this way fork join_none works

  • Below is the example of how nested fork join_none works:

module tb_fork_join_none;
initial begin

$display (“Thread — 1 : Fork join_none will start at:”,$time);

fork
fork

#30 $display (“Thread — 2 : This statement is inside the nested fork and will execute at:”,$time);

join_none

#10 $display (“Thread — 3 : This statement is outside the nested fork and will execute at:”,$time);

join_none

$display (“Thread — 4 : This statement is outside the fork join_none block and will immediately get executed at time:”,$time);
end
endmodule

OUTPUT

Thread — 1 : Fork join_none will start at: 0
Thread — 4 : This statement is outside the fork join_none block and will immediately get executed at time: 0
Thread — 3 : This statement is outside the nested fork and will execute at: 10
Thread — 2 : This statement is inside the nested fork and will execute at: 30

In summary, System Verilog’s fork and join constructs offer powerful tools for managing concurrency and parallelism in your verification and design processes. Whether you’re orchestrating multiple tasks, utilizing nested forks, or waiting for specific conditions with join_any, mastering these constructs is essential for efficient and effective System Verilog programming. Harness the potential of fork and join to optimize your designs and verification environments and unlock the full potential of parallelism in your projects.

Like, Share and Follow me if you like my content.
Thank You.

2 comments:

  1. Thanks for creating a very informative blog .
    There is a minor error in the output of Ex-2 of fork_join_any

    Output should be:

    Thread — 1 : Fork join_any will start at: 0
    Thread — 3 : This statement is outside the nested fork and will execute at: 10
    Thread — 4 : This statement is outside the fork join_any block and will execute at last at time: 10 [ error here : In the above blog here the time is 30 but correct should be 10 ]
    Thread — 2 : This statement is inside the nested fork and will execute at: 30

    ReplyDelete
    Replies
    1. Thank you for pointing that out! I've corrected the error in the output of Ex-2 of fork_join_any.
      Thank you for taking the time to read my blog and for your detailed feedback! Your attention to detail is greatly appreciated. Your feedback motivates me to continuously improve and provide accurate information. Please keep reading my blog and sharing your thoughts; it's invaluable to me!

      Delete

Explore Our Topics!

Check out the extensive list of topics we discuss:  Communication Protocols: -  USB   - RS232   -  Ethernet   -  AMBA Protocol: APB, AHB and...