Sunday, March 29, 2009

Recursive and Iterative designs in Verilog

Disclaimer: Note that the examples below for the unary_and have not been compiled and are just there for demonstration (probably have a few minor bugs). The paramatrizeable priority decoder example on the other hand is complete, and is in production use.

The Verilog language allows designers to write recursive modules using a generate block. For example:
module unary_and
#(parameter WIDTH = 32)
(input [WIDTH-1:0] in_and,
output out_and)

generate
if(WIDTH == 1) begin
assign out_and = in_and;
end
else if(WIDTH == 2) begin
assign out_and = in_and[0] & in_and[1];
end
else begin
unary_and #(.WIDTH (WIDTH/2))
unary_and_low
(.in_and (in_and[WIDTH/2-1:0]),
.out_and (out_and_low));

unary_and #(.WIDTH (WIDTH - WIDTH/2))
unary_and_high
(.in_and (in_and[WIDTH-1:WIDTH/2]),
.out_and (out_and_high));

assign out_and = out_and_low & out_and_high;
end
endgenerate
endmodule
Unfortunately, not all synthesis tools can handle such designs. One way to get around this problem is by using using two nested loops to build each level of the tree of the recursive modules. The outer loop iterates over levels, and the inner loop creates the nodes at each level. For simplicity, we change the parameter to be the number of levels to create. Here's an example:
module unary_and
#(parameter NUM_LEVELS = 5,
parameter WIDTH = 2**NUM_LEVELS)
(input [WIDTH-1:0] in_and,
output out_and)

generate
genvar i,j;
// start at the lowest level in the tree
for (i=0; i < NUM_LEVELS; i=i+1) begin:gen_levels
// do nodes at each level
for (j=0; j < 2**(NUM_LEVELS-i-1); j=j+1) begin:gen_nodes
wire in_low;
wire in_high;
wire out;

// at the lowest level, use the module's inputs
if (i==0) begin
assign in_low = in_and[j*2];
assign in_high = in_and[j*2 + 1];
end
// otherwise, use the outputs of the lower level
else begin
assign in_low = gen_levels[i-1].gen_nodes[j*2];
assign in_high = gen_levels[i-1].gen_nodes[j*2 + 1];
end

assign out = in_low & in_high;

end
end
endgenerate

assign out_and = gen_levels[NUM_LEVELS-1].gen_nodes[0].out;

endmodule

A more useful example is the design of an efficient parametrizeable priority decoder. I have looked everywhere and couldn't find a good design that worked for various input widths. I've found one that used a recursive structure, but of course that didn't synthesize. The example below shows a complete priority decoder. There are two implementation styles. If STYLE is set to 0 (default), then the implementation builds a multiplexer tree that multiplexes the value to be output. The other style constructs one bit of the output at each level. Below I give two modules, the priority_encoder itself and a tester to test that it works.
/***************************************************               
* $Id$
*
* Module: priority_encoder.v
* Project: NF2.1
* Author: Jad Naous
* Description: parametrizable priority encoder.
*
* Highest priority by default given to rightmost bit.
*
* Usually just specify the OUTPUT_WIDTH. INPUT_WIDTH
* is optional. Defaults to 2**OUTPUT_WIDTH.
*
***************************************************/
module priority_encoder
#(parameter OUTPUT_WIDTH = 8,
parameter RIGHT_TO_LEFT_PRIORITY = 1)

(input [0:(2**OUTPUT_WIDTH)-1] unencoded_input,
output [OUTPUT_WIDTH-1:0] encoded_output,
output valid);

localparam INPUT_WIDTH = 2**OUTPUT_WIDTH;
localparam INPUT_VAL_WIDTH = INPUT_WIDTH*OUTPUT_WIDTH;
localparam STYLE = 0;

generate
genvar i,j;
if(STYLE==0) begin
for(i=0; i<OUTPUT_WIDTH; i=i+1)
begin:gen_levels
for(j=0; j<INPUT_WIDTH/(2**(i+1)); j=j+1)
begin:gen_nodes
wire [OUTPUT_WIDTH-1:0] value;
wire valid;

wire [OUTPUT_WIDTH-1:0] left_val;
wire left_vld;
wire [OUTPUT_WIDTH-1:0] right_val;
wire right_vld;

if(i==0) begin
assign left_val = j*2;
assign left_vld = unencoded_input[j*2];
assign right_val = j*2 + 1;
assign right_vld = unencoded_input[j*2+1];
end
else begin
assign left_val
= gen_levels[i-1].gen_nodes[j*2].value;
assign left_vld
= gen_levels[i-1].gen_nodes[j*2].valid;
assign right_val
= gen_levels[i-1].gen_nodes[j*2+1].value;
assign right_vld
= gen_levels[i-1].gen_nodes[j*2+1].valid;
end // else: !if(i==0)

assign value = (RIGHT_TO_LEFT_PRIORITY ?
(right_vld ? right_val : left_val) :
(left_vld ? left_val : right_val));
assign valid = right_vld | left_vld;
end // block: gen_nodes
end // block: gen_levels

assign encoded_output
= gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].value;
assign valid
= gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].valid;
end // if (STYLE==0)

else begin
for(i=0; i<OUTPUT_WIDTH; i=i+1)
begin:gen_levels
for(j=0; j<INPUT_WIDTH/(2**(i+1)); j=j+1)
begin:gen_nodes
wire [i:0] value;
wire valid;

if(i==0) begin
assign value = RIGHT_TO_LEFT_PRIORITY
? unencoded_input[j*2+1]
: unencoded_input[j*2];
assign valid = unencoded_input[j*2+1]
| unencoded_input[j*2];
end
else begin
wire [i-1:0] left_val;
wire left_vld;
wire [i-1:0] right_val;
wire right_vld;
assign left_val
= gen_levels[i-1].gen_nodes[j*2].value;
assign left_vld
= gen_levels[i-1].gen_nodes[j*2].valid;
assign right_val
= gen_levels[i-1].gen_nodes[j*2+1].value;
assign right_vld
= gen_levels[i-1].gen_nodes[j*2+1].valid;
assign value
= (RIGHT_TO_LEFT_PRIORITY ?
(right_vld ? {1'b1, right_val}
: {1'b0, left_val}) :
(left_vld ? {1'b0, left_val}
: {1'b1, right_val}));
assign valid = right_vld | left_vld;
end // else: !if(i==0)
end // block: gen_nodes
end // block: gen_levels

assign encoded_output
= gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].value;
assign valid
= gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].valid;
end // else: !if(STYLE==0)
endgenerate
endmodule

// synthesise translate_off
module pri_encode_test ();
reg [0:7] unencoded_input = 8'h0;
wire [2:0] encoded_output;
wire valid;

priority_encoder
#(.OUTPUT_WIDTH(3))
priority_encoder
(.valid (valid),
.encoded_output (encoded_output),
.unencoded_input (unencoded_input));

initial begin
unencoded_input[7] = 1'b1;
unencoded_input[2] = 1'b1;
unencoded_input[3] = 1'b1;

#2 if(encoded_output != 7) $display("%t ERROR", $time);

#2 unencoded_input = 0;
unencoded_input[6] = 1'b1;
unencoded_input[3] = 1'b1;
unencoded_input[0] = 1'b1;

#2 if(encoded_output != 6) $display("%t ERROR", $time);

#2 unencoded_input = 0;
unencoded_input[0] = 1'b1;

#2 if(encoded_output != 0) $display("%t ERROR", $time);

#2 unencoded_input = 0;
unencoded_input[7] = 1'b1;

#2 if(encoded_output != 7) $display("%t ERROR", $time);

#2 unencoded_input = 255;

#2 if(encoded_output != 7) $display("%t ERROR", $time);

#2 $display("%t Test ended.", $time);
end // initial begin
endmodule // pri_encode_test
// synthesise translate_on


Note that in synthesis, and surprisingly, style 0 turns out to be more efficient.

No comments:

Post a Comment