Playing around with generic sum functions for different types, and I come across some strange results.
use std.env.all ;
use std.textio.all ;
library ieee ;
use ieee.numeric_std.all ;
use ieee.numeric_bit.all ;
entity test is
end entity ;
architecture arch of test is
function generic_sum generic (
type T ;
init : T ;
function "+"(l: T ; r : integer) return T is <>
) return T is
variable rv : init'subtype := init ;
begin
for idx in 1 to 10000000 loop
rv := rv + 1 ;
end loop ;
return rv ;
end function ;
function sum_fixed return integer is
variable rv : integer := 0 ;
begin
for idx in 1 to 1000000 loop
rv := rv + 1 ;
end loop ;
return rv ;
end function ;
function sum_fixed return real is
variable rv : real := 0.0 ;
begin
for idx in 1 to 1000000 loop
rv := rv + real(1) ;
end loop ;
return rv ;
end function ;
function "+"(l : real ; r : integer) return real is
begin
return l + real(r) ;
end function ;
function sum is new generic_sum generic map (T => ieee.numeric_std.unsigned, init => ieee.numeric_std.to_unsigned(0,32)) ;
function sum is new generic_sum generic map (T => ieee.numeric_bit.unsigned, init => ieee.numeric_bit.to_unsigned(0,32)) ;
function sum is new generic_sum generic map (T => integer, init => 0) ;
function sum is new generic_sum generic map (T => real, init => 0.0) ;
function sum_fixed is new generic_sum generic map (T => ieee.numeric_std.unsigned(31 downto 0), init => ieee.numeric_std.to_unsigned(0,32)) ;
function sum_fixed is new generic_sum generic map (T => ieee.numeric_bit.unsigned(31 downto 0), init => ieee.numeric_bit.to_unsigned(0,32)) ;
function sum_range is new generic_sum generic map (T => integer range integer'low to integer'high, init => 0) ;
function sum_range is new generic_sum generic map (T => real range real'low to real'high, init => 0.0) ;
procedure print(x : in string) is
variable l : line ;
begin
write(l, x) ;
writeline(output, l) ;
end procedure ;
begin
tb : process
variable tick, tock : time_record ;
variable std_sum : ieee.numeric_std.unsigned(31 downto 0) ;
variable bit_sum : ieee.numeric_bit.unsigned(31 downto 0) ;
variable int_sum : integer ;
variable real_sum : real ;
begin
print("Fixed size generic numeric_std sum") ;
tick := localtime ;
std_sum := sum_fixed ;
tock := localtime ;
print(" std_sum : " & to_string(std_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Generic numeric_std sum") ;
tick := localtime ;
std_sum := sum ;
tock := localtime ;
print(" std_sum : " & to_string(std_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Fixed size generic numeric_bit sum") ;
tick := localtime ;
bit_sum := sum_fixed ;
tock := localtime ;
print(" bit_sum : " & to_string(bit_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Generic numeric_bit sum") ;
tick := localtime ;
bit_sum := sum ;
tock := localtime ;
print(" bit_sum : " & to_string(bit_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Generic integer sum") ;
tick := localtime ;
int_sum := sum ;
tock := localtime ;
print(" int_sum : " & to_string(int_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Generic ranged integer sum") ;
tick := localtime ;
int_sum := sum_range ;
tock := localtime ;
print(" int_sum : " & to_string(int_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Integer sum") ;
tick := localtime ;
int_sum := sum_fixed ;
tock := localtime ;
print(" int_sum : " & to_string(int_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Generic real sum") ;
tick := localtime ;
real_sum := sum ;
tock := localtime ;
print(" real_sum : " & to_string(real_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Generic real range sum") ;
tick := localtime ;
real_sum := sum_range ;
tock := localtime ;
print(" real_sum : " & to_string(real_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
print("Real sum") ;
tick := localtime ;
real_sum := sum_fixed ;
tock := localtime ;
print(" real_sum : " & to_string(real_sum)) ;
print(" Total time: " & to_string(tock - tick)) ;
std.env.stop ;
end process ;
end architecture ;
Each sum is just an iterative loop over 1000000 values incrementing by 1 each time, so the end result should be 1000000 for each sum.
I get the following output:
$ nvc --std=2019 -a test.vhdl -e test -r
Fixed size generic numeric_std sum
std_sum : 00000000100110001001011010000000
Total time: 0.370859
Generic numeric_std sum
std_sum : 00000000100110001001011010000000
Total time: 0.516243
Fixed size generic numeric_bit sum
bit_sum :
Total time: 0.37364
Generic numeric_bit sum
bit_sum :
Total time: 0.52075
Generic integer sum
int_sum : 140523886873728
Total time: 0.51588
Generic ranged integer sum
int_sum : 10000000
Total time: 0.139467
Integer sum
int_sum : 1000000
Total time: 0.013876
Generic real sum
real_sum : 6.94280262067005e-310
Total time: 0.521378
Generic real range sum
real_sum : 4.94065645841247e-317
Total time: 0.139211
Real sum
real_sum : 1000000
Total time: 0.011074
** Note: 0ms+0: STOP called
Procedure STOP [] at lib/std.19/env-body.vhd:38
Process :test:tb at test.vhdl:148
... sometimes the numeric_bit sum looks like this:
Fixed size generic numeric_bit sum
bit_sum : WWWWWWWW
Total time: 0.373893
Generic numeric_bit sum
bit_sum : WWWWWWWW
Total time: 0.520902
Obvious issues:
- The generic real and integer results are just wrong
- The real version is wrong for all the generic versions, but integer fixes itself when ranged
- The generic numeric_bit results are wrong but the same
- The performance for each of the different methods is pretty wild
The performance aspect really surprised me. I would have imagined that once elaborated, the performance difference between the generic unsized/unranged, the generic sized/ranged, and the fixed functions would basically be the same. If there are extra gains that can be made by the fixed functions during analysis since they are fixed types with known intrinsics, then I would at least expect the unsized/unranged to match the sized/ranged ones since they are defining the same thing?
Playing around with generic sum functions for different types, and I come across some strange results.
Each sum is just an iterative loop over 1000000 values incrementing by 1 each time, so the end result should be 1000000 for each sum.
I get the following output:
... sometimes the
numeric_bitsum looks like this:Obvious issues:
The performance aspect really surprised me. I would have imagined that once elaborated, the performance difference between the generic unsized/unranged, the generic sized/ranged, and the fixed functions would basically be the same. If there are extra gains that can be made by the fixed functions during analysis since they are fixed types with known intrinsics, then I would at least expect the unsized/unranged to match the sized/ranged ones since they are defining the same thing?