Blockram timing info?

varkenvarken · December 26, 2019, 8:58pm

does anyone have a pointer to timing info for the blockrams on the up5k? I think I searched through all the lattice technical notes but it is not there or I might have overlooked it
the reason I ask is that on the up5k I seem to need 2 wait cycles (@ 12MHz) between writing the address register and retrieving the data to get reliable results vs. just 1 (or even a half if I read on the negative edge) on the hx1k .
I might be doing something completely stupid of course so that’s why a pointer to relevant documentation will be appreciated

daveshah · December 26, 2019, 11:43pm

I don’t think Lattice make these numbers public. The clock to out time is about 1ns, but this isn’t usually something you would worry about directly. The UP5K fabric in general is much slower than the iCE40 HX. However, the nextpnr timing report will tell you whether or not your design meets timing at 12MHz.

varkenvarken · December 27, 2019, 9:15am

I agree that the general timings are not so useful, even for the up5k they are in the hundreds of MHz but that is gate-to-gate. The iCE family handbook gives some timings for multiplexors etc. but not the essential information on how much time it takes between providing the read address to a block ram and the data becoming available. Why wouldn’t they publish that?
Anyway, Next-pnr reports my clock to be on the critical path but with plenty of room to spare (26MHz, while the clock is 12).
So I guess what I am really looking for is not so much timing info but some guidance/best practice re. blockrams on a up5k suitable for newbies like me, perhaps in the form of a small example that has been proven to work.
I mean, I can get it to work but those extra waitcycles don’t ‘feel’ right, although at this point on my learning curve I am obviously in no position to judge

daveshah · December 27, 2019, 9:39am

read address to a block ram and the data becoming available

There is no such number. The address must arrive ~100ps before the read clock edge and the data will be value ~1ns after the read edge but its a synchronous port so there must always be a clock edge.

If you post your code I can have a more detailed look.

varkenvarken · December 27, 2019, 9:51am

thanks Dave

Actually I think the diagram on page 4 of the Memory Usage Guide
for iCE40 Devices does make sense: I should be able the read data every other cycle, provided that the address is valid on the rising edge of the clock. However, I realize I am actually calculating the address on the rising edge so it is probably only fully settled on the next clock.

The code is quite big but I think the snippet below captures is all (it part of a larger state machine). If my thinking is right, it would make sense to operate the block ram on the negative edge?

...
    wire [3:0] cmd = instruction[15:12]; // main opcode
    wire [3:0] R2  = instruction[11: 8]; // destination register
    wire [3:0] R1  = instruction[ 7: 4]; // source register 1
    wire [3:0] R0  = instruction[ 3: 0]; // source register 0
    wire writable_destination = R2 > 1;
    wire [31:0] sumr1r0 = r[R1] + r[R0];
    wire [addr_width-1:0] sumr1r0_addr = sumr1r0[addr_width-1:0];

...

    CMD_LOADB:  begin
                    mem_raddr <= sumr1r0_addr;  // <------- this might cost some time ...
                    state <= LOAD1w;
                end
    LOAD1w  :   state <= LOAD1;                 // <------- w.o. this mem_data_out is often invalid
    LOAD1   :   begin
                    if(writable_destination) r[R2][7:0] <= mem_data_out;
                    state <= FETCH;
                end

daveshah · December 28, 2019, 8:25am

In general it rarely makes sense to work on more than one clock edge, as it makes debugging harder and reduces the maximum frequency the design can work at.