Skip to content

Latest commit

 

History

History
72 lines (56 loc) · 4.61 KB

0137-2023-08-30.md

File metadata and controls

72 lines (56 loc) · 4.61 KB

30 Aug 2023

Previous journal: Next journal:
0136-2023-08-29.md 0138-2023-08-31.md

raybox-zero: Merging height_scaler

Accomplishments

  • Moved height_scaler rcp into wall_tracer and made it use shared_reciprocal.
  • Ran harden_test and it got so close to fitting. See below
  • Implemented optional `define TRACE_STATE_DEBUG which visually represents the tracing state of each line on-screen.
  • Fixed a minor issue with map ROM size parameters.

Stats with height_scaler absorbed into wall_tracer

tt04-rbz@shared_rcp 3ea9e0e: shared_rcp branch: raybox-zero@shared_rcp submodule

4x2:1 8x2:1 4x2:2 8x2:2 4x2:3 8x2:3 4x2:4 8x2:4
cells_pre_abc 14,998 14,998 14,998 14,998 15,123 15,123 14,998 14,998
TotalCells 12,879 37,794 12,879 37,868 0 0 11,451 37,291
suggested_mhz 24.39 25.00 47.62 47.87 47.62 36.43 47.62 44.46
logic_cells 10,559 11,072 10,559 11,137 9,294 9,531 9,131 9,674
utilisation_% 72.57 36.01 72.57 36.01 61.04 30.29 60.26 29.90
wire_length_um 0 333,654 0 335,534 0 555,834 0 242,588

Summary:

  • 4x2:4 is our target, and it nearly fits: Got to get it under 60% utilisation.
  • 25MHz will probably be fine in 4x2, but we're also closer to 50MHz (44MHz so far in 8x2:4).

With all debug modules baked in (map overlay, debug overlay, trace debug)

3ea9e0e: shared_rcp branch: raybox-zero@shared_rcp submodule

4x2:1 8x2:1 4x2:2 8x2:2 4x2:3 8x2:3 4x2:4 8x2:4
cells_pre_abc 15,672 15,672 15,672 15,672 15,659 15,659 15,672 15,672
TotalCells 13,248 38,199 13,248 38,147 0 0 11,865 37,623
suggested_mhz 24.39 25.00 47.62 50.00 47.62 37.10 47.62 43.57
logic_cells 10,928 11,482 10,928 11,581 9,532 9,864 9,545 10,075
utilisation_% 75.46 37.45 75.46 37.45 62.95 31.24 62.96 31.25
wire_length_um 0 367,960 0 370,801 0 627,200 0 261,605

Next steps

  • Try hardening with debug signals controlling debug overlays, to see if they notably increase area.
  • Cut back states. There are too many.
  • Cut back facing/vplane vectors; SPI doesn't need to supply higher order bits.
  • See if we can reduce multipliers.
  • Are there some combo chains which could actually now be registered (for faster timing)?
  • Try a bigger map ROM. Currently it's only 16x16. Mind you, that's small enough to store in local registers.
  • Add in tex and see how much it blows things out.
  • Compare cell count when using rcp_in as a register instead of mux -- it might actually be worse by turning into both a register AND a mux in synth.
  • Combine TraceStep and TraceTest if using combo map ROM. No good if using external ROM.
  • External ROM ideas:
    • Try to plan a minimal external ROM interface that doesn't suffer bad SPI speed issues.
    • Consider designing an extra TT04 tile that just provides a ROM interface of the kind we need (col/row inc/dec).
    • Remember: Really only need to read 1 bit, or 2 for tri-colour walls... Abort reading at that point (no need to get all 8 bits).
    • Address setup is still a problem though if we're reading the map ROM very randomly.
  • Is there a way to avoid trackDistY-stepDistY by holding a previous value? Even a reg (to store a copy) might be less than a sub.
  • Move the shared_reciprocal OUT of wall_tracer
  • Try to get a sense of how big each component/module/register/expression is... harden them all separately or black-box them?
  • SPI access to registers for controlling colours, esp. background.

Notes

  • Include an option for wall_tracer to signal if it has not finished, or otherwise a way to abort it? Handling that might mean repeating the last trace value for the current line, allowing optionally up to 1 extra line of tracing time.
  • Experiment with vdist input to rcp... if we cap it THERE to the bit range we expect, does it already help optimisation?
  • Check if there are any registers we can recycle for other purposes, e.g. when the current state no longer needs them. vdist comes to mind; it could be rcp_in (and in fact actually IS a shared reciprocal input at one point).