Previous journal: | Next journal: |
---|---|
0136-2023-08-29.md | 0138-2023-08-31.md |
- Moved height_scaler rcp into wall_tracer and made it use shared_reciprocal.
- Ran harden_test and it got so close to fitting. See below
- Implemented optional
`define TRACE_STATE_DEBUG
which visually represents the tracing state of each line on-screen. - Fixed a minor issue with map ROM size parameters.
tt04-rbz@shared_rcp 3ea9e0e
: shared_rcp branch: raybox-zero@shared_rcp submodule
4x2:1 | 8x2:1 | 4x2:2 | 8x2:2 | 4x2:3 | 8x2:3 | 4x2:4 | 8x2:4 | |
---|---|---|---|---|---|---|---|---|
cells_pre_abc | 14,998 | 14,998 | 14,998 | 14,998 | 15,123 | 15,123 | 14,998 | 14,998 |
TotalCells | 12,879 | 37,794 | 12,879 | 37,868 | 0 | 0 | 11,451 | 37,291 |
suggested_mhz | 24.39 | 25.00 | 47.62 | 47.87 | 47.62 | 36.43 | 47.62 | 44.46 |
logic_cells | 10,559 | 11,072 | 10,559 | 11,137 | 9,294 | 9,531 | 9,131 | 9,674 |
utilisation_% | 72.57 | 36.01 | 72.57 | 36.01 | 61.04 | 30.29 | 60.26 | 29.90 |
wire_length_um | 0 | 333,654 | 0 | 335,534 | 0 | 555,834 | 0 | 242,588 |
Summary:
- 4x2:4 is our target, and it nearly fits: Got to get it under 60% utilisation.
- 25MHz will probably be fine in 4x2, but we're also closer to 50MHz (44MHz so far in 8x2:4).
3ea9e0e
: shared_rcp branch: raybox-zero@shared_rcp submodule
4x2:1 | 8x2:1 | 4x2:2 | 8x2:2 | 4x2:3 | 8x2:3 | 4x2:4 | 8x2:4 | |
---|---|---|---|---|---|---|---|---|
cells_pre_abc | 15,672 | 15,672 | 15,672 | 15,672 | 15,659 | 15,659 | 15,672 | 15,672 |
TotalCells | 13,248 | 38,199 | 13,248 | 38,147 | 0 | 0 | 11,865 | 37,623 |
suggested_mhz | 24.39 | 25.00 | 47.62 | 50.00 | 47.62 | 37.10 | 47.62 | 43.57 |
logic_cells | 10,928 | 11,482 | 10,928 | 11,581 | 9,532 | 9,864 | 9,545 | 10,075 |
utilisation_% | 75.46 | 37.45 | 75.46 | 37.45 | 62.95 | 31.24 | 62.96 | 31.25 |
wire_length_um | 0 | 367,960 | 0 | 370,801 | 0 | 627,200 | 0 | 261,605 |
- Try hardening with debug signals controlling debug overlays, to see if they notably increase area.
- Cut back states. There are too many.
- Cut back facing/vplane vectors; SPI doesn't need to supply higher order bits.
- See if we can reduce multipliers.
- Are there some combo chains which could actually now be registered (for faster timing)?
- Try a bigger map ROM. Currently it's only 16x16. Mind you, that's small enough to store in local registers.
- Add in
tex
and see how much it blows things out. - Compare cell count when using rcp_in as a register instead of mux -- it might actually be worse by turning into both a register AND a mux in synth.
- Combine TraceStep and TraceTest if using combo map ROM. No good if using external ROM.
- External ROM ideas:
- Try to plan a minimal external ROM interface that doesn't suffer bad SPI speed issues.
- Consider designing an extra TT04 tile that just provides a ROM interface of the kind we need (col/row inc/dec).
- Remember: Really only need to read 1 bit, or 2 for tri-colour walls... Abort reading at that point (no need to get all 8 bits).
- Address setup is still a problem though if we're reading the map ROM very randomly.
- Is there a way to avoid
trackDistY-stepDistY
by holding a previous value? Even a reg (to store a copy) might be less than a sub. - Move the shared_reciprocal OUT of wall_tracer
- Try to get a sense of how big each component/module/register/expression is... harden them all separately or black-box them?
- SPI access to registers for controlling colours, esp. background.
- Include an option for wall_tracer to signal if it has not finished, or otherwise a way to abort it? Handling that might mean repeating the last trace value for the current line, allowing optionally up to 1 extra line of tracing time.
- Experiment with vdist input to rcp... if we cap it THERE to the bit range we expect, does it already help optimisation?
- Check if there are any registers we can recycle for other purposes, e.g. when the current state no longer needs them. vdist comes to mind; it could be rcp_in (and in fact actually IS a shared reciprocal input at one point).