Previous journal: | Next journal: |
---|---|
0113-2023-07-12.md | 0115-2023-07-14.md |
Several of these are borrowed from older "Next steps" in other journal entries:
- Include an option for 8-bit writes into the vector registers (to be compatible with old CPUs, but also for more speed).
- On reset:
- Ensure
ready_buffer
has initial vector data that will be loaded into vector registers. - Make sure
spi_done
is not asserted.
- Ensure
- Implement an externally accessible
- Build R2R VGA DAC.
- IRQs for various things, esp. frame start.
- Plan some extra debugging signals.
- Make better use of Verilog modules, e.g. to make the SPI part generic.
- Make it possible to disable some modules to simplilfy the design (e.g. no sprites, no textures).
- Try converting some modules into Tiny Tapeout targets.
- Consider an overall reduced design for Tiny Tapeout: 160x120 res, 16x16 map, lower fixed-point resolution...?
- Try RP2040 SPI hardware instead of bit-banging.
- Check the SPI logic to make sure it's not possible for a write to be lost (or otherwise cause a conflict)
if it occurs exactly at the key moments (like buffer → ready transfer).
- Try setting up a testbench for this?
- Write a testbench; cocotb, Verilog, or both.
- See if we can correct textures so they don't mirror on certain wall sides.
- Try compacting vectors; they don't need to be so big.
- Implement resource/instance sharing, e.g. for reciprocal and some multiplication?
Prove that it actually simplifies the design.
- Need to get solid metrics from OpenLane/Yosys before making changes!
- Check utilisation also in Quartus; beware multiplier blocks!
- Need to also check timing, inc. both FPGA reports and OpenSTA.
- Doors.
- Configurable early frame end (to devote more lines to tracing time).
- Are there any magnitude comparators we could reduce by replacing with equality checks and tracking by registers?
- Implement external memory access, first for map data, then texture data?
- If Raybox 640x480 is too optimistic and we can't get a good route that runs fast enough, we could do 400x300 by using the above 800x600@56 mode, and then we only need an 18MHz clock (which means timing for external RAM is even more generous: ~55ns). Other option is maybe 320x200, which could work with a 12.5MHz clock, and would produce probably a reasonably compact layout...?
- If we use a 50MHz core, but just clock out 25MHz or even 12.5MHz to VGA, will this help the design set up external memory timing in a more structured way?
- Work on multiple sprites.
- Learn Static Timing Analysis, and setting timing constraints.
- Quasi-floating-point for trace distance data.
- Greater colour depth than RGB222.
- Consider running some parts of the design at 50MHz.
- Design for different VGA resolutions/clocks.
- Consider pipelines, e.g. is there a way to pipeline the reciprocal so it can get to a result without having to cascade its combo logic too far?
- Find out how to define actual RAM blocks in Verilog
- Learn OpenRAM and DFFRAM
- Optimising to get the design to fit in Caravel without compromising:
- Sprite positions probably only need ~6 bits of sub-unit precision. In fact, they only need about Q6.6 (unsigned)?
- Player position could be reduced to Q6.12 (unsigned)?
- Facing and vplane vectors could be reduced to Q2.12 (signed). This allows vectors in the range [-2.0,2.0)
- Trace buffer could be converted to OpenRAM.
- Fewer IOs
- Implement shared resources (i.e. for reciprocals and multiplication)
- Plan/implement external ROMs
- Option to change height scaling for different FOVs. What is the Wolf3D FOV, etc??
- Momentum
- Wall collisions
- Map overlay
- Reset and other debugging controls sent to design on other GPIOs