Giter VIP home page Giter VIP logo

Comments (9)

yrabbit avatar yrabbit commented on July 17, 2024

A bitwise comparison of image cells containing DFF and LUT showed no differences except for the routing bits.

from apicula.

pepijndevos avatar pepijndevos commented on July 17, 2024

Wait so it gets an extra on bit every time it passes by the faulty DFF? What would be interesting is to extract the timing paths from the clock to each DFF CLK and EN input.

from apicula.

yrabbit avatar yrabbit commented on July 17, 2024

And this is where the nextnr GUI would come in handy, right? :)

from apicula.

yrabbit avatar yrabbit commented on July 17, 2024

A thorough study of the random attosoc inoperability on the GW1NR-9 chip.
Modern versions of Yosys and ABC were used, but specially adapted versions of apicula and nextpnr as of December 2019.
The freshness of the Yosys version has little effect because there was no gowin backend in 2019 and used pure generic, whose description was part of apicula.
To get rid of bugs found later than this date, the following things have been ported back from the new apicula version:

  • distributed VCC and GND so as not to take up routing with huge networks;
  • synthetic DFF coding using attributes instead of fuzzer results;
  • modifying LUT's INIT to ignore unused inputs.

The attosoc source was compiled with yosys exactly once and further experiments were performed with the resulting JSON file to rule out any differences in the input files for nextpnr.
A special attribute (BEL) was added to all LUTs and DFFs, which fixed them in certain cells of the chip, so nextpnr always put primitives in the same place between experiments. Only the routing changed.
By running nextpnr with different values of the seed parameter, test images were generated, which were then loaded into a board equipped with an indicator for visual confirmation of performance and an external clock with adjustable frequency.

bug-gw-good.mp4

This is a good image that honestly counts prime numbers. Images with an error (bad) do not light any LEDs.

The images were generated with successively increasing values of the seed parameter from 1 to 500. As a result it was found that on average one bad image per 100 is obtained.
The performance does not depend on the frequency of the external oscillator.
All further tests were performed at a frequency of 20 MHz.

Nextpnr generates JSON files, which are then converted into images using gowin_pack, this file, let's call it pnr-json, contains the final routing of all networks with the connections of each wire.

The pnr-json of the "bad" experiment was saved as bad-pnr-json, and nextpnr was modified so that it could take the routing of one network from that bad-pnr-json, and calculate the rest in the usual way.

The method used was to exclude all wires involved in the routing of the selected fixed network from being available for routing. So whatever nextpnr does to other networks could not conflict with that fixed network, simply because nextpnr is not aware of the existence of wires involved in routing the fixed network.

Experiments have shown that a fixed "clk" network does not allow an image to "heal" over 200 random distributions of other networks.
This network consists of about 4000 endpoints so the next step was to divide it into subnets and find a small segment that with a minimal change in routing would turn a non-working image into a working one.

from apicula.

yrabbit avatar yrabbit commented on July 17, 2024

Finding the problem area out of the thousands took a pretty decent amount of time and here's the picture:

wires
It all starts on pin F6, which is where the external oscillator signal is applied. The final target is the CLK0 input of the DFF flip-flop in cell R25C24.

"Bad" network:
R1C29_F6 -> R1C29_W83 -> R1C25_S83 -> R9C25_S83 -> R17C25_S80 -> R25C25_W23 -> R25C24_X02 -> R25C24_CLK0
"Good" network:
R1C29_F6 -> R1C29_W26 -> R1C27_W27 -> R1C25_S27 -> R3C25_S82 -> R11C25_S10 -> R12C25_S80 -> R20C25_S81 -> R28C25_W24 -> R27C24_N24 -> R25C24_CLK0

If input CLK0 is connected to N242, the image works, if CLK0 is connected to X02, the image does not work.
And now to show the locality of the problem let's compare the whole input files for gowin_pack:

 cmp -x pnrbad.json pnrgood.json
003df255 58 4e
003df256 30 32
003df257 32 34
003df258 5f 32
003df259 43 5f
003df25a 4c 43
003df25b 4b 4c
003df25c 30 4b
003df25d 3b 30

or in ASCII:
ascii

Since only these bytes are different, we are sure that all LUT and DFF are in the same places, the contents of the LUT are the same, all other networks are the same, in general everything is the same except switching input CLK0 to another network.

Now make sure that the bits in the images are set correctly and each of these networks, good and bad, is really connected to F6 and CLK0 and the whole chain is correct.

To do this, unpack the images with gowin_unpack and trace the connections of the wires, it is a little chaotic, but who has dealt with unpacked images immediately catch how I trace the whole chain from CLK0 to F6

bad.mp4
good.mp4

pnrbad.json.gz
pnrgood.json.gz

from apicula.

yrabbit avatar yrabbit commented on July 17, 2024

And in conclusion, I would like to add that there is another network that makes the task even more interesting:
20

This is a small auxiliary network that is connected to wire R25C25_W23 and ends on pin R29C23_A0 - this is pin number 52 of the chip output. From it I can check what is the signal on the R25C25_W23 wire or more clearly:
pin
Measurements showed that at this point the 20 MHz signal is always present.

What are the conclusions? Nothing - we have just a single unfortunate X02 wire, connecting which turns the image into a non-working one.

And I have no idea what could be wrong.

from apicula.

yrabbit avatar yrabbit commented on July 17, 2024

It turned out to be a clock screw.

At @lofty 's suggestion* I extended the clk network with another LUT, it became longer and the circuit worked through the X02 wire.

wires-lut-min

  • the advice was actually to lengthen the signal line instead of clk to detect a holding time violation

from apicula.

yrabbit avatar yrabbit commented on July 17, 2024

oops. Closed by mistake. Let it hang until the new backend fixes it.

from apicula.

yrabbit avatar yrabbit commented on July 17, 2024

In Himbaechel-gowin, none of the examples require any of the -nodffe -noalu -nowidelut flags to work correctly. Ok, pll-nanolcd for tangnano and nangnano1k require -noalu but only because of the lack of proper ALU for calculations.

from apicula.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.