rachelselinar / dreamplacefpga Goto Github PK

An Open-Source Analytical Placer for Large Scale Heterogeneous FPGAs using Deep-Learning Toolkit

License: BSD 3-Clause "New" or "Revised" License

CMake 1.61% TeX 0.02% Perl 1.84% Python 28.68% C++ 49.47% Cuda 17.61% C 0.62% Cap'n Proto 0.06% Tcl 0.01% Dockerfile 0.06%

dreamplacefpga's People

Contributors

Stargazers

Watchers

Forkers

boenset gerbaux liumengbjut wisdom0530 vouveene zhilix clavin-xlnx eddieh-xlnx jammy-li koananan rustamc guhaopython lcrypto mfkiwl ethan0jiang restelli pwang7 aprucka19

dreamplacefpga's Issues

Runtime Error: cudaErrorInvalidDevice

Hi, there is a runtime error when executing ''python dreamplacefpga/Placer.py test/FPGA-example1.json'' whether enable GPU or not.

I build env by docker.

May I know if I can get some suggestion?

Thanks for your help.

Always one instance being placed to "1 0 15"

When I run global placement and legalization for gnl designs and FPGA-example designs, in the final solution there is always one instrance being placed to "1 0 15".

which is very far away from other instances being placed in the center of the layout.

I tried to dump out placement solutions at different stages and found probably something happened during "ripUP_Greedy_slotAssign".

The "make" errors with different pytorch versions

Hi Rachel,
I met the errors with the "make" command in the build folder. The log is shown as follows.

I guess the errors comes from the CUDA and pytorch versions. I installed them by using the official command. It is shown as follows. I used the python version of 3.6. The OS is ubuntu 18.04.

Could you please check and provide the docker version in the future? Hope you can reply.

Segmentation Fault on GNL Designs

Could you please merge the lut0_support branch into main? It is solving a Segmentation Fault issue for some designs see Xilinx/RapidWright#853

Placement Runtime in DREAMPlaceFPGA

Why did I experiment with the ISPD'2016 FPGA01 benchmark on a Linux server that consists of an Intel (R) Xeon (R) W-2123 CPU @ 3.60GHz (8 cores) and the result was a GP of 17.94 seconds and an L+D of 52.141 seconds?

IndexError in place_io.py

Not sure if I am doing something wrong, but I am getting an error when trying to place a design:

$ python dreamplacefpga/Placer.py test/FPGA12_vu440.json
[INFO   ] DREAMPlaceFPGA - Parameters[1] = [{'scl_file': '', 'instance_file': '', 'pin_file': '', 'net_file': '', 'routing_file': '', 'util_file': '', 'pickle_file': '', 'load_pickle': 0, 'aux_input': 'benchmarks/IF2bookshelf/FPGA12_vu440/design.aux', 'gpu': 0, 'num_bins_x': 512, 'num_bins_y': 512, 'global_place_stages': [{'num_bins_x': 512, 'num_bins_y': 512, 'iteration': 2000, 'learning_rate': 0.01, 'wirelength': 'weighted_average', 'optimizer': 'nesterov'}], 'target_density': 1.0, 'density_weight': 8e-05, 'random_seed': 1000, 'result_dir': 'results', 'scale_factor': 1.0, 'ignore_net_degree': 3000, 'gp_noise_ratio': 0.025, 'enable_fillers': 1, 'global_place_flag': 1, 'legalize_flag': 1, 'stop_overflow': 0.1, 'dtype': 'float32', 'detailed_place_engine': '', 'detailed_place_command': '-nolegal -nodetail', 'plot_flag': 0, 'RePlAce_ref_hpwl': 350000, 'RePlAce_LOWER_PCOF': 0.95, 'RePlAce_UPPER_PCOF': 1.05, 'gamma': 5.0, 'random_center_init_flag': 1, 'sort_nets_by_degree': 0, 'num_threads': 1, 'dump_global_place_solution_flag': 0, 'dump_legalize_solution_flag': 0, 'routability_opt_flag': 1, 'route_num_bins_x': 512, 'route_num_bins_y': 512, 'node_area_adjust_overflow': 0.15, 'max_num_area_adjust': 3, 'adjust_resource_area_flag': 1, 'adjust_route_area_flag': 1, 'adjust_pin_area_flag': 1, 'area_adjust_stop_ratio': 0.01, 'route_area_adjust_stop_ratio': 0.01, 'pin_area_adjust_stop_ratio': 0.05, 'unit_horizontal_capacity': 209, 'unit_vertical_capacity': 239, 'unit_pin_capacity': 50, 'max_route_opt_adjust_rate': 2.0, 'route_opt_adjust_exponent': 2.0, 'pin_stretch_ratio': 1.414213562, 'max_pin_opt_adjust_rate': 1.5, 'ffPinWeight': 3.0, 'deterministic_flag': 0, 'enable_if': 0, 'detailed_place_flag': 0}]
[INFO   ] reading benchmarks/IF2bookshelf/FPGA12_vu440/design.aux
Parsing File benchmarks/IF2bookshelf/FPGA12_vu440/design.lib
Parsing File benchmarks/IF2bookshelf/FPGA12_vu440/design.scl
Parsing File benchmarks/IF2bookshelf/FPGA12_vu440/design.nodes
Parsing File benchmarks/IF2bookshelf/FPGA12_vu440/design.pl
Parsing File benchmarks/IF2bookshelf/FPGA12_vu440/design.nets
Traceback (most recent call last):
  File "dreamplacefpga/Placer.py", line 120, in <module>
    placeFPGA(params)
  File "dreamplacefpga/Placer.py", line 38, in placeFPGA
    placedb(params) #Call function
  File "dreamplacefpga/PlaceDB.py", line 371, in __call__
    self.read(params)
  File "dreamplacefpga/PlaceDB.py", line 221, in read
    self.rawdb = place_io.PlaceIOFunction.read(params)
  File "
[example.zip](https://github.com/rachelselinar/DREAMPlaceFPGA/files/11761836/example.zip)
dreamplacefpga/ops/place_io/place_io.py", line 20, in read
    return place_io_cpp.forward(args)
IndexError: _Map_base::at

example.zip

To reproduce:

wget https://github.com/rachelselinar/DREAMPlaceFPGA/files/11761852/example.zip
unzip example.zip
python dreamplacefpga/Placer.py test/FPGA12_vu440.json

NonLinearPlaceFPGA.dump() not defined

The issue is in file dreamplacefpga/NonLinearPlace.py, line 712.
The function self.dump() is not defined.

To reproduce the error:

Add the following option in test/FPGA-example1.json:

"dump_global_place_solution_flag" : 1

Then run the following command:

python dreamplacefpga/Placer.py test/FPGA-example1.json

Error message:

Traceback (most recent call last):
  File "dreamplacefpga/Placer.py", line 120, in <module>
    placeFPGA(params)
  File "dreamplacefpga/Placer.py", line 44, in placeFPGA
    metrics = placer(params, placedb)
  File "/home/grads/h/hailiang/DREAMPlaceFPGA/dreamplacefpga/NonLinearPlace.py", line 712, in __call__
    self.dump(params, placedb, self.pos[0].cpu(), "%s.lg.pklz" %(params.design_name()))
  File "/home/grads/h/hailiang/anaconda3/envs/dreamplacefpga/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1270, in __getattr__
    type(self).__name__, name))
AttributeError: 'NonLinearPlaceFPGA' object has no attribute 'dump'

Possible method to fix this issue:
Add the dump() and load() function in dreamplacefpga/BasicPlace.py as in DREAMPlace. (I think dump and load work in pairs.)
Links to example: dump( ), load( ).

Error running ISPD2016 FPGA12 benchmark

I attempted to run the ISPD2016 FPGA12 benchmark with the following configuration,

{
"aux_input" : "benchmarks/ispd2016/FPGA12/design.aux",
"gpu" : 1,
"num_threads" : 1,
"num_bins_x" : 512,
"num_bins_y" : 512,
"global_place_stages" : [
    {"num_bins_x" : 512, "num_bins_y" : 512, "iteration" : 2000, "learning_rate" : 0.01, "wirelength" : "weighted_average", "optimizer" : "nesterov"}
],
"target_density" : 1.0,
"density_weight" : 8e-5,
"random_seed" : 1000,
"scale_factor" : 1.0,
"global_place_flag" : 1,
"legalize_flag" : 1,
"detailed_place_flag" : 0,
"dtype" : "float32",
"deterministic_flag" : 0,
"result_dir": "results/ispd2016/FPGA12"
}

and I encountered the following error:

I am seeking assistance finding a working script for running the ISPD2016 benchmark. Could you please provide a script or guidance on how to configure the benchmark to run it successfully and properly? I would appreciate it if you could provide the scripts for running all the ISPD2016 benchmarks from FPGA01 to FPGA12.

Thank you. 😊

Handle LUT6_2 in IFWriter

Refer to pull request #7 for more details. Warning msgs added to IFWriter.

Interchange to Bookshelf converter doesn't properly handle bussed nets

When trying to convert a much larger design, I encountered this error:

Traceback (most recent call last):
  File "IFsupport/IF2bookshelf.py", line 586, in <module>
    if_parser = IF2bookshelf(bookshelf_dir, args.netlist)
  File "IFsupport/IF2bookshelf.py", line 38, in __init__
    self.netlist_obj = LogicalNetlist(schema_dir, netlist_file)
  File "IFsupport/IF2bookshelf.py", line 500, in __init__
    port_idx = port_bus2idx[port_name].index(port_inst.busIdx.idx)
ValueError: 0 is not in list

I have attached a trivial design example that triggers the bug.
example.zip

To reproduce:

wget https://github.com/rachelselinar/DREAMPlaceFPGA/files/11761406/example.zip
unzip example.zip
python IFsupport/IF2bookshelf.py --netlist example.netlist

Here is what the example looks like in Vivado:

CC: @zhilix

"RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDevice: invalid device ordinal" for GPUs with compute capability 8.6 and higher

DREAMPlaceFPGA run on GPUs with compute capability 8.6 and higher has a CUDA runtime error during LUT/FF legalization. Pasting from FPGA-example1 run:

Preclusters: 829 (819 + 10) Initialization completed in 0.074 seconds 

Traceback (most recent call last): 
  File "dreamplacefpga/Placer.py", line 120, in <module> 
    placeFPGA(params) 
  File "dreamplacefpga/Placer.py", line 44, in placeFPGA 
    metrics = placer(params, placedb) 
  File "/DREAMPlaceFPGA/dreamplacefpga/NonLinearPlace.py", line 793, in __call__ 
    self.op_collections.lut_ff_legalization_op.runDLIter(self.pos[0], model.precondWL[:placedb.num_physical_nodes], sortedNodeMap, sortedNodeIdx, sortedNetMap, sortedNetIdx, sortedPinMap, activeStatus, illegalStatus, dlIter) 
  File "/DREAMPlaceFPGA/dreamplacefpga/ops/lut_ff_legalization/lut_ff_legalization.py", line 323, in runDLIter 
    lut_ff_legalization_cuda.runDLIter(pos, self.pin_offset_x, self.pin_offset_y, self.net_bbox, self.net_pinIdArrayX,  
RuntimeError: radix_sort: failed on 1st step: cudaErrorInvalidDevice: invalid device ordinal

Cause:
Similar to issue.
Error is due to the use of thrust libraries in lut_ff_legalization_cuda_kernel.cu. Observed the runtime error in gpu machines with compute capability 8.6 and 8.9.
The CUDA runtime error does not exist for gpu machines with compute capability 8.0 or lower.

Current Work around:
In lut_ff_legalization_cuda_kernel.cu, use the single kernel for runDLIter instead of split kernel approach with rearranging.
Comment out thrust libraries
Comment out split kernels for DL
Uncomment single DL kernel
With a single kernel for direct legalization, the sites are not rearranged in descending order of number of new site candidates to be explored and incurs a minimal runtime increase.

Opening this issue to track and provide a fix.

Error when running program with "detailed_place_flag" : 1

Hi there! I'm facing an issue while running the program with the detailed_place_flag set to 1. I noticed that this flag is supposed to enable detailed placement of the program, but it seems to be causing an error. Could you help me understand what might be causing this issue?

I'm currently running the program with the following configuration on ISPD 2016 FPGA01 benchmark.

{
"aux_input" : "benchmarks/ispd2016/FPGA01/design.aux",
"gpu" : 1,
"num_threads" : 1,
"num_bins_x" : 512,
"num_bins_y" : 512,
"global_place_stages" : [
    {"num_bins_x" : 512, "num_bins_y" : 512, "iteration" : 2000, "learning_rate" : 0.01, "wirelength" : "weighted_average", "optimizer" : "nesterov"}
],
"target_density" : 1.0,
"density_weight" : 8e-5,
"random_seed" : 1000,
"scale_factor" : 1.0,
"global_place_flag" : 1,
"legalize_flag" : 1,
"routability_opt_flag" : 1,
"detailed_place_flag" : 1,
"dtype" : "float32",
"deterministic_flag" : 0,
"result_dir": "results/ispd2016/FPGA01"
}

I've included the error message below:

Any guidance or insight you could provide would be greatly appreciated. Thank you for your time!

Move LUT occupying entire BLE to 6LUT location instead of 5LUT

In US+ designs, having any single LUT in BLE in the 5LUT location instead of 6LUT is causing errors in Vivado during routing.
Currently only six-input LUT is placed in 6LUT, and others are not checked. Update legalizer to fix this inherently instead of fix in IFWriter based on PR #13.

Pasting relevant information below from @eddieh-xlnx 's email:

“A6” sitewire of a site containing just a LUT5 (no LUT6) does not have the “A6” sitewire set to VCC. Here is a screenshot of the Vivado device view:

E6 sitewire is not set to VCC, thus it appears as an antenna after “route_design”.

There are two possible solutions:

Set the [A-H]6 sitewire to GND for all cases when [A-H]5LUT is occupied but [A-H]6LUT is not.
There is no advantage in placing a LUT[54321] into a [A-H]5LUT BEL when the corresponding [A-H]6LUT BEL is unoccupied. You might as well place it in the latter as it’s a little more flexible and a little faster to exit the site too.