Giter VIP home page Giter VIP logo

Comments (7)

nhanvtran avatar nhanvtran commented on May 13, 2024

Kevin P says it should work passing a c variable to a preprocessor directive, so that makes things easier!

from hls4ml.

ejk43 avatar ejk43 commented on May 13, 2024

Kevin P says it should work passing a c variable to a preprocessor directive, so that makes things easier

Neat! I have not tried this before, but happy to hear it's possible.

As a related thought: I've recently been using a few digital signal processing operations provided with HLS that might provide a helpful model: 1) CORDIC and 2) FFT .. Both cordic and fft libraries use template functions to control their implementation details, including bitwidths, fixed point vs floating point, HDL architecture, and even what type of operation they run.

Here's a code snippet for an example CORDIC operation. It's a bit extreme use of templating, but I think it's instructive to see some what we could theoretically do. For example, the CORDIC is running a "Translate" operation, it's using scaled radians (range -1 to +1 instead of -pi to +pi), it's choosing the number of iterations automatically, and it's using BRAM to scale the magnitude result, to call out a few of the capabilities.

// Find mag/angle via cordic
typename translate_inputs<cmplx_width, hls::CORDIC_FORMAT_SIG_FRAC>::in cordicdata;
typename translate_outputs<cordic_width, hls::CORDIC_FORMAT_SIG_FRAC>::out outputdata;
cordicdata.cartesian = in[ii];
hls::cordic_base<hls::CORDIC_F_TRANSLATE, hls::CORDIC_TRUE,
                hls::CORDIC_FORMAT_SIG_FRAC, hls::CORDIC_FORMAT_SCA,
                cmplx_width, cordic_width,
                hls::CORDIC_ITER_AUTO, hls::CORDIC_PREC_AUTO,
                hls::CORDIC_ROUND_TRUNCATE, hls::CORDIC_SCALE_BRAM > (cordicdata, outputdata);

Here's an example code snippet for running an FFT...

struct static_config : hls::ip_fft::params_t {
  // Default parameters: Fixed pt config!
  static const unsigned ordering_opt = hls::ip_fft::natural_order;
  static const unsigned max_nfft = FFT_NFFT_MAX;
  static const unsigned config_width = FFT_CONFIG_WIDTH;
  static const unsigned phase_factor_width = FFT_PHASE_FACTOR_WIDTH;
  static const unsigned input_width = FFT_INPUT_WIDTH;
  static const unsigned output_width = FFT_OUTPUT_WIDTH;
};
typedef hls::ip_fft::config_t<static_config> static_config_t;
typedef hls::ip_fft::status_t<static_config> static_status_t;

static_config_t fft_config;
static_status_t fft_status;
fft_config.setDir(1);
fft_config.setSch(0x2AB);

hls::fft<static_config>(iq_in, iq_out, &fft_status, &fft_config);
bool ovflo = fft_status.getOvflo();

There's a LOT that goes into configuring and using the HLS FFT, but the important takeaway in my opinion is that we can create a struct which gets passed in to a template function... I suspect we might be able to use this architecture as inspiration for an improved interface to the neural network library that can plug in to Keras with more flexibility....

Thoughts?

from hls4ml.

nhanvtran avatar nhanvtran commented on May 13, 2024

@ejk43 I like the idea of having the templated configuration for each layer. Just to make sure I understand, we could even configure without preprocessor directives and then combine both functionalities pretty cleanly, right?

also good tips on the CORDIC!

from hls4ml.

benjaminkreis avatar benjaminkreis commented on May 13, 2024

Good progress in recently merged #7. Should do something similar for the activation functions

from hls4ml.

ejk43 avatar ejk43 commented on May 13, 2024

Another thought here-- would it make sense to supply the configuration struct with a target "initiation interval"? The "II" is the real driver of the potential resource reuse, so it would be great to be able to supply the target II to the nnet library, which the calculates the ideal partitioning and unroll factors to hit the target... For example, II = 1 would require full unrolling. II = 2 would want to be unrolled by half the total operations, etc

I've found the pragma PIPELINE directive with a specified II is basically unreliable unless the partitioning and unroll directives are also correct. This suggests to me the nnet library would want to do the required calculations to get unroll/partition correct.

(also, a pretty neat feature could be to chain multiple layers with different network sizes... I wonder if we could intelligently adjust unroll factors for downstream/upstream layers based on the input initiation interval target, to get more resource-reuse downstream if possible. This may be a bit of a stretch goal, though)

from hls4ml.

nhanvtran avatar nhanvtran commented on May 13, 2024

That sounds good. We also found that the pipelining pragma did weird things without unroll/partition set correctly -- thus it was actually commented out of the latest PR. The other problem was that it was in the top function. So putting it in each layer should give us finer control

from hls4ml.

nhanvtran avatar nhanvtran commented on May 13, 2024

Closing this issue. Technical basics are solved and now finer details are being discussed in other issues!

from hls4ml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.