Giter VIP home page Giter VIP logo

java-examples's People

Contributors

aulorbe avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

java-examples's Issues

Break HuggingFaceHandler up

          Some of these tests are a bit tough for me to follow because I'm not familiar with all of the code being tested; going forward, let's try to add code and tests at the same time. This will help me be a better reviewer, and make sure we're not adding lots of code without tests.

Overall this seems like a great first step. Good work.

A couple things that stood out:

  • Since HuggingFaceManager makes a network call to download the dataset from HF, those tests are what I would normally consider an integration test rather than a unit test. By integration test, I mean that it depends on some kind of production dependency outside of our control that can lead to test flakes if HF ever has downtime. In this case, it's not a huge deal since they probably park these datasets on some kind of CDN with high availability, but it's just a good thing to keep in mind when writing tests. When I'm setting up CI stuff, I like to run integration tests separately from unit tests.
  • I would consider breaking up your HuggingFaceManager into two classes:
    1. HuggingFaceDatasetFetcher which does 1 thing only: given a dataset url, it downloads stuff from HF. Could be reused in the future to fetch multiple different datasets instead of hardcoding the url.
    2. ClimateFeverDataset which hardcodes the api url for that dataset, uses HuggingFaceDatasetFetcher to fetch it, and then handles the parsing logic that is specific to that dataset (e.g. claims and all that). If you break it down in this way, you can use mocks to stub the network call to HF and test the parsing with fixture data.

If you want to take on this suggested refactor, it's probably best to do it in a follow-up PR.

Originally posted by @jhamon in #4 (review)

Refactor nested `stream` in `embedMany` method to enhance readability

// Create an empty list to hold the lists of Float values
List<List> floatEmbeddings = new ArrayList<>();

// Iterate over each Embedding object
for (Embedding embedding : embeddings) {
// Create a new list to hold the Float values for this embedding
List floatList = new ArrayList<>();

// Convert each Double value to Float and add it to the list
for (Double value : embedding.getEmbedding()) {
    floatList.add(value.floatValue());
}

// Add the list of Float values to the outer list
floatEmbeddings.add(floatList);

}

return floatEmbeddings;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.