pinecone-io / java-examples Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Pointers shouldn't rly cross Class boundaries. Should stay in `Semantic` class
Originally posted by @aulorbe in #3 (comment)
Some of these tests are a bit tough for me to follow because I'm not familiar with all of the code being tested; going forward, let's try to add code and tests at the same time. This will help me be a better reviewer, and make sure we're not adding lots of code without tests.
Overall this seems like a great first step. Good work.
A couple things that stood out:
HuggingFaceManager
makes a network call to download the dataset from HF, those tests are what I would normally consider an integration test rather than a unit test. By integration test, I mean that it depends on some kind of production dependency outside of our control that can lead to test flakes if HF ever has downtime. In this case, it's not a huge deal since they probably park these datasets on some kind of CDN with high availability, but it's just a good thing to keep in mind when writing tests. When I'm setting up CI stuff, I like to run integration tests separately from unit tests.HuggingFaceManager
into two classes:
HuggingFaceDatasetFetcher
which does 1 thing only: given a dataset url, it downloads stuff from HF. Could be reused in the future to fetch multiple different datasets instead of hardcoding the url.ClimateFeverDataset
which hardcodes the api url for that dataset, uses HuggingFaceDatasetFetcher
to fetch it, and then handles the parsing logic that is specific to that dataset (e.g. claims
and all that). If you break it down in this way, you can use mocks to stub the network call to HF and test the parsing with fixture data.If you want to take on this suggested refactor, it's probably best to do it in a follow-up PR.
Originally posted by @jhamon in #4 (review)
From mtg w/Silas: Think about what data type you want it to be *in Pinecone* -- based on use case, how you want to use it in reality
Originally posted by @aulorbe in #3 (comment)
// Create an empty list to hold the lists of Float values
List<List> floatEmbeddings = new ArrayList<>();
// Iterate over each Embedding object
for (Embedding embedding : embeddings) {
// Create a new list to hold the Float values for this embedding
List floatList = new ArrayList<>();
// Convert each Double value to Float and add it to the list
for (Double value : embedding.getEmbedding()) {
floatList.add(value.floatValue());
}
// Add the list of Float values to the outer list
floatEmbeddings.add(floatList);
}
return floatEmbeddings;
Solve after speaking with Silas: Make 2 constructors -- one with default model, one where users can pass a model.
Originally posted by @aulorbe in #3 (comment)
Ideally, we should use the same connection and at the end of your program, you can add index.close() which will close the connection. So you dont need try-with resources.
Originally posted by @rohanshah18 in #3 (comment)
Fancy "stream-within-a-stream" :)
My brain is having trouble parsing it, can you provide some examples of what the source object looks like and what it needs to turn into?
Originally posted by @ssmith-pc in #3 (comment)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.