Giter VIP home page Giter VIP logo

ml5-next-gen's People

Contributors

b2xx avatar ch3926 avatar gohai avatar orpheask avatar shiffman avatar ziyuan-linn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ml5-next-gen's Issues

NPM vs Yarn

Hi everyone,

I am trying to install some packages for MoveNet with npm and got ERESOLVE unable to resolve dependency tree error.

Screenshot 2023-07-14 184943

It seems like TensorFlow is using dependencies with conflicting versions and npm does not support it.

The TensorFlow installation guides for newer models use Yarn instead of npm when installing the packages. I tried installing MoveNet packages with Yarn instead of npm and was successful. I think the task of managing dependencies will be easier with Yarn, and I'm wondering whether we should switch our package manager.

camelCase pose models?

I'm doing final edits of my Nature of Code book and I noticed that I had referenced ml5.handpose(). I initially thought this was a typo and it should be ml5.handPose(), but looking at the existing models and examples, I see it is handpose()! I don't have a strong feeling one way or another but since I'm finishing the book it might be nice to make a decision:

  • handpose vs. handPose
  • bodypose vs bodyPose
  • facemesh vs. faceMesh

Thoughts @sproutleaf @ziyuan-linn @MOQN @gohai? I do see we have bodySegmentation and imageClassifier so perhaps the camelCase is more consistent?

Sorry for the tags, but my deadline is Friday lol! (I mean, it can be wrong in the book, it won't be the end of the world.)

API for continuous vs. one-time body/hand/face detection

I'm opening this thread to discuss aligning the API across all keypoint detection models (currently planned: hand, body, face). This relates to #21.

The hand model implemented by @ziyuan-linn uses an event callback with handpose.on("hand", gotHands); I like this pattern and it would make sense to adopt it for body pose and face keypoint tracking. However, should we also provide a function for a "one-time" hand detection in an image? Here's one idea:

Continuous

// continuous if a video element is passed into handpose() and on() is used? 
// Or should video be passed into on()?
let handpose = ml5.handpose(video, modelReady);
handpose.on("hand", gotHands);
function modelReady() {
  console.log("model ready");
}
function gotHands(results) {
  console.log(results);
}

One-time

// No media is passed in when loading the model
let handpose = ml5.handpose(modelReady);
function modelReady() {
  console.log("model ready");
  // Individual image passed in?
  handpose.detect(img, gotResults);
}
function gotHands(results) {
  console.log(results);
}

This brings up two other key questions:

Support asnyc/await?

Should we support async and await or keep things simple and more aligned with p5.js callbacks only? The following syntax is so lovely, but not all how p5.js does things! cc @Qianqianye

async function runHandPose() {
  let handpose = await ml5.handpose();
  let results = await handpose.detect(img);
  console.log(results);
}

Error first callbacks?

The p5 style for callbacks is to have one object containing the results.

function gotHands(results) {
  console.log(results);
}

ml5.js has had "error first" callbacks for most of the models in prior versions:

function gotHands(error, results) {
  console.log(results);
}

My preference for beginners is the p5.js style (and include an optional second error argument for debugging). However, I believe ml5.js adopted the error first callback previously because (a) it is a JS convention and (b) it was required for us to be able to support both callbacks and async/await. @joeyklee tagging you in here, is my memory of this correct?

I might lean towards keeping things simple (a) not support asnyc/await (radical!) and therefore (b) not require error-first callbacks. But I'm not at all sure here, that async/await syntax is quite nice!

@MOQN, would love your thoughts as someone who has taught these models quite a bit.

Output API for hand

Hi everyone, I'm opening this thread to discuss about the model prediction output for hand detection. Though I think a lot of things here can also be applied to other landmark detection models.

Keypoints

The tf.js original output for hand detection looks like this:

[
  {
    score: 0.8,
    handedness: "Right",
    keypoints: [
      {x: 105, y: 107, name: "wrist"},
      {x: 108, y: 160, name: "pinky_finger_tip"},
      ...
    ],
    keypoints3D: [
      {x: 0.00388, y: -0.0205, z: 0.0217, name: "wrist"},
      {x: -0.025138, y: -0.0255, z: -0.0051, name: "pinky_finger_tip"},
      ...
    ]
  },
  {
    score: 0.9,
    handedness: "Left",
    ...
  }
]

One idea is to expose each keypoint by name so they can be more intuitively accessed, for example:

[
  {
    score: 0.8,
    handedness: "Right",
    wrist: { //<-----------------------add
      x: 105,
      y: 107,
      3dx: 0.00388,
      3dy: -0.0205,
      3dz: 0.0217
    },
    pinky_finger_tip: { //<-----------------------add
      x: 108,
      y: 160,
      3dx: -0.025138,
      3dy: -0.0255,
      3dz: -0.0051
    },
    keypoints: [
      {x: 105, y: 107, name: "wrist"},
      {x: 108, y: 160, name: "pinky_finger_tip"},
      ...
    ],
    keypoints3D: [
      {x: 0.00388, y: -0.0205, z: 0.0217, name: "wrist"},
      {x: -0.025138, y: -0.0255, z: -0.0051, name: "pinky_finger_tip"},
      ...
    ]
  },
  {
    score: 0.9,
    handedness: "Left",
    ...
  }
]

@yining1023 suggested grouping landmarks of each finger together with intuitive names like wrist, thumb, etc...

[
    {
        "handedness": "Left",
        "wrist": { //<-----------------------add
            "x": 57.2648811340332,
            "y": 489.2754936218262,
            "z": 0.00608062744140625,
            "confidence": 0.89
        },
        "thumb": [ //<-----------------------add
            {
                "x": 57.2648811340332,
                "y": 489.2754936218262,
                "z": 0.00608062744140625,
                "confidence": 0.89,
            },
            {...},
            {...},
            {...},
            {...},
        ],
        "indexFinger":[], //<-----------------------add
        "middleFinger":[], //<-----------------------add
        "ringFinger":[], //<-----------------------add
        "pinky":[], //<-----------------------add
        "keypoints": [
            {
                "x": 57.2648811340332,
                "y": 489.2754936218262,
                "name": "wrist"
            },
        ],
        "keypoints3D": [
            {
                "x": -0.03214273601770401,
                "y": 0.08357296139001846,
                "z": 0.00608062744140625,
                "name": "wrist"
            }
        ],
        "score": 0.9638671875,
    },
    {
       "handedness": "Right",
       ...
    }
]

Handedness

I think this feature could potentially be very useful for users. However, the handedness is the opposite of the actual hand (left hand labeled as right). I found that when flipHorizontal is set to true, the handedness would be labeled correctly. We could potentially flip the handedness value within ml5 when flipHorizontal is false.

Keypoint Diagram

Tf.js have a diagram outlining each index and the name of each keypoint.
diagram

I personally find this kind of diagram very helpful when trying to find a landmark point quickly. I think there are similar diagrams for other tf.js landmark detection models. @MOQN Do you think we could display or link these diagrams on the new website?

I'm happy to hear any suggestions or ideas!

Supporting p5.js preload

@shiffman Hi everyone, I am starting this discussion because I just realized that initialization functions like ml5.handpose() need to be hooked into p5's preload() function.

@gohai's mentioned here that the old p5PreloadHelper does not seem to be working and instead called _incrementPreload() and _decrementPreload(). I am not really familiar with p5's implementation, so I am not sure how preload() works under the hood. Would it make sense for us to update the p5PreloadHelper so we could easily hook into the preload function?

i need moveNet in ML5

Dear ml5 community,
i need moveNet in my projects.so when is MoveNet supported in ML5?

Feature Request: Tensorflow_Converter in ml5js

Any chance ml5js can include a javascript version of Tensorflow_Converter, specifically converting tensorflowJS files to tensorflowLite files. I already have a JavaScript file that might convert tensorflowLite files into C-header files ready for loading onto a microcontroller. https://hpssjellis.github.io/webMLserial/public/convert/xxd03.html

@shiffman has spoken with me on the issue of webSerial to microcontroller to machine learning back to microcontroller at: #9

Here is a link to the tensorflow forum about this topic

My vanilla Javascript webSerial research page is here

BodyPix seems to be making http request each frame

I have not looked into this extensively, but when bodyPix is running continuous detection, it makes a network request for each frame and adds an image to the sources. The original bodyPix also continuously makes requests, but it will not add images to the sources. I am not sure if this is the intended behavior. See the screenshot below.

Screenshot 2023-10-18 at 10 45 21 AM Screenshot 2023-10-18 at 10 44 40 AM

Feature Request: ml5 with Arduino Microcontrollers using webSerial

I am working on a proof of concept of using TensorflowJS with microcontroller sensors to build machine learning models that can connect to microcontroller actuators after classification using webSerial and hopefully polyfill for Android. I feel that ml5js and p5js could probably do the same thing, but more professionally than my skills.

Is anyone on the ML5js team capable with Arduino MBED style microcontrollers such as the Nano33BleSense or the PortentaH7 or the new Seeedstudio XIAO SAMD or XIAO esp32S3?

Here is a youtube video of taking a Huggingface model and having the Arduino Servo motor respond to the classifications. I am working on Sensor uploading of data to tensorflowJS but testing needs an Arduino, hence the question about if anyone on your team is familiar with Arduino's.

I am not that interested in the extra steps of converting the tensorflowJS model to TensorflowLite and then using xxd -i to convert it into a c-header file for compiling to the Arduino.

Topology mismatch with 0.20.0.alpha releases

Not wanting to open duplicate issues but I have narrowed this issue
to be related to the last 3 alpha releases of ML5.

It is not present with ml5@latest/dist/ml5.min.js

However, it seems when saving and loading the files with the neural net function in the alpha releases the tensorflow topology is expecting a different structure ?

I have tried this with nextgen handpose and also a simple mouse X,Y regression. Both cause this error as soon as the version is 0.20.0.alpha 1,2 or 3.

"VERSION" global variable conflict with p5

Both p5.js and @tensorflow-models/face-landmark-detection use VERSION as a global variable. This causes p5 to throw a warning that VERSION is a p5-reserved variable. To not cause confusion among the users, the current library uses patch-package to remove the VERSION global variable from the @tensorflow-models/face-landmark-detection file.

This fix prevents p5 from outputting a warning; however, it also modifies the dependency code. We should consider this fix a temporary hack and eventually contact p5 or TensorFlow about this issue.

Support loading offline models

Add the ability to load offline models, so that users can still run their projects with spotty Internet / from places of the world where access to online models is restricted, i.e. China.

See prior discussion in ml5js/ml5-library#1254.

Notably, now that we're using TF version ^4.2.0, we should be able to specify where the model is loaded from. To quote Joey's reply in the issue mentioned above:

If we do end up updating our tf versions to some of the more recent versions, then it looks like in the latest face-landmarks-detection lib we can specify where our model files should be loaded from -- https://github.com/tensorflow/tfjs-models/tree/master/face-landmarks-detection/src/mediapipe#create-a-detector -- which, in this case, would be somewhere on a local server.

Discussion: ImageClassifier

I'm opening this thread as a place for us to discuss ongoing work on the ImageClassifier model. Some of the tasks are:

  • Decide where the user should specify the number of classes to return (the topk variable). Should it be when they create the model? When they call classify/classifyStart? Or both? In the new version of the detector models we eliminated the ability to pass options to the detect function. Following that example we would support ml5.imageClassifier('doodlenet', { topk: 5 }) but not myClassifier.classify(image, 5). Right now passing the option to ml5.imageClassifier() only works for mobilenet.
  • Separate handling which only applies to one model type ml5js/ml5-library#1362
  • Figure out where we want to handle conversion to Tensor, related to ml5js/ml5-library#1396
  • Clean up preprocessing code using the utilities created in ml5js/ml5-library#1398
  • See if there are parts of the legacy code that don't make sense. For example we look for a property ml5Specs in the model.json file for custom models. That property would only be set if the user trained their model using the ml5 FeatureExtractor, which we haven't ported over here. But users may have trained models on the old version so I guess we should keep it?
  • Look into what a model from TeachableMachine actually looks like and see if we are handling it properly. Is the image size always 224? Is our range of normalized values correct? Can we infer these values from the model in case the user has a custom trained model which doesn't match our expectations?
  • Write unit tests for all model types which make sure that the model gives the correct prediction. There's a lot in https://github.com/ml5js/ml5-library/blob/b4960e9f8d5673830b6e149ad9da0e3548ec636a/src/ImageClassifier/index.test.js but even that is incomplete.

ml5 errors when loading two different models simultaneously

Hi all,

Thank you @Tinazz333 for discovering and letting me know about the issue below.

When loading handpose and facemesh (and probably other models as well) simultaneously, a runtime error will occur:

RuntimeError: Aborted(Module.arguments has been replaced with plain arguments_ (the initial value can be provided on Module, but after startup the value is only looked for on a local variable of that name))

The following code in p5 will trigger this error:

let handpose;
let facemesh;
preload() {
  handpose = ml5.handpose();
  facemesh = ml5.facemesh();
}

A trick that seems to fix the problem is forcing the models to load one by one like this:

let handpose;
let facemesh;
preload() {
  handpose = ml5.handpose(() => {
    facemesh = ml5.facemesh();
  });
}

However, this still causes the browser to throw a bunch of warnings:

The kernel '...' for backend 'webgpu' is already registered

Idea: universal CSS file for examples

I think it would make a huge difference if we could have a style.css file which we load into all of our examples. We would define styles for HTML elements like h1, canvas, p, etc. (not CSS class names).

We would need to add a <style> tag to all examples, alongside our <script> tags. It could be done with a one-time manual copy & paste or dynamically with the copy-webpack-plugin, which we use for transforming script urls. We can easily re-theme by changing that one style.css file.

Here is a quick proof-of-concept. I styled the canvas using the same style as the search box on the examples home page. I centered the contents horizontally which makes a big difference on large screens.

After ๐Ÿ‘

image

Before ๐Ÿ‘Ž

image

CSS ๐Ÿ‘ฉโ€๐Ÿ’ป

body {
    font-family: Arial, Helvetica, sans-serif;
    font-size: 16px;
    display: flex;
    flex-direction: column;
    align-items: center;
}

canvas {
    border: 2px solid black;
    padding: 1rem;
    box-shadow: 1rem 1rem #a15ffb;
    margin: 1rem 2rem 2rem 2rem;
}

h1 {
    text-align: center;
    border-bottom: 2px dotted #a15ffb;
    width: fit-content;
    padding-bottom: 8px;
    
}

p {
    max-width: 600px;
}

Header for Examples

          This is maybe a silly comment, but I sometimes find that when looking at the examples that the copyright notice at the top is a little off-putting / intimidating. Being clear and transparent about software licenses is important, however, maybe we should try a friendlier heading? 

Also, the ml5 project has its own license, not MIT! What if we create a new uniform heading something like:

// ๐Ÿ‘‹ Hello! This is an ml5.js made and shared with โค๏ธ.
// Learn more about the ml5.js project: https://ml5js.org/
// ml5.js license and code of conduct: https://github.com/ml5js/ml5-next-gen/blob/main/LICENSE.md

// This example demonstrates _______________
// Any other notes?

Originally posted by @shiffman in #66 (comment)

Mirror Webcam Video

Hi all!

I'm opening this issue to discuss the possibility of implementing a feature in ml5 that flips the underlying source of a video horizontally, mainly as a convenient way to mirror the live webcam footage. This issue is originally discussed here in this thread.

It would be nice to have a function that flips the webcam video with one line of code and does not interfere with other things on the canvas, as shown in the code below:

let video, flippedVideo;
function setup() {
  video = createCapture(VIDEO);
  flippedVideo = ml5.flipVideo(video);
}

As far as implementing a function like ml5.flipVideo(video) I am imaging these steps:

  1. Get the original video element from the parameter (either a p5 video or an HTML video).
  2. Create an HTML canvas and repeatedly draw the video image, flipped horizontally.
  3. Convert the HTML canvas into a media stream using captureStream()
  4. Return the media stream as a video or p5 video depending on the user input type.

I am not 100% sure whether this would work, so any suggestions or alternative solutions are welcome!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.