ml5js / ml5-next-gen Goto Github PK
View Code? Open in Web Editor NEWRepo for next generation of ml5.js: friendly machine learning for the web! ๐ค
Home Page: https://ml5js.org/
License: Other
Repo for next generation of ml5.js: friendly machine learning for the web! ๐ค
Home Page: https://ml5js.org/
License: Other
Not wanting to open duplicate issues but I have narrowed this issue
to be related to the last 3 alpha releases of ML5.
It is not present with ml5@latest/dist/ml5.min.js
However, it seems when saving and loading the files with the neural net function in the alpha releases the tensorflow topology is expecting a different structure ?
I have tried this with nextgen handpose and also a simple mouse X,Y regression. Both cause this error as soon as the version is 0.20.0.alpha 1,2 or 3.
I am working on a proof of concept of using TensorflowJS with microcontroller sensors to build machine learning models that can connect to microcontroller actuators after classification using webSerial and hopefully polyfill for Android. I feel that ml5js and p5js could probably do the same thing, but more professionally than my skills.
Is anyone on the ML5js team capable with Arduino MBED style microcontrollers such as the Nano33BleSense or the PortentaH7 or the new Seeedstudio XIAO SAMD or XIAO esp32S3?
Here is a youtube video of taking a Huggingface model and having the Arduino Servo motor respond to the classifications. I am working on Sensor uploading of data to tensorflowJS but testing needs an Arduino, hence the question about if anyone on your team is familiar with Arduino's.
I am not that interested in the extra steps of converting the tensorflowJS model to TensorflowLite and then using xxd -i to convert it into a c-header file for compiling to the Arduino.
I'm opening this thread to discuss aligning the API across all keypoint detection models (currently planned: hand, body, face). This relates to #21.
The hand model implemented by @ziyuan-linn uses an event callback with handpose.on("hand", gotHands);
I like this pattern and it would make sense to adopt it for body pose and face keypoint tracking. However, should we also provide a function for a "one-time" hand detection in an image? Here's one idea:
// continuous if a video element is passed into handpose() and on() is used?
// Or should video be passed into on()?
let handpose = ml5.handpose(video, modelReady);
handpose.on("hand", gotHands);
function modelReady() {
console.log("model ready");
}
function gotHands(results) {
console.log(results);
}
// No media is passed in when loading the model
let handpose = ml5.handpose(modelReady);
function modelReady() {
console.log("model ready");
// Individual image passed in?
handpose.detect(img, gotResults);
}
function gotHands(results) {
console.log(results);
}
This brings up two other key questions:
Should we support async
and await
or keep things simple and more aligned with p5.js callbacks only? The following syntax is so lovely, but not all how p5.js does things! cc @Qianqianye
async function runHandPose() {
let handpose = await ml5.handpose();
let results = await handpose.detect(img);
console.log(results);
}
The p5 style for callbacks is to have one object containing the results.
function gotHands(results) {
console.log(results);
}
ml5.js has had "error first" callbacks for most of the models in prior versions:
function gotHands(error, results) {
console.log(results);
}
My preference for beginners is the p5.js style (and include an optional second error
argument for debugging). However, I believe ml5.js adopted the error first callback previously because (a) it is a JS convention and (b) it was required for us to be able to support both callbacks and async/await. @joeyklee tagging you in here, is my memory of this correct?
I might lean towards keeping things simple (a) not support asnyc/await (radical!) and therefore (b) not require error-first callbacks. But I'm not at all sure here, that async/await syntax is quite nice!
@MOQN, would love your thoughts as someone who has taught these models quite a bit.
Here is a sketch with old ML5 facemesh making a mask
https://editor.p5js.org/dano/sketches/2ULaz2E0v
Here is a sketch with new ML5 facemesh making a mask
https://editor.p5js.org/dano/sketches/-3gKqxIZD
It appears that all the points for the "faceOval" property come in but they are not in a consecutive order around the face.
As ever thanks for this amazing library
In preparing a ml5.js alpha build, we noticed the filze size is quite large.
It would be worth taking the time to research and look into why this is and if there are dependencies we can remove / scale down. I recall this used to be an issue due to the inclusion of magenta
but i don't think we are anymore?
The values look like this:
{
"yMin": 0.1048629879951477,
"xMin": 0.1899825632572174,
"yMax": 0.9370519518852234,
"xMax": 0.7449000477790833,
"width": 0.5549174845218658,
"height": 0.8321889638900757
}
I get the right pixel values by multiplying by the p5 width
and height
.
rect(boundingBox.xMin * width, boundingBox.yMin * height, boundingBox.width * width, boundingBox.height * height);
Hi everyone! I realize there's a new version of the library being developed so this bug doesn't actually have to be addressed, but I thought maybe the learnings of the cause might be good to know for future development.
So I was running PoseNet in a sketch being used in an art installation, intended to be running for days at a time. After a few hours on, it would mostly still run fast, but would drop a bunch of frames once in a while (at an accelerating pace, after being left on for e.g. 12 hours.) This shows up in a performance profile as a 1.2s(!!) garbage collection:
I compared some heap profiles over time to see what got added, and noticed that some of the added memory includes a promise, referenced by a promise, referenced by a promise, referenced by a promise...:
I narrowed this down to the usage of promises in multiPose
:
https://github.com/ml5js/ml5-library/blob/f80f95aa6b31191ab7e79ff466e8506f5e48a172/src/PoseNet/index.js#L180-L182
So, this code has the leak, using the automatic detection on the next frame, going through the promises above:
class MyDetector {
constructor(constraints) {
this.capture = createCapture(constraints)
this.poseNet = ml5.poseNet(this.capture, {
modelUrl: 'lib/model-stride16.json',
graphModelURL: 'lib/group1-shard1of1.bin',
})
this.poseNet.on('pose', (results) => {
this.processNextPose(results)
})
}
processNextPose(results) {}
}
Meanwhile, this code does not seem to leak, by not initializing PoseNet with a video, and instead manually calling multiPose
:
class MyDetector {
constructor(constraints) {
this.capture = createCapture(constraints)
this.ready = false; // Manually store when we're ready to process a frame
this.poseNet = ml5.poseNet(undefined, { // Pass undefined here
modelUrl: 'lib/model-stride16.json',
graphModelURL: 'lib/group1-shard1of1.bin',
}, () => {
this.ready = true
})
this.poseNet.on('pose', (results) => {
this.processNextPose(results)
})
}
// Call this every frame in draw():
update() {
if (this.ready) {
this.ready = false;
this.poseNet.multiPose(this.capture)
requestAnimationFrame(() => {
this.ready = true
})
}
}
processNextPose(results) {}
}
So it seems like the problem was in fact the buildup of contexts in the promises. It tries to return the promise from multiPose
, effectively doing an asynchronous infinite loop on itself, and I guess the browser keeps around the whole chain of promises due to the fact that it's being return
ed instead of just awaited.
Anyway, I just wanted to flag this in case the same pattern is being used elsewhere!
@shiffman Hi everyone, I am starting this discussion because I just realized that initialization functions like ml5.handpose()
need to be hooked into p5's preload()
function.
@gohai's mentioned here that the old p5PreloadHelper
does not seem to be working and instead called _incrementPreload()
and _decrementPreload()
. I am not really familiar with p5's implementation, so I am not sure how preload()
works under the hood. Would it make sense for us to update the p5PreloadHelper
so we could easily hook into the preload
function?
Any chance ml5js can include a javascript version of Tensorflow_Converter, specifically converting tensorflowJS files to tensorflowLite files. I already have a JavaScript file that might convert tensorflowLite files into C-header files ready for loading onto a microcontroller. https://hpssjellis.github.io/webMLserial/public/convert/xxd03.html
@shiffman has spoken with me on the issue of webSerial to microcontroller to machine learning back to microcontroller at: #9
Here is a link to the tensorflow forum about this topic
My vanilla Javascript webSerial research page is here
I'm opening this thread as a place for us to discuss ongoing work on the ImageClassifier
model. Some of the tasks are:
topk
variable). Should it be when they create the model? When they call classify
/classifyStart
? Or both? In the new version of the detector models we eliminated the ability to pass options
to the detect
function. Following that example we would support ml5.imageClassifier('doodlenet', { topk: 5 })
but not myClassifier.classify(image, 5)
. Right now passing the option to ml5.imageClassifier()
only works for mobilenet
.ml5Specs
in the model.json
file for custom models. That property would only be set if the user trained their model using the ml5 FeatureExtractor
, which we haven't ported over here. But users may have trained models on the old version so I guess we should keep it?224
? Is our range of normalized values correct? Can we infer these values from the model in case the user has a custom trained model which doesn't match our expectations?Hi all,
Thank you @Tinazz333 for discovering and letting me know about the issue below.
When loading handpose and facemesh (and probably other models as well) simultaneously, a runtime error will occur:
RuntimeError: Aborted(Module.arguments has been replaced with plain arguments_ (the initial value can be provided on Module, but after startup the value is only looked for on a local variable of that name))
The following code in p5 will trigger this error:
let handpose;
let facemesh;
preload() {
handpose = ml5.handpose();
facemesh = ml5.facemesh();
}
A trick that seems to fix the problem is forcing the models to load one by one like this:
let handpose;
let facemesh;
preload() {
handpose = ml5.handpose(() => {
facemesh = ml5.facemesh();
});
}
However, this still causes the browser to throw a bunch of warnings:
The kernel '...' for backend 'webgpu' is already registered
Previous issues where folks couldn't get their projects running due to missing `crossOrigin="anonymous":
i just upgraded a few sketches which use ml5 handpose and facemesh from 0.12.2 to next-gen alpha4, due to the considerable freezes on model load. the new preload is much more bearable. however there are many differences in the api of these models, also besides the loading, and part of which may be due to upstream changes. to support the multitude of legacy code, and allow it to transition to next-gen and benefit from smoother loading, i suggest to implement api's compatible with the legacy library, in parallel with the new api's. if this approach is acceptable, i can work on a PR for handpose and facemesh.
I think it would make a huge difference if we could have a style.css
file which we load into all of our examples. We would define styles for HTML elements like h1
, canvas
, p
, etc. (not CSS class names).
We would need to add a <style>
tag to all examples, alongside our <script>
tags. It could be done with a one-time manual copy & paste or dynamically with the copy-webpack-plugin, which we use for transforming script urls. We can easily re-theme by changing that one style.css
file.
Here is a quick proof-of-concept. I styled the canvas using the same style as the search box on the examples home page. I centered the contents horizontally which makes a big difference on large screens.
body {
font-family: Arial, Helvetica, sans-serif;
font-size: 16px;
display: flex;
flex-direction: column;
align-items: center;
}
canvas {
border: 2px solid black;
padding: 1rem;
box-shadow: 1rem 1rem #a15ffb;
margin: 1rem 2rem 2rem 2rem;
}
h1 {
text-align: center;
border-bottom: 2px dotted #a15ffb;
width: fit-content;
padding-bottom: 8px;
}
p {
max-width: 600px;
}
I was testing the new p5 mirror function released in 1.9.1
along with handPose.
When mirroring is enabled with the following code:
video = createCapture(VIDEO, { flipped: true });
this happens:
The webcam video is mirrored correctly but the handPose keypoints are still not mirrored.
ml5 runs inferences on the underlying HTML video element when given a p5 video element, which is probably never mirrored. Mirroring an HTML video element is not very straightforward, and I don't think it would make sense for p5 to do it.
Luckily, the tfjs models all have a flipHorizontal
property. We could simply set it to true:
handPose = ml5.handPose({ flipHorizontal: true });
and everything will work correctly:
We now have an intuitive two-line solution to the mirroring problem๐! Perhaps we can add some mirrored examples? We can probably also mention this somewhere in the docs or the website.
It works fine if you drag as intended, but there's no instructions telling the user what to do.
If you click on the canvas (without ever having dragged) then you will get a crash triggered by this line:
The end
variable is set by the mouseDragged
function. So if there is no drag then end
is undefined
and the p5.Vector.sub()
function will crash.
Uncaught TypeError: Cannot read properties of undefined (reading 'copy')
at Function.sub (p5.js:92156:31)
at mouseReleased (sketch.js:79:23)
at _main.default._onmouseup (p5.js:78236:38)
ml5-next-gen/examples/NeuralNetwork-mouse-gesture/sketch.js
Lines 74 to 84 in 13ba73c
This is the section as I have it in the Nature of Code book:
I tend to prefer to reference a specific version for use with the examples in case things change, but since we aren't sure what the version # will be for the upcoming release maybe putting latest
in is best? @ziyuan-linn any reason you anticipate this changing once we publish to npm?
Shall I leave as is or any thoughts?
This is maybe a silly comment, but I sometimes find that when looking at the examples that the copyright notice at the top is a little off-putting / intimidating. Being clear and transparent about software licenses is important, however, maybe we should try a friendlier heading?
Also, the ml5 project has its own license, not MIT! What if we create a new uniform heading something like:
// ๐ Hello! This is an ml5.js made and shared with โค๏ธ.
// Learn more about the ml5.js project: https://ml5js.org/
// ml5.js license and code of conduct: https://github.com/ml5js/ml5-next-gen/blob/main/LICENSE.md
// This example demonstrates _______________
// Any other notes?
Originally posted by @shiffman in #66 (comment)
Dear ml5 community,
i need moveNet in my projects.so when is MoveNet supported in ML5?
I'm doing final edits of my Nature of Code book and I noticed that I had referenced ml5.handpose()
. I initially thought this was a typo and it should be ml5.handPose()
, but looking at the existing models and examples, I see it is handpose()
! I don't have a strong feeling one way or another but since I'm finishing the book it might be nice to make a decision:
handpose
vs. handPose
bodypose
vs bodyPose
facemesh
vs. faceMesh
Thoughts @sproutleaf @ziyuan-linn @MOQN @gohai? I do see we have bodySegmentation
and imageClassifier
so perhaps the camelCase
is more consistent?
Sorry for the tags, but my deadline is Friday lol! (I mean, it can be wrong in the book, it won't be the end of the world.)
Our old library has been suffering from issues of outdated dependencies, leading to install and compilation errors.
Related issues from the old repo:
The ml5.neuralNetwork()
examples require either:
ml5.setBackend("webgl");
ml5.setBackend("cpu");
Without one of the above, the examples break with an error related to webgpu
, see #34. This came up again while reviewing #105. We'll leave it for now, but it would be great to remove the requirement from the examples and have them all work with webgl
as default.
Some classes at ITP/IMA are going to be using ml5.js in the next couple weeks. Shall we do another alpha release and update the temporary docs and web editor collection? @ziyuan-linn we can do the release together like last time?
@MOQN and team, what's your best guess as to the website timeline?
Hi all, I noticed in the faceMesh parts example, there is a missing keypoint (291 on the mesh_map). See this sketch with keypoint 291 drawn in red. The same issue is present in the original tfjs demo so I'm not sure what we should do. Would it make sense to manually add this keypoint in ml5 for the time being (until it's fixed in tfjs)?
Add the ability to load offline models, so that users can still run their projects with spotty Internet / from places of the world where access to online models is restricted, i.e. China.
See prior discussion in ml5js/ml5-library#1254.
Notably, now that we're using TF version ^4.2.0, we should be able to specify where the model is loaded from. To quote Joey's reply in the issue mentioned above:
If we do end up updating our tf versions to some of the more recent versions, then it looks like in the latest face-landmarks-detection lib we can specify where our model files should be loaded from -- https://github.com/tensorflow/tfjs-models/tree/master/face-landmarks-detection/src/mediapipe#create-a-detector -- which, in this case, would be somewhere on a local server.
I've been working a lot on refactoring the NeuralNetwork code so I wanted to have a place to ask questions as they come up.
First up:
classifyMultiple
and predictMultiple
are a bit odd because they are actually identical to the classify
and predict
functions. They call the same classifyInternal
/predictInternal
function with the same _input
argument.
We determine whether it's single or multiple by looking at the input. This could be potentially buggy. For example if you pass an array of inputs which has length 1 to the classifyMultiple
method then you get back the same results as you would from a single classification. Not an array of results (array of arrays) with length 1. That's because we're determining whether to return a nested array by looking at the length with no knowledge of which method was used.
We have a few options here:
classify
method which can handle single or multiple classification (This is how the TFJS model.predict() works, but I don't like this because it is confusing to explain in the docs).classify
always returns a single classification and classifyMultiple
always returns an array. We would validate the inputs and log warnings or throw errors if the user provides the wrong type of input.classify
method which can only accept a single input.I was looking at the migrating one of the examples that I wrote for the old poseNet
model (ml5 #1386) and I realized that the new bodyPose
model does not return the skeleton
data -- telling us which pairs of keypoints should be connected. Should we add this in somehow? Do we get this data from the TensorFlow model?
We were previously getting these pairs from from a getAdjacentKeyPoints
function on the TensorFlow posenet
model.
https://github.com/ml5js/ml5-library/blob/f80f95aa6b31191ab7e79ff466e8506f5e48a172/src/PoseNet/index.js#L120
JSON from the old ml5 version:
{
"pose": {
"score": 0.9598300737493178,
"keypoints": [
{
"score": 0.9982106685638428,
"part": "nose",
"position": {
"x": 234.7436927765724,
"y": 120.99073052313543
}
},
{
"score": 0.9967994689941406,
"part": "leftEye",
"position": {
"x": 244.4021276332989,
"y": 116.2922117478189
}
},
{
"score": 0.9952504634857178,
"part": "rightEye",
"position": {
"x": 225.1750576746603,
"y": 111.51711423109478
}
},
{
"score": 0.8623329997062683,
"part": "leftEar",
"position": {
"x": 260.36198021558477,
"y": 119.75604613189104
}
},
{
"score": 0.8424369692802429,
"part": "rightEar",
"position": {
"x": 208.37124743628596,
"y": 112.57503536239209
}
},
{
"score": 0.9977927207946777,
"part": "leftShoulder",
"position": {
"x": 268.2742455033477,
"y": 173.4225490288048
}
},
{
"score": 0.9977340698242188,
"part": "rightShoulder",
"position": {
"x": 185.7138446258664,
"y": 156.34151538893408
}
},
{
"score": 0.9940621256828308,
"part": "leftElbow",
"position": {
"x": 306.7087895137327,
"y": 228.746257499962
}
},
{
"score": 0.9967411160469055,
"part": "rightElbow",
"position": {
"x": 119.44014965606573,
"y": 195.0827477896724
}
},
{
"score": 0.9498299956321716,
"part": "leftWrist",
"position": {
"x": 352.618514064685,
"y": 194.567012682963
}
},
{
"score": 0.969875693321228,
"part": "rightWrist",
"position": {
"x": 123.3985939768038,
"y": 135.49304417413498
}
},
{
"score": 0.9958400130271912,
"part": "leftHip",
"position": {
"x": 247.96982743675147,
"y": 293.0668878221326
}
},
{
"score": 0.9959107041358948,
"part": "rightHip",
"position": {
"x": 178.35620396239287,
"y": 287.9360321282413
}
},
{
"score": 0.984212338924408,
"part": "leftKnee",
"position": {
"x": 247.36217650179736,
"y": 406.25373513782074
}
},
{
"score": 0.9769543409347534,
"part": "rightKnee",
"position": {
"x": 190.22527761496463,
"y": 411.3035043204341
}
},
{
"score": 0.8522323369979858,
"part": "leftAnkle",
"position": {
"x": 234.5086841509036,
"y": 518.1386816121261
}
},
{
"score": 0.9108952283859253,
"part": "rightAnkle",
"position": {
"x": 145.71858087680684,
"y": 455.45372270517316
}
}
],
"nose": {
"x": 234.7436927765724,
"y": 120.99073052313543,
"confidence": 0.9982106685638428
},
"leftEye": {
"x": 244.4021276332989,
"y": 116.2922117478189,
"confidence": 0.9967994689941406
},
"rightEye": {
"x": 225.1750576746603,
"y": 111.51711423109478,
"confidence": 0.9952504634857178
},
"leftEar": {
"x": 260.36198021558477,
"y": 119.75604613189104,
"confidence": 0.8623329997062683
},
"rightEar": {
"x": 208.37124743628596,
"y": 112.57503536239209,
"confidence": 0.8424369692802429
},
"leftShoulder": {
"x": 268.2742455033477,
"y": 173.4225490288048,
"confidence": 0.9977927207946777
},
"rightShoulder": {
"x": 185.7138446258664,
"y": 156.34151538893408,
"confidence": 0.9977340698242188
},
"leftElbow": {
"x": 306.7087895137327,
"y": 228.746257499962,
"confidence": 0.9940621256828308
},
"rightElbow": {
"x": 119.44014965606573,
"y": 195.0827477896724,
"confidence": 0.9967411160469055
},
"leftWrist": {
"x": 352.618514064685,
"y": 194.567012682963,
"confidence": 0.9498299956321716
},
"rightWrist": {
"x": 123.3985939768038,
"y": 135.49304417413498,
"confidence": 0.969875693321228
},
"leftHip": {
"x": 247.96982743675147,
"y": 293.0668878221326,
"confidence": 0.9958400130271912
},
"rightHip": {
"x": 178.35620396239287,
"y": 287.9360321282413,
"confidence": 0.9959107041358948
},
"leftKnee": {
"x": 247.36217650179736,
"y": 406.25373513782074,
"confidence": 0.984212338924408
},
"rightKnee": {
"x": 190.22527761496463,
"y": 411.3035043204341,
"confidence": 0.9769543409347534
},
"leftAnkle": {
"x": 234.5086841509036,
"y": 518.1386816121261,
"confidence": 0.8522323369979858
},
"rightAnkle": {
"x": 145.71858087680684,
"y": 455.45372270517316,
"confidence": 0.9108952283859253
}
},
"skeleton": [
[
{
"score": 0.9958400130271912,
"part": "leftHip",
"position": {
"x": 247.96982743675147,
"y": 293.0668878221326
}
},
{
"score": 0.9977927207946777,
"part": "leftShoulder",
"position": {
"x": 268.2742455033477,
"y": 173.4225490288048
}
}
],
[
{
"score": 0.9940621256828308,
"part": "leftElbow",
"position": {
"x": 306.7087895137327,
"y": 228.746257499962
}
},
{
"score": 0.9977927207946777,
"part": "leftShoulder",
"position": {
"x": 268.2742455033477,
"y": 173.4225490288048
}
}
],
[
{
"score": 0.9940621256828308,
"part": "leftElbow",
"position": {
"x": 306.7087895137327,
"y": 228.746257499962
}
},
{
"score": 0.9498299956321716,
"part": "leftWrist",
"position": {
"x": 352.618514064685,
"y": 194.567012682963
}
}
],
[
{
"score": 0.9958400130271912,
"part": "leftHip",
"position": {
"x": 247.96982743675147,
"y": 293.0668878221326
}
},
{
"score": 0.984212338924408,
"part": "leftKnee",
"position": {
"x": 247.36217650179736,
"y": 406.25373513782074
}
}
],
[
{
"score": 0.984212338924408,
"part": "leftKnee",
"position": {
"x": 247.36217650179736,
"y": 406.25373513782074
}
},
{
"score": 0.8522323369979858,
"part": "leftAnkle",
"position": {
"x": 234.5086841509036,
"y": 518.1386816121261
}
}
],
[
{
"score": 0.9959107041358948,
"part": "rightHip",
"position": {
"x": 178.35620396239287,
"y": 287.9360321282413
}
},
{
"score": 0.9977340698242188,
"part": "rightShoulder",
"position": {
"x": 185.7138446258664,
"y": 156.34151538893408
}
}
],
[
{
"score": 0.9967411160469055,
"part": "rightElbow",
"position": {
"x": 119.44014965606573,
"y": 195.0827477896724
}
},
{
"score": 0.9977340698242188,
"part": "rightShoulder",
"position": {
"x": 185.7138446258664,
"y": 156.34151538893408
}
}
],
[
{
"score": 0.9967411160469055,
"part": "rightElbow",
"position": {
"x": 119.44014965606573,
"y": 195.0827477896724
}
},
{
"score": 0.969875693321228,
"part": "rightWrist",
"position": {
"x": 123.3985939768038,
"y": 135.49304417413498
}
}
],
[
{
"score": 0.9959107041358948,
"part": "rightHip",
"position": {
"x": 178.35620396239287,
"y": 287.9360321282413
}
},
{
"score": 0.9769543409347534,
"part": "rightKnee",
"position": {
"x": 190.22527761496463,
"y": 411.3035043204341
}
}
],
[
{
"score": 0.9769543409347534,
"part": "rightKnee",
"position": {
"x": 190.22527761496463,
"y": 411.3035043204341
}
},
{
"score": 0.9108952283859253,
"part": "rightAnkle",
"position": {
"x": 145.71858087680684,
"y": 455.45372270517316
}
}
],
[
{
"score": 0.9977927207946777,
"part": "leftShoulder",
"position": {
"x": 268.2742455033477,
"y": 173.4225490288048
}
},
{
"score": 0.9977340698242188,
"part": "rightShoulder",
"position": {
"x": 185.7138446258664,
"y": 156.34151538893408
}
}
],
[
{
"score": 0.9958400130271912,
"part": "leftHip",
"position": {
"x": 247.96982743675147,
"y": 293.0668878221326
}
},
{
"score": 0.9959107041358948,
"part": "rightHip",
"position": {
"x": 178.35620396239287,
"y": 287.9360321282413
}
}
]
]
}
JSON from the new ml5 version:
{
"keypoints": [
{
"y": 116.29785878956318,
"x": 235.0086233296345,
"score": 0.7886537313461304,
"name": "nose"
},
{
"y": 109.22195221483707,
"x": 247.31669480038673,
"score": 0.8181501030921936,
"name": "left_eye"
},
{
"y": 104.87190327048302,
"x": 227.84886195487582,
"score": 0.7873461246490479,
"name": "right_eye"
},
{
"y": 121.65976269543171,
"x": 258.62494331045247,
"score": 0.8321802616119385,
"name": "left_ear"
},
{
"y": 107.55718423426151,
"x": 212.17497975555892,
"score": 0.8760896325111389,
"name": "right_ear"
},
{
"y": 168.72356361150742,
"x": 268.7925813615936,
"score": 0.8404656052589417,
"name": "left_shoulder"
},
{
"y": 152.61473408341408,
"x": 178.79237256099265,
"score": 0.8981107473373413,
"name": "right_shoulder"
},
{
"y": 226.47244331240654,
"x": 299.3288135528564,
"score": 0.7384112477302551,
"name": "left_elbow"
},
{
"y": 194.91725680232048,
"x": 117.27373925680965,
"score": 0.9358187913894653,
"name": "right_elbow"
},
{
"y": 202.76897567510605,
"x": 348.5035130903893,
"score": 0.8098330497741699,
"name": "left_wrist"
},
{
"y": 140.22100810706615,
"x": 123.04792117826716,
"score": 0.8564737439155579,
"name": "right_wrist"
},
{
"y": 299.423077493906,
"x": 245.70130230225234,
"score": 0.9197857975959778,
"name": "left_hip"
},
{
"y": 299.77109426259995,
"x": 183.46146969451115,
"score": 0.9340135455131531,
"name": "right_hip"
},
{
"y": 411.4370810389519,
"x": 238.26926617278264,
"score": 0.8583171963691711,
"name": "left_knee"
},
{
"y": 416.5619515180588,
"x": 182.75880041810655,
"score": 0.8068801164627075,
"name": "right_knee"
},
{
"y": 517.5497948527336,
"x": 244.53618868110104,
"score": 0.8860065937042236,
"name": "left_ankle"
},
{
"y": 449.2491074204445,
"x": 139.96030905812054,
"score": 0.15222936868667603,
"name": "right_ankle"
}
],
"box": {
"yMin": 0.1048629879951477,
"xMin": 0.1899825632572174,
"yMax": 0.9370519518852234,
"xMax": 0.7449000477790833,
"width": 0.5549174845218658,
"height": 0.8321889638900757
},
"score": 0.7778568267822266,
"id": 1,
"nose": {
"x": 235.0086233296345,
"y": 116.29785878956318,
"score": 0.7886537313461304
},
"left_eye": {
"x": 247.31669480038673,
"y": 109.22195221483707,
"score": 0.8181501030921936
},
"right_eye": {
"x": 227.84886195487582,
"y": 104.87190327048302,
"score": 0.7873461246490479
},
"left_ear": {
"x": 258.62494331045247,
"y": 121.65976269543171,
"score": 0.8321802616119385
},
"right_ear": {
"x": 212.17497975555892,
"y": 107.55718423426151,
"score": 0.8760896325111389
},
"left_shoulder": {
"x": 268.7925813615936,
"y": 168.72356361150742,
"score": 0.8404656052589417
},
"right_shoulder": {
"x": 178.79237256099265,
"y": 152.61473408341408,
"score": 0.8981107473373413
},
"left_elbow": {
"x": 299.3288135528564,
"y": 226.47244331240654,
"score": 0.7384112477302551
},
"right_elbow": {
"x": 117.27373925680965,
"y": 194.91725680232048,
"score": 0.9358187913894653
},
"left_wrist": {
"x": 348.5035130903893,
"y": 202.76897567510605,
"score": 0.8098330497741699
},
"right_wrist": {
"x": 123.04792117826716,
"y": 140.22100810706615,
"score": 0.8564737439155579
},
"left_hip": {
"x": 245.70130230225234,
"y": 299.423077493906,
"score": 0.9197857975959778
},
"right_hip": {
"x": 183.46146969451115,
"y": 299.77109426259995,
"score": 0.9340135455131531
},
"left_knee": {
"x": 238.26926617278264,
"y": 411.4370810389519,
"score": 0.8583171963691711
},
"right_knee": {
"x": 182.75880041810655,
"y": 416.5619515180588,
"score": 0.8068801164627075
},
"left_ankle": {
"x": 244.53618868110104,
"y": 517.5497948527336,
"score": 0.8860065937042236
},
"right_ankle": {
"x": 139.96030905812054,
"y": 449.2491074204445,
"score": 0.15222936868667603
}
}
I'm filing an issue to continue a discussion with @golanlevin on twitter (see this thread).
It will be worth A/B testing against the MediaPipe implementations. E.g., I'm getting 30fps using their body segmentation,
https://mediapipe-studio.webapps.google.com/studio/demo/image_segmenter
but only 5fps with ML5's new version,
https://editor.p5js.org/ml5/sketches/KNsdeNhrp
@ziyuan-linn could #61 be related?
Just keeping a note that we should be consistent about using confidence
vs score
across all models.
It seems like in a previous discussion we were leaning toward using confidence
. Just wondering if anyone have any additional thoughts about it.
Both p5.js and @tensorflow-models/face-landmark-detection
use VERSION
as a global variable. This causes p5 to throw a warning that VERSION
is a p5-reserved variable. To not cause confusion among the users, the current library uses patch-package
to remove the VERSION
global variable from the @tensorflow-models/face-landmark-detection
file.
This fix prevents p5 from outputting a warning; however, it also modifies the dependency code. We should consider this fix a temporary hack and eventually contact p5 or TensorFlow about this issue.
Picking up on #127, we can now add the flipped
property to a selection of examples that work with video input. This is a great starter issue for a new contributor!
maskType
modelType
line in BOTH code snippetsHi everyone,
I am trying to install some packages for MoveNet with npm and got ERESOLVE unable to resolve dependency tree
error.
It seems like TensorFlow is using dependencies with conflicting versions and npm does not support it.
The TensorFlow installation guides for newer models use Yarn instead of npm when installing the packages. I tried installing MoveNet packages with Yarn instead of npm and was successful. I think the task of managing dependencies will be easier with Yarn, and I'm wondering whether we should switch our package manager.
I have not looked into this extensively, but when bodyPix is running continuous detection, it makes a network request for each frame and adds an image to the sources. The original bodyPix also continuously makes requests, but it will not add images to the sources. I am not sure if this is the intended behavior. See the screenshot below.
I'm looking at the code for preparing inputs for the Neural Network. We always convert non-numeric values to number arrays (strings through one-hot encoding and images to pixel arrays).
I'm confused about when we handle normalization of numbers to the range [0, 1]
based on the min
and max
. It appears that this is an optional step which only happens if the user has called the normalizeData()
function.
What I'm seeing in the code is that calling normalizeData()
sets the this.neuralNetworkData.meta.isNormalized
flag to true
. All calls to the normalize functions are guarded inside if (meta.isNormalized) {
blocks. So if normalizeData()
has not been called by the user at the top level then no numbers will be normalized.
@shiffman is that correct? Is that how it's supposed to work? What are the situations where someone might chose to normalize the data or not? This is definitely something that we should explain in the docs.
I am trying to run the yarn
command to install the dependencies but got this error.
error /Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas: Command failed.
Exit code: 1
Command: node-pre-gyp install --fallback-to-build --update-binary
Arguments:
Directory: /Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas
Output:
node-pre-gyp info it worked if it ends with ok
node-pre-gyp info using [email protected]
node-pre-gyp info using [email protected] | darwin | arm64
node-pre-gyp http GET https://github.com/Automattic/node-canvas/releases/download/v2.11.2/canvas-v2.11.2-node-v108-darwin-unknown-arm64.tar.gz
node-pre-gyp ERR! install response status 404 Not Found on https://github.com/Automattic/node-canvas/releases/download/v2.11.2/canvas-v2.11.2-node-v108-darwin-unknown-arm64.tar.gz
node-pre-gyp WARN Pre-built binaries not installable for [email protected] and [email protected] (node-v108 ABI, unknown) (falling back to source compile with node-gyp)
node-pre-gyp WARN Hit error response status 404 Not Found on https://github.com/Automattic/node-canvas/releases/download/v2.11.2/canvas-v2.11.2-node-v108-darwin-unknown-arm64.tar.gz
gyp info it worked if it ends with ok
gyp info using [email protected]
gyp info using [email protected] | darwin | arm64
gyp info ok
gyp info it worked if it ends with ok
gyp info using [email protected]
gyp info using [email protected] | darwin | arm64
gyp info find Python using Python version 3.9.6 found at "/Library/Developer/CommandLineTools/usr/bin/python3"
gyp info spawn /Library/Developer/CommandLineTools/usr/bin/python3
gyp info spawn args [
gyp info spawn args '/Users/ziyuanlin/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/node-gyp/gyp/gyp_main.py',
gyp info spawn args 'binding.gyp',
gyp info spawn args '-f',
gyp info spawn args 'make',
gyp info spawn args '-I',
gyp info spawn args '/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/build/config.gypi',
gyp info spawn args '-I',
gyp info spawn args '/Users/ziyuanlin/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/node-gyp/addon.gypi',
gyp info spawn args '-I',
gyp info spawn args '/Users/ziyuanlin/Library/Caches/node-gyp/18.15.0/include/node/common.gypi',
gyp info spawn args '-Dlibrary=shared_library',
gyp info spawn args '-Dvisibility=default',
gyp info spawn args '-Dnode_root_dir=/Users/ziyuanlin/Library/Caches/node-gyp/18.15.0',
gyp info spawn args '-Dnode_gyp_dir=/Users/ziyuanlin/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/node-gyp',
gyp info spawn args '-Dnode_lib_file=/Users/ziyuanlin/Library/Caches/node-gyp/18.15.0/<(target_arch)/node.lib',
gyp info spawn args '-Dmodule_root_dir=/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas',
gyp info spawn args '-Dnode_engine=v8',
gyp info spawn args '--depth=.',
gyp info spawn args '--no-parallel',
gyp info spawn args '--generator-output',
gyp info spawn args 'build',
gyp info spawn args '-Goutput_dir=.'
gyp info spawn args ]
/bin/sh: pkg-config: command not found
gyp: Call to 'pkg-config pixman-1 --libs' returned exit status 127 while in binding.gyp. while trying to load binding.gyp
gyp ERR! configure error
gyp ERR! stack Error: `gyp` failed with exit code: 1
gyp ERR! stack at ChildProcess.onCpExit (/Users/ziyuanlin/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:325:16)
gyp ERR! stack at ChildProcess.emit (node:events:513:28)
gyp ERR! stack at ChildProcess._handle.onexit (node:internal/child_process:291:12)
gyp ERR! System Darwin 23.3.0
gyp ERR! command "/Users/ziyuanlin/.nvm/versions/node/v18.15.0/bin/node" "/Users/ziyuanlin/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "configure" "--fallback-to-build" "--update-binary" "--module=/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/build/Release/canvas.node" "--module_name=canvas" "--module_path=/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/build/Release" "--napi_version=8" "--node_abi_napi=napi" "--napi_build_version=0" "--node_napi_label=node-v108"
gyp ERR! cwd /Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas
gyp ERR! node -v v18.15.0
gyp ERR! node-gyp -v v9.3.1
gyp ERR! not ok
node-pre-gyp ERR! build error
node-pre-gyp ERR! stack Error: Failed to execute '/Users/ziyuanlin/.nvm/versions/node/v18.15.0/bin/node /Users/ziyuanlin/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js configure --fallback-to-build --update-binary --module=/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/build/Release/canvas.node --module_name=canvas --module_path=/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/build/Release --napi_version=8 --node_abi_napi=napi --napi_build_version=0 --node_napi_label=node-v108' (1)
node-pre-gyp ERR! stack at ChildProcess.<anonymous> (/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/node_modules/@mapbox/node-pre-gyp/lib/util/compile.js:89:23)
node-pre-gyp ERR! stack at ChildProcess.emit (node:events:513:28)
node-pre-gyp ERR! stack at maybeClose (node:internal/child_process:1091:16)
node-pre-gyp ERR! stack at ChildProcess._handle.onexit (node:internal/child_process:302:5)
node-pre-gyp ERR! System Darwin 23.3.0
node-pre-gyp ERR! command "/Users/ziyuanlin/.nvm/versions/node/v18.15.0/bin/node" "/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/node_modules/.bin/node-pre-gyp" "install" "--fallback-to-build" "--update-binary"
node-pre-gyp ERR! cwd /Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas
node-pre-gyp ERR! node -v v18.15.0
node-pre-gyp ERR! node-pre-gyp -v v1.0.11
node-pre-gyp ERR! not ok
Failed to execute '/Users/ziyuanlin/.nvm/versions/node/v18.15.0/bin/node /Users/ziyuanlin/.nvm/versions/node/v18.15.0/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js configure --fallback-to-build --update-binary --module=/Users/ziyuanlin/Documents/VS Code/ml5-next-gen/node_modules/canvas/build/Rel
It only appears on my M1 MacBook with macOS 14.3 and not on my Windows machine. The error disappeared when I reverted the changes of #82.
Hi everyone, I'm opening this thread to discuss about the model prediction output for hand detection. Though I think a lot of things here can also be applied to other landmark detection models.
The tf.js original output for hand detection looks like this:
[
{
score: 0.8,
handedness: "Right",
keypoints: [
{x: 105, y: 107, name: "wrist"},
{x: 108, y: 160, name: "pinky_finger_tip"},
...
],
keypoints3D: [
{x: 0.00388, y: -0.0205, z: 0.0217, name: "wrist"},
{x: -0.025138, y: -0.0255, z: -0.0051, name: "pinky_finger_tip"},
...
]
},
{
score: 0.9,
handedness: "Left",
...
}
]
One idea is to expose each keypoint by name so they can be more intuitively accessed, for example:
[
{
score: 0.8,
handedness: "Right",
wrist: { //<-----------------------add
x: 105,
y: 107,
3dx: 0.00388,
3dy: -0.0205,
3dz: 0.0217
},
pinky_finger_tip: { //<-----------------------add
x: 108,
y: 160,
3dx: -0.025138,
3dy: -0.0255,
3dz: -0.0051
},
keypoints: [
{x: 105, y: 107, name: "wrist"},
{x: 108, y: 160, name: "pinky_finger_tip"},
...
],
keypoints3D: [
{x: 0.00388, y: -0.0205, z: 0.0217, name: "wrist"},
{x: -0.025138, y: -0.0255, z: -0.0051, name: "pinky_finger_tip"},
...
]
},
{
score: 0.9,
handedness: "Left",
...
}
]
@yining1023 suggested grouping landmarks of each finger together with intuitive names like wrist, thumb, etc...
[
{
"handedness": "Left",
"wrist": { //<-----------------------add
"x": 57.2648811340332,
"y": 489.2754936218262,
"z": 0.00608062744140625,
"confidence": 0.89
},
"thumb": [ //<-----------------------add
{
"x": 57.2648811340332,
"y": 489.2754936218262,
"z": 0.00608062744140625,
"confidence": 0.89,
},
{...},
{...},
{...},
{...},
],
"indexFinger":[], //<-----------------------add
"middleFinger":[], //<-----------------------add
"ringFinger":[], //<-----------------------add
"pinky":[], //<-----------------------add
"keypoints": [
{
"x": 57.2648811340332,
"y": 489.2754936218262,
"name": "wrist"
},
],
"keypoints3D": [
{
"x": -0.03214273601770401,
"y": 0.08357296139001846,
"z": 0.00608062744140625,
"name": "wrist"
}
],
"score": 0.9638671875,
},
{
"handedness": "Right",
...
}
]
I think this feature could potentially be very useful for users. However, the handedness is the opposite of the actual hand (left hand labeled as right). I found that when flipHorizontal
is set to true, the handedness would be labeled correctly. We could potentially flip the handedness value within ml5 when flipHorizontal
is false.
Tf.js have a diagram outlining each index and the name of each keypoint.
I personally find this kind of diagram very helpful when trying to find a landmark point quickly. I think there are similar diagrams for other tf.js landmark detection models. @MOQN Do you think we could display or link these diagrams on the new website?
I'm happy to hear any suggestions or ideas!
Given a normal context like the first example in example folder, and when the canvas or video width and height are not (640, 480), I found the poses are misaligned with the video stream. I wonder why. Also How can we align poses to the video stream when the width and height of the video are changed? Thank you
function preload() {
// Load the bodyPose model
bodyPose = ml5.bodyPose("BlazePose");
}
function setup() {
createCanvas(640, 480);
// createCanvas(1280, 960);
// Create the video and hide it
video = createCapture(VIDEO);
video.size(width, height);
video.hide();
// Start detecting poses in the webcam video
bodyPose.detectStart(video, gotPoses);
}
Hi all!
I'm opening this issue to discuss the possibility of implementing a feature in ml5 that flips the underlying source of a video horizontally, mainly as a convenient way to mirror the live webcam footage. This issue is originally discussed here in this thread.
It would be nice to have a function that flips the webcam video with one line of code and does not interfere with other things on the canvas, as shown in the code below:
let video, flippedVideo;
function setup() {
video = createCapture(VIDEO);
flippedVideo = ml5.flipVideo(video);
}
As far as implementing a function like ml5.flipVideo(video)
I am imaging these steps:
captureStream()
I am not 100% sure whether this would work, so any suggestions or alternative solutions are welcome!
The README needs to be updated with any new links related to website and examples.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.