[Japanese / [WIP] English]

hand-gesture-recognition-using-onnx

Kazuhito00/hand-gesture-recognition-using-mediapipe @Kazuhito00 を引用させていただき、MediaPipeの実装を全てONNXにリプレースしたリポジトリです。簡易なMLPでハンドサインとフィンガージェスチャーを認識するサンプルプログラムです。とりあえず動く、ということしか確認していません。

output_.mp4

本リポジトリは以下の内容を含みます。

サンプルプログラム
ハンドディテクションモデル (改造ONNX)
手のひらランドマークディテクションモデル (改造ONNX)
ハンドサイン認識モデル (改造ONNX)
フィンガージェスチャー認識モデル (改造ONNX)
ハンドサイン認識用学習データ、および、学習用ノートブック
フィンガージェスチャー認識用学習データ、および、学習用ノートブック

Requirements

onnxruntime 1.12.0 or onnxruntime-gpu 1.12.0
opencv-contrib-python 4.6.0.66 or Later
Tensorflow 2.10.0 (学習後にONNXファイルを再作成する場合のみ)
PyTorch 1.12.0 (学習後にONNXファイルを再作成する場合のみ)
tf2onnx 1.12.0 or Later (学習後にONNXファイルを再作成する場合のみ)
simple-onnx-processing-tools 1.0.54 or Later (学習後にONNXファイルを再作成する場合のみ)
scikit-learn 0.23.2 or Later (学習時に混同行列を表示したい場合のみ)
matplotlib 3.3.2 or Later (学習時に混同行列を表示したい場合のみ)

Demo

Webカメラを使ったデモの実行方法は以下です。

python app.py

デモ実行時には、以下のオプションが指定可能です。

--device
    カメラデバイス番号の指定 (デフォルト：0)

--width
    カメラキャプチャ時の横幅 (デフォルト：640)

--height
    カメラキャプチャ時の縦幅 (デフォルト：480)

--min_detection_confidence
    検出信頼値の閾値 (デフォルト：0.6)

--disable_image_flip
    入力映像の左右反転を無効化

Directory

.
│  app.py
│  keypoint_classification.ipynb
│  point_history_classification.ipynb
│  requirements.txt
│  README.md
│
├─model
│  ├─keypoint_classifier
│  │  │  tflite_to_onnx.sh
│  │  │  make_argmax.py
│  │  │  keypoint.csv
│  │  │  keypoint_classifier.hdf5
│  │  │  keypoint_classifier.py
│  │  │  keypoint_classifier.tflite
│  │  │  keypoint_classifier.onnx
│  │  └─ keypoint_classifier_label.csv
│  │
│  └─point_history_classifier
│      │  tflite_to_onnx.sh
│      │  make_argmax.py
│      │  point_history.csv
│      │  point_history_classifier.hdf5
│      │  point_history_classifier.py
│      │  point_history_classifier.tflite
│      │  point_history_classifier.onnx
│      └─ point_history_classifier_label.csv
│
└─utils
    │  cvfpscalc.py
    └─ utils.py

app.py

推論用のサンプルプログラムです。
また、ハンドサイン認識用の学習データ(キーポイント)、
フィンガージェスチャー認識用の学習データ(人差指の座標履歴)を収集することもできます。

keypoint_classification.ipynb

ハンドサイン認識用のモデル訓練用スクリプトです。

point_history_classification.ipynb

フィンガージェスチャー認識用のモデル訓練用スクリプトです。

model/keypoint_classifier

ハンドサイン認識に関わるファイルを格納するディレクトリです。
以下のファイルが格納されます。

tfliteのONNX変換スクリプト (tflite_to_onnx.sh)
ONNX部品生成プログラム (make_argmax.py)
学習用データ (keypoint.csv)
学習済モデル (keypoint_classifier.tflite)
学習済モデル (keypoint_classifier.onnx)
ラベルデータ (keypoint_classifier_label.csv)
推論用クラス (keypoint_classifier.py)

model/point_history_classifier

フィンガージェスチャー認識に関わるファイルを格納するディレクトリです。
以下のファイルが格納されます。

tfliteのONNX変換スクリプト (tflite_to_onnx.sh)
ONNX部品生成プログラム (make_argmax.py)
学習用データ (point_history.csv)
学習済モデル (point_history_classifier.tflite)
学習済モデル (point_history_classifier.onnx)
ラベルデータ (point_history_classifier_label.csv)
推論用クラス (point_history_classifier.py)

utils/cvfpscalc.py

FPS計測用のモジュールです。

utils/utils.py

画像加工用の関数群です。

Training

ハンドサイン認識、フィンガージェスチャー認識は、
学習データの追加、変更、モデルの再トレーニングが出来ます。

ハンドサイン認識トレーニング方法

1.学習データ収集

「k」を押すと、キーポイントの保存するモードになります（「MODE:Logging Key Point」と表示される）

「0」～「9」を押すと「model/keypoint_classifier/keypoint.csv」に以下のようにキーポイントが追記されます。
1列目：押下した数字(クラスIDとして使用)、2列目：trackid、3列目以降：キーポイント座標

キーポイント座標は以下の前処理を④まで実施したものを保存します。

初期状態では、パー(クラスID：0)、グー(クラスID：1)、指差し(クラスID：2)の3種類の学習データが入っています。
必要に応じて3以降を追加したり、csvの既存データを削除して、学習データを用意してください。
　　

2.モデル訓練

「keypoint_classification.ipynb」をJupyter Notebookで開いて上から順に実行してください。
学習データのクラス数を変更する場合は「NUM_CLASSES = 3」の値を変更し、
「model/keypoint_classifier/keypoint_classifier_label.csv」のラベルを適宜修正してください。

X.モデル構造

「keypoint_classification.ipynb」で用意しているモデルのイメージは以下です。

フィンガージェスチャー認識トレーニング方法

1.学習データ収集

「h」を押すと、指先座標の履歴を保存するモードになります（「MODE:Logging Point History」と表示される）

「0」～「9」を押すと「model/point_history_classifier/point_history.csv」に以下のようにキーポイントが追記されます。
1列目：押下した数字(クラスIDとして使用)、2列目：trackid、3列目以降：座標履歴

キーポイント座標は以下の前処理を④まで実施したものを保存します。

初期状態では、静止(クラスID：0)、時計回り(クラスID：1)、反時計回り(クラスID：2)、移動(クラスID：4)の
4種類の学習データが入っています。
必要に応じて5以降を追加したり、csvの既存データを削除して、学習データを用意してください。
　　　

2.モデル訓練

「point_history_classification.ipynb」をJupyter Notebookで開いて上から順に実行してください。
学習データのクラス数を変更する場合は「NUM_CLASSES = 4」の値を変更し、
「model/point_history_classifier/point_history_classifier_label.csv」のラベルを適宜修正してください。

X.モデル構造

「point_history_classification.ipynb」で用意しているモデルのイメージは以下です。
「LSTM」を用いたモデルは以下です。
使用する際には「use_lstm = False」を「True」に変更してください

Application example

None

Reference

Acknowledgments

hand-gesture-recognition-using-mediapipe: 高橋かずひと (https://twitter.com/KzhtTkhs)
- Apache-2.0 License
depthai_hand_tracker: geaxgx (https://github.com/geaxgx/depthai_hand_tracker)
- MIT License

Author

hand-gesture-recognition-using-onnx: Katsuya Hyodo
- Apache-2.0 License

nilamadhab47 / hand-gesture-recognition-using-onnx Goto Github PK