1996scarlet / openvtuber Goto Github PK

View Code? Open in Web Editor NEW

855.0 18.0 93.0 65.71 MB

虚拟爱抖露(アイドル)共享计划, 是基于单目RGB摄像头的人眼与人脸特征点检测算法, 在实时3D面部捕捉以及模型驱动领域的应用.

License: GNU General Public License v3.0

JavaScript 2.76% HTML 17.98% Python 79.25%

vtuber head-pose-estimation face-detection face-alignment tflite

openvtuber's People

Contributors

Stargazers

Watchers

Forkers

homerj233 napoler 875798590 seeker1943 ricardovf ucas-iigroup bochuanwu misakadesu2206 lxngoddess5321 mikewuang hajungong007 cip0 rkuo2000 org-mars ericustc zorrock slayerxj kejianping888 kotoyuuko-forks arufuss zhaoyk1986 skyformat99 johndpope vernitgarg menglj kingmpw2015 wolf718 kadantte martinhoang11 jrtkcoder 1165048017 shanks-xu getcurrentthread didadidadidad rxhmdia xuguozhi phoenixlabcn chunlang ralphmuir xuyixun21 birdflies leewoody yepman0620 foolooo gofirestar dwctod legendaryonion acatcc ar-collections zhaixingang ssssnack wpdata yishionq tiger1933 geneww albatross11 onewaycat assassindesign lm6824 tim3385 406345 ch1y4 haoduoyu1203 pheobus78 jiahongwu1995 15737939656 testitok honsia lcc157 shaozhenged ssafeluck gravec 307509256 feiyilicare vpegasus fishofting yeayee mrying jiamian2009 cnusodlearning greatfeel marvin-liu-sx ccf0515 loong1989 sourcesresearch lihuibng zhenglei0410 soon14 fzy567 tony-tang666 tranduong874

openvtuber's Issues

Commercial license

Is it possible to license code commercially? Code is GPL which strictly prohibits sub licensing.

你们是怎么利用得到的角度信息，去控制相应的动画人物的？

还有，我查了一下MMD是制作3d动画的程序。你们用角度信息等控制人物，具体的实现是在这个仓库中吗？小白想学习一下。谢谢

Silly Question: How to run this on windows

Hello, I know that it is a bit annoying of me, but I have not understood how to install the prerequisites and run it.
I do not know if I have to do it from the windows console, from powershell or with what I have to install / run things.
Sorry, it's my first time wanting to use a program that doesn't come as an installer / exe :c

your-video-path具体指什么呢？

如何解决抖动的问题

vtuber会随着摄像头检测的点位抖动，比较灵敏，怎么能平滑一下下

How to hook up to laptop's webcam?

Awesome work, but I have few question:

1.How to hook up to laptop's webcam? in readme it said "python3 vtuber_link_start.py ". I am able to test it on a .mp4 video, how to make the program read live video from webcam?

I am only able to get ~14 fps output with my laptop's i5-1035G7 & iGPU in windows 10, is this possible to get better result?

Program terminated with uncaught exception of type NSException

Here is the console log.

➜  PythonClient git:(master) ✗ python3 vtuber_link_start.py
/Users/T**/VTube/OpenVtuber/PythonClient/TFLiteFaceDetector.py:13: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  self._min_boxes = np.array([[10, 16, 24], [32, 48],
2021-01-10 11:06:34.944433: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-10 11:06:34.944758: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-01-10 11:06:35.108 Python[11977:4273942] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow drag regions should only be invalidated on the Main Thread!'
*** First throw call stack:
(
	0   CoreFoundation                      0x00007fff2d67eb57 __exceptionPreprocess + 250
	1   libobjc.A.dylib                     0x00007fff665115bf objc_exception_throw + 48
	2   CoreFoundation                      0x00007fff2d6a734c -[NSException raise] + 9
	3   AppKit                              0x00007fff2a8a15ec -[NSWindow(NSWindow_Theme) _postWindowNeedsToResetDragMarginsUnlessPostingDisabled] + 310
	4   AppKit                              0x00007fff2a889052 -[NSWindow _initContent:styleMask:backing:defer:contentView:] + 1416
	5   AppKit                              0x00007fff2a888ac3 -[NSWindow initWithContentRect:styleMask:backing:defer:] + 42
	6   AppKit                              0x00007fff2abc1a28 -[NSWindow initWithContentRect:styleMask:backing:defer:screen:] + 52
	7   cv2.cpython-38-darwin.so            0x0000000110d95ed5 cvNamedWindow + 677
	8   cv2.cpython-38-darwin.so            0x0000000110d9579c cvShowImage + 188
	9   cv2.cpython-38-darwin.so            0x0000000110d93c66 _ZN2cv6imshowERKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEERKNS_11_InputArrayE + 230
	10  cv2.cpython-38-darwin.so            0x000000010fff539e _ZL18pyopencv_cv_imshowP7_objectS0_S0_ + 302
	11  Python                              0x000000010f9f60f6 cfunction_call_varargs + 171
	12  Python                              0x000000010f9f5be5 _PyObject_MakeTpCall + 274
	13  Python                              0x000000010fa96572 call_function + 804
	14  Python                              0x000000010fa92f91 _PyEval_EvalFrameDefault + 30081
	15  Python                              0x000000010f9f642a function_code_fastcall + 106
	16  Python                              0x000000010f9f5e71 PyVectorcall_Call + 108
	17  Python                              0x000000010fa93328 _PyEval_EvalFrameDefault + 31000
	18  Python                              0x000000010f9f642a function_code_fastcall + 106
	19  Python                              0x000000010fa963a8 call_function + 346
	20  Python                              0x000000010fa92f75 _PyEval_EvalFrameDefault + 30053
	21  Python                              0x000000010f9f642a function_code_fastcall + 106
	22  Python                              0x000000010fa963a8 call_function + 346
	23  Python                              0x000000010fa92f75 _PyEval_EvalFrameDefault + 30053
	24  Python                              0x000000010f9f642a function_code_fastcall + 106
	25  Python                              0x000000010f9f8554 method_vectorcall + 256
	26  Python                              0x000000010f9f5e71 PyVectorcall_Call + 108
	27  Python                              0x000000010fb0c4b2 t_bootstrap + 74
	28  Python                              0x000000010facdd74 pythread_wrapper + 25
	29  libsystem_pthread.dylib             0x00007fff678be109 _pthread_start + 148
	30  libsystem_pthread.dylib             0x00007fff678b9b8b thread_start + 15
)
libc++abi.dylib: terminating with uncaught exception of type NSException
[1]    11977 abort      python3 vtuber_link_start.py

OS ENV:

OS: macOS Catalina 10.15.7 19H15 x86_64
Kernel: 19.6.0
Shell: zsh 5.7.1
CPU: Intel i7-9750H (12) @ 2.60GHz
GPU: Intel UHD Graphics 630, AMD Radeon Pro 5300M
Memory: 16728MiB / 32768MiB

请问可以修改kizuna的模型吗

直接修改kizunaai里的文件就可以，还是得重新训练模型😿

只有头部的动作

是否会支持头部以下部分的动作，肩膀躯干等

还有比如招手或者其他更加复杂的动作

运行时出错

Segmentation fault: 11

Stack trace:
  [bt] (0) 1   libmxnet.so                         0x000000011e2c42b0 mxnet::Storage::Get() + 4880
  [bt] (1) 2   libsystem_platform.dylib            0x00007fff72b2b5fd _sigtramp + 29
  [bt] (2) 3   libsystem_platform.dylib            0x00007fff72b29074 _platform_strcpy + 84
  [bt] (3) 4   CoreFoundation                      0x00007fff3899e7fe __104-[CFPrefsSearchListSource synchronouslySendDaemonMessage:andAgentMessage:andDirectMessage:replyHandler:]_block_invoke.119 + 83
  [bt] (4) 5   CoreFoundation                      0x00007fff3899e79a CFPREFERENCES_IS_WAITING_FOR_SYSTEM_AND_USER_CFPREFSDS + 74
  [bt] (5) 6   CoreFoundation                      0x00007fff3899e634 -[CFPrefsSearchListSource synchronouslySendDaemonMessage:andAgentMessage:andDirectMessage:replyHandler:] + 172
  [bt] (6) 7   CoreFoundation                      0x00007fff3899d10d -[CFPrefsSearchListSource alreadylocked_generationCountFromListOfSources:count:] + 215
  [bt] (7) 8   CoreFoundation                      0x00007fff3899ce4b -[CFPrefsSearchListSource alreadylocked_getDictionary:] + 360
  [bt] (8) 9   CoreFoundation                      0x00007fff3899ca74 -[CFPrefsSearchListSource alreadylocked_copyValueForKey:] + 152

通用的骨骼转动算法

请问大佬头部转动角度的各个点的权重是如何计算出来的

How can I change 3D character? I have downloaded 3D model from [https://3d.nicovideo.jp/users/196981](url) and paste it inside NodeServer/public/models/ and change the reference in Kizuna.html, but still unable to get the movement of that model. It remains still having no movement. @1996scarlet

np.int gives errors

I got a lot of complaints when running about np.int being deprecated (python3.11, numpy1.26.0)
This is mainly in the SolvePnPHeadPoseEstimation.py and in the vtuber_link_start.py.
I did change the np.int to int as the error message asked, and now i am able to at least test facial expressions even if the head pose estimator seems to not work for me.

无法复现Face Alignment里的表情

愤怒和闭一只眼。

请问可以离线测试吗？

您好，感谢大佬的分享
运行到这里的时候出现问题
cd NodeServer && yarn # install node modules
是不是不能离线（断网）测试

Facial Landmarks are shivering continuously, Is there any way to stabilize them as they are affecting output on 3d character as well??

Detected facial keypoints ordering / enumeration

I am currently testing multiple headpose detectors and most of them follow this ordering rule https://images.app.goo.gl/sTU8BnXvMquRfPY69 for predicted 68pts

For rendering i use this function


    def render_68pts_lines(self, frame, landmarks, color_keypoints, color_lines):
        for pts68 in landmarks:
            # Base contour
            for i in range(16):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[16]), 1, color_keypoints, 1, cv2.LINE_AA)

            # Left Eyebrow
            for i in range(17, 21):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[21]), 1, color_keypoints, 1, cv2.LINE_AA)

            # Right Eyebrow
            for i in range(22, 26):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[26]), 1, color_keypoints, 1, cv2.LINE_AA)

            # Nose
            for i in range(27, 30):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[30]), 1, color_keypoints, 1, cv2.LINE_AA)
            for i in range(31, 35):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[35]), 1, color_keypoints, 1, cv2.LINE_AA)

            # Left eye
            for i in range(36, 41):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.line(frame, tuple(pts68[41]), tuple(pts68[36]), color_lines, 2, lineType=cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[41]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[36]), 1, color_keypoints, 1, cv2.LINE_AA)

            # Right eye
            for i in range(42, 47):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.line(frame, tuple(pts68[47]), tuple(pts68[42]), color_lines, 2, lineType=cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[47]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[42]), 1, color_keypoints, 1, cv2.LINE_AA)

            # Lips outer
            for i in range(48, 60):
                cv2.line(frame, tuple(pts68[i]), tuple(pts68[i + 1]), color_lines, 2, lineType=cv2.LINE_AA)
                cv2.circle(frame, tuple(pts68[i]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.line(frame, tuple(pts68[48]), tuple(pts68[60]), color_lines, 2, lineType=cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[60]), 1, color_keypoints, 1, cv2.LINE_AA)
            cv2.circle(frame, tuple(pts68[48]), 1, color_keypoints, 1, cv2.LINE_AA)

However for this model the ordering seems to be different and render is not working properly, is it possible to sort 68pts in a standard ordering?

Also the reason I need this ordering to compute RMSE between multiple detectors

请问后续有无支持其他vtuber的计划?

如题，repo的名称是OpenVtuber，但现在README里只看到Kizuna AI的demo.

以后会不会加上其他的vtuber，比如月の美兎，猫宮ひなた，湊あくあ？主要问题是不是开源3D模型的来源或渠道？

~~好吧，上面的内容都不是重点。~~我只是想夸夸开发者，拿一套SOTA的方法来做这种东西真的是鬼才。第一次看到拿retinaface来干这个的。

Inaccurate arrow of gaze estimation

Hi,

I did some qualitative assessment using my own data trying to test face alignment + headpose + gaze. I merged some code from this repository and some additional code from lazer eye as you reffered to be able to render gaze arrow.

Its works well however to me it seems that the arrow is inverted in comparison to iris localization.

Code used for rendering:

Used constants

    YAW_THD = 45
    SIN_LEFT_THETA = 2 * sin(pi / 4)
    SIN_UP_THETA = sin(pi / 6)

Video processing

         while capture.isOpened():
                ret, frame = capture.read()
                if not ret:
                    break

                bboxes, _ = fd.inference(frame)

                detections = []
                for landmarks in fa.get_landmarks(frame, bboxes):
                    euler_angle = hp.get_head_pose(landmarks)
                    pitch, yaw, roll = euler_angle[:, 0]

                    eye_markers = np.take(landmarks, fa.eye_bound, axis=0)

                    eye_centers = np.average(eye_markers, axis=1)

                    eye_lengths = (landmarks[[39, 93]] - landmarks[[35, 89]])[:, 0]

                    iris_left = gs.get_mesh(frame, eye_lengths[0], eye_centers[0])
                    pupil_left, _ = gs.draw_pupil(iris_left, frame, thickness=1)

                    iris_right = gs.get_mesh(frame, eye_lengths[1], eye_centers[1])
                    pupil_right, _ = gs.draw_pupil(iris_right, frame, thickness=1)

                    pupils = np.array([pupil_left, pupil_right])

                    poi = landmarks[[35, 89]], landmarks[[39, 93]], pupils, eye_centers
                    theta, pha, delta = calculate_3d_gaze(frame, poi)

                    if yaw > 30:
                        end_mean = delta[0]
                    elif yaw < -30:
                        end_mean = delta[1]
                    else:
                        end_mean = np.average(delta, axis=0)

                    if end_mean[0] < 0:
                        zeta = arctan(end_mean[1] / end_mean[0]) + pi
                    else:
                        zeta = arctan(end_mean[1] / (end_mean[0] + 1e-7))

                    if roll < 0:
                        roll += 180
                    else:
                        roll -= 180

                    real_angle = zeta + roll * pi / 180

                    R = norm(end_mean)
                    offset = R * cos(real_angle), R * sin(real_angle)

                    landmarks[[38, 92]] = landmarks[[34, 88]] = eye_centers

                    draw_sticker(frame, offset, pupils, landmarks)
                    detections.append({
                        'landmarks': [lm.tolist() for lm in landmarks],
                        'offset': offset,
                        'pupils': [p.tolist() for p in pupils],
                        'iris_left': [il.tolist() for il in iris_left],
                        'iris_right': [ir.tolist() for ir in iris_right]
                    })

                json_result[frame_id] = detections
                sink.write(frame)
                frame_id += 1

Draw arrows + landmarks

def draw_sticker(src, offset, pupils, landmarks,
                 blink_thd=0.22,
                 arrow_color=(0, 125, 255), copy=False):
    if copy:
        src = src.copy()

    left_eye_hight = landmarks[33, 1] - landmarks[40, 1]
    left_eye_width = landmarks[39, 0] - landmarks[35, 0]

    right_eye_hight = landmarks[87, 1] - landmarks[94, 1]
    right_eye_width = landmarks[93, 0] - landmarks[89, 0]

    for mark in landmarks.reshape(-1, 2).astype(int):
        cv2.circle(src, tuple(mark), radius=1,
                   color=(0, 0, 255), thickness=-1)

    if left_eye_hight / left_eye_width > blink_thd:
        cv2.arrowedLine(src, tuple(pupils[0].astype(int)),
                        tuple((offset+pupils[0]).astype(int)), arrow_color, 2)

    if right_eye_hight / right_eye_width > blink_thd:
        cv2.arrowedLine(src, tuple(pupils[1].astype(int)),
                        tuple((offset+pupils[1]).astype(int)), arrow_color, 2)

    return src

socketio.exceptions.ConnectionError: Connection refused by the server

INFO: Initialized TensorFlow Lite runtime.
Traceback (most recent call last):
  File "vtuber_link_start.py", line 38, in <module>
    sio.connect("http://192.168.14.131:6789")
  File "/home/charmve/.local/lib/python3.6/site-packages/socketio/client.py", line 282, in connect
    six.raise_from(exceptions.ConnectionError(exc.args[0]), None)
  File "<string>", line 3, in raise_from
socketio.exceptions.ConnectionError: Connection refused by the server

File "vtuber_link_start.py", line 38, in <module>
sio.connect("http://192.168.14.131:6789")

# ======================================================

sio = socketio.Client()

@sio.on('connect', namespace='/kizuna')
def on_connect():
    sio.emit('result_data', 0, namespace='/kizuna')

# sio.connect("http://127.0.0.1:6789")
sio.connect("192.168.14.131:6789")
sio.wait()

# ======================================================

the original address is http://127.0.0.1:6789, 192.168.14.131 is my ip.

how to address this error ?

关于gaze预测

很棒的工作！

我看到gaze的预测是用2d关键点算出来的，通过calculate_3d_gaze这个函数，可以给出原理的相关介绍吗，或者推导的链接~

谢谢！

socketio.exceptions.ConnectionError: OPEN packet not returned by server

Running the python client gives me this error. The server is running. Any hint on how to fix this?

Traceback (most recent call last):
File ".\vtuber_link_start.py", line 46, in
sio.connect("http://127.0.0.1:6789")
File "C:\Users\cipol\Projects\Avatar\Engagement research\OpenVtuber\venv\lib\site-packages\socketio\client.py", line 282, in connect
six.raise_from(exceptions.ConnectionError(exc.args[0]), None)
File "", line 3, in raise_from
socketio.exceptions.ConnectionError: OPEN packet not returned by server