gaomingqi / track-anything Goto Github PK

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.

License: MIT License

Python 99.22% HTML 0.78%

segment-anything video-object-segmentation interactive-tracking track-anything video-object-tracking inpaint-anything

track-anything's Introduction

Track-Anything is a flexible and interactive tool for video object tracking and segmentation. It is developed upon Segment Anything, can specify anything to track and segment via user clicks only. During tracking, users can flexibly change the objects they wanna track or correct the region of interest if there are any ambiguities. These characteristics enable Track-Anything to be suitable for:

Video object tracking and segmentation with shot changes.
Visualized development and data annotation for video object tracking and segmentation.
Object-centric downstream video tasks, such as video inpainting and editing.

🚀 Updates

2023/05/02: We uploaded tutorials in steps 🗺️. Check HERE for more details.
2023/04/29: We improved inpainting by decoupling GPU memory usage and video length. Now Track-Anything can inpaint videos with any length! 😺 Check HERE for our GPU memory requirements.
2023/04/25: We are delighted to introduce Caption-Anything ✍️, an inventive project from our lab that combines the capabilities of Segment Anything, Visual Captioning, and ChatGPT.
2023/04/20: We deployed DEMO on Hugging Face 🤗!
2023/04/14: We made Track-Anything public!

🗺️ Video Tutorials (Track-Anything Tutorials in Steps)

huggingface_demo_operation.mp4

🕹️ Example - Multiple Object Tracking and Segmentation (with XMem)

qingming.mp4

🕹️ Example - Video Object Tracking and Segmentation with Shot Changes (with XMem)

curry_good_night_low.mp4

🕹️ Example - Video Inpainting (with E2FGVI)

inpainting.mp4

💻 Get Started

Linux & Windows

# Clone the repository:
git clone https://github.com/gaomingqi/Track-Anything.git
cd Track-Anything

# Install dependencies: 
pip install -r requirements.txt

# Run the Track-Anything gradio demo.
python app.py --device cuda:0
# python app.py --device cuda:0 --sam_model_type vit_b # for lower memory usage

📖 Citation

If you find this work useful for your research or applications, please cite using this BibTeX:

@misc{yang2023track,
      title={Track Anything: Segment Anything Meets Videos}, 
      author={Jinyu Yang and Mingqi Gao and Zhe Li and Shang Gao and Fangjing Wang and Feng Zheng},
      year={2023},
      eprint={2304.11968},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

👏 Acknowledgements

The project is based on Segment Anything, XMem, and E2FGVI. Thanks for the authors for their efforts.

track-anything's People

Stargazers

Watchers

Forkers

dumpmemory deepak5j cat-stack-boop winfredeu 8-diagrams kristianmk wolfzer44 dagelf benmbark pugcn jackyyvan liwei0705 creatorcao shaneholloman hirajanwin furmanlukasz yuniancong rmallof if-ai abdulazizbek victoruceda jimzou gryhkn anoopssjchipli air23zj dabirsagar nonfungiblefuturist otey247 hithereai rcadecaro blackaller faisalshahbaz bccw2021 d-mad lwppwl oojjcorp techthiyanes nrobin anminhhung qianqian121 iqbalsublime xymfei shen-dongming 23pointsnorth djord97 nopeanuts evdcush roman-212 furkanedizkan hsaigroup cjh88888 yyang181 tinfeijun ricardodeazambuja zdyshine githuberpilot yukiman76 freescar kp-forks leslieetubo researcher48 techwealthlab jefedeoro nangal kzke tironiigor jettisonthenet tgohblio skytodmoon jamalsjones gitluchaoqiang qingqingniu xcytxs winter2897 vn-os rachidbenzhair azure-arc-0 sorieil tonywhite11 shallow2000 autogyro huyxuhao merecesarchviz bazarum perevalovds perseusdg pratik-behera alaincavel nisaaragharia alirezaahmadi ufodriverr singhdavinderpa1 say383 shunransasaki 913883232 romitavia masllsam goyallon arj-m iphyer

track-anything's Issues

app.py: error: unrecognized arguments:

First of all this is an outstanding project.

unfortunately i could not replicate it here, since this error prevents the server to run.

python app.py --device cuda:0 -sam_model_type vit_h --port 12212
usage: app.py [-h] [--device DEVICE] [--sam_model_type SAM_MODEL_TYPE]
[--port PORT] [--debug] [--mask_save MASK_SAVE]

app.py: error: unrecognized arguments: -sam_model_type vit_h

any tip to what to do, before I try to fix it myself?

cheers

Run time performance

Hi,

Thank you for such an amazing and useful project! I am wondering what the best performance of the running time for each frame you got when tracking an object in a video? The reason I am asking is that I am thinking to use this powerful tool in a real-time AR application but not sure whether it could be capable of this. Thank you again for your time and effort!

Tutorial install Track-Anything in Windows

https://youtu.be/MQJ4LMLXm30

point_mask[point[1], point[0]] = 1 raise IndexError: index 648 is out of bounds for axis 1 with size 640

What is the required GPU memory for running this project?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.17 GiB (GPU 0; 14.84 GiB total capacity; 11.81 GiB already allocated; 1.67 GiB free; 12.22 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Still getting errors even with 16GB GPU memory

大佬，为什么我运行不了呢

(venv) D:\Track-Anything-master>python app.py
Initializing BaseSegmenter to cuda:0
Traceback (most recent call last):
File "D:\Track-Anything-master\app.py", line 383, in
model = TrackingAnything(SAM_checkpoint, xmem_checkpoint, e2fgvi_checkpoint,args)
File "D:\Track-Anything-master\track_anything.py", line 18, in init
self.samcontroler = SamControler(self.sam_checkpoint, args.sam_model_type, args.device)
File "D:\Track-Anything-master\tools\interact_tools.py", line 37, in init
self.sam_controler = BaseSegmenter(SAM_checkpoint, model_type, device)
File "D:\Track-Anything-master\tools\base_segmenter.py", line 25, in init
self.model = sam_model_registrymodel_type
File "C:\ana\lib\site-packages\segment_anything\build_sam.py", line 15, in build_sam_vit_h
return _build_sam(
File "C:\ana\lib\site-packages\segment_anything\build_sam.py", line 105, in _build_sam
state_dict = torch.load(f)
File "C:\ana\lib\site-packages\torch\serialization.py", line 777, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "C:\ana\lib\site-packages\torch\serialization.py", line 282, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
是torch版本不对吗？

masks displayed from multiple frames

Hi,

Nice work!

Using the demo app, whenever one clicks on a "Add mask" and then switches to a new frame, and then adds another mask, all masks from all time steps are displayed. Perhaps it would be better to display only masks associated to a particular frame?
If multiple masks are selected of the same object over a longer video sequence, they interpreted as separate tracked objects and not merged. How can one select in the UI multiple occurrences of the same track?

Also, I often get the following error

in generate_video_from_frames
    frames = torch.from_numpy(np.asarray(frames))
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (200,) + inhomogeneous part.

Int conversion error (not sure why this would be happening)

The app loads the videos correctly and allows me to select masks for tracking in video. However, when running the tracking feature, I keep getting this error:

File "app.py", line 351, in generate_video_from_frames
torchvision.io.write_video(output_path, frames, fps=fps, video_codec="libx264")
File "C:\Users\AppData\Roaming\Python\Python38\site-packages\torchvision\io\video.py", line 70, in write_video
stream = container.add_stream(video_codec, rate=fps)
File "av\container\output.pyx", line 101, in av.container.output.OutputContainer.add_stream
File "av\utils.pyx", line 57, in av.utils.to_avrational
OverflowError: Python int too large to convert to C long

Errors in Hugging Face interface

Cool tool. Testing it for our experiments and running into issues. see attached image. There're 3 blocks of "error" after I uploaded the video. I don't have any controls to try and do anything (buttons don't do anything).

Is the code runnable on Raspberry Pi?

First of all, you guys did an amazing job 👏💯💯💯👏!!

I've always wanted to track objects in videos found other projects which can do pretty good job for tracking, but not able to run on Arm based edge devices e.g. RPi.

So just wondering is TAM for tracking purpose able to be run on Raspberry Pi 4(Mine has 4GB memory)?

Looking forward to your reply.

Cheers,
Winston

X-mem annotation question.

In the Steph Curry example, how many different manual annotations needed to be made before processing? I am attempting to process a similar video, but can not keep consistency of a person between different camera cuts

Background removal

Hello,

Your segmentation and tracking are impressive! I was wondering if there is a way to remove the background once a 'subject' is being tracked?

Thank you.

Tracking Error

Hello,
I cloned the repo on my local machine but I get an ERROR on the GUI whenever I click TRACK

[ERROR:[email protected]] global D:\a\opencv-python\opencv-python\opencv\modules\videoio\src\cap.cpp (166) cv::VideoCapture::open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.6.0) D:\a\opencv-python\opencv-python\opencv\modules\videoio\src\cap_images.cpp:293: error: (-215:Assertion failed) !_filename.empty() in function 'cv::CvCapture_Images::open'

Traceback (most recent call last):
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\gradio\routes.py", line 395, in run_predict output = await app.get_blocks().process_api(
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\gradio\blocks.py", line 1193, in process_api
result = await self.call_function(
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\gradio\blocks.py", line 916, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\anyio\to_thread.py", line 28, in run_sync
return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\anyio_backends_asyncio.py", line 818, in run_sync_in_worker_thread
return await future
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\anyio_backends_asyncio.py", line 754, in run
result = context.run(func, *args)
File "D:\Track-Anything\app.py", line 88, in get_frames_from_video
image_size = (frames[0].shape[0],frames[0].shape[1])
IndexError: list index out of range
Traceback (most recent call last):
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\gradio\routes.py", line 395, in run_predict output = await app.get_blocks().process_api(
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\gradio\blocks.py", line 1193, in process_api
result = await self.call_function(
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\gradio\blocks.py", line 916, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\anyio\to_thread.py", line 28, in run_sync
return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\anyio_backends_asyncio.py", line 818, in run_sync_in_worker_thread
return await future
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\anyio_backends_asyncio.py", line 754, in run
result = context.run(func, *args)
File "D:\Track-Anything\app.py", line 221, in vos_tracking_video
masks, logits, painted_images = model.generator(images=following_frames, template_mask=template_mask)
File "D:\Track-Anything\track_anything.py", line 44, in generator
mask, logit, painted_image = self.xmem.track(images[i], template_mask)
File "C:\Users\Francesco\anaconda3\envs\stable_diffusion\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "D:\Track-Anything\tracker\base_tracker.py", line 82, in track
probs, _ = self.tracker.step(frame_tensor, mask, labels) # logits 2 (bg fg) H W
File "D:\Track-Anything/tracker\inference\inference_core.py", line 102, in step
value, hidden = self.network.encode_value(image, f16, self.memory.get_hidden(),
File "D:\Track-Anything/tracker\model\network.py", line 75, in encode_value
others = torch.cat([
RuntimeError: torch.cat(): expected a non-empty list of Tensors

This error occurs when I click on " Tracking"

Traceback (most recent call last):
  File "H:\Deepfacelab\Deepface\Track-Anything\venv\Lib\site-packages\gradio\routes.py", line 395, in run_predict
    output = await app.get_blocks().process_api(
  File "H:\Deepfacelab\Deepface\Track-Anything\venv\Lib\site-packages\gradio\blocks.py", line 1193, in process_api
    result = await self.call_function(
  File "H:\Deepfacelab\Deepface\Track-Anything\venv\Lib\site-packages\gradio\blocks.py", line 916, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "H:\Deepfacelab\Deepface\Track-Anything\venv\Lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "H:\Deepfacelab\Deepface\Track-Anything\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "H:\Deepfacelab\Deepface\Track-Anything\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "H:\Deepfacelab\Deepface\Track-Anything\app.py", line 149, in vos_tracking_video
    masks, logits, painted_images = model.generator(images=following_frames, template_mask=template_mask)
  File "H:\Deepfacelab\Deepface\Track-Anything\track_anything.py", line 46, in generator
    mask, logit, painted_image = self.xmem.track(images[i], template_mask)
  File "H:\Deepfacelab\Deepface\Track-Anything\venv\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "H:\Deepfacelab\Deepface\Track-Anything\tracker\base_tracker.py", line 81, in track
    probs, _ = self.tracker.step(frame_tensor, mask, labels)   # logits 2 (bg fg) H W
  File "H:\Deepfacelab\Deepface\Track-Anything/tracker\inference\inference_core.py", line 113, in step
    return unpad(pred_prob_with_bg, self.pad), None
  File "H:\Deepfacelab\Deepface\Track-Anything/tracker\util\tensor_util.py", line 35, in unpad
    if len(img.shape) == 4:
AttributeError: 'NoneType' object has no attribute 'shape'

More errors, how can I solve them?

I upgraded gradio because I had a different error and that would fix the issue. Now my own videos won't load half the time when I click "get video info" But the sample videos at the bottom at least load. But it gives me this issue.

AttributeError: module 'gradio' has no attribute 'SelectData'

$ python app.py --device cuda:0
'mim' is not recognized as an internal or external command,
operable program or batch file.
Traceback (most recent call last):
File "D:\remover vdieo\Track-Anything\app.py", line 159, in
def sam_refine(video_state, point_prompt, click_state, interactive_state, evt:gr.SelectData):
^^^^^^^^^^^^^
AttributeError: module 'gradio' has no attribute 'SelectData'

does not run although all requirements fullfilled

the console givesme this messages:

Initializing BaseSegmenter to cuda:0
Hyperparameters read from the model weights: C^k=64, C^v=512, C^h=64
Single object mode: False
Traceback (most recent call last):
File "G:\TrackAnything\Track-Anything\app.py", line 384, in
model = TrackingAnything(SAM_checkpoint, xmem_checkpoint, e2fgvi_checkpoint,args)
File "G:\TrackAnything\Track-Anything\track_anything.py", line 20, in init
self.baseinpainter = BaseInpainter(self.e2fgvi_checkpoint, args.device)
File "G:\TrackAnything\Track-Anything\inpainter\base_inpainter.py", line 20, in init
net = importlib.import_module('inpainter.model.e2fgvi_hq')
File "C:\Users\Tanvir\AppData\Local\Programs\Python\Python310\lib\importlib_init_.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in load_unlocked
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "G:\TrackAnything\Track-Anything\inpainter\model\e2fgvi_hq.py", line 9, in
from inpainter.model.modules.feat_prop import BidirectionalPropagation, SecondOrderDeformableAlignment
File "G:\TrackAnything\Track-Anything\inpainter\model\modules\feat_prop.py", line 7, in
from mmcv.ops import ModulatedDeformConv2d, modulated_deform_conv2d
File "C:\Users\Tanvir\AppData\Local\Programs\Python\Python310\lib\site-packages\mmcv\ops_init.py", line 2, in
from .active_rotated_filter import active_rotated_filter
File "C:\Users\Tanvir\AppData\Local\Programs\Python\Python310\lib\site-packages\mmcv\ops\active_rotated_filter.py", line 10, in
ext_module = ext_loader.load_ext(
File "C:\Users\Tanvir\AppData\Local\Programs\Python\Python310\lib\site-packages\mmcv\utils\ext_loader.py", line 13, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "C:\Users\Tanvir\AppData\Local\Programs\Python\Python310\lib\importlib_init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed while importing _ext: The specified procedure could not be found.

How did u segment entire person

When I clicked on single point, the SAM output like this:

it generates many masks on person

and then run tracker, got error:

tracker\util\mask_mapper.py:47 in convert_mask      │
│                                                                                                  │
│   44 │   │                                                                                       │
│   45 │   │   new_labels = list(set(labels) - set(self.labels))                                   │
│   46 │   │   if not exhaustive:                                                                  │
│ ❱ 47 │   │   │   assert len(new_labels) == len(labels), 'Old labels found in non-exhaustive m    │
│   48 │   │                                                                                       │
│   49 │   │   # add new remappings                                                                │
│   50 │   │   for i, l in enumerate(new_labels):                                                  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: Old labels found in non-exhaustive mode

Feature Request: Add pip package.

I really liked your work 💯 I want to add it to the Hugging Face platform. In this space there is the A10 Large GPU.

https://huggingface.co/spaces/ArtGAN/Segment-Anything-Video

Need 29 VRAM to test?

can this run on silicon M1/M2 ?

Multiple Object Tracking automation

Thanks for your work. I was wondering if there is any support for automatic multiple object tracking?

Can this run under win11? Is it possible to make compatibility adjustments for Win11 in the future?

I configured the environment and successfully lauched in powershell , but the web interface cannot be opened under win11. Is it possible to make compatibility adjustments for Win11 in the future?

Getting an error called: "Torch not compiled with CUDA enabled"

app.py --device cuda:0

Initializing BaseSegmenter to cuda:0
Traceback (most recent call last):
File "C:\Users\Happy\Desktop\Git\Track-Anything\app.py", line 383, in
model = TrackingAnything(SAM_checkpoint, xmem_checkpoint, e2fgvi_checkpoint,args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Happy\Desktop\Git\Track-Anything\track_anything.py", line 18, in init
self.samcontroler = SamControler(self.sam_checkpoint, args.sam_model_type, args.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Happy\Desktop\Git\Track-Anything\tools\interact_tools.py", line 37, in init
self.sam_controler = BaseSegmenter(SAM_checkpoint, model_type, device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Happy\Desktop\Git\Track-Anything\tools\base_segmenter.py", line 26, in init
self.model.to(device=self.device)
File "C:\Users\Happy\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1145, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Happy\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\Users\Happy\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\Users\Happy\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\Users\Happy\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 820, in apply
param_applied = fn(param)
^^^^^^^^^
File "C:\Users\Happy\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1143, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Happy\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\cuda_init.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

I don't know how to fix this, can anyone help?

get video info error

I uploaded a video, but it won't show.Get the info error.

can this run in macos m1 pro(mps)

as what i questioned, i don't have Nvidia device,just mps

I

Use modern python tooling

Why is this using requirements.txt?

Please move to poetry.

ImportError: cannot import name 'SpaceStage' from 'huggingface_hub' (unknown location)

(base) root@185:~/Track-Anything# python app.py --device cuda:0 --sam_model_type vit_h --port 80
Traceback (most recent call last):
File "/root/Track-Anything/app.py", line 1, in
import gradio as gr
File "/root/anaconda3/lib/python3.10/site-packages/gradio/init.py", line 3, in
import gradio.components as components
File "/root/anaconda3/lib/python3.10/site-packages/gradio/components.py", line 34, in
from gradio_client import utils as client_utils
File "/root/anaconda3/lib/python3.10/site-packages/gradio_client/init.py", line 1, in
from gradio_client.client import Client
File "/root/anaconda3/lib/python3.10/site-packages/gradio_client/client.py", line 21, in
from huggingface_hub import SpaceStage
ImportError: cannot import name 'SpaceStage' from 'huggingface_hub' (unknown location)

Does this require high performance GPU?

Hi，I am the one who Email you about my browser happen errors，yesterday. Today,I input "python app.py --device cuda:0 --sam_model_type vit_b"，It works ok！Is this due to my GPU？I use 2070 8g.
Another question,if I want to know the center point which area I clicked.What can I do for it.

ImportError: cannot import name 'SegAutoMaskPredictor' from partially initialized module 'metaseg' (most likely due to a circular import) (/root/anaconda3/lib/python3.10/site-packages/metaseg/init.py)

(base) root@185:~/Track-Anything# python app.py --device cuda:0 --sam_model_type vit_h --port 80
Traceback (most recent call last):
File "/root/Track-Anything/app.py", line 2, in
from demo import automask_image_app, automask_video_app, sahi_autoseg_app
File "/root/Track-Anything/demo.py", line 1, in
from metaseg import SegAutoMaskPredictor, SegManualMaskPredictor, SahiAutoSegmentation, sahi_sliced_predict
File "/root/anaconda3/lib/python3.10/site-packages/metaseg/init.py", line 7, in
from metaseg.falai_demo import falai_automask_image, falai_manuelmask_image
File "/root/anaconda3/lib/python3.10/site-packages/metaseg/falai_demo.py", line 5, in
from metaseg import SegAutoMaskPredictor, SegManualMaskPredictor
ImportError: cannot import name 'SegAutoMaskPredictor' from partially initialized module 'metaseg' (most likely due to a circular import) (/root/anaconda3/lib/python3.10/site-packages/metaseg/init.py)

how to make masks fully opaque , rather than semi transparent overlay

hey,
1- huge thanks, i was testing (linux / local install) working great and amazing speed. 3 sec , 720 p clip gets masked in 3 secs, what a time to be alive.

Q. is it possble to make mask fully opaque.. i.e orange color fully solid??

export masks?

Can I export the tracked object as a mask? is it built in to this? (I cant seem to find it). Also, I am no programmer!

Inpaint error - You are trying to inpaint without masks input

Running locally on Win11/4090 w/python 3.10.5 on pyenv
Added video 10 second 720p video, selected 3 masks, clicked Tracking, success:
Clicked inpaint, error like Error! You are trying to inpaint without masks input. Please track the selected mask first, and then press inpaint. If VRAM exceeded, please use the resize ratio to scaling down the image size.
- Tried restarting from the beginning and setting the resizing ratio to .5, same error.

Is this the correct order of operations? Is there a step I'm missing to send the same masks that just worked for tracking to inpaint?

Error during the installion

When I followed the instruction to install all the requirements, I still got some problems as follow:
Initializing BaseSegmenter to cuda:0
Traceback (most recent call last):
File "E:\Sydney University\CVProject\Project\Track_Anything\Track-Anything-master\app.py", line 343, in
model = TrackingAnything(SAM_checkpoint, xmem_checkpoint, e2fgvi_checkpoint,args)
File "E:\Sydney University\CVProject\Project\Track_Anything\Track-Anything-master\track_anything.py", line 13, in init
self.samcontroler = SamControler(sam_checkpoint, args.sam_model_type, args.device)
File "E:\Sydney University\CVProject\Project\Track_Anything\Track-Anything-master\tools\interact_tools.py", line 37, in init
self.sam_controler = BaseSegmenter(SAM_checkpoint, model_type, device)
File "E:\Sydney University\CVProject\Project\Track_Anything\Track-Anything-master\tools\base_segmenter.py", line 25, in init
self.model = sam_model_registrymodel_type
File "D:\Anaconda\envs\Track_Anything\lib\site-packages\segment_anything\build_sam.py", line 15, in build_sam_vit_h
return _build_sam(
File "D:\Anaconda\envs\Track_Anything\lib\site-packages\segment_anything\build_sam.py", line 105, in _build_sam
state_dict = torch.load(f)
File "D:\Anaconda\envs\Track_Anything\lib\site-packages\torch\serialization.py", line 705, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "D:\Anaconda\envs\Track_Anything\lib\site-packages\torch\serialization.py", line 242, in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Is there anyone meet the same problem and solve it?

AttributeError: 'Image' object has no attribute 'select'

I think I am the first person to report this issue.

template_frame.select(
    fn=sam_refine,
    inputs=[video_state, point_prompt, click_state, interactive_state],
    outputs=[template_frame, video_state, interactive_state, run_status]
)

Regarding this piece of code, I encountered the following problem:

AttributeError: 'Image' object has no attribute 'select'

I have tried updating the Gradio version, but it didn't work. After removing the template_frame.select function, the code runs successfully, but it obviously loses an important feature. Is there a good solution to this problem?
For reference, my project is using the AutoDL platform.

Improve object tracking for objects that temporarily disappear and reappear in the video

Hi all,

First of all, I would like to thank you again for your amazing work! I have successfully used it in my application, and it performs really well! However, I have noticed a performance issue that I would like to address to further improve the tracking capabilities of the model.

When tracking ONE object, if the object disappears (entirely) from the video for a few frames and then reappears, the model seems to struggle to identify it again, without using human interactions. I understand this is a common issue. However, I am wondering if there is anything I could modify in the code to enhance the model's ability to track objects that temporarily disappear and reappear in the video.

Any suggestions or guidance would be greatly appreciated. Thank you in advance!

Best regards,

Demo won't launch.

Holly@HOLLY MINGW64 ~/Documents/Track-Anything (master)
$ python app.py --device cuda:0
C:\Users\Holly\AppData\Local\Programs\Python\Python310\lib\site-packages\mmcv_init_.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
warnings.warn(
Initializing BaseSegmenter to cuda:0
Hyperparameters read from the model weights: C^k=64, C^v=512, C^h=64
Single object mode: False
Traceback (most recent call last):
File "C:\Users\Holly\Documents\Track-Anything\app.py", line 372, in
model = TrackingAnything(SAM_checkpoint, xmem_checkpoint, e2fgvi_checkpoint,args)
File "C:\Users\Holly\Documents\Track-Anything\track_anything.py", line 20, in init
self.baseinpainter = BaseInpainter(self.e2fgvi_checkpoint, args.device)
File "C:\Users\Holly\Documents\Track-Anything\inpainter\base_inpainter.py", line 20, in init
net = importlib.import_module('inpainter.model.e2fgvi_hq')
File "C:\Users\Holly\AppData\Local\Programs\Python\Python310\lib\importlib_init_.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in load_unlocked
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "C:\Users\Holly\Documents\Track-Anything\inpainter\model\e2fgvi_hq.py", line 9, in
from inpainter.model.modules.feat_prop import BidirectionalPropagation, SecondOrderDeformableAlignment
File "C:\Users\Holly\Documents\Track-Anything\inpainter\model\modules\feat_prop.py", line 7, in
from mmcv.ops import ModulatedDeformConv2d, modulated_deform_conv2d
File "C:\Users\Holly\AppData\Local\Programs\Python\Python310\lib\site-packages\mmcv\ops_init.py", line 3, in
from .active_rotated_filter import active_rotated_filter
File "C:\Users\Holly\AppData\Local\Programs\Python\Python310\lib\site-packages\mmcv\ops\active_rotated_filter.py", line 10, in
ext_module = ext_loader.load_ext(
File "C:\Users\Holly\AppData\Local\Programs\Python\Python310\lib\site-packages\mmcv\utils\ext_loader.py", line 13, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "C:\Users\Holly\AppData\Local\Programs\Python\Python310\lib\importlib_init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed while importing _ext: The specified procedure could not be found.

ERROR: Failed building wheel for mmcv-full

installing reqs ends with this:
active_rotated_filter.cpp
C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\include\yvals.h(17): fatal error C1083: Cannot open include file: 'crtdbg.h': No such file or directory
error: command 'C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\HostX86\x64\cl.exe' failed with exit code 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for mmcv-full
Running setup.py clean for mmcv-full
Successfully built metaseg thinplate segment-anything
Failed to build pycocotools mmcv-full
ERROR: Could not build wheels for pycocotools, mmcv-full, which is required to install pyproject.toml-based projects

C:\Users\XXXXX\Documents\Github\track-anything>

about run error

ConnectionResetError: [WinError 10054] 远程主机强迫关闭了一个现有的连接。

RuntimeError: CUDA error: invalid device ordinal

I think it's a external problem, but someone know any solution?
Nvidia/cuda drivers updated.

E:\Segment\Track-Anything>app.py
download checkpoints ......
download successfully!
download checkpoints ......
download successfully!
Initializing BaseSegmenter to cuda:4
Traceback (most recent call last):
File "E:\Segment\Track-Anything\app.py", line 200, in
model = TrackingAnything(SAM_checkpoint, xmem_checkpoint, args)
File "E:\Segment\Track-Anything\track_anything.py", line 14, in init
self.samcontroler = SamControler(sam_checkpoint, args.sam_model_type, args.device)
File "E:\Segment\Track-Anything\tools\interact_tools.py", line 37, in init
self.sam_controler = BaseSegmenter(SAM_checkpoint, model_type, device)
File "E:\Segment\Track-Anything\tools\base_segmenter.py", line 26, in init
self.model.to(device=self.device)
File "C:\Users\marco\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 990, in to
return self._apply(convert)
File "C:\Users\marco\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 642, in _apply
module._apply(fn)
File "C:\Users\marco\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 642, in _apply
module._apply(fn)
File "C:\Users\marco\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 642, in _apply
module._apply(fn)
File "C:\Users\marco\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 665, in _apply
param_applied = fn(param)
File "C:\Users\marco\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 988, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

any ways to get this runnning on cpu?

also is any ways to get the alpha mask of tracked object?

Where does "inpainter" come from?

Hey guys,

is the inpainter(https://github.com/gaomingqi/Track-Anything/tree/master/inpainter) origin from this repo or from other project (standalone)?

thanks.

Google colab demo?

Gradio demo seems too slow. Any plans to have a Google colab demo?

Suggest specifying metaseg==0.6.1

Seems like new metaseg==0.7 has some circular dependencies

Packages & Python Versions

Hello, thanks for the library.
Could you please provide version of Python and requirements.txt with fixed versions?

Getting errors installing the package as it is.

Even better if you could provide a docker container.

Error

When I click on the area in the picture, an error is reported

Traceback (most recent call last):
File "H:\Deepfacelab\Deepface\Track-Anything\venv\lib\site-packages\gradio\routes.py", line 395, in run_predict
output = await app.get_blocks().process_api(
File "H:\Deepfacelab\Deepface\Track-Anything\venv\lib\site-packages\gradio\blocks.py", line 1193, in process_api
result = await self.call_function(
File "H:\Deepfacelab\Deepface\Track-Anything\venv\lib\site-packages\gradio\blocks.py", line 916, in call_function
prediction = await anyio.to_thread.run_sync(
File "H:\Deepfacelab\Deepface\Track-Anything\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "H:\Deepfacelab\Deepface\Track-Anything\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "H:\Deepfacelab\Deepface\Track-Anything\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "H:\Deepfacelab\Deepface\Track-Anything\app.py", line 130, in sam_refine
image=video_state["origin_images"][video_state["select_frame_number"]],
TypeError: 'NoneType' object is not subscriptable

Testing on video segmentation datasets

Can this project handle videos composed of image sequences? Does the segmentation map output by SAM contain category information?
However, the paper presents the quantitative results of the model on DAVIS. What can I do to test other video datasets, such as YouTube-VOS or YouTube-VIS, to obtain performance metrics on these datasets?

Cannot plot image after install Tracking-Anything

Hi, I have successfully installed Tracking-Anything, and it works mostly as expected.

A minor bug is that, I cannot use either cv2.imshow() or plt.show() to plot any images, the code stucks forever.

I have created a new virtual environment testing the same code with only SAM installed, and it works fine. But with Tracking-Anything installed, the same bug happens.

Could you please look into this issue?

Many thanks!

在我的macstudio上运行出错，我的是Apple M1 Max芯片，32GB内存，macOS是13.0 (22A380)

运行情况：(supple) sepplu@seppdeMac-Studio Track-Anything % python3 app.py --device cpu
Defaulting to user installation because normal site-packages is not writeable
Looking in links: https://download.openmmlab.com/mmcv/dist/cpu/torch2.0.0/index.html
Requirement already satisfied: mmcv in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (2.0.0)
Requirement already satisfied: addict in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmcv) (2.4.0)
Requirement already satisfied: mmengine>=0.2.0 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmcv) (0.7.3)
Requirement already satisfied: numpy in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmcv) (1.24.3)
Requirement already satisfied: packaging in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmcv) (23.1)
Requirement already satisfied: Pillow in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmcv) (9.5.0)
Requirement already satisfied: pyyaml in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmcv) (6.0)
Requirement already satisfied: yapf in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmcv) (0.33.0)
Requirement already satisfied: matplotlib in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmengine>=0.2.0->mmcv) (3.7.1)
Requirement already satisfied: rich in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmengine>=0.2.0->mmcv) (13.3.5)
Requirement already satisfied: termcolor in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmengine>=0.2.0->mmcv) (2.3.0)
Requirement already satisfied: opencv-python>=3 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from mmengine>=0.2.0->mmcv) (4.7.0.72)
Requirement already satisfied: tomli>=2.0.1 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from yapf->mmcv) (2.0.1)
Requirement already satisfied: contourpy>=1.0.1 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from matplotlib->mmengine>=0.2.0->mmcv) (1.0.7)
Requirement already satisfied: cycler>=0.10 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from matplotlib->mmengine>=0.2.0->mmcv) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from matplotlib->mmengine>=0.2.0->mmcv) (4.39.3)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from matplotlib->mmengine>=0.2.0->mmcv) (1.4.4)
Requirement already satisfied: pyparsing>=2.3.1 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from matplotlib->mmengine>=0.2.0->mmcv) (3.0.9)
Requirement already satisfied: python-dateutil>=2.7 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from matplotlib->mmengine>=0.2.0->mmcv) (2.8.2)
Requirement already satisfied: importlib-resources>=3.2.0 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from matplotlib->mmengine>=0.2.0->mmcv) (5.12.0)
Requirement already satisfied: markdown-it-py<3.0.0,>=2.2.0 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from rich->mmengine>=0.2.0->mmcv) (2.2.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from rich->mmengine>=0.2.0->mmcv) (2.15.1)
Requirement already satisfied: zipp>=3.1.0 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from importlib-resources>=3.2.0->matplotlib->mmengine>=0.2.0->mmcv) (3.15.0)
Requirement already satisfied: mdurl~=0.1 in /Users/sepplu/Library/Python/3.9/lib/python/site-packages (from markdown-it-py<3.0.0,>=2.2.0->rich->mmengine>=0.2.0->mmcv) (0.1.2)
Requirement already satisfied: six>=1.5 in /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/site-packages (from python-dateutil>=2.7->matplotlib->mmengine>=0.2.0->mmcv) (1.15.0)
Initializing BaseSegmenter to cpu
Traceback (most recent call last):
File "/Users/sepplu/Track-Anything/app.py", line 383, in
model = TrackingAnything(SAM_checkpoint, xmem_checkpoint, e2fgvi_checkpoint,args)
File "/Users/sepplu/Track-Anything/track_anything.py", line 19, in init
self.xmem = BaseTracker(self.xmem_checkpoint, device=args.device)
File "/Users/sepplu/Track-Anything/tracker/base_tracker.py", line 32, in init
network = XMem(config, xmem_checkpoint).to(device).eval()
File "/Users/sepplu/Track-Anything/tracker/model/network.py", line 24, in init
model_weights = self.init_hyperparameters(config, model_path, map_location)
File "/Users/sepplu/Track-Anything/tracker/model/network.py", line 145, in init_hyperparameters
model_weights = torch.load(model_path, map_location=map_location)
File "/Users/sepplu/miniconda3/envs/supple/lib/python3.10/site-packages/torch/serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/Users/sepplu/miniconda3/envs/supple/lib/python3.10/site-packages/torch/serialization.py", line 1172, in _load
result = unpickler.load()
File "/Users/sepplu/miniconda3/envs/supple/lib/python3.10/site-packages/torch/serialization.py", line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "/Users/sepplu/miniconda3/envs/supple/lib/python3.10/site-packages/torch/serialization.py", line 1116, in load_tensor
wrap_storage=restore_location(storage, location),
File "/Users/sepplu/miniconda3/envs/supple/lib/python3.10/site-packages/torch/serialization.py", line 217, in default_restore_location
result = fn(storage, location)
File "/Users/sepplu/miniconda3/envs/supple/lib/python3.10/site-packages/torch/serialization.py", line 182, in _cuda_deserialize
device = validate_cuda_device(location)
File "/Users/sepplu/miniconda3/envs/supple/lib/python3.10/site-packages/torch/serialization.py", line 166, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.