danbider / lightning-pose Goto Github PK

View Code? Open in Web Editor NEW

210.0 210.0 30.0 146.26 MB

Accelerated pose estimation and tracking using semi-supervised convolutional networks.

License: MIT License

Python 4.88% Jupyter Notebook 95.12%

cnn pose-estimation pytorch-lightning

lightning-pose's People

Contributors

Stargazers

Watchers

lightning-pose's Issues

Question on multi-agent pose

Hello,

Not an issue, so feel free to label as question!

Thanks for releasing lightning-pose. Is there a way to make it work with multi-agent videos, or is the added complexity of handling spatiotemporal constraints on different tracklets the difficult part here?

Query regarding use of Filtered_predictions in LightningPose model

Hi Team,

I want to know, what the post-processing approach used by LightningPose ? Also, I have following queries:

Does lightningPose use filtered_predictions options in its model as a post-processing step as done by the DeeplabCut model?
If yes, could you please explain, is it using filtered predictions by default (we cannot disable it) or can we change this setting in the config file (if yes, where in the config file)?

Many thanks in advance,

run inference on new videos without needing access to original training dataset

Currently the Lightning Pose code requires construction of a dataset/data module in order to load model parameters and perform inference. This requires users to move training datasets around with the model checkpoint, which is not ideal. Consider removing this requirement.

how to understand unlabeled frames vs temporal context frames?

hi, lightning pose team
As the tutorial mentions, both base model and context model make use of unlabeled frames but context model utilizes temporal context frames.
so what is the difference and relation between unlabeled frames vs temporal context frames? Are temporal context frames derived from unlabeled frames?

dali.base.train.sequence_length - number of unlabeled frames per batch in regression and heatmap models (i.e. “base” models that do not use temporal context frames)

dali.context.train.batch_size - number of unlabeled frames per batch in heatmap_mhcrnn model (i.e. “context” models that utilize temporal context frames); each frame in this batch will be accompanied by context frames, so the true batch size will actually be larger than this number

large rotations in imgaug throw errors

Hi @themattinthehatt , I have 5 labeled keypoints per frame.
Thanks for the info that heatmaps are more accurate.
Also, I have noticed that when I use DLC image augmentation and when the image rotation aug is above 10, the code throws an error as below.

Error executing job with overrides: []
Traceback (most recent call last):
File "scripts/train_hydra.py", line 175, in train
trainer.fit(model=model, datamodule=data_module)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 737, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1168, in _run
results = self._run_stage()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1254, in _run_stage
return self._run_train()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1285, in _run_train
self.fit_loop.run()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 270, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 203, in advance
batch_output = self.batch_loop.run(kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 87, in advance
outputs = self.optimizer_loop.run(optimizers, kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 201, in advance
result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 248, in _run_optimization
self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 358, in _optimizer_step
self.trainer._call_lightning_module_hook(
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1552, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1673, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 168, in step
step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 216, in optimizer_step
return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 153, in optimizer_step
return optimizer.step(closure=closure, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
return wrapped(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torch/optim/optimizer.py", line 113, in wrapper
return func(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torch/optim/adam.py", line 118, in step
loss = closure()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 138, in _wrap_closure
closure_result = closure()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 146, in call
self._result = self.closure(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 132, in closure
step_output = self._step_fn()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 407, in _training_step
training_step_output = self.trainer._call_strategy_hook("training_step", *kwargs.values())
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1706, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 358, in training_step
return self.model.training_step(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/typeguard/init.py", line 1033, in wrapper
retval = func(*args, **kwargs)
File "/home/walthamadmin/notebooks/projects/lightning-pose/lightning_pose/models/base.py", line 347, in training_step
loss = self.evaluate_labeled(train_batch, "train")
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/typeguard/init.py", line 1033, in wrapper
retval = func(*args, **kwargs)
File "/home/walthamadmin/notebooks/projects/lightning-pose/lightning_pose/models/base.py", line 321, in evaluate_labeled
data_dict = self.get_loss_inputs_labeled(batch_dict=batch_dict)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/typeguard/init.py", line 1033, in wrapper
retval = func(*args, **kwargs)
File "/home/walthamadmin/notebooks/projects/lightning-pose/lightning_pose/models/heatmap_tracker.py", line 233, in get_loss_inputs_labeled
predicted_keypoints, confidence = self.run_subpixelmaxima(predicted_heatmaps)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/typeguard/init.py", line 1033, in wrapper
retval = func(*args, **kwargs)
File "/home/walthamadmin/notebooks/projects/lightning-pose/lightning_pose/models/heatmap_tracker.py", line 143, in run_subpixelmaxima
confidences = evaluate_heatmaps_at_location(heatmaps=softmaxes, locs=preds)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/typeguard/init.py", line 1033, in wrapper
retval = func(*args, **kwargs)
File "/home/walthamadmin/notebooks/projects/lightning-pose/lightning_pose/data/utils.py", line 333, in evaluate_heatmaps_at_location
heatmaps_padded[i, j, k_offset, l_offset].squeeze(-1).squeeze(-1)
IndexError: index -9223372036854775808 is out of bounds for dimension 2 with size 388

Kindly check is there is a bug and help in correcting this.

Originally posted by @prateekdhawalia in #56 (comment)

any manual refining and re-training function in lightning pose?

hi, lightning pose team
is there any manual refining and re-training function in lightning pose if I feel unsatisfied with the prediction?

create bi-directional converter from Lighting Pose and Label Studio.

Please review Label Studio Setup Instructions

A method to convert Lightning Pose Annotation data with Label Studio Annotation

Lightning Pose Annotation uses:

ToyMouseRunningData pngs
the pngs are annotated in CSV

Label Studio can import pre-annotated data
JSON-MIN version

[
  {
    "img": "/data/upload/1/18928a62-img1.png",
    "id": 1,
    "kp-1": [
      {
        "x": 96.71717171717172,
        "y": 7.389162561576355,
        "width": 0.5050505050505051,
        "keypointlabels": [
          "Nose"
        ],
        "original_width": 396,
        "original_height": 406
      },
      {
        "x": 92.17171717171718,
        "y": 5.41871921182266,
        "width": 0.5050505050505051,
        "keypointlabels": [
          "Face"
        ],
        "original_width": 396,
        "original_height": 406
      }
    ],
    "annotator": 1,
    "annotation_id": 1,
    "created_at": "2022-07-06T12:49:47.101659Z",
    "updated_at": "2022-07-06T12:49:47.101700Z",
    "lead_time": 5.507
  }
]

TODO:

convert toy_datasets/toymouseRunningData/CollectedData_.csv to label studio pre-annotated format for import

@kathleenislee https://github.com/robert-s-lee/lightning-pose/blob/label-studio/tests/utils/csv_to_label_studio.py has sample code to help get started that reads CSV. the script needs to export in JSON-MIN format.

convert label studio pre-annotated JSON to toy_datasets/toymouseRunningData/CollectedData_.csv format

Compatible with Windows?

Hi, do you have plans to provide a Windows-compatible installation option? The installation instructions specify Linux compatibility only and I have now run out of credits for further use of the cloud version. Thanks in advance.

Inference speed

Hi Team,

Cool paper.

This is a question not an issue: I'm considering trying out lightning pose on my data but was hoping you could provide some info on inference speed before I try it out. I didn't see anything in the paper. About how many frames/second are you able to transcribe after training? Do you have any benchmarks I missed?

Thanks,

How many labeled frames should we have?

Hi, lightning pose team
In your preprint, Figure 4: Unlabeled frames improve pose estimation (raw network predictions.), Fig 4C and 4D show that when there are 75 labeled frames, semi-super context model performs the best. But when it goes to 631 labeled frames, it seems that different models(dlc, baseline, semi-super, semi-super context) performance would be very similar but all are better than those in 75 label frames.
So how many frames should I extract to label at the very beginning? With less as tens or more as hundreds?

While running the lightning pose , it ends up with "Killed"

Hi Team,

When I try to run the Lightning pose with animal data on NeSI platform it ends up with "killed" .

NesI platform (https://support.nesi.org.nz/hc/en-gb)

Why is this happening ? Please help me..

Thanks in advance :)

pytest errors for semi-supervised models

Performed a fresh install of lightning-pose, received the following error for all semi-supervised model tests:

    def get_loss_inputs_unlabeled(self, batch_dict: UnlabeledBatchDict) -> Dict:
        """Return predicted heatmaps and their softmaxes (estimated keypoints)."""
        predicted_keypoints = self.forward(batch_dict["frames"])
        # undo augmentation if needed
>       if batch_dict["transforms"].shape[-1] == 3:
E       IndexError: tuple index out of range

lightning_pose/models/regression_tracker.py:198: IndexError

Pose-app crushed at the last moment of training.

Hi, every time I run training via Pose-app GUI, it crashes at the last moment before finishing, like the figure attached. The trained model will be kept and could be used to predict new videos. However, I do not know its impact on models trained. Would you give me any suggestions about this? Thank you!

Set different temporal loss parameters for each body parts

Hi lightning-pose team,

I have multiple body parts labeled. But some of them have a very limited range of movement (1/5 of width of the video) but others moves at a much larger range (nearly across the width of the video) . I tried to use same epsilon for all of the body parts but it seems not working well. Can I set different temporal loss parameters for each boday parts?

Or do you have any suggestions on how I can adjust the parameters?

Another alternative I am thinking is to train two models for the body parts. But it would definetly make it easier for me to just label once, train once and infer once.

Thanks! Appreciate your reply!

Best,
Nora

Move backbone model initialization to backbones.py

Description:

Currently, the initialization of all backbone models, such as ResNet 50, 101, ViT, and EffNet, is hardcoded in the models/base.py file. This approach lacks modularity and can lead to code duplication. To improve the code structure and maintainability, it is proposed to refactor the code by moving the backbone model initialization logic to a dedicated folder called 'backbones'. Each backbone model will have its own file, for example, resnets.py. Additionally, a build_backbone function will be created to handle the initialization of the backbones based on the provided configuration.

Motivation:

The current implementation suffers from several drawbacks. Firstly, having all backbone model initializations hardcoded in a single file makes it challenging to locate and modify specific backbone configurations. Secondly, it leads to code duplication if multiple files require the same backbone model. This lack of modularity can hinder the scalability and maintainability of the codebase.

Proposed Solution:

Create a new folder called 'backbones' in the project directory.
Inside the 'backbones' folder, create a separate Python file for each backbone architecture model, such as resnets.py, vits.py, etc.
Move the corresponding backbone model initialization code from the base.py file to their respective backbone files in the 'backbones' folder.
Define a function called build_backbone in the base.py file, which takes the configuration as input and returns the initialized backbone model.
In the build_backbone function, based on the provided configuration, dynamically import the appropriate backbone file from the 'backbones' folder and initialize the corresponding backbone model.
Update the relevant parts of the codebase to call the build_backbone function instead of the hardcoded backbone initializations.

Benefits:

Improved modularity: By separating each backbone model into its own file, it becomes easier to locate and modify specific backbone configurations without affecting unrelated code.
Code reuse: The separation of backbone models into individual files eliminates code duplication, enabling multiple files to utilize the same backbone model initialization.
Maintainability: The proposed folder structure and build_backbone function enhance the overall maintainability of the codebase, making it easier to add, modify, or remove backbone models in the future.
Readability: The new structure improves code readability by isolating backbone model logic into dedicated files, enhancing code comprehension, and reducing cognitive load.

Please let me know if you need any further clarification or have any questions regarding the proposed refactor.

multiple csv files

Hi, lightning pose team
lightning pose is a great tool. and it is very important to my project.
I used to use deeplabcut before, and now I have multiple videos with multiple csv files.
How can I move on with lightning pose?
should I put all the csv files into one or should I train one model with one csv file?
Hoping for your suggestions.

how to carry out multi-GPU with lightning-pose?

Hi, lighting pose team
I am very interested in this super cool tool.
How to do multi-GPU with lighting pose?
I can't find multi-GPU information in the tutorial.

Running training on toy dataset fails

Hello,
I tried running training on toy dataset using the default hydra script and it fails when loss is set to pca_singleview/pca_multiview with the following stack trace.
Kindly help in resolving this.

scripts/train_hydra.py:22: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path="configs", config_name="config")
/anaconda/envs/lightning-pose/lib/python3.8/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/next/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
Our Hydra config file:

training parameters

train_batch_size: 16
val_batch_size: 16
test_batch_size: 16
train_prob: 0.8
val_prob: 0.1
train_frames: 1
num_gpus: 0
num_workers: 4
early_stop_patience: 3
unfreezing_epoch: 25
dropout_rate: 0.1
min_epochs: 100
max_epochs: 500
log_every_n_steps: 1
check_val_every_n_epoch: 10
gpu_id: 0
unlabeled_sequence_length: 16
rng_seed_data_pt: 42
rng_seed_data_dali: 43
rng_seed_model_pt: 44
limit_train_batches: 10
multiple_trainloader_mode: max_size_cycle
profiler: simple
accumulate_grad_batches: 2
lr_scheduler: multisteplr
lr_scheduler_params: {'multisteplr': {'milestones': [100, 200, 300], 'gamma': 0.5}}

losses parameters

pca_multiview: {'log_weight': 7.0, 'components_to_keep': 3, 'empirical_epsilon_percentile': 1.0, 'empirical_epsilon_multiplier': 1.0, 'epsilon': None, 'error_metric': 'reprojection_error'}
pca_singleview: {'log_weight': 7.25, 'components_to_keep': 0.99, 'empirical_epsilon_percentile': 1.0, 'empirical_epsilon_multiplier': 1.0, 'epsilon': None, 'error_metric': 'reprojection_error'}
temporal: {'log_weight': 7.5, 'epsilon': [12.9, 11.3, 10.5, 12.0, 5.0, 7.3, 0.7, 61.8, 11.2, 9.9, 9.7, 10.1, 4.8, 4.9, 1.0, 19.2, 6.8]}
unimodal_mse: {'log_weight': 6.5, 'prob_threshold': 0.0}
unimodal_kl: {'log_weight': 6.5, 'prob_threshold': 0.0}

data parameters

image_orig_dims: {'width': 396, 'height': 406}
image_resize_dims: {'width': 256, 'height': 256}
data_dir: toy_datasets/toymouseRunningData
video_dir: unlabeled_videos
csv_file: CollectedData_.csv
header_rows: [1, 2]
downsample_factor: 2
num_keypoints: 17
mirrored_column_matches: [[0, 1, 2, 3, 4, 5, 6], [8, 9, 10, 11, 12, 13, 14]]
columns_for_singleview_pca: [0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14]

model parameters

losses_to_use: ['pca_singleview']
learn_weights: False
resnet_version: 50
model_type: heatmap
heatmap_loss_type: mse
model_name: my_base_toy_model

callbacks parameters

anneal_weight: {'attr_name': 'total_unsupervised_importance', 'init_val': 0.0, 'increase_factor': 0.01, 'final_val': 1.0, 'freeze_until_epoch': 0}

/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2895.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Number of labeled images in the full dataset (train+val+test): 90
Size of -- train set: 72, val set: 9, test set: 9
Warning: the argument {farg[0]} shadows a Pipeline constructor argument of the same name.
[/opt/dali/dali/operators/reader/loader/video_loader.h:178] file_list_include_preceding_frame is set to False (or not set at all). In future releases, the default behavior would be changed to True.
[/opt/dali/dali/operators/reader/nvdecoder/nvdecoder.cc:80] Warning: Decoding on a default stream. Performance may be affected.
Results of running PCA (pca_singleview) on keypoints:
Kept 13/28 components, and found:
Explained variance ratio: [0.315 0.242 0.209 0.073 0.048 0.034 0.021 0.015 0.01 0.007 0.007 0.005
0.004 0.003 0.002 0.001 0.001 0.001 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. ]
Variance explained by 13 components: 0.991
/home/walthamadmin/notebooks/projects/lightning-pose/lightning_pose/losses/losses.py:326: UserWarning: Using empirical epsilon=0.194 * multiplier=1.000 -> total=0.194 for pca_singleview loss
warnings.warn(
/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py:22: LightningDeprecationWarning: pytorch_lightning.core.lightning.LightningModule has been deprecated in v1.7 and will be removed in v1.9. Use the equivalent class from the pytorch_lightning.core.module.LightningModule class instead.
rank_zero_deprecation(

Initializing a SemiSupervisedHeatmapTracker instance.
/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torchvision/models/_utils.py:135: UserWarning: Using 'weights' as positional parameter(s) is deprecated since 0.13 and will be removed in 0.15. Please use keyword parameter(s) instead.
warnings.warn(
/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing weights=ResNet50_Weights.IMAGENET1K_V1. You can also use weights=ResNet50_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:446: LightningDeprecationWarning: Setting Trainer(gpus=[0]) is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=[0]) instead.
rank_zero_deprecation(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py:285: LightningDeprecationWarning: The Callback.on_epoch_start hook was deprecated in v1.6 and will be removed in v1.8. Please use Callback.on_<train/validation/test>_epoch_start instead.
rank_zero_deprecation(
Missing logger folder: tb_logs/my_base_toy_model
Number of labeled images in the full dataset (train+val+test): 90
Size of -- train set: 72, val set: 9, test set: 9
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | backbone | Sequential | 23.5 M
1 | loss_factory | LossFactory | 0
2 | upsampling_layers | Sequential | 81.0 K
3 | rmse_loss | RegressionRMSELoss | 0
4 | loss_factory_unsup | LossFactory | 0

134 K Trainable params
23.5 M Non-trainable params
23.6 M Total params
94.356 Total estimated model params size (MB)
/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:219: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 6 which is the number of cpus on this machine) in theDataLoader` init to improve performance.
rank_zero_warn(
Epoch 0: 0%| | 0/10 [00:00<?, ?it/s]/home/walthamadmin/notebooks/projects/lightning-pose/lightning_pose/data/dali.py:103: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
return torch.tensor(
Error executing job with overrides: []
Traceback (most recent call last):
File "scripts/train_hydra.py", line 110, in train
trainer.fit(model=model, datamodule=data_module)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 737, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1168, in _run
results = self._run_stage()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1254, in _run_stage
return self._run_train()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1285, in _run_train
self.fit_loop.run()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 270, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 203, in advance
batch_output = self.batch_loop.run(kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 87, in advance
outputs = self.optimizer_loop.run(optimizers, kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 201, in advance
result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 240, in _run_optimization
closure()
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 146, in call
self._result = self.closure(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 141, in closure
self._backward_fn(step_output.closure_loss)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 304, in backward_fn
self.trainer._call_strategy_hook("backward", loss, optimizer, opt_idx)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1706, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 191, in backward
self.precision_plugin.backward(self.lightning_module, closure_loss, optimizer, optimizer_idx, *args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 80, in backward
model.backward(closure_loss, optimizer, optimizer_idx, *args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1418, in backward
loss.backward(*args, **kwargs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torch/_tensor.py", line 396, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/anaconda/envs/lightning-pose/lib/python3.8/site-packages/torch/autograd/init.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 14]], which is output 0 of LinalgVectorNormBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Epoch 0: 0%|

Feature: add support for multiple (non-fused) camera views

The current package only supports multiview setups that have fused views across cameras into a single frame. This does not scale well past 2-4 views.

Enabling learn_weights in model_params.yaml throws errors.

Enabling learn_weights in model_params.yaml throws errors
learn_weights: True

Variable naming issue in factory.py

TypeError: entry_points() got an unexpected keyword argument 'group'

Hello, I am trying to install lightning-pose as outlined here: https://lightning-pose.readthedocs.io/en/latest/source/installation.html

I've created a conda environment in Python 3.8, installed lightning_pose from git, ran python -c "import lightning_pose" successfully, installed the dependencies with a success message, but when I try pytest I get the following error:

    sys.exit(console_main())
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 198, in console_main
    code = main()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 156, in main
    config = _prepareconfig(args, plugins)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 338, in _prepareconfig
    config = pluginmanager.hook.pytest_cmdline_parse(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pluggy/_hooks.py", line 501, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pluggy/_manager.py", line 119, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pluggy/_callers.py", line 138, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pluggy/_callers.py", line 121, in _multicall
    teardown.throw(exception)  # type: ignore[union-attr]
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/helpconfig.py", line 105, in pytest_cmdline_parse
    config = yield
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pluggy/_callers.py", line 102, in _multicall
    res = hook_impl.function(*args)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1096, in pytest_cmdline_parse
    self.parse(args)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1449, in parse
    self._preparse(args, addopts=addopts)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/config/__init__.py", line 1326, in _preparse
    self.pluginmanager.load_setuptools_entrypoints("pytest11")
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pluggy/_manager.py", line 414, in load_setuptools_entrypoints
    plugin = ep.load()
  File "/usr/lib/python3.8/importlib/metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/assertion/rewrite.py", line 178, in exec_module
    exec(co, module.__dict__)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torchtyping/__init__.py", line 11, in <module>
    from .typechecker import patch_typeguard
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/assertion/rewrite.py", line 178, in exec_module
    exec(co, module.__dict__)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/torchtyping/typechecker.py", line 4, in <module>
    import typeguard
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "/home/ubuntu/.local/lib/python3.8/site-packages/_pytest/assertion/rewrite.py", line 178, in exec_module
    exec(co, module.__dict__)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/typeguard/__init__.py", line 48, in <module>
    load_plugins()
  File "/home/ubuntu/.local/lib/python3.8/site-packages/typeguard/_checkers.py", line 874, in load_plugins
    for ep in entry_points(group="typeguard.checker_lookup"):
TypeError: entry_points() got an unexpected keyword argument 'group'

Based on related issues 1, 2, 3, 4, it seems like there's some incompatibility with the version of importlib-metadata. I've tried manually installing several different versions and still get the same error. Can you recommend a particular version that is compatible?

After following all the directions on the lightning pose installation page (I'm on Ubuntu 20.04 so I skipped installing fiftyone-db-ubuntu2204), I get the following from conda list:

#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
attrs                     23.2.0                   pypi_0    pypi
blinker                   1.7.0                    pypi_0    pypi
bzip2                     1.0.8                hd590300_5    conda-forge
ca-certificates           2024.2.2             hbcca054_0    conda-forge
certifi                   2024.2.2                 pypi_0    pypi
charset-normalizer        3.3.2                    pypi_0    pypi
click                     8.1.7                    pypi_0    pypi
idna                      3.6                      pypi_0    pypi
importlib-metadata        7.1.0                    pypi_0    pypi
jsonschema                4.21.1                   pypi_0    pypi
jsonschema-specifications 2023.12.1                pypi_0    pypi
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 13.2.0               h807b86a_5    conda-forge
libgomp                   13.2.0               h807b86a_5    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libsqlite                 3.45.2               h2797004_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libzlib                   1.2.13               hd590300_5    conda-forge
lightning-pose            1.1.0                     dev_0    <develop>
markdown                  3.6                      pypi_0    pypi
ncurses                   6.4.20240210         h59595ed_0    conda-forge
nvidia-dali-cuda110       1.34.0                   pypi_0    pypi
oauthlib                  3.2.2                    pypi_0    pypi
openssl                   3.2.1                hd590300_1    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
pkgutil-resolve-name      1.3.10                   pypi_0    pypi
pyasn1                    0.5.1                    pypi_0    pypi
pyasn1-modules            0.3.0                    pypi_0    pypi
python                    3.8.19          hd12c33a_0_cpython    conda-forge
pyyaml                    6.0.1                    pypi_0    pypi
readline                  8.2                  h8228510_1    conda-forge
referencing               0.34.0                   pypi_0    pypi
requests                  2.31.0                   pypi_0    pypi
rpds-py                   0.18.0                   pypi_0    pypi
setuptools                69.2.0             pyhd8ed1ab_0    conda-forge
six                       1.16.0                   pypi_0    pypi
tk                        8.6.13          noxft_h4845f30_101    conda-forge
urllib3                   1.26.18                  pypi_0    pypi
wheel                     0.42.0             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zipp                      3.18.1                   pypi_0    pypi

Thanks for any suggestions you have.

dependency on fixed image size

Hi there, amazing work. I am wondering how vital the dependency on fixed image sizes for training your setup is. Reading the code it seems you are currently requiring a fixed image size but DALIs transform module should be able to scale dynamically.

Best

Jan

scripts/predict_new_vids.py should save loss metrics as csv

Hello,

Thank you so much for this amazing module/library. It's really well written and it has a very informative documentation.

(This issue should be more like a pull request, but I'm quite new on github and I don't have too much time right now. Sorryyy)

State

I successfully used train_hydra.py to train the model on my dataset.

Problem

I now would like to use predict_new_vids.py to do inference on some new videos coming from an additional dataset. I would to use the pca_singleview_error and the temporal_norm to select some frames from these videos to manually relabel. The problem is that predict_new_vids.py as it's written, outputs only the likelihood, which is not very informative.

Solution

I modified this predict_new_vids.py so that it outputs the estimated keypoints, the pca_singleview_error, and the temporal_norm in 3 separate csv files as train_hydra.py does during its "predict" phase.

from lightning_pose.utils.scripts import (
    compute_metrics,
    export_predictions_and_labeled_video,
    get_data_module,
    get_dataset,
    get_imgaug_transform,
)

...

@typechecked
class VideoPredPathHandler:

    # ...

    def build_pred_file_basename(self, extra_str="") -> str:
        # return "%s_%s%s%s.csv" % (
        #     self.video_basename,
        #     self.model_cfg.model.model_type,
        #     self.loss_str,
        #     extra_str,
        # )
        return f"{self.video_basename}.csv"

...

@hydra.main(config_path="configs", config_name="config_mirror-mouse-example")
def predict_videos_in_dir(cfg: DictConfig):

    # ...

    for _, hydra_relative_path in enumerate(cfg.eval.hydra_paths):

        # ...

        for video_file in video_files:

            # ...

            print(f"\n\n{prediction_csv_file = }\n\n")

            export_predictions_and_labeled_video(
                video_file=video_file,
                cfg=cfg,
                ckpt_file=ckpt_file,
                prediction_csv_file=prediction_csv_file,
                labeled_mp4_file=labeled_mp4_file,
                trainer=trainer,
                model=model,
                data_module=data_module,
                save_heatmaps=cfg.eval.get(
                    "predict_vids_after_training_save_heatmaps", False
                ),
            )
            # compute and save various metrics
            try:
                compute_metrics(
                    cfg=cfg,
                    preds_file=prediction_csv_file,
                    data_module=data_module,
                )
            except Exception as e:
                print(f"Error predicting on video {video_file}:\n{e}")
                continue

if __name__ == "__main__":
    predict_videos_in_dir()

Trying to add custom-trained YOLOv8n-pose model

First of all, thank you for such a well-written code and document. Everything was easy to read and understand, and the codes were organized very nicely. As a person with no computer degree, this is very much appreciated, especially when you don't see it often from the people in your field.

What I have tried to do is what the title says: I have a custom-trained YOLOv8n pose model, which is in .pt format. I wanted to add this model as 1) supervised tracking model, and 2) possibly extend this into a semi-supervised tracker.

However, even after following your detailed instructions on adding a new model, I have failed to do so.
What I have done is the following. It is in the order of the document page.

Added [[YOLOtracker.py]that defines two new tracker classes - YOLOtracker and SemisupervisedYOLOtracker
- YOLOtracker(RegressionTracker)
- SemisupervisedYOLOtracker(SemiSupervisedTrackerMixin, YOLOtracker)
- The only differences are that -
  - backbone=YOLO('~~Somepath~~/ym-pretrained.pt') (yes, I have imported YOLO from ultralytics here)
  - num_keypoints=11
Added ‘YOLOtracker’ to ALLOWED_MODELS in models/init.py
Created new config with model_type: “YOLOtracker”
line85 of utils/scripts.py
if cfg.model.model_type == "regression" or cfg.model.model_type == "YOLOtracker":

added

elif cfg.model.model_type == "YOLOtracker":
            model = YOLOtracker(
                num_keypoints=cfg.data.num_keypoints,
                # loss_factory=loss_factories["supervised"],
                backbone=cfg.model.backbone,
                # torch_seed=cfg.training.rng_seed_model_pt,
                # lr_scheduler=lr_scheduler,
                # lr_scheduler_params=lr_scheduler_params,
                # image_size=image_h,  # only used by ViT

added in get_model_class

elif map_type == "YOLOtracker":
            from lightning_pose.models import YOLOtracker as Model

I have also went ahead and added "YOLOtracker":RegressionMSELoss to losses.losses, which was missing from the document.

then I went to the unit test and created,

def test_supervised_YOLO(
    cfg, base_data_module, video_dataloader, trainer, remove_logs
):
    """Test the initialization and training of a supervised YOLO model."""
    # cfg = '/home/tarislada/Behavitproject/lightning-pose/scripts/configs/config_custom.yaml'
    cfg_tmp = copy.deepcopy(cfg)
    cfg_tmp.model.model_type = "YOLOtracker"
    cfg_tmp.model.losses_to_use = []
    run_model_test(
        cfg=cfg_tmp,
        data_module=base_data_module,
        video_dataloader=video_dataloader,
        trainer=trainer,
        remove_logs_fn=remove_logs,
    )

Which failed with the message
FAILED tests/models/test_custom_trackers.py::test_supervised_YOLO - RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
Because I did see the "Initializing a YOLOtracker instance." message and the usual YOLO run command line messages, I assume that the implementation regarding the lightning-pose document was successful.
Any idea on how to get this working?

error in prediction

Hi, lightning pose team
based on your tutorial, I can run the training with 'train_hydra.py' without error reporting:

python train_hydra.py --config-path=/root/autodl-tmp/DLC_LP --config-name=config_LP.yaml

and get a new directory: outputs/2024-04-07/11-48-45/

Now I want to run 'predict_new_vids.py'

python predict_new_vids.py --config-path=/root/autodl-tmp/DLC_LP --config-name=config_LP.yaml

but it gives the error as:

[2024-04-07 23:33:49,971][HYDRA] /root/miniconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Error executing job with overrides: []
Traceback (most recent call last):
  File "predict_new_vids.py", line 116, in predict_videos_in_dir
    absolute_cfg_path = return_absolute_path(hydra_relative_path, n_dirs_back=2)
  File "/root/miniconda3/lib/python3.8/site-packages/lightning_pose/utils/io.py", line 153, in return_absolute_path
    raise IOError("%s is not a valid path" % abs_path)
OSError: /root/autodl-tmp/DLC_LP/outputs/outputs/2024-04-07/11-48-45/ is not a valid path

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
root@autodl-container-6be511a5ae-e94941e1:~/autodl-tmp/DLC_LP# python predict_new_vids.py --config-path=/root/autodl-tmp/DLC_LP --config-name=config_LP.yaml
[2024-04-07 23:34:41,733][HYDRA] /root/miniconda3/lib/python3.8/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Error executing job with overrides: []
Traceback (most recent call last):
  File "predict_new_vids.py", line 116, in predict_videos_in_dir
    absolute_cfg_path = return_absolute_path(hydra_relative_path, n_dirs_back=2)
  File "/root/miniconda3/lib/python3.8/site-packages/lightning_pose/utils/io.py", line 153, in return_absolute_path
    raise IOError("%s is not a valid path" % abs_path)
OSError: /root/autodl-tmp/DLC_LP/outputs/outputs/2024-04-07/11-48-45/ is not a valid path

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

And here is my config file:

data:
  image_orig_dims:
    height: 2160
    width: 2160
  image_resize_dims:
    height: 512
    width: 512
  data_dir: /root/autodl-tmp/DLC_LP
  video_dir: /root/autodl-tmp/DLC_LP/videos
  csv_file: CollectedData.csv
  downsample_factor: 2
  num_keypoints: 6
  keypoint_names:
  - snout
  - forepaw_L
  - forefaw_R
  - hindpaw_L
  - hindpaw_R
  - base
  mirrored_column_matches: null
  columns_for_singleview_pca: null
training:
  imgaug: dlc
  train_batch_size: 8
  val_batch_size: 32
  test_batch_size: 32
  train_prob: 0.95
  val_prob: 0.05
  train_frames: 1
  num_gpus: 1
  num_workers: 4
  early_stop_patience: 3
  unfreezing_epoch: 20
  min_epochs: 5
  max_epochs: 10
  log_every_n_steps: 10
  check_val_every_n_epoch: 5
  gpu_id: 0
  rng_seed_data_pt: 0
  rng_seed_model_pt: 0
  lr_scheduler: multisteplr
  lr_scheduler_params:
    multisteplr:
      milestones:
      - 150
      - 200
      - 250
      gamma: 0.5
model:
  losses_to_use:
  - pca_singleview
  - temporal
  backbone: resnet50_animal_ap10k
  model_type: heatmap_mhcrnn
  heatmap_loss_type: mse
  model_name: DLC_LP
dali:
  general:
    seed: 123456
  base:
    train:
      sequence_length: 32
    predict:
      sequence_length: 96
  context:
    train:
      batch_size: 16
    predict:
      sequence_length: 96
losses:
  pca_multiview:
    log_weight: 5.0
    components_to_keep: 3
    epsilon: null
  pca_singleview:
    log_weight: 5.0
    components_to_keep: 0.99
    epsilon: null
  temporal:
    log_weight: 5.0
    epsilon: 20.0
    prob_threshold: 0.05
eval:
  hydra_paths: ["outputs/2024-04-07/11-48-45/"] 
  predict_vids_after_training: true
  save_vids_after_training: false
  fiftyone:
    dataset_name: test
    model_display_names:
    - test_model
    launch_app_from_script: false
    remote: true
    address: 127.0.0.1
    port: 5151
  test_videos_directory: /root/autodl-tmp/DLC_LP/videos
  saved_vid_preds_dir: null
  confidence_thresh_for_vid: 0.9
  video_file_to_plot: null
  pred_csv_files_to_plot:
  - ' '
callbacks:
  anneal_weight:
    attr_name: total_unsupervised_importance
    init_val: 0.0
    increase_factor: 0.01
    final_val: 1.0
    freeze_until_epoch: 0
hydra:
  run:
    dir: outputs/${now:%Y-%m-%d}/${now:%H-%M-%S}
  sweep:
    dir: multirun/${now:%Y-%m-%d}/${now:%H-%M-%S}
    subdir: ${hydra.job.num}

so any suggestion? Thank you.

Colab demo No module named 'segment_anything' / Toy config missing

Hi, Thanks a lot for the development of this amazing package.
Just trying to run the demo notebook in colab when running !pytest got the error below.

Also, I noticed that the config_toy-dataset.yaml file is missing from the directory where it should be according to notebook.

Thanks a lot for your help,
Anto

============================= test session starts ==============================
platform linux -- Python 3.10.6, pytest-7.3.1, pluggy-1.2.0
rootdir: /content/lightning-pose
plugins: torchtyping-0.1.4, hydra-core-1.3.2, typeguard-3.0.2, anyio-3.7.1
collected 56 items / 1 error                                                   

==================================== ERRORS ====================================
__________________ ERROR collecting tests/models/test_base.py __________________
ImportError while importing test module '/content/lightning-pose/tests/models/test_base.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/models/test_base.py:3: in <module>
    import segment_anything
E   ModuleNotFoundError: No module named 'segment_anything'
=========================== short test summary info ============================
ERROR tests/models/test_base.py
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!
=============================== 1 error in 1.79s ===============================

issue with lightning pose installation

lightning pose prompted to install an update
went through ok - no issue

used LP for a while - firefox then prompted for update
updated firefox
now when calling: "lightning run app app.py" nothing happens and we get: "Please call fabric run model instead"
when running this we now get

$ fabric run model /home/lightning-pose/Pose-app/app.py
Traceback (most recent call last):
File "/home/lightning-pose/Pose-app/app.py", line 336, in
app = LightningApp(LitPoseApp())
File "/home/lightning-pose/Pose-app/app.py", line 59, in init
default_config_dict = yaml.safe_load(open(os.path.join(config_dir, "config_default.yaml")))
FileNotFoundError: [Errno 2] No such file or directory: 'lightning-pose/scripts/configs/config_default.yaml'
[2024-04-23 13:04:26,639] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 6409) of binary: /home/lightning-pose/anaconda3/envs/lai/bin/python
Traceback (most recent call last):
File "/home/lightning-pose/anaconda3/envs/lai/bin/fabric", line 8, in
sys.exit(_main())
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/click/core.py", line 783, in invoke
return __callback(args, kwargs)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/lightning/fabric/cli.py", line 159, in _run_model
main(args=Namespace(kwargs), script_args=script_args)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/lightning/fabric/cli.py", line 217, in main
_torchrun_launch(args, script_args or [])
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/lightning/fabric/cli.py", line 212, in _torchrun_launch
torchrun.main(torchrun_args)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, kwargs)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in call**
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/lightning-pose/anaconda3/envs/lai/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

/home/lightning-pose/Pose-app/app.py FAILED

Failures:
<NO_OTHER_FAILURES>

Root Cause (first observed failure):
[0]:
time : 2024-04-23_13:04:26
host : lightningpose-901045
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 6409)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

bit unsure how to fix our current system now...

do we need to convert our system/code to fabric? https://lightning.ai/docs/fabric/stable/fundamentals/convert.html

bit unsure!

log files - where are they (path) for analysis later?

any pointers or more recent documentation would be gratefully received....

*****A short question about the online version: how does the GPU is used? I created an account yesterday and I used it for few minutes, I left it open and today I needed to pay. So how should i save GPU hours?
Anyway I think the online version is problematic in terms of GPU hours allowance so i would like to install it locally. Could you please help with the above situation?

Have a good day!

running lighting-pose on CalMS21 dataset

hello.

I'm trying to run lightning-pose for CalMS21 and test how it perfroms. I configured a new config .yaml file for this specific dataset, but the predictions on the video isn't really working. The new config file for the calm21 is mostly the same with crim13 config file, but added a video to leverage the unsupervised losses (pca_singleview & temporal), and also tested up to 1000 training epochs. The predicted keypoints stay at the top left corner of the video. Since, there was a crim13 config file as default, I wonder how the model performed on crim13 dataset, since these two datasets share a lot of features.

Below is the frame result of the predicted video, hoping it will help. Thank you in advance.

Best regards

How to get started?

Hello Dan,
I am very eager to try this tool. I have successfully installed lightning-pose and label-studio. I am somewhat experienced with DeepLabCut.

I have a video I would like to extract frames from to label and train a model. I am stuck at not knowing how to extract frames. When I try to import my video into label-studio, I get an error. I have spent a good deal of time reading through the examples here on your Github, and it seems to me that most of the demos assume you already have labeled data. Are there instructions somewhere for how to implement the full workflow (fig. 6 in your paper) starting from a new video?

Stuck on Model Building Cell

Hello!

I'm using lightning pose for my project, but it got stuck on the cell for model building and won't run through. I downloaded Lightning Pose using the conda from source installation method. I've tried running the command from the terminal and also breaking it down step by step in Jupyter (like the one used in demo on Colab).

Here's what I got when I ran this cell in Jupyter:

Thank you in advance!

Happy holidays!
Alan

Starts training from specific checkpoint

Hello and happy Friday,

I couldn't find these topics in the documentation so some clarification would be appreciated.

When adding new videos to an existing project,

Is it possible to label novel videos using the current model or do we still need to label frames?
Is it possible to continue training from an existing checkpoint?

Thank you very much for your work!

errors while testing lightning-pose

Hello!
Excited to try lightning-pose, but cannot get through pytest. I am mostly interested in the features that deepgraphpose had, to avoid paw switches and try to follow fast moving things better, at least right now.

I am trying to install on codeocean, so there might be platform specific issues. Having a dockerimage would probably help, I wonder if you have that handy.

pytest didn't want to start first:

patch_typeguard() in /data/utils throws an error in the first try, saying there is no _CallMemo in typeguard
this fixed the issue:

try:
    patch_typeguard()  # use before @typechecked
except:
    print('patch_typeguard() failed for the first try, retrying')
    patch_typeguard()

But there is probably a better way to do this. :D

then I needed to change how the CombinedLoader is imported from pytorch_lightning to the following:

from pytorch_lightning.utilities.combined_loader import CombinedLoader

then pytest started, but threw a bunch of errors.
I wonder if I am not using the correct version, before I would go in and try to fix the errors.

Please see the output of pytest below.

Best,
Marton

=============================== short test summary info ===============================
FAILED tests/test_metrics.py::test_pixel_error - UnboundLocalError: local variable 'pixel_error' referenced before assignment
FAILED tests/test_metrics.py::test_pca_singleview_reprojection_error - TypeError: cannot create weak reference to 'property' object
FAILED tests/test_metrics.py::test_pca_multiview_reprojection_error - TypeError: cannot create weak reference to 'property' object
FAILED tests/data/test_dali.py::test_video_pipe - RuntimeError: Critical error when building pipeline:
FAILED tests/data/test_dali.py::test_PrepareDALI - RuntimeError: Critical error when building pipeline:
FAILED tests/data/test_datasets.py::test_base_dataset - TypeError: cannot create weak reference to 'property' object
FAILED tests/data/test_datasets.py::test_base_dataset_context - TypeError: cannot create weak reference to 'property' object
FAILED tests/data/test_utils.py::test_generate_heatmaps_weird_shape - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_helpers.py::test_empirical_epsilon - TypeError: isinstance() arg 2 must be a type or tuple of types
FAILED tests/losses/test_helpers.py::test_convert_dict_values_to_tensor - TypeError: isinstance() arg 2 must be a type or tuple of types
FAILED tests/losses/test_losses.py::test_heatmap_mse_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_heatmap_kl_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_heatmap_js_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_pca_singleview_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_pca_multiview_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_temporal_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_unimodal_mse_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_unimodal_kl_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_unimodal_js_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_regression_mse_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/losses/test_losses.py::test_regression_rmse_loss - TypeError: cannot create weak reference to 'property' object
FAILED tests/models/test_base.py::test_backbone - OSError: [Errno 28] No space left on device
FAILED tests/models/test_base.py::test_representation_shapes_truncated_resnet - OSError: [Errno 28] No space left on device
FAILED tests/models/test_base.py::test_representation_shapes_full_resnet - OSError: [Errno 28] No space left on device
ERROR tests/data/test_datamodules.py::test_heatmap_datamodule - TypeError: cannot create weak reference to 'property' object
ERROR tests/data/test_datamodules.py::test_base_data_module_combined - TypeError: cannot create weak reference to 'property' object
ERROR tests/data/test_datamodules.py::test_heatmap_data_module_combined - TypeError: cannot create weak reference to 'property' object
ERROR tests/data/test_datasets.py::test_heatmap_dataset - TypeError: cannot create weak reference to 'property' object
ERROR tests/data/test_datasets.py::test_heatmap_dataset_context - TypeError: cannot create weak reference to 'property' object
ERROR tests/data/test_datasets.py::test_equal_return_sizes - TypeError: cannot create weak reference to 'property' object
ERROR tests/data/test_utils.py::test_data_extractor - TypeError: cannot create weak reference to 'property' object
ERROR tests/data/test_utils.py::test_generate_heatmaps - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_heatmap_tracker.py::test_supervised_heatmap - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_heatmap_tracker.py::test_supervised_heatmap_context - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_heatmap_tracker.py::test_semisupervised_heatmap_temporal - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_heatmap_tracker.py::test_semisupervised_heatmap_pcasingleview_context - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_heatmap_tracker_mhcrnn.py::test_supervised_heatmap_mhcrnn - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_heatmap_tracker_mhcrnn.py::test_semisupervised_heatmap_mhcrnn_pcasingleview - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_regression_tracker.py::test_supervised_regression - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_regression_tracker.py::test_supervised_regression_context - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_regression_tracker.py::test_semisupervised_regression_temporal - TypeError: cannot create weak reference to 'property' object
ERROR tests/models/test_regression_tracker.py::test_semisupervised_regression_pcasingleview_context - TypeError: cannot create weak reference to 'property' object
ERROR tests/utils/test_pca.py::test_train_loader_iter - TypeError: cannot create weak reference to 'property' object
ERROR tests/utils/test_pca.py::test_pca_keypoint_class - TypeError: cannot create weak reference to 'property' object
ERROR tests/utils/test_pca.py::test_singleview_format_and_loss - TypeError: cannot create weak reference to 'property' object
========== 24 failed, 13 passed, 10 warnings, 21 errors in 94.60s (0:01:34) ===========

support for avi formats

Hi lightning-pose team.

Our lab collects video using avi formats and it seems the lightning-pose only support the mp4 format (correct me if i am wrong!). Could you add features to support avi formats as Deeplabcut did? It seems the online video format convert tools are not free. This will make our work much easier.

Thank you so much!

Best,
Nora

GPU memory requirements

Hello Lightning Pose,
I have a 12Gb RTX3060 GPU and I received a CUDA out-of-memory error when trying to run the "semi-supervised" model:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 11.76 GiB total capacity; 9.16 GiB already allocated; 38.31 MiB free; 9.24 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

This was run using Pose-app/app.py. It was training a model based on a 180Mb video and 20 labeled frames with 4 keypoints. I could upgrade to a 24 Gb GPU if I knew that this would fix the issue. Is there a recommended GPU size for running Lightning-Pose?

lit-pose crashes while using .avis

trainer.fit will crash if the unlabeled data are in .avi format, just hangs without any errors.

This is obviously dealt with in the io module with a couple of asserts and booleans, but before digging into the details I figured asking why here would be worth while.

fail to generate heatmaps for the video

Hi, thanks for releasing lightning-pose.

I got an error message when generating the heatmaps for the testing video by calling the export_predictions_and_labeled_video() function, from https://github.com/danbider/lightning-pose/blob/6135e5a50523d4a9d8f1ba986b76e28e3dcd0cf1/scripts/train_hydra.py. The error message is shown as below.

export_predictions_and_labeled_video(
video_file=video_file,
cfg=cfg,
ckpt_file=best_ckpt,
prediction_csv_file=prediction_csv_file,
labeled_mp4_file=labeled_mp4_file,
trainer=trainer,
model=model,
data_module=data_module_pred,
save_heatmaps=cfg.eval.get(
"predict_vids_after_training_save_heatmaps", True
),
)

Error message:
Traceback (most recent call last):
File "/root/capsule/scratch/lightning-pose/scripts/train_hydra.py", line 297, in train
export_predictions_and_labeled_video(
File "/lightning-pose/lightning_pose/utils/scripts.py", line 675, in export_predictions_and_labeled_video
preds_df = predict_single_video(
File "/lightning-pose/lightning_pose/utils/predictions.py", line 397, in predict_single_video
keypoints, confidences, heatmaps = _predict_frames(
File "/lightning-pose/lightning_pose/utils/predictions.py", line 437, in _predict_frames
def _predict_frames(
File "/opt/conda/lib/python3.8/site-packages/typeguard/_functions.py", line 113, in check_argument_types
check_type_internal(value, expected_type, memo=memo)
File "/opt/conda/lib/python3.8/site-packages/typeguard/_checkers.py", line 680, in check_type_internal
raise TypeCheckError(f"is not an instance of {qualified_name(origin_type)}")
typeguard.TypeCheckError: argument "model" (lightning_pose.models.heatmap_tracker.HeatmapTracker) is not an instance of pytorch_lightning.core.module.LightningModule

Thanks,
Di

how to set up min_epochs and max_epochs

Hi,lightning pose team
I tried lightning pose with temporal model. It seemed that the model would converge at around 100 epochs. And the default setting is around 100-300 epochs in the example. Would a few hundreds epochs of training be enough in temporal model?

And how about basic model? Still a few hundred epochs?

Usually when I used deeplabcut, I have to go to 200k - 500k iterations or even more. I am not sure what relationship would be between epoch in LP and the iteration in DLC. But it seems that LP could converge or finish training much faster in my dataset.

Error loading checkpoint for vit_b_sam backbone

When loading weights from a fine-tuned vit_b_sam backbone, if the fine-tuning frame size is not 1024x1024 the following error is raised:

RuntimeError: Error(s) in loading state_dict for HeatmapTracker:
	size mismatch for backbone.pos_embed: copying a param with shape torch.Size([1, 16, 16, 768]) from checkpoint, the shape in current model is torch.Size([1, 64, 64, 768]).

The problem:

During training, the regular vit_b_sam backbone is constructed, which assumes an image shape of 1024x1024
If the image size that we are fine-tuning on is not 1024x1024, the position embedding is automatically updated during training and the new weights are stored (and eventually saved)
When loading the weights into a new model, the position embedding parameter assuming 1024x1024 is constructed, but the saved parameter assuming a different image size is loaded in (with the above error).

The solution:
Instead of loading the state dict directly into the model using Model.load_from_checkpoint, this step needs to be broken into several parts:

Initialize the model (which includes loading the SAM weights) - this will set the position embedding parameter in a way that assumes 1024x1024 images
Manually update the position embedding parameter to match the desired fine-tune image size
Load the weights from the checkpoint

shift of xy coordinates on the labeled_videos

Hi, thanks for releasing lightning-pose.

I found a bug when predicting a folder of videos using the script lightning-pose/scripts/train_hydra.py.
The x and y coordinates of keypoints on the labeled videos will be shifted when the testing videos have different dimensions.

To avoid the shift of xy coordinates, I updated the image's original dimension for the testing video with the following codes before calling export_predictions_and_labeled_video():

        clip = VideoFileClip(video_file)
        cfg.data.image_orig_dims.width  = clip.w
        cfg.data.image_orig_dims.height = clip.h

Thanks,
Di

Issue: Runtime error database disk image is malformed

We have run into this issue twice now.

We are running ~/Pose-app/app.py to run lightning pose.

It seems like something gets messed up in sqlite. The problem is we haven't been able to clear the error. We only solved this the first time with a complete re-install.

If you understand how to reset this database to somehow get lightning-pose going again, that would be helpful.

We tried removing and re-installing ~/venv-label-studio, but that did not fix the problem.

Here is the graphical output from app.py:

Runtime error
database disk image is malformed

Traceback (most recent call last):
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
return Database.Cursor.execute(self, query, params)
sqlite3.DatabaseError: database disk image is malformed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/rest_framework/views.py", line 506, in dispatch
response = handler(request, *args, **kwargs)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/utils/decorators.py", line 43, in _wrapper
return bound_method(*args, **kwargs)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/label_studio/projects/api.py", line 165, in get
return super(ProjectListAPI, self).get(request, *args, **kwargs)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/rest_framework/generics.py", line 239, in get
return self.list(request, *args, **kwargs)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/rest_framework/mixins.py", line 40, in list
page = self.paginate_queryset(queryset)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/rest_framework/generics.py", line 171, in paginate_queryset
return self.paginator.paginate_queryset(queryset, self.request, view=self)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/rest_framework/pagination.py", line 204, in paginate_queryset
self.page = paginator.page(page_number)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/core/paginator.py", line 76, in page
number = self.validate_number(number)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/core/paginator.py", line 54, in validate_number
if number > self.num_pages:
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/utils/functional.py", line 48, in get
res = instance.dict[self.name] = self.func(instance)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/core/paginator.py", line 103, in num_pages
if self.count == 0 and not self.allow_empty_first_page:
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/utils/functional.py", line 48, in get
res = instance.dict[self.name] = self.func(instance)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/core/paginator.py", line 97, in count
return c()
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/models/query.py", line 412, in count
return self.query.get_count(using=self.db)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/models/sql/query.py", line 528, in get_count
number = obj.get_aggregation(using, ['__count'])['__count']
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/models/sql/query.py", line 513, in get_aggregation
result = compiler.execute_sql(SINGLE)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1175, in execute_sql
cursor.execute(sql, params)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/sentry_sdk/integrations/django/init.py", line 596, in execute
return real_execute(self, sql, params)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/utils.py", line 90, in exit
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/home/plafave/venv-label-studio/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 423, in execute
return Database.Cursor.execute(self, query, params)
django.db.utils.DatabaseError: database disk image is malformed

danbider / lightning-pose Goto Github PK

lightning-pose's People

Contributors

Stargazers

Watchers

Forkers

lightning-pose's Issues

Description:

Motivation:

Proposed Solution:

Benefits:

training parameters

losses parameters

data parameters

model parameters

callbacks parameters

| Name | Type | Params

0 | backbone | Sequential | 23.5 M 1 | loss_factory | LossFactory | 0 2 | upsampling_layers | Sequential | 81.0 K 3 | rmse_loss | RegressionRMSELoss | 0 4 | loss_factory_unsup | LossFactory | 0

State

Problem

Solution

/home/lightning-pose/Pose-app/app.py FAILED

Failures: <NO_OTHER_FAILURES>

Here is the graphical output from app.py:

Recommend Projects

Recommend Topics

Recommend Org

0 | backbone | Sequential | 23.5 M
1 | loss_factory | LossFactory | 0
2 | upsampling_layers | Sequential | 81.0 K
3 | rmse_loss | RegressionRMSELoss | 0
4 | loss_factory_unsup | LossFactory | 0

Failures:
<NO_OTHER_FAILURES>