Giter VIP home page Giter VIP logo

Comments (56)

fire avatar fire commented on May 28, 2024 1

I was able to train a 1 step that outputs garbage glb 🎉

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024 1

I want to mention, getting the indices so they're in the right order and making sure they fit in the box and not inside out are problems too.

If you're interested in training the head it's in the dataset. I can't get the autoencoder below 0.5 loss

How many examples/steps of the same 3d mesh did you train it on? I trained for 10-20 epochs @ 2000 examples and got 0.19 loss.
I think you are training on too few examples, it needs massive amounts of data to model. And if you do data augmentation you'll need even more data, maybe 30-40 epochs or more.

I was able to generate a pretty good 3d mesh, it's not as smooth but very good result for such small amount of training data.
The transformer & encoder isn't good at generalizing with low training data but that will resolve itself when training with much more data.

3D mesh:
https://file.io/6JIueypFnRyT

bild

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024 1

image

https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/9b8k9mfc/overview?workspace=user-ernest-lee

I have some bugs, but this is really promising.

I had to recode my face index asc regularization strategy.

The clipped ears is the meshgpt.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024 1

@MarcusLoppe what sort of limits are you getting on your triangle count? I think mine is around 1349 triangles per mesh.

What happens when you go above it? Is it the VRAM or the transformer get stuck at a loss? If so; have you tried raising the dim to 768 or 1024?

Does anyone know what the meshgpt paper means by jitter below Hausdorff?

I don't think jitter is related to that, it talks about jitter but then switches the topic to planar decimation, e.g simplify the training mesh while having it look the same as before.

from meshgpt-pytorch.

lucidrains avatar lucidrains commented on May 28, 2024

yup sounds good! just put all the functions into one file, say augment.py, and if you want to go the distance, have ways to compose / chain any number of augmentations

from meshgpt-pytorch.

lucidrains avatar lucidrains commented on May 28, 2024

@fire scale and rotation will go a long way

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

image

Here's what my current augments do.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

vs original

image

Edited:

There's a bias near the center D:

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

image

The bias is removed.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I have to go for now.

https://github.com/lucidrains/meshgpt-pytorch/pull/6/files#diff-bb1e7e12bca15c4f2fd0faa464db85f6e8cb35c55454247f94c31bfc1483c3bbR100-R150

See def augment_mesh(self, base_mesh, augment_count, augment_idx):

Edited: removed seed

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

@lucidrains Can you post something for me to extract the resulting mesh from the autoencoder?

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

You mentioned the topic of overfitting as a first step.

I added the Blender monkey as a validation of mesh input through an autoencoder as an initial step.

I want send another monkey to the autoencoder and get the same monkey out again. How do I do that?

from meshgpt-pytorch.

 avatar commented on May 28, 2024

You mentioned the topic of overfitting as a first step.

I added the Blender monkey as a validation of mesh input through an autoencoder as an initial step.

I want send another monkey to the autoencoder and get the same monkey out again. How do I do that?

I have been using Marcus provided Notebook file to try that, I am also getting bad obj results. I am going to try the latest @lucidrains changes tomorrow in this notebook, maybe you can try, give a look; or maybe you might be ahead of what I am using. 😆 Thanks!
https://drive.google.com/file/d/1gpLjbnH1WUH6U50MJKrw-8BV6S_-3KH1/view?usp=sharing

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

image

I am getting bad mesh results too, but it's trying. The selected is the output, the background is the base mesh.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

Just for testing purposes; give it a go without the data augment.
I think there needs to be some more improvements with the model + it will take a long time to train with the data augment.
In the paper they used 28 000 shapes and trained the encoder on 2x A100 for 2 days and 4x A100 for 5 days for the transformer.
So it will need lots of training data and time.

When I have been successful, the encoder loss was less 0.200- 0.250 and the loss for the transformer was around 0.00007.
So if you can get the loss using the data augmentation down to those levels it probably work but that will require lots of training

bild

Here is some details from the paper, they only use scalar and jitter-shift.
So remove translation & rotation and see if that helps.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I am currently at:

loss: 1.255
loss: 1.500
loss: 1.786
loss: 1.596
loss: 1.941
loss: 1.583
loss: 1.895
loss: 1.904

So maybe I can dream about 0.200 - 0.250 loss.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

I am currently at:

loss: 1.255
loss: 1.500
loss: 1.786
loss: 1.596
loss: 1.941
loss: 1.583
loss: 1.895
loss: 1.904

So maybe I can dream about 0.200 - 0.250 loss.

How many steps are that at? I require about 2000 steps since 200 x10 epochs = 2000.
Also implement tqdm since print can slow down quite alot.

Try only doing scalar and see, probably will go better.

You can give it a go with my forked version @ https://github.com/MarcusLoppe/meshgpt-pytorch/tree/main

The data MeshDataset expect is a array of:

obj_data = {"texts": "chair", "vertices": vertices, "faces": faces} 
import torch
from torch.utils.data import Dataset, DataLoader 
from tqdm import tqdm

class MeshDataset(Dataset): 
    def __init__(self, obj_data): 
        self.obj_data = obj_data
        print(f"Got {len(obj_data)} data")

    def __len__(self):
        return len(self.obj_data)

    def __getitem__(self, idx):
       return  self.obj_data[idx] 

from meshgpt_pytorch import (
    MeshTransformerTrainer,
    MeshAutoencoderTrainer
)

autoencoder_trainer = MeshAutoencoderTrainer(model = autoencoder,learning_rate = 1e-3, warmup_steps = 10,dataset = dataset,batch_size=4,grad_accum_every=1,num_train_steps=1)

autoencoder_trainer.train(10, True)

max_length =  max(len(d["faces"]) for d in dataset if "faces" in d)
max_seq =  max_length * 6
print(max_length)
print(max_seq)
transformer = MeshTransformer(
    autoencoder,
    dim = 16,
    max_seq_len = max_seq,
    #condition_on_text = True
)
 
 
trainer = MeshTransformerTrainer(model = transformer,warmup_steps = 10, dataset = dataset,learning_rate = 1e-2,batch_size=2,grad_accum_every=1,num_train_steps=1)
trainer.train(10)

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

These are my current settings which is 200 steps. The outlined is the output mesh. You can see my code in the pull request.

run = wandb.init(
    project="meshgpt-pytorch",
    
    config={
        "learning_rate": 1e-2,
        "architecture": "MeshGPT",
        "dataset": dataset_directory,
        "num_train_steps": 200,
        "warmup_steps": 1,
        "batch_size": 4,
        "grad_accum_every": 1,
        "checkpoint_every": 20,
        "device": str(device),
        "autoencoder": {
            "dim": 512,
            "encoder_depth": 6,
            "decoder_depth": 6,
            "num_discrete_coors": 128,
        },
        "dataset_size": dataset.__len__(),
    }
)

image

image

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

You are right that I should ensure that we're in unit square distance and do less augmentations though.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

You are right that I should ensure that we're in unit square distance and do less augmentations though.

I think that generating two objects are causing some issues, try using a singular box.

I tried your s_bed_full.glb file and the result was pretty good, it's not so smooth. Probably better result with data augmentation. The right side is the generated one.

bild
bild

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

https://imgsli.com/ is very good for image comparisons.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

Writing down an idea. It should be possible to go over the 10 million 3d item set and find a small set of items in a small set of classes similar to the paper and label them manually (like via path name).

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

Writing down an idea. It should be possible to go over the 10 million 3d item set and find a small set of items in a small set of classes similar to the paper and label them manually (like via path name).

Training 10 million might be overkill and going over 28 000 shapes might cost a bit to much $$$.
Shapenet got 50k 3d models with like almost a paragraph of description text.

Renting A100 at 0.79$ per hour:
Training encoder on A100 x2 for 2 days: 75,84$
Training transformer on A100 x4 for 5 days: 379$

However H100 promises good performance but at like 2-3$ an hour.

https://imgsli.com/ is very good for image comparisons.

Seems pretty good, but probably not for 3D models

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I can't use shapenet, but I'm sure we can find 10 class of 100 models like Shapenet in that 10 million dataset.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

I can't use shapenet, but I'm sure we can find 10 class of 100 models like Shapenet in that 10 million dataset.

I think it's fine, there are many free sources, the trouble might be finding a dataset with descriptions.
But that is in the future, I think someone can get access from Shapenet.
But the bigger issue is the GPU bill, however Phil/lucidrains might be able to improve the models so much that the training time goes down dramatically.

But after the model is trained the issue the inference will be a big issue for users, if it's going to generate complex 3D models, it might not work on consumer hardware. But the recent performance boost is a good sign that the performance and effective is on the right track.

https://github.com/timzhang642/3D-Machine-Learning#3d_models

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I want to mention, getting the indices so they're in the right order and making sure they fit in the box and not inside out are problems too.

If you're interested in training the head it's
image in the dataset. I can't get the autoencoder below 0.5 loss

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I was using the wrong strategy. You were using many same copies of the mesh and then some augments. I was doing the opposite.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

I was using the wrong strategy. You were using many same copies of the mesh and then some augments. I was doing the opposite.

I might have worded that badly but no, I'm using the same model without any augmentations.
But train for 10/20 epochs @ 2000 items per dataset and let me know.
Kaggle has some awesome free GPU's.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

Here's what I interpreted it.

  1. model * multiple
  2. model * multiple * augments

You were doing 2000 (same) * 1 * 1.

I was trying 1 * 2000 (agumented) * 1.

Thanks for telling me! I'm trying your suggestion.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

Here's what I interpreted it.

1. model * multiple

2. model * multiple * augments

You were doing 2000 (same) * 1 * 1.

I was trying 1 * 2000 (agumented) * 1.

Thanks for telling me! I'm trying your suggestion.

No problem, I posted this in another issue but I think this might help you; according to the paper they sort the vertices in z-y-x order.
Then sort the faces as per their lowest vertex index.

Also, I'm current training on like 6 3d mesh chairs. Each chair has 1500 examples, but it have 3 augmentation version .
So each 3d mesh file have a total of 500 x 3 =1500 examples.

The total is 12 000 examples.

To give you some type of idea of why you need to train for 2 days on two A100, watch how slow the progress is (33 minutes running):


Epoch 1/20: 100%|██████████| 1125/1125 [03:29<00:00,  5.38it/s, loss=0.296]
Epoch 1 average loss: 0.7889469708336724
Epoch 2/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.52it/s, loss=0.307]
Epoch 2 average loss: 0.29623086002137927
Epoch 3/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.54it/s, loss=0.28] 
Epoch 3 average loss: 0.2731376721594069
Epoch 4/20: 100%|██████████| 1125/1125 [03:22<00:00,  5.54it/s, loss=0.248]
Epoch 4 average loss: 0.25995001827345954
Epoch 5/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.54it/s, loss=0.239]
Epoch 5 average loss: 0.251056260228157
Epoch 6/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.53it/s, loss=0.217]
Epoch 6 average loss: 0.24529405222998726
Epoch 7/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.54it/s, loss=0.227]
Epoch 7 average loss: 0.24055371418264176
Epoch 8/20: 100%|██████████| 1125/1125 [03:22<00:00,  5.54it/s, loss=0.221]
Epoch 8 average loss: 0.23791699058479732
Epoch 9/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.54it/s, loss=0.245]
Epoch 9 average loss: 0.23742892943488228
Epoch 10/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.54it/s, loss=0.208]
Epoch 10 average loss: 0.23614923742082383
Epoch 11/20: 100%|██████████| 1125/1125 [03:23<00:00,  5.53it/s, loss=0.219]
Epoch 11 average loss: 0.23556399891111585

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

#11 (comment) was the verification of z-y-x order and sort the faces as per their lowest vertex index. Note that I am using the convention that gives me that result like Y-Z-X, but it followed their requirement of sorted vertically.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

@MarcusLoppe on your branch, can you add a feature that on the first quit I save, on the second quit quit. Then, we can restart from a checkpoint.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

#11 (comment) was the verification of z-y-x order and sort the faces as per their lowest vertex index. Note that I am using the convention that gives me that result like Y-Z-X, but it followed their requirement of sorted vertically.

Oh, great :)
I'm currently testing and seeing if using 50% of the 3d mesh examples to be full and the rest of the faces are stepped on, e.g 0 to max(faces). My idea is that when generating the 3d mesh, the embedder might freak out since it have never seen a input graph that is not full. I'll let you know how it goes.

One other tip might be to normalize the size and set everything on the ground.
If i'm correct; the below will set the max value of a vertices to 1 and min 0, then set everything on the ground.

I'm limiting the size since I'm current training on a few different chairs and some of the chairs where huge like a building while others where "normal" size.

    max_abs = np.max(np.abs(vertices))
    vertices = vertices / max_abs 
    
    min_y = np.min(vertices[:, 1])
    vertices[:, 1] -= min_y

@MarcusLoppe on your branch, can you add a feature that on the first quit I save, on the second quit quit. Then, we can restart from a checkpoint.

I don't understand, can you clarify?

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

image

This is my current result.

I'll retype the last message in a bit.

output.log See also https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/2fkwahjc/overview

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

image

This is my current result.

I'll retype the last message in a bit.

output.log See also https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/2fkwahjc/overview

I see that the dataset size is 10, for training effective I just duplicate the one model x2000 times since it can train faster I think when dealing with bigger loads.
Since you are using a 3090 you can probably up batch size to 8 or 16. The only reason why I had the batch size at 1 or 4 was due to VRAM constraints but the encoder & transformer are now much more memory effective.

The learning rate seems bit high, for the encoder i used 1e-3 (0.001) and for the transformer i used 1e-2 (0.01).
When the loss becomes quite low for the transformer you can try using a lower learning rate such as 1e-3.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I see that the dataset size is 10, for training effective I just duplicate the one model x2000 times since it can train faster I think when dealing with bigger loads.

Instead of duplicating the model, I multiply the epoch by n, but according to the graph the training flattens so I stop early.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

image

I broke the counter clockwise triangle order, but it's invisible in this shot.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/9b8k9mfc/overview?workspace=user-ernest-lee

I have some bugs, but this is really promising.

I had to recode my face index asc regularization strategy.

The clipped ears is the meshgpt.

That seems very good, I see that you increased the num_discrete_coors to 256. Did that help? Seems like that would smooth out the errors/give it a higher error margin so even if it's wrong it looks smoother.

What kind of augmentation are you doing? Are you applying all the augmentations including the rotation?
I'm bit unsure about the rotation one since neither MeshGPT or PolyGen mention it, only the scalar & jitter.

Is there any reason why you are adding 2 extra tokens as padding?

seq_len = dataset.get_max_face_count() * 3
seq_len = ((seq_len + 2) // 3) * 3

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

Here's my current augmentations. It's in the git.:

image

def augment_mesh(self, base_mesh, augment_count, augment_idx):
    # Set the random seed for reproducibility
    random.seed(self.seed + augment_count * augment_idx + augment_idx)

    # Generate a random scale factor
    scale = random.uniform(0.8, 1)

    vertices = base_mesh[0]

    # Calculate the centroid of the object
    centroid = [
        sum(vertex[i] for vertex in vertices) / len(vertices) for i in range(3)
    ]

    # Translate the vertices so that the centroid is at the origin
    translated_vertices = [[v[i] - centroid[i] for i in range(3)] for v in vertices]

    # Scale the translated vertices
    scaled_vertices = [
        [v[i] * scale for i in range(3)] for v in translated_vertices
    ]

    # Generate a random rotation matrix
    rotation = R.from_euler("y", random.uniform(-180, 180), degrees=True)

    # Apply the transformations to each vertex of the object
    new_vertices = [
        (np.dot(rotation.as_matrix(), np.array(v))).tolist()
        for v in scaled_vertices
    ]

    # Translate the vertices back so that the centroid is at its original position
    final_vertices = [[v[i] + centroid[i] for i in range(3)] for v in new_vertices]

    # Normalize uniformly to fill [-1, 1]
    min_vals = np.min(final_vertices, axis=0)
    max_vals = np.max(final_vertices, axis=0)

    # Calculate the maximum absolute value among all vertices
    max_abs_val = max(np.max(np.abs(min_vals)), np.max(np.abs(max_vals)))

    # Calculate the scale factor as the reciprocal of the maximum absolute value
    scale_factor = 1 / max_abs_val if max_abs_val != 0 else 1

    # Apply the normalization
    final_vertices = [
        [(component - c) * scale_factor for component, c in zip(v, centroid)]
        for v in final_vertices
    ]

    return (
        torch.from_numpy(np.array(final_vertices, dtype=np.float32)),
        base_mesh[1],
    )

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

Is there any reason why you are adding 2 extra tokens as padding?

The generated tokens length needs to be a multiple of 3.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I see that you increased the num_discrete_coors to 256

To be honest I think this only affects the quantization loss on the discretionary of the mesh vertex positions.

I don't think it matters, but I haven't tested it off.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

image

https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/dn4mqfoj/overview?workspace=user-ernest-lee [Edited]

phone.zip

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

Is there any reason why you are adding 2 extra tokens as padding?

The generated tokens length needs to be a multiple of 3.

It should be 6 since 1 face = 6 tokens.

I see that you increased the num_discrete_coors to 256

To be honest I think this only affects the quantization loss on the discretionary of the mesh vertex positions.

I don't think it matters, but I haven't tested it off.

It should make it smoother since if it guesses wrong class of 128 vs 256 classes; the step values might be 0.20 vs 0.10, the 0.1 error will be less visible.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

image

https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/dn4mqfoj/overview?workspace=user-ernest-lee [Edited]

phone.zip

Training a single mesh seems to be going pretty good/solved, have you tried using the texts & multiple meshes?
Try with just 2-3 meshes and see how it goes, it's very slow to train the transformer with more then one mesh.

I'm guess that you resolved the issue with the mesh get cut off? I just scale it to fit -0.95 to +0.95, seems like there are some issues when the mesh gets above at 1.0.

Also; I was granted access to the shapenet v2 dataset on huggingface, you can probably get access as well.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I was able to train the transformer to use 1172 faces.

mesh_transforms_humanoid_avatar.zip

image

I respect the MIT, Apache-2 and cc-by licenses and so have a reason to not use shapenet.

V-Sekai-fire@a416837

https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/rp8nbw7w?workspace=user-ernest-lee Some logs.

Duration: 1h 13m 26s

upbeat-waterfall-437-618dbfb6d54f78d191f293a55a0c9e7a41147541.json

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

Training a single mesh seems to be going pretty good/solved, have you tried using the texts & multiple meshes?
Try with just 2-3 meshes and see how it goes, it's very slow to train the transformer with more then one mesh.

I want to do after a break. Any suggestions? I was thinking of having one human be in multiple poses, but different objects is doable too.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

I was able to train the transformer to use 1172 faces.

mesh_transforms_humanoid_avatar.zip

image

I respect the MIT, Apache-2 and cc-by licenses and so have a reason to not use shapenet.

V-Sekai-fire@a416837

https://wandb.ai/ernest-lee/meshgpt-pytorch/runs/rp8nbw7w?workspace=user-ernest-lee Some logs.

Duration: 1h 13m 26s

upbeat-waterfall-437-618dbfb6d54f78d191f293a55a0c9e7a41147541.json

I think it's fine to train while testing since it's not for any commercial purpose but pure testing that won't be touched by anyone else.

One benefit of using shapenet is they got nice labels and not just category's like "chair", examples:
"name": "easy chair,lounge chair,overstuffed chair",
"name": "water faucet,water tap,tap,hydrant",
"name": "ladder-back,ladder-back chair",

I want to do after a break. Any suggestions? I was thinking of having one human be in multiple poses, but different objects is doable too.

Yes, use very low faces mesh since using text to encode makes the training much harder.

Using a dataset of 2 chairs with 5000 examples (2 meshes, 5 augmentations x 500)
I got the encoder to 0.2 loss after 2 epochs but the transformer is at 0.001695 loss after 40 epochs and taken 2h's.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

@MarcusLoppe I'm pretty sure you can use blip to categorize photos of the mesh so that's not a blocker. https://replicate.com/gfodor/instructblip

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

Someone wanted me to try https://www.kenney.nl/assets/castle-kit. So I'll need to generate labels for them, but it should work.

image

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

@MarcusLoppe I'm pretty sure you can use blip to categorize photos of the mesh so that's not a blocker. https://replicate.com/gfodor/instructblip

Well the downside is that you'll use blender to take a screenshot with a default camera and since models vary with the orientation/vertical axis you might take a snapshot at back/below of the object.
Why complicate it? :)

Someone wanted me to try https://www.kenney.nl/assets/castle-kit. So I'll need to generate labels for them, but it should work.

Try walking before running :) I've been trying to tell that you need massive amount of data and training time to actually create a good enough model for that.
Currently you have been over fitting a model with a very small sample of data. The harder part is when you want to create a general model that can generate general items.

I've been successful on over fitting it using text + 1 single model for around 40 epochs at 2000 examples per epoch.
If i use two models that are the same type of object e.g chair, it fails massively.

If you want to give it a go, use only the models with less then 500-600 faces and then create 10-20 augmentations per model, then duplicate each variation 200 times.
If you want to train it using 40 objects = 10 * 200 * 40 = 80 000 examples per dataset.

Then train on this for a day or two and then try to generate using the texts.

In the PolyGen and MeshGPT paper they stress that they didn't have enough training data and used only 28 000 mesh models.
They needed to augment those with lets say 20 augments, this means that they trained on 560 000 mesh models.
Since they only did autocomplete it makes the generation much easier then using texts.

In the paper they used 28 000 3d models, lets say they generate 10 augmentations per each model and then used 10 duplicates since the it's more effective to train a model with big batch size of 64 and when you are using a small number of models per dataset it will not train effectively and you will waste parallelism of GPUs.
This means that : 10 * 10 = 100 * 28 000 = 2 800 000

I want to stress this:
Over fitting a model = super easy.
Training a model to be general enough for many different models = Hard.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

From @MarcusLoppe

But since it seems like you are not using the texts you can try to feed the transformer a prompt of 10-30 connected faces of a model and see what happens (like in the paper), it should act as a autocomplete.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

@MarcusLoppe what sort of limits are you getting on your triangle count? I think mine is around 1349 triangles per mesh.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

@MarcusLoppe what sort of limits are you getting on your triangle count? I think mine is around 1349 triangles per mesh.

I haven't bothered with such large meshes due to hardware constraints.
What limits are you talking about? If you are running out of VRAM while training; lower the batch size.

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

image

Does anyone know what the meshgpt paper means by jitter below Hausdorff?

from meshgpt-pytorch.

fire avatar fire commented on May 28, 2024

I had an avatar https://booth.pm/en/items/4861008 and I wanted to use it so was trying to optimize. (mirror https://github.com/V-Sekai-fire/SK_faavrs_breadbread)

Was around 15_023 triangles, and I don't think it's reasonable for people to pay for 48 gb gpus.

from meshgpt-pytorch.

MarcusLoppe avatar MarcusLoppe commented on May 28, 2024

I had an avatar https://booth.pm/en/items/4861008 and I wanted to use it so was trying to optimize. (mirror https://github.com/V-Sekai-fire/SK_faavrs_breadbread)

Was around 15_023 triangles, and I don't think it's reasonable for people to pay for 48 gb gpus.

@fire
I'm guessing you never made it to the transformer encoding and crashes at the encoder training?

Did you load the models without training them and tried to generate a model and see what the inference VRAM requirement was?

@lucidrains
Since most meshes contain a lot triangles and each time the autoencoder embeds the mesh data; the whole mesh is embedded at once (not sure if the ResNet does the same thing).
This will create a lot of VRAM usage, currently it seems like it's zero-shotting the entire mesh.
This seems not very smart or effective, maybe a better idea is to provide faces but include extra connected faces as padding so it can understand the overall shape? Or maybe figure out a good way of summarizing the rest of the faces in a memory efficient way.

from meshgpt-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.