Giter VIP home page Giter VIP logo

Comments (4)

2b avatar 2b commented on August 9, 2024 1

Thank you for the detailed response!

From the name of your pool I can guess that you're probably using ZFS encryption?

Correct. ZFS pool and encrypted dataset was created with these commands:

zpool create -f -o ashift=12 -o autoreplace=on rpool mirror sdc sdd

zfs create -o encryption=aes-256-gcm -o keyformat=passphrase rpool/encrypted

So I think this issues is a manifestation of a race condition, when the provider is trying to get information of a ZFS volume after VM creation, but the volume is not fully provisioned yet and does not have a size.

I don't think so, applying terraform configuration again throws the same error again while the volume is already provisioned.

Could you share some details about your ZFS setup, so we can validate these assumptions?
Is the error always reproducible in your environment?

Yes, the error is always reproducible with each version of the provider >= 0.43.1.

A command pvesh get /nodes/pve-srv2/storage/ENC-STORAGE/content/vm-102-disk-0

gives the same error: volume_size_info on 'ENC-STORAGE:vm-102-disk-0' failed

I've tested the same pvesh get command on another PVE cluster with a non-encrypted ZFS dataset, and it works as expected. The PVE version on that cluster is different, though (8.0.3 vs. 7.2-5 on the one with encrypted datasets).

Unfortunately, both are production clusters, and I can't create any new pools, but I've traced the pvesh perl script and found it's calling a function from the file /usr/share/perl5/PVE/Storage/ZFSPoolPlugin.pm just before throwing an error. So I replaced that file with the one from 8.0.3 version cluster, and pvesh doesn't give any error anymore.

Finally, I found a commit related to this issue, so this has nothing to do with your provider. Sorry for wasting your time and thank you for such a feature-rich terraform provider!

from terraform-provider-proxmox.

bpg avatar bpg commented on August 9, 2024

Good investigation, thank you @2b

The only related change is in 0.43.1 is #862, and that might've introduced a regression.
Will take a look.

from terraform-provider-proxmox.

bpg avatar bpg commented on August 9, 2024

Okay, I wasn't able to reproduce it on a simple mirror zpool, there could be something specific to your ZFS configuration at play.

root@pve:~# zpool list -v
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
tank                                      63.5G  2.47G  61.0G        -         -     0%     3%  1.00x    ONLINE  -
  mirror-0                                63.5G  2.47G  61.0G        -         -     0%  3.89%      -    ONLINE
    scsi-0QEMU_QEMU_HARDDISK_drive-scsi1  64.0G      -      -        -         -      -      -      -    ONLINE
    scsi-0QEMU_QEMU_HARDDISK_drive-scsi2  64.0G      -      -        -         -      -      -      -    ONLINE
root@pve:~# zpool status
  pool: tank
 state: ONLINE
config:

	NAME                                      STATE     READ WRITE CKSUM
	tank                                      ONLINE       0     0     0
	  mirror-0                                ONLINE       0     0     0
	    scsi-0QEMU_QEMU_HARDDISK_drive-scsi1  ONLINE       0     0     0
	    scsi-0QEMU_QEMU_HARDDISK_drive-scsi2  ONLINE       0     0     0

errors: No known data errors
root@pve:~# zfs list -t volume
NAME                 USED  AVAIL  REFER  MOUNTPOINT
tank/vm-101-disk-0  20.3G  59.1G  2.47G  -
root@pve:~# pvesh get /nodes/pve/storage/tank/content/vm-101-disk-0
┌────────┬──────────────────────────────┐
│ key    │ value                        │
╞════════╪══════════════════════════════╡
│ format │ raw                          │
├────────┼──────────────────────────────┤
│ path   │ /dev/zvol/tank/vm-101-disk-0 │
├────────┼──────────────────────────────┤
│ size   │ 20.00 GiB                    │
├────────┼──────────────────────────────┤
│ used   │ 2.47 GiB                     │
└────────┴──────────────────────────────┘

The error is returned from the PVE API request to get the VM volume details. For your environment it will be equivalent of this cli command:

pvesh get /nodes/<node name>/storage/ENC-STORAGE/content/vm-102-disk-0

The previous version of the provider was using a slightly different call to get a list of all volumes from the storage in one shot:

pvesh get /nodes/<node name>/storage/ENC-STORAGE/content

From the name of your pool I can guess that you're probably using ZFS encryption? This may slow down ZFS management operations, like volumes provisioning. So I think this issues is a manifestation of a race condition, when the provider is trying to get information of a ZFS volume after VM creation, but the volume is not fully provisioned yet and does not have a size.

@2b Could you share some details about your ZFS setup, so we can validate these assumptions?
Is the error always reproducible in your environment?
Would it be possible to add another ZFS storage from a new pool (with default attributes) on your server to see if that makes any difference?

from terraform-provider-proxmox.

bpg avatar bpg commented on August 9, 2024

I glad you was able to find a solution!

from terraform-provider-proxmox.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.