Hello! Thanks a lot for a such well-written article! It's so impress

What parameters is really trained during P-tuning v2? about p-tuning-v2 HOT 4 CLOSED

thudm commented on June 8, 2024

What parameters is really trained during P-tuning v2?

from p-tuning-v2.

Comments (4)

Xiao9905 commented on June 8, 2024

Thanks for your pointing out! Because this is still a draft technical report, we have many details to update in very few days.

First, in this work, all results of "prompt tuning", "P-tuning", "P-tuning v2", and "Multitask P-tuning v2" are obtained by freezing the transformers parameters and only tuning the continuous prompts. Ratios of task-specific parameters (e.g., 0.1%) are derived from comparing continuous prompts' parameters with pre-trained transformers' parameters. Only results of "fine-tuning" are obtained by tuning transformers' parameters (without using continuous prompts).

Second, parameters really tuned are prefix token embeddings in the input of every transformer layers. In vanilla P-tuning (v1), we only tuned continuous prompts for the first transformer layer; in P-tuning v2, embeddings at certain positions (e.g., in the beginning of sequences) in input embedding sequences for every transformer layers are tuned. We are not using MLP as reparamerization in this work, but just pure embeddings.

from p-tuning-v2.

petrovlesha commented on June 8, 2024

So that mean for prefix tokens intermediate layer input is not obtained from previous layer, it's just some latent embeddings? And in case of 24 layer model and 50 tokens prefix you tune 50 * 24 = 1200 embeddings instead of 50, right?

from p-tuning-v2.

Xiao9905 commented on June 8, 2024

Yes, your understanding is correct.

from p-tuning-v2.

petrovlesha commented on June 8, 2024

Thank you! That helped a lot.

from p-tuning-v2.

Recommend Projects

What parameters is really trained during P-tuning v2? about p-tuning-v2 HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent