When I use my own protein sequence data set for fine-tuning ， since my protein sequenc

Finetuning problem about protein_bert HOT 3 CLOSED

lengyuewuyazui commented on July 24, 2024

Finetuning problem

from protein_bert.

Comments (3)

nadavbra commented on July 24, 2024

You can just provide an all-zero input vector for the GO annotations - this will tell the model that you don't provide it with GO annotations and that it will have to infer them on its own. That's in fact also what we did when fine-tuning and testing the model so it's not an issue. ProteinBERT treats GO annotations has a bonus, but it's not really necessary.

from protein_bert.

lengyuewuyazui commented on July 24, 2024

Thank you for your explanation. I have another question, for my own protein sequence task. I want to compare the performance of pre-training on the results. How to directly train a end-to-end model without pre-training?

from protein_bert.

nadavbra commented on July 24, 2024

If you want a non-pretrained model, then instead of using load_pretrained_model you can do something along these lines:

from proteinbert import PretrainingModelGenerator, FinetuningModelGenerator, InputEncoder
from proteinbert import conv_and_global_attention_model

n_annotations = 1 # So long as you don't plan to provide annotations as input, it doesn't really matter what number you choose
output_spec = ...

uninitialized_pretraining_model_generator = PretrainingModelGenerator(conv_and_global_attention_model.create_model, n_annotations)
model_generator = FinetuningModelGenerator(uninitialized_pretraining_model_generator, output_spec)
input_encoder = InputEncoder(n_annotations)

from protein_bert.

Recommend Projects

Finetuning problem about protein_bert HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent