Hi, When training the model, I get this error: <a

I got this error when I changed it: <a target="_blank" rel="noopener

ValueError: Number of processes must be at least 1,about thilinarajapakse/pytorch-transformers-classification

Comments (14)

ThilinaRajapakse commented on May 19, 2024 1

Yes, that would fix all multiprocessing related issues at the expense of not using multiprocessing at all. I think you can get away with setting it to 2 on Colab. Setting it to 2 should speed things up a bit but setting it to 1 will ensure that you won't get multiprocessing related errors.

from pytorch-transformers-classification.

ThilinaRajapakse commented on May 19, 2024

~~I thought I fixed this, sorry.~~

~~In this function definition, change the default value of process count to 1 for Google Colab. (Colab has 1 vCPU if I remember correctly).~~

Edit: Couldn't reproduce the error when I ran the notebook. The Colab notebook specifies a process count of 2 when calling convert_examples_to_features()

from pytorch-transformers-classification.

Magpi007 commented on May 19, 2024

So this change is only for Colab? If I implement it in my local laptop, can I use the one that is in the repo?

from pytorch-transformers-classification.

ThilinaRajapakse commented on May 19, 2024

Yes, the local version will work fine. By default, the process_count is set to number of CPU cores available - 2. On a modern computer, you will certainly have more than 2 so it's fine. ~~But for Colab, the number is 1 so that makes the process_count -1, which throws the error because at least 1 process is needed.~~

Edit: The cpu_count on Colab is two, and the notebook is configured to use 2 as the process_count.

from pytorch-transformers-classification.

ThilinaRajapakse commented on May 19, 2024

I ran the Colab notebook again, it works without issues. The above change is unnecessary.

from pytorch-transformers-classification.

Magpi007 commented on May 19, 2024

Mmm and what could it be? I just run it again and got the same error. These are the resources that I have allocated:

from pytorch-transformers-classification.

ThilinaRajapakse commented on May 19, 2024

Try setting process_count to 1 in the call to convert_examples_to_features() inside the load_and_cache_examples() function.

features = convert_examples_to_features(examples, label_list, args['max_seq_length'], tokenizer, output_mode,
            cls_token_at_end=bool(args['model_type'] in ['xlnet']),            # xlnet has a cls token at the end
            cls_token=tokenizer.cls_token,
            sep_token=tokenizer.sep_token,
            cls_token_segment_id=2 if args['model_type'] in ['xlnet'] else 0,
            pad_on_left=bool(args['model_type'] in ['xlnet']),                 # pad on the left for xlnet
            pad_token_segment_id=4 if args['model_type'] in ['xlnet'] else 0,
            process_count=1)

from pytorch-transformers-classification.

Magpi007 commented on May 19, 2024

I got this error when I changed it:

Anyway, let me review the code, because I have been disconnected last days from this, so I want to check that I have been following all the steps correctly.

from pytorch-transformers-classification.

ThilinaRajapakse commented on May 19, 2024

Are you using a local copy (local to your Google Drive, that is)? I think this bug was there in the original notebook, but it was fixed later. The function should look like this:

def convert_examples_to_features(examples, label_list, max_seq_length,
                                 tokenizer, output_mode,
                                 cls_token_at_end=False, pad_on_left=False,
                                 cls_token='[CLS]', sep_token='[SEP]', pad_token=0,
                                 sequence_a_segment_id=0, sequence_b_segment_id=1,
                                 cls_token_segment_id=1, pad_token_segment_id=0,
                                 mask_padding_with_zero=True,
                                 process_count=cpu_count() - 2):
    """ Loads a data file into a list of `InputBatch`s
        `cls_token_at_end` define the location of the CLS token:
            - False (Default, BERT/XLM pattern): [CLS] + A + [SEP] + B + [SEP]
            - True (XLNet/GPT pattern): A + [SEP] + B + [SEP] + [CLS]
        `cls_token_segment_id` define the segment id associated to the CLS token (0 for BERT, 2 for XLNet)
    """

    label_map = {label : i for i, label in enumerate(label_list)}

    examples = [(example, label_map, max_seq_length, tokenizer, output_mode, cls_token_at_end, cls_token, sep_token, cls_token_segment_id, pad_on_left, pad_token_segment_id) for example in examples]

    with Pool(process_count) as p:
        features = list(tqdm(p.imap(convert_example_to_feature, examples, chunksize=100), total=len(examples)))

    return features

from pytorch-transformers-classification.

Magpi007 commented on May 19, 2024

Yeah maybe is that. I know is better to fork to your repo so we have the updates/fixes instantly, but I like first to understand the code recreating it in my own notebook. I am using Colab linked to Google Drive. I will check that and I will let you know. Thanks.

from pytorch-transformers-classification.

ThilinaRajapakse commented on May 19, 2024

Understandable! Let me know how it goes.

from pytorch-transformers-classification.

Magpi007 commented on May 19, 2024

With fresh head is more easy to see clearly. There were two things that I changed and make it worked:

When making dataframes BERT friendly I didn't pass the parameter columns=train_df_bert.columns to the to.csv function.
In the function load_and_cache_examples I was not including the colab add-in parameter undersample_scale_factor=0.1.

Maybe the first point was the one causing the error? Anyway sorry for my lapsus, I will keep iterating it and let you know if I see any suspicious bug.

from pytorch-transformers-classification.

ThilinaRajapakse commented on May 19, 2024

Weird. Neither of those things should be throwing a "number of processes" error as far as I can tell. That error comes from the multiprocessing used for converting examples to features. Oh well, we don't need to worry about it if it's working!

from pytorch-transformers-classification.

Magpi007 commented on May 19, 2024

I was facing again this problem with another iteration and I changed this:

process_count = cpu_count() - 2

for this

process_count = 1

in the function convert_examples_to_features of the utils.py file, and it worked. I am working on Colab. It makes sense to you?

from pytorch-transformers-classification.

ValueError: Number of processes must be at least 1 about pytorch-transformers-classification HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent