Welcome to my profile! I'm a Ph.D. student at University of Texas Southwestern Medical Center.
- ๐ซ How to reach me: [email protected]
- Homepage: https://www.yunxiangli.top/
- Google Scholar: https://scholar.google.com/citations?user=evbcKz8AAAAJ
License: Apache License 2.0
Welcome to my profile! I'm a Ph.D. student at University of Texas Southwestern Medical Center.
https://github.com/zphang/transformers.git@68d640f7c368bcaaaecfc678f11908ebbd3d6176 in requirements.txt cannot reach
I found there is a peft module imported, but it is not like the peft module from huggingface transformers package. (or I didn't find it in a specific package version)
Can you give me some hints about how to import the peft and its version detail?
The REAMDE mentioned "We uploaded a larger training data, InstructorDoctor-200k."
Where can we find this dataset file in this repo? Many thanks.
Has anyone encountered this problem?
Unable to load weights from pytorch checkpoint file for './pretrained/pytorch_model-00002-of-00003.bin' at
'./pretrained/pytorch_model-00002-of-00003.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint,
please set from_tf=True.
hello, I have filled out the link several times, but I do not receive related weight files. Is there something missing here? (I had check my spam) My email is [email protected], could you please send me the pre-trained weights? Thanks a lot.
Hi!
Can we train the model from scratch? Do you have plans to release the training code as well without loading pre-trained weights?
Thanx
i'd like to see whether Chinese data works in LLAMA finetune
Thanks for such a great job.In the paper, there is not that much details about your evaluate model performance.can we get the efvaluate dataset,evaluater rules,and detailed model performence result?thank you
Thanks for your work๏ผ
ๅฏๅฆๅจ่พๅบ็ปๆๆถๅ ๅ
ฅๅคๆญ๏ผ่ฅ็ฝฎไฟกๅบฆ่พไฝๆไธๆพๅจ่ฎญ็ป้่ง่ฟ็ฑปไผผ็้ฎ้ขๅฐฑ่ฎฉๆจกๅๆ็ปๅ็ญ้ฎ้ข๏ผ
as the title,
LLM is always used for English very well , but not good for Chinese .
So how about the performance of Chinese ?
run python chat.py, throw this error๏ผ
AttributeError: module transformers has no attribute LLaMATokenizer
It seems that there is not code for " utilize conversation demonstrations synthesized via ChatGPT to finetune the LLaMA model "
in the code , I see that your use HealthCareMagic-200k.json ,not the "5k generated conversations between patients and physicians from ChatGPT [GenMedGPT-5k]" ,
how to utilize conversation demonstrations synthesized via ChatGPT ? Can you show us the code for this ?
Hello, the link you provided, https://huggingface.co/spaces/kenton-li/ChatDoctor, does not work. Can you provide a new one?
Error :
(base) hemang@hemang-HP-Pavilion-g6-Notebook-PC:~/Documents/GitHub/ChatDoctor$ python3.11 chat.py
2023-03-30 16:16:25.135057: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-30 16:16:26.061195: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Loading ./pretrained/...
/home/hemang/.local/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
gpu_count 0
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ /home/hemang/.local/lib/python3.11/site-packages/transformers/modeling_utils.py:415 in โ
โ load_state_dict โ
โ โ
โ 412 โ โ โ ) โ
โ 413 โ โ return safe_load_file(checkpoint_file) โ
โ 414 โ try: โ
โ โฑ 415 โ โ return torch.load(checkpoint_file, map_location="cpu") โ
โ 416 โ except Exception as e: โ
โ 417 โ โ try: โ
โ 418 โ โ โ with open(checkpoint_file) as f: โ
โ โ
โ /home/hemang/.local/lib/python3.11/site-packages/torch/serialization.py:791 in load โ
โ โ
โ 788 โ if 'encoding' not in pickle_load_args.keys(): โ
โ 789 โ โ pickle_load_args['encoding'] = 'utf-8' โ
โ 790 โ โ
โ โฑ 791 โ with _open_file_like(f, 'rb') as opened_file: โ
โ 792 โ โ if _is_zipfile(opened_file): โ
โ 793 โ โ โ # The zipfile reader is going to advance the current file position. โ
โ 794 โ โ โ # If we want to actually tail call to torch.jit.load, we need to โ
โ โ
โ /home/hemang/.local/lib/python3.11/site-packages/torch/serialization.py:271 in _open_file_like โ
โ โ
โ 268 โ
โ 269 def _open_file_like(name_or_buffer, mode): โ
โ 270 โ if _is_path(name_or_buffer): โ
โ โฑ 271 โ โ return _open_file(name_or_buffer, mode) โ
โ 272 โ else: โ
โ 273 โ โ if 'w' in mode: โ
โ 274 โ โ โ return _open_buffer_writer(name_or_buffer) โ
โ โ
โ /home/hemang/.local/lib/python3.11/site-packages/torch/serialization.py:252 in __init__ โ
โ โ
โ 249 โ
โ 250 class _open_file(_opener): โ
โ 251 โ def __init__(self, name, mode): โ
โ โฑ 252 โ โ super().__init__(open(name, mode)) โ
โ 253 โ โ
โ 254 โ def __exit__(self, *args): โ
โ 255 โ โ self.file_like.close() โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
FileNotFoundError: [Errno 2] No such file or directory: './pretrained/pytorch_model-00001-of-00003.bin'
During handling of the above exception, another exception occurred:
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ /home/hemang/Documents/GitHub/ChatDoctor/chat.py:43 in <module> โ
โ โ
โ 40 โ โ
โ 41 โ generator = model.generate โ
โ 42 โ
โ โฑ 43 load_model("./pretrained/") โ
โ 44 โ
โ 45 First_chat = "ChatDoctor: I am ChatDoctor, what medical questions do you have?" โ
โ 46 print(First_chat) โ
โ โ
โ /home/hemang/Documents/GitHub/ChatDoctor/chat.py:28 in load_model โ
โ โ
โ 25 โ print('gpu_count', gpu_count) โ
โ 26 โ โ
โ 27 โ tokenizer = transformers.LLaMATokenizer.from_pretrained(model_name) โ
โ โฑ 28 โ model = transformers.LLaMAForCausalLM.from_pretrained( โ
โ 29 โ โ model_name, โ
โ 30 โ โ #device_map=device_map, โ
โ 31 โ โ #device_map="auto", โ
โ โ
โ /home/hemang/.local/lib/python3.11/site-packages/transformers/modeling_utils.py:2630 in โ
โ from_pretrained โ
โ โ
โ 2627 โ โ โ โ mismatched_keys, โ
โ 2628 โ โ โ โ offload_index, โ
โ 2629 โ โ โ โ error_msgs, โ
โ โฑ 2630 โ โ โ ) = cls._load_pretrained_model( โ
โ 2631 โ โ โ โ model, โ
โ 2632 โ โ โ โ state_dict, โ
โ 2633 โ โ โ โ loaded_state_dict_keys, # XXX: rename? โ
โ โ
โ /home/hemang/.local/lib/python3.11/site-packages/transformers/modeling_utils.py:2939 in โ
โ _load_pretrained_model โ
โ โ
โ 2936 โ โ โ โ # Skip the load for shards that only contain disk-offloaded weights when โ
โ 2937 โ โ โ โ if shard_file in disk_only_shard_files: โ
โ 2938 โ โ โ โ โ continue โ
โ โฑ 2939 โ โ โ โ state_dict = load_state_dict(shard_file) โ
โ 2940 โ โ โ โ โ
โ 2941 โ โ โ โ # Mistmatched keys contains tuples key/shape1/shape2 of weights in the c โ
โ 2942 โ โ โ โ # matching the weights in the model. โ
โ โ
โ /home/hemang/.local/lib/python3.11/site-packages/transformers/modeling_utils.py:418 in โ
โ load_state_dict โ
โ โ
โ 415 โ โ return torch.load(checkpoint_file, map_location="cpu") โ
โ 416 โ except Exception as e: โ
โ 417 โ โ try: โ
โ โฑ 418 โ โ โ with open(checkpoint_file) as f: โ
โ 419 โ โ โ โ if f.read(7) == "version": โ
โ 420 โ โ โ โ โ raise OSError( โ
โ 421 โ โ โ โ โ โ "You seem to have cloned a repository without having git-lfs ins โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
FileNotFoundError: [Errno 2] No such file or directory: './pretrained/pytorch_model-00001-of-00003.bin'
Hi,
Thanks to share this interesting repository. I'm working in HealthTech. The https://huggingface.co/spaces/kenton-li/ChatDoctor link in the readme is not working. I have a 404 error from Hugging Face website.
Thanks in advance.
Joel
Hi, thank you for this model!
I am trying to build this app and getting this error message:
File "/home/ChatDoctor-main/env_doct/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2939, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ChatDoctor-main/env_doct/lib/python3.11/site-packages/transformers/modeling_utils.py", line 418, in load_state_dict
with open(checkpoint_file) as f:
^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './pretrained/pytorch_model-00001-of-00003.bin'
Already filled this form: link. :
https://forms.office.com/Pages/ResponsePage.aspx?id=lYZBnaxxMUy1ssGWyOw8ij06Cb8qnDJKvu2bVpV1-ANUMDIzWlU0QTUxN0YySFROQk9HMVU0N0xJNC4u
Can you please share this file 'pytorch_model-00001-of-00003.bin'
thanks in advance
Thank you for your interesting work.
In the chatdoctor5k.json
and chatdoctor200k.json
I see that the instructions start with "If you are a doctor".
I am curious why the instructions do not start with "You are a doctor".
Is this a common way to perform the alpaca instruction fine-tuning?
Does it support Chinese question๏ผ
Hi๏ผI met a problem that said ranks have different model. Followings are details.
./train_lora.sh
WARNING:torch.distributed.run:
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in y
our application as needed.
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
bin /root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA SETUP: CUDA runtime path found: /root/anaconda3/envs/chat-doctor/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
bin /root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
Finetuning model with params:
base_model: /disk2/data/xk/retr-llm/files/model/llama-7b/
data_path: /disk2/data/xk/retr-llm/files/datasets/mental_health_chatbot_dataset.json
output_dir: ./lora-chatDoctor_bs192_Mbs24_ep3_len512_lr3e-5_fromAlpacaLora
batch_size: 192
micro_batch_size: 24
num_epochs: 3
learning_rate: 3e-05
cutoff_len: 256
val_set_size: 120
use_gradient_checkpointing: False
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules: None
bottleneck_size: 256
non_linearity: tanh
adapter_dropout: 0.0
use_parallel_adapter: False
use_adapterp: False
train_on_inputs: True
scaling: 1.0
adapter_name: lora
target_modules: None
group_by_length: False
wandb_project:
wandb_run_name:
wandb_watch:
wandb_log_model:
resume_from_checkpoint: None
Loading checkpoint shards: 100%|##########| 33/33 [00:12<00:00, 2.58it/s]
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
Map: 100%|##########| 52/52 [00:00<00:00, 687.22 examples/s]
Map: 100%|##########| 120/120 [00:00<00:00, 765.56 examples/s]
[E ProcessGroupNCCL.cpp:828] [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807082 milliseconds before timi
ng out.
Traceback (most recent call last):
File "train_lora.py", line 353, in
fire.Fire(train)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "train_lora.py", line 299, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/trainer.py", line 1662, in train
return inner_training_loop(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/trainer.py", line 1749, in _inner_training_loop
model = self._wrap_model(self.model_wrapped)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/trainer.py", line 1569, in _wrap_model
model = nn.parallel.DistributedDataParallel(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 674, in init
_verify_param_shape_across_processes(self.process_group, parameters)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/distributed/utils.py", line 118, in _verify_param_shape_across_processes
return dist._verify_params_across_processes(process_group, tensors, logger)
RuntimeError: DDP expects same model across all ranks, but Rank 0 has 128 params, while rank 1 has inconsistent 0 params.
[E ProcessGroupNCCL.cpp:455] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplet
e data.
[E ProcessGroupNCCL.cpp:460] To avoid data inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): [Rank 0] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807082 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807414 milliseconds before timi
ng out.
Traceback (most recent call last):
File "train_lora.py", line 353, in
fire.Fire(train)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "train_lora.py", line 299, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/trainer.py", line 1662, in train
return inner_training_loop(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/trainer.py", line 1749, in _inner_training_loop
model = self._wrap_model(self.model_wrapped)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/transformers/trainer.py", line 1569, in _wrap_model
model = nn.parallel.DistributedDataParallel(
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 674, in init
_verify_param_shape_across_processes(self.process_group, parameters)
File "/root/anaconda3/envs/chat-doctor/lib/python3.8/site-packages/torch/distributed/utils.py", line 118, in _verify_param_shape_across_processes
return dist._verify_params_across_processes(process_group, tensors, logger)
RuntimeError: DDP expects same model across all ranks, but Rank 3 has 128 params, while rank 0 has inconsistent 0 params.
[E ProcessGroupNCCL.cpp:455] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplet
e data.
[E ProcessGroupNCCL.cpp:460] To avoid data inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807414 milliseconds before timing out.
[E ProcessGroupNCCL.cpp:828] [Rank 6] Watchdog caught collective operation timeout: WorkNCCL(SeqNum=1, OpType=ALLGATHER, Timeout(ms)=1800000) ran for 1807716 milliseconds before timi
ng out.
my environment๏ผ
GPU๏ผ8 X A100 80GB
pytorch version๏ผ2.0.1
How can I solve this bug๏ผThanks!
Hello,
Is there a specific license for the associated datasets?
Hi, @Kent0n-Li in the paper you mentioned that you used MedlinePlus as database to create the format_dataset.csv, but I found if directly typing the name in format_dataset.csv (e.g.: Panic disorder) in MedlinePlus system, there are multiple results and symptoms may not be complete. Just wondering how you create this file? Is there any scripts or you just manually select it? Can you give an example of how you utilize MedlinePlus system to derive Symptom,reason, TestsAndProcedures, commonMedications in format_dataset.csv(e.g.: Panic disorder )
The first step in building a physician-patient conversation dataset is to collect the disease database that serves as the gold standard. Therefore, we collected and organized a database of diseases, which contains about 700 diseases with their relative symptoms, medical tests, and recommended medications. To train high-quality conversation models on an academic budget, we input each message from the disease database separately as a prompt into the ChatGPT API to automatically generate instruction data. It is worth noting that our prompts to the ChatGPT API contain the gold standard of diseases and symptoms, and drugs, so our fine-tuned ChatDoctor is not only able to achieve ChatGPT's conversational fluency but also higher diagnostic accuracy compared to ChatGPT. We finally collected 5K doctor-patient conversation instructions and named it InstructorDoctor-5K.
I'm confused by this process. Can anyone explain it more precisely?
Got this error while running chat.py file
Giving 404 error @saharmor @Kent0n-Li
I haven't seen where you load the format_data.csv, so I am wondering what is the purpose of the file?
Thanks for your sharing, your attempt is very interesting and valuable.
However, I have some questions about the training process.
I notice that ChatDoctor is first trained using 52K instruction-following data from provided by Stanford Alpaca, and then finetuned on the your specific data.
Why not finetune the model using a mixture of two parts of the data?
What is the insight of this finetuning process?
What is the insight behind this finetuning model?
Have you ever tried to train with two pieces of data mixed together?
Hello, if i want to follow your code to fine-tune based on llama, what file should i prepare for this --model_name_or_path <your_path_to_hf_converted_llama_ckpt_and_tokenizer>? Thank you very much.
I cannot load the model because Colab Free does not have enough ram memory. Does anyone have a solution for this ?
Thanks
Doesn't LLaMA have basic conversational capabilities? Why fine-tune it based on Alpaca to get basic conversational skills?
I had trained a ChatDoctor using lora following your instruction, and evaluate in iCliniq-10k without providing external documents (e.g. wikpedia or medical databse). The problem is the model generates duplicate context as the below example:
Patient: 'Hi doctor,My wife missed her periods and we doubt she is pregnant. Our first kid is just 1 year old now and it was a Cesarean. We are not in a position to have a second child now. Her LMP was 22nd of last month. I read that Mifepristone can induce abortion. Is it a safe drug? Do we have to consult doctor for taking this drug? What are all the possible side effects? Is that drug 100 percent reliable? Is there any other options? Whether this drug is available without doctor\s prescription? Thank you.'
Predictions: 1. Mifepristone is a safe drug. 2. Mifepristone is a safe drug. 3. Mifepristone is a safe drug. 4. Mifepristone is a safe drug. 5. Mifepristone is a safe drug. 6. Mifepristone is a safe drug. 7. Mifepristone is a safe drug. 8. Mifepristone is a safe drug. 9. Mifepristone is a safe drug. 10. Mifepristone is a safe drug. 11. Mifepristone is a safe drug. 12. Mifepristone is a safe drug. 13. Mifepristone is a safe drug. 14. Mifepristone is a safe drug. 15. Mifepristone is a safe drug. 16. Mifepristone is a safe drug. 17. Mifepristone is a safe drug. 18. Mifepristone is a safe drug. 19. Mifepristone is a safe drug.
ChatDoctor: 1. No, mifepristone is not available without doctors prescription. 2. It is a safe medication, but as with any medications there may be some side effects which include nausea, vomiting, headache, diarrhea etc. These symptoms usually subside within 48 hours after ingestion.3. Yes, it is 100% reliable if taken under supervision by a trained gynecologist.4. There are many other options for termination of early pregnancy like medical methods (like MTP kit) or surgical methods (like suction evacuation). But these need to be done under proper guidance of a qualified Gynecologist.'
In my genertaed texts, this phenomeon is pretty common. Did you encounter this problem before? How did you solve this? thx!
ๅจ่ฎญ็ปlora็ๆถๅ, train_on_inputs ่ฟไธชๅๆฐไธบไปไน่ฎพ็ฝฎๆTrueๅข๏ผ
Hi, I noticed that a lot of "Chat Doctor" appears in the outputs of HealthCareMagic-100k.json. For example:
""Hi thanks for contacting Chat Doctor ... Your brother have both hepatitis b and c positive...."
"Hi and welcome to Chat Doctor."
"Hi and welcome to Chat Doctor. Thank you for your query. I am Chat Doctor.."
I wonder if that cost by some post processing? Is there any data without these words?
Hello,
I am curious to know which prompt you used to generate the dataset. I couldnt find it in the
utils.py`.
Also you might want to remove your OpenAI API key from the utils.py
.
It seems that in your paper the train dataset is 'InstructorDoctor-205k' but in this repo, from the training command, the dataset is 'HealthCareMagic-100k.json'
In the paper, the training was 'fine tuning on nstructorDoctor-205k (seems to be one step?)', but in this repo: 'Our model was firstly be fine-tuned by Stanford Alpaca's data to have some basic conversational capabilities.' does it mean the repo contains updated method?
Training time difference: paper - 18 hours. repo - 30 minutes
Can you help to provide some clarifications?
Thanks!
Hello, I am a college student reading your paper. My server GPU is only 48G, does that mean I don't have enough memory in my GPU to do the inference
i got an error when run train.py :
wandb: ERROR api_key not configured (no-tty). call wandb.login(key=[your_api_key])
See title. What's the dataset? Did you run any evaluation steps?
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402
ImportError: cannot import name 'BottleneckConfig' from 'peft' (/usr/local/lib/python3.10/dist-packages/peft/init.py)
Traceback (most recent call last):
File "/content/ChatDoctor/train_lora.py", line 17, in
from peft import ( # noqa: E402Traceback (most recent call last):
staing on 0%|
for about 20 mins now.
Is this okay?
Can`t work
As you can see im in Conda Powershell as Admin. I have installed PyTorch 2.0 with the updated torchvision for acceleration, along with the required downloads for transformers and tokenizer. The models load as well from the pretrained folder. Additionally, I have installed the CUDA toolkit 11.7 with drivers, and my GTX 1060 GPU 6GVRam is listed as available for computing. However, when attempting to activate CUDA, it shows as 0 or false. I am currently in the correct Conda environment and CUDA is installed and activated, but the issue persists. I noticed in the chat.py file that the model tokenizer shows as 8-bit floats to be disabled, which leads me to wonder if it is related. Also to mention that the tokenizers LLama name is written falsely perhaps, because i have found a thread on github on it, here is the link treadon/llama-7b-example#1 (comment) . There may be a typo error in your code as well in the chat.py file. I have been working on this issue for 3 days and would greatly appreciate any help. Thank you.
I got the following error when I run the chat_wiki.py :
raise HFValidationError(
huggingface_hub.utils.validators.HFValidationError: Repo id must use alphanumeric chars or '-', '', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'chatDoctor100k/'.
I am a medico, i don't have a server with large GPUs. can you online(or re-onine) the demo on the huggingface?
Thank you so much!
An error occurs when I run the "pip install -r requirements.txt" command to configure the environment๏ผ
ERROR: git+https://github.com/zphang/transformers.git (from -r requirements.txt (line 6)) does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.