kamalkraj / mingpt-tf Goto Github PK
View Code? Open in Web Editor NEWA minimal TF2 re-implementation of the OpenAI GPT training
License: MIT License
A minimal TF2 re-implementation of the OpenAI GPT training
License: MIT License
I am getting multiple errors when running the play_math
file:
2023-06-30 20:04:36.575689: W tensorflow/core/framework/op_kernel.cc:1807] OP_REQUIRES failed at cast_op.cc:121 : UNIMPLEMENTED: Cast string to float is not supported
2023-06-30 20:04:36.575889: W tensorflow/core/framework/op_kernel.cc:1807] OP_REQUIRES failed at cast_op.cc:121 : UNIMPLEMENTED: Cast string to float is not supported
Traceback (most recent call last):
File "play_math.py", line 96, in <module>
trainer.train()
File "/home/iccn/Desktop/minGPT-TF/mingpt/trainer.py", line 153, in train
loss = train_step(inputs)
File "/home/iccn/miniconda3/envs/tf_gpu/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/iccn/miniconda3/envs/tf_gpu/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 52, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnimplementedError: Graph execution error:
Detected at node 'Cast' defined at (most recent call last):
File "play_math.py", line 96, in <module>
trainer.train()
File "/home/iccn/Desktop/minGPT-TF/mingpt/trainer.py", line 153, in train
loss = train_step(inputs)
File "/home/iccn/Desktop/minGPT-TF/mingpt/trainer.py", line 115, in train_step
per_example_losses = self.strategy.run(step_fn, args=(dist_inputs,))
File "/home/iccn/Desktop/minGPT-TF/mingpt/trainer.py", line 112, in step_fn
self.optimizer.apply_gradients(list(zip(grads, self.model.trainable_variables)))
File "/home/iccn/Desktop/minGPT-TF/mingpt/optimization.py", line 71, in apply_gradients
zip(grads, tvars),
File "/home/iccn/miniconda3/envs/tf_gpu/lib/python3.8/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_gradients
return super().apply_gradients(grads_and_vars, name=name)
File "/home/iccn/miniconda3/envs/tf_gpu/lib/python3.8/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 632, in apply_gradients
self._apply_weight_decay(trainable_variables)
File "/home/iccn/miniconda3/envs/tf_gpu/lib/python3.8/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1159, in _apply_weight_decay
tf.__internal__.distribute.interim.maybe_merge_call(
File "/home/iccn/miniconda3/envs/tf_gpu/lib/python3.8/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1155, in distributed_apply_weight_decay
distribution.extended.update(
File "/home/iccn/miniconda3/envs/tf_gpu/lib/python3.8/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1151, in weight_decay_fn
wd = tf.cast(self.weight_decay, variable.dtype)
Node: 'Cast'
2 root error(s) found.
(0) UNIMPLEMENTED: Cast string to float is not supported
[[{{node Cast}}]]
(1) CANCELLED: Function was cancelled before it was started
0 successful operations.
0 derived errors ignored. [Op:__inference_train_step_33982]
What can I do to resolve them?
Hello, is there any way to save checkpoints to a specific file ? Also, how could I save the model for future use via other TF code ?
I'm still getting started learning Tensorflow and Python, so please bear with me ๐ !
Edit: current setup doesn't save checkpoint. Will it save only after each epoch ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.