shuyangli94 / recsysmodels Goto Github PK
View Code? Open in Web Editor NEWTensorFlow implementations of recommender systems models for implicit feedback & sequential actions
License: GNU General Public License v3.0
TensorFlow implementations of recommender systems models for implicit feedback & sequential actions
License: GNU General Public License v3.0
Did y used tensorflow version ?
I am following the sample_pipeline notebook, I think till Train/Test/Validation split the procedure is the same. Added prior to the validation and test set (u-p-i-j) format. I think I am doing something wrong in the sampling part I am not sure how to sample it properly? Any help is appreciated.
Here's my code:
'''
Run TransRec
'''
import tensorflow as tf
from recsys_models.models.transrec import TransRec
from recsys_models.pipeline import train_model
max_epochs = 200
n_iterations = 1000
batch_size = 512
stopping_threshold = 1e-5
start = datetime.now()
validation_data = validation_df[['u', 'prior', 'i', 'j']].values
test_data = test_df[['u', 'prior', 'i', 'j']].values
print('{} - Generated u-p-i-j matrices for validation and testing'.format(
datetime.now() - start
))
tf.reset_default_graph()
model = TransRec(n_users, n_items, k=5, lambda_emb=1e-4, lambda_bias=1e-4,
opt_type=tf.contrib.opt.LazyAdamOptimizer, opt_args={'learning_rate': 0.007})
print('\n=== BEGIN Optimization for {} ==='.format(model.model_id))
print(' {} Max epochs, with early stoppage at {} Validation AUC change'.format(max_epochs, stopping_threshold))
print(' {} Iterations per epoch with {}-sized batches'.format(n_iterations, batch_size))
session = tf.Session()
session.run(tf.global_variables_initializer())
model, train_auc, validation_auc, test_auc = train_model(
session, model, train_df, validation_data, test_data,
n_iterations=n_iterations, batch_size=batch_size,
min_epochs=10, max_epochs=max_epochs,
stopping_threshold=stopping_threshold,
sample_columns=['u', 'i'], column_order=['u', 'i', 'j'],
n_items=n_items, items_by_user=train_items_by_user
)
suffix = '_transrec'
full_model_id = '{}{}'.format(model.model_id, suffix)
model_folder = os.path.join('tf_models', full_model_id)
if not os.path.exists(model_folder):
os.makedirs(model_folder)
model.save(session, 'tf_models', suffix=suffix)
print('{} - Saved model to {}'.format(
datetime.now() - start, model_folder
))
session.close()
gc.collect()
print()
Attached is the Full Traceback:
Hi,
I am trying to implement the BPR model on my dataset. My dataset has the following structure:
userID - int
itemID - int
rating - int
In my dataset, I don't have timestamps. How would I implement BPR to this dataset?
I followed the sample_pipeline notebook; I didn't run the get_interaction(filename) function, because I have the dataset in the proper format except the timestamps.
I ran the Train/Test/Validation splits of the sample_pipeline notebook; it throws 'date' key error, so I commented the line (217) in recsys_models/data/init.py
df = df.sort_values(['date'])
It temporarily solves the 'date' key error. I successfully ran the Train/Test/Validation splits part of the sample_pipeline notebook.
Then I jumped to the BPR model part on the notebook, and I ran the cell, which throws me the following error. I think this is the error because I commented the date line in the init.py file maybe because of that I dont get the proper embedding
My question is, is there any way I can use this repository to build the BPR model on my dataset (which doesn't have timestamp) if how could you please tell me how? Or is there any way to solve the following error?
InvalidArgumentError: indices[1430575] = 84369 is not in [0, 78793)
[[{{node embedding_lookup}}]]
Any help is appreciated.
Thanks
This is the Full Traceback:
=== BEGIN Optimization for bpr-mf_5k_0.0001l2_0.0001l2bias ===
200 Max epochs, with early stoppage at 1e-05 Validation AUC change
1000 Iterations per epoch with 512-sized batches
InvalidArgumentError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1333 try:
-> 1334 return fn(*args)
1335 except errors.OpError as e:
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1318 return self._call_tf_sessionrun(
-> 1319 options, feed_dict, fetch_list, target_list, run_metadata)
1320
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1406 self._session, options, feed_dict, fetch_list, target_list,
-> 1407 run_metadata)
1408
InvalidArgumentError: indices[1430575] = 84369 is not in [0, 78793)
[[{{node embedding_lookup}}]]
During handling of the above exception, another exception occurred:
InvalidArgumentError Traceback (most recent call last)
in
39 stopping_threshold=stopping_threshold,
40 sample_columns=['u', 'i'], column_order=['u', 'i', 'j'],
---> 41 n_items=n_items, items_by_user=train_items_by_user
42 )
43
~\Desktop\RecSysModels\RecSysModels-master\recsys_models\pipeline.py in train_model(session, model, train_df, validation_mat, test_mat, n_iterations, batch_size, min_epochs, max_epochs, stopping_threshold, **sampling_kwargs)
41
42 # Get initial validation AUC
---> 43 prior_auc = model.evaluate_auc(session, validation_mat)
44 test_auc = model.evaluate_auc(session, test_mat)
45 print('{} - Prior: {:.5f} Validation AUC, {:.5f} Testing AUC'.format(
~\Desktop\RecSysModels\RecSysModels-master\recsys_models\models_init_.py in evaluate_auc(self, session, input_data)
221 '''
222 # Get predictions
--> 223 ranking_predictions = self._session_run(session, input_data, self.p_uij)[0]
224
225 # This is the magic - it's cheaper to plug it into a pandas DF and then
~\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py in _session_run(self, session, input_batch, *args)
227 self.u: input_batch[:, 0],
228 self.i: input_batch[:, 1],
--> 229 self.j: input_batch[:, 2]
230 }
231 )
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
927 try:
928 result = self._run(None, fetches, feed_dict, options_ptr,
--> 929 run_metadata_ptr)
930 if run_metadata:
931 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1150 if final_fetches or final_targets or (handle and feed_dict_tensor):
1151 results = self._do_run(handle, final_targets, final_fetches,
-> 1152 feed_dict_tensor, options, run_metadata)
1153 else:
1154 results = []
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1326 if handle is None:
1327 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1328 run_metadata)
1329 else:
1330 return self._do_call(_prun_fn, handle, feeds, fetches)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1346 pass
1347 message = error_interpolation.interpolate(message, self._graph)
-> 1348 raise type(e)(node_def, op, message)
1349
1350 def _extend_graph(self):
InvalidArgumentError: indices[1430575] = 84369 is not in [0, 78793)
[[node embedding_lookup (defined at C:\Users\Karthik\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py:128) ]]
Caused by op 'embedding_lookup', defined at:
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in
app.launch_new_instance()
File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
app.start()
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 505, in start
self.io_loop.start()
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 148, in start
self.asyncio_loop.run_forever()
File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 539, in run_forever
self._run_once()
File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 1775, in _run_once
handle._run()
File "C:\ProgramData\Anaconda3\lib\asyncio\events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\ioloop.py", line 690, in
lambda f: self._run_callback(functools.partial(callback, future))
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\ioloop.py", line 743, in _run_callback
ret = callback()
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 781, in inner
self.run()
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 742, in run
yielded = self.gen.send(value)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 357, in process_one
yield gen.maybe_future(dispatch(*args))
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 267, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 534, in execute_request
user_expressions, allow_stdin,
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 294, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2848, in run_cell
raw_cell, store_history, silent, shell_futures)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2874, in _run_cell
return runner(coro)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\async_helpers.py", line 67, in _pseudo_sync_runner
coro.send(None)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3049, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3214, in run_ast_nodes
if (yield from self.run_code(code, result)):
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3296, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 25, in
opt_type=tf.contrib.opt.LazyAdamOptimizer, opt_args={'learning_rate': 0.007})
File "C:\Users\Karthik\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py", line 128, in init
self.u_embeddings = tf.nn.embedding_lookup(self.U_mf, self.u)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\embedding_ops.py", line 316, in embedding_lookup
transform_fn=None)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\embedding_ops.py", line 133, in _embedding_lookup_and_transform
result = _clip(array_ops.gather(params[0], ids, name=name),
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3273, in gather
return gen_array_ops.gather_v2(params, indices, axis, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 3748, in gather_v2
"GatherV2", params=params, indices=indices, axis=axis, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): indices[1430575] = 84369 is not in [0, 78793)
[[node embedding_lookup (defined at C:\Users\Karthik\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py:128) ]]
โ
Hi,
I'm able to train the model on custom dataset. But I'm not able to get suitable recommendations for a user.
I tried writing some custom code but it didn't work.
How do we achieve this?
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.