shuyangli94 / recsysmodels Goto Github PK

View Code? Open in Web Editor NEW

14.0 14.0 5.0 41 KB

TensorFlow implementations of recommender systems models for implicit feedback & sequential actions

License: GNU General Public License v3.0

Python 70.94% Jupyter Notebook 29.06%

recsysmodels's People

Contributors

Stargazers

Watchers

Forkers

genii01 abhisheksoni27 nggianno mindis world4jason

recsysmodels's Issues

AttributeError: module 'tensorflow' has no attribute 'contrib'

Did y used tensorflow version ?

[HELP] How do I run TransRec Model?

I am following the sample_pipeline notebook, I think till Train/Test/Validation split the procedure is the same. Added prior to the validation and test set (u-p-i-j) format. I think I am doing something wrong in the sampling part I am not sure how to sample it properly? Any help is appreciated.

Here's my code:

'''
Run TransRec
'''
import tensorflow as tf
from recsys_models.models.transrec import TransRec
from recsys_models.pipeline import train_model

Set training parameters

max_epochs = 200
n_iterations = 1000
batch_size = 512
stopping_threshold = 1e-5

Get the validation and testing matrices

start = datetime.now()
validation_data = validation_df[['u', 'prior', 'i', 'j']].values
test_data = test_df[['u', 'prior', 'i', 'j']].values
print('{} - Generated u-p-i-j matrices for validation and testing'.format(
datetime.now() - start
))

Initialize the graph

tf.reset_default_graph()
model = TransRec(n_users, n_items, k=5, lambda_emb=1e-4, lambda_bias=1e-4,
opt_type=tf.contrib.opt.LazyAdamOptimizer, opt_args={'learning_rate': 0.007})
print('\n=== BEGIN Optimization for {} ==='.format(model.model_id))
print(' {} Max epochs, with early stoppage at {} Validation AUC change'.format(max_epochs, stopping_threshold))
print(' {} Iterations per epoch with {}-sized batches'.format(n_iterations, batch_size))

Open session and initialize graph weights

session = tf.Session()
session.run(tf.global_variables_initializer())

Train the model!

model, train_auc, validation_auc, test_auc = train_model(
session, model, train_df, validation_data, test_data,
n_iterations=n_iterations, batch_size=batch_size,
min_epochs=10, max_epochs=max_epochs,
stopping_threshold=stopping_threshold,
sample_columns=['u', 'i'], column_order=['u', 'i', 'j'],
n_items=n_items, items_by_user=train_items_by_user
)

Save model

suffix = '_transrec'
full_model_id = '{}{}'.format(model.model_id, suffix)
model_folder = os.path.join('tf_models', full_model_id)
if not os.path.exists(model_folder):
os.makedirs(model_folder)
model.save(session, 'tf_models', suffix=suffix)
print('{} - Saved model to {}'.format(
datetime.now() - start, model_folder
))

Cleanup

session.close()
gc.collect()
print()

Attached is the Full Traceback:

[HELP] How to run BPR model on my own dataset

Hi,

I am trying to implement the BPR model on my dataset. My dataset has the following structure:
userID - int
itemID - int
rating - int
In my dataset, I don't have timestamps. How would I implement BPR to this dataset?

I followed the sample_pipeline notebook; I didn't run the get_interaction(filename) function, because I have the dataset in the proper format except the timestamps.

I ran the Train/Test/Validation splits of the sample_pipeline notebook; it throws 'date' key error, so I commented the line (217) in recsys_models/data/init.py

df = df.sort_values(['date'])

It temporarily solves the 'date' key error. I successfully ran the Train/Test/Validation splits part of the sample_pipeline notebook.

Then I jumped to the BPR model part on the notebook, and I ran the cell, which throws me the following error. I think this is the error because I commented the date line in the init.py file maybe because of that I dont get the proper embedding

My question is, is there any way I can use this repository to build the BPR model on my dataset (which doesn't have timestamp) if how could you please tell me how? Or is there any way to solve the following error?

InvalidArgumentError: indices[1430575] = 84369 is not in [0, 78793)
[[{{node embedding_lookup}}]]

Any help is appreciated.

Thanks

This is the Full Traceback:

=== BEGIN Optimization for bpr-mf_5k_0.0001l2_0.0001l2bias ===
200 Max epochs, with early stoppage at 1e-05 Validation AUC change
1000 Iterations per epoch with 512-sized batches

InvalidArgumentError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1333 try:
-> 1334 return fn(*args)
1335 except errors.OpError as e:

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1318 return self._call_tf_sessionrun(
-> 1319 options, feed_dict, fetch_list, target_list, run_metadata)
1320

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1406 self._session, options, feed_dict, fetch_list, target_list,
-> 1407 run_metadata)
1408

InvalidArgumentError: indices[1430575] = 84369 is not in [0, 78793)
[[{{node embedding_lookup}}]]

During handling of the above exception, another exception occurred:

InvalidArgumentError Traceback (most recent call last)
in
39 stopping_threshold=stopping_threshold,
40 sample_columns=['u', 'i'], column_order=['u', 'i', 'j'],
---> 41 n_items=n_items, items_by_user=train_items_by_user
42 )
43

~\Desktop\RecSysModels\RecSysModels-master\recsys_models\pipeline.py in train_model(session, model, train_df, validation_mat, test_mat, n_iterations, batch_size, min_epochs, max_epochs, stopping_threshold, **sampling_kwargs)
41
42 # Get initial validation AUC
---> 43 prior_auc = model.evaluate_auc(session, validation_mat)
44 test_auc = model.evaluate_auc(session, test_mat)
45 print('{} - Prior: {:.5f} Validation AUC, {:.5f} Testing AUC'.format(

~\Desktop\RecSysModels\RecSysModels-master\recsys_models\models_init_.py in evaluate_auc(self, session, input_data)
221 '''
222 # Get predictions
--> 223 ranking_predictions = self._session_run(session, input_data, self.p_uij)[0]
224
225 # This is the magic - it's cheaper to plug it into a pandas DF and then

~\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py in _session_run(self, session, input_batch, *args)
227 self.u: input_batch[:, 0],
228 self.i: input_batch[:, 1],
--> 229 self.j: input_batch[:, 2]
230 }
231 )

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
927 try:
928 result = self._run(None, fetches, feed_dict, options_ptr,
--> 929 run_metadata_ptr)
930 if run_metadata:
931 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1150 if final_fetches or final_targets or (handle and feed_dict_tensor):
1151 results = self._do_run(handle, final_targets, final_fetches,
-> 1152 feed_dict_tensor, options, run_metadata)
1153 else:
1154 results = []

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1326 if handle is None:
1327 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1328 run_metadata)
1329 else:
1330 return self._do_call(_prun_fn, handle, feeds, fetches)

C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1346 pass
1347 message = error_interpolation.interpolate(message, self._graph)
-> 1348 raise type(e)(node_def, op, message)
1349
1350 def _extend_graph(self):

InvalidArgumentError: indices[1430575] = 84369 is not in [0, 78793)
[[node embedding_lookup (defined at C:\Users\Karthik\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py:128) ]]

Caused by op 'embedding_lookup', defined at:
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in
app.launch_new_instance()
File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
app.start()
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 505, in start
self.io_loop.start()
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 148, in start
self.asyncio_loop.run_forever()
File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 539, in run_forever
self._run_once()
File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 1775, in _run_once
handle._run()
File "C:\ProgramData\Anaconda3\lib\asyncio\events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\ioloop.py", line 690, in
lambda f: self._run_callback(functools.partial(callback, future))
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\ioloop.py", line 743, in _run_callback
ret = callback()
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 781, in inner
self.run()
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 742, in run
yielded = self.gen.send(value)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 357, in process_one
yield gen.maybe_future(dispatch(*args))
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 267, in dispatch_shell
yield gen.maybe_future(handler(stream, idents, msg))
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 534, in execute_request
user_expressions, allow_stdin,
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 294, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 536, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2848, in run_cell
raw_cell, store_history, silent, shell_futures)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2874, in _run_cell
return runner(coro)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\async_helpers.py", line 67, in _pseudo_sync_runner
coro.send(None)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3049, in run_cell_async
interactivity=interactivity, compiler=compiler, result=result)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3214, in run_ast_nodes
if (yield from self.run_code(code, result)):
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3296, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 25, in
opt_type=tf.contrib.opt.LazyAdamOptimizer, opt_args={'learning_rate': 0.007})
File "C:\Users\Karthik\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py", line 128, in init
self.u_embeddings = tf.nn.embedding_lookup(self.U_mf, self.u)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\embedding_ops.py", line 316, in embedding_lookup
transform_fn=None)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\embedding_ops.py", line 133, in _embedding_lookup_and_transform
result = _clip(array_ops.gather(params[0], ids, name=name),
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3273, in gather
return gen_array_ops.gather_v2(params, indices, axis, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 3748, in gather_v2
"GatherV2", params=params, indices=indices, axis=axis, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): indices[1430575] = 84369 is not in [0, 78793)
[[node embedding_lookup (defined at C:\Users\Karthik\Desktop\RecSysModels\RecSysModels-master\recsys_models\models\bpr.py:128) ]]

How do we get recommendations after training the model?

Hi,

I'm able to train the model on custom dataset. But I'm not able to get suitable recommendations for a user.
I tried writing some custom code but it didn't work.
How do we achieve this?

Thanks.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.