sh0416 / bpr Goto Github PK
View Code? Open in Web Editor NEWBayesian Personalized Ranking using PyTorch
License: GNU General Public License v3.0
Bayesian Personalized Ranking using PyTorch
License: GNU General Public License v3.0
For now, I build up scan with only 1024 parallel threads (due to threadblock size issue) and add block sum sequentially.
It is reasonably fast, but Blelloch could give more improvement when VariableShapeList
pack more than 1,000,000 elements.
Hi, thanks for sharing your code.
Based on your code, I try to implement my model:
`
class BPR_Item(nn.Module):
def __init__(self, user_size, item_size, dim, weight_decay):
super().__init__()
self.W = nn.Parameter(torch.empty(user_size, dim))
self.H = nn.Parameter(torch.empty(item_size, dim))
self.B = nn.Parameter(torch.rand(item_size))
nn.init.xavier_normal_(self.W.data)
nn.init.xavier_normal_(self.H.data)
self.weight_decay = weight_decay
def forward(self, u, i, j):
"""Return loss value.
Args:
u(torch.LongTensor): tensor stored user indexes. [batch_size,]
i(torch.LongTensor): tensor stored item indexes which is prefered by user. [batch_size,]
j(torch.LongTensor): tensor stored item indexes which is not prefered by user. [batch_size,]
Returns:
torch.FloatTensor
"""
u1 = self.W[u, :]
i1 = self.H[i, :]
j1 = self.H[j, :]
#print(j)
bi = self.B[i]
bj = self.B[j]
x_ui = torch.mul(u1, i1).sum(dim=1) + bi
x_uj = torch.mul(u1, j1).sum(dim=1) + bj
x_uij = x_ui - x_uj
log_prob = F.logsigmoid(x_uij).sum()
regularization = self.weight_decay * (u1.norm(dim=1).pow(2).sum() + i1.norm(dim=1).pow(2).sum() + j1.norm(dim=1).pow(2).sum()+ bi.pow(2) + bj.pow(2))
return -log_prob + regularization
def recommend(self, u, i, j):
u1 = self.W[u, :]
i1 = self.H[i, :]
j1 = self.H[j, :]
bi = self.B[i]
bj = self.B[j]
x_ui = torch.mul(u1, i1).sum(dim=1) + bi
x_uj = torch.mul(u1, j1).sum(dim=1) + bj
return x_ui, x_uj`
the only difference is that I added a self.B, and in training period it raises an error:
loss.backward()
RuntimeError: grad can be implicitly created only for scalar outputs
It's weird since the loss calculation is done by torch.sum() in BPR_Item, could you please give some advice on this issue?
(my torch version is 1.8.1)
Thank you very much!!!
Right after the game starts, the Player Information dialog will prompt the players to enter the number of players (between 2 and 8). Each player will then be prompted for their name, which may not be an empty string. If cancel is pressed the game exits gracefully.
4 (Relative measure)
1 is most important.
The cashier has a new customer.
The cashier scans three cans of beans ($0.99 each), two pounds of spinach ($0.59/lb), and a toothbrush ($2.00)
The receipt has all of the scanned items and their correctly listed prices.
Hi, there.
Thank you for sharing your code, I am confused about smooth loss and loss you mentioned in train.py
could you give me more details about the differences between these two loss , it seems that smooth loss doesn't participate in backpropagation
optimizer.zero_grad()
loss = model(u, i, j)
loss.backward()
optimizer.step()
writer.add_scalar('train/loss', loss, idx)
smooth_loss = smooth_loss*0.99 + loss*0.01
thank you very much!
In ML dataset, All data are visited at most one time.
However, in the other dataset, a item might be referenced multiple times by one user.
In this case, how to handle when Item i is both inside the training set and the test set?
Do I need to remove the item i from the test set?
If I remove the item i, then evaluation metric represents the predictive power of the model which has not seen yet. However, it does not capture the repetitive pattern inside a user.
If The item i is not removed from the test set, then evaluation metric could become vulnerable to overfitting. However, it could represent the repetitive pattern.
The current implementation excludes the observed item, i.e. item from the training set, during splitting data and evaluation process.
To make an additional option for this consideration will be a good solution for handling this issue.
Hi,
In the function len of the pytorch dataset you return 10*len(self.pair)
. Could you explain why?
If I debug the code, len(self.pair) would be equals to 802,406, which I understand as the number of links of the training set (as the total number of links for ml-1M is 1M). However, if you multiply it by a factor 10, you get a dataset of len 8,024,060, which I don't understand.
Thank you in advance!
Think about it
Current implementation prepares a batch, i.e. set of triplet, in CPU.
However, bpr use small hidden dimension size, i.e. 4 or 8, which reports low gpu utilization.
Therefore, it will be possible if there is idle computation power inside the gpu.
Preparing a batch requires some sampling from tuple and providing a negative item for each tuple which can be implemented easily.
Line 21 in 899582d
this line, maybe u should write self.batch_size
instead of args.batch_size
?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.