Giter VIP home page Giter VIP logo

fuxictr's Introduction

RecZoo

RecZoo: A curated model zoo for recommendation tasks

Matching

No Model Publication
1 UltraGCN Kelong Mao, Jieming Zhu, Xi Xiao, Biao Lu, Zhaowei Wang, Xiuqiang He. UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation, in CIKM 2021.
2 SimpleX Kelong Mao, Jieming Zhu, Jinpeng Wang, Quanyu Dai, Zhenhua Dong, Xi Xiao, Xiuqiang He. SimpleX: A Simple and Strong Baseline for Collaborative Filtering, in CIKM 2021.

Ranking

No Model Publication
1 FinalMLP Kelong Mao, Jieming Zhu, Liangcai Su, Guohao Cai, Yuru Li, Zhenhua Dong. FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction, in AAAI 2023.
2 FinalNet Jieming Zhu, Qinglin Jia, Guohao Cai, Quanyu Dai, Jingjie Li, Zhenhua Dong, Ruiming Tang, Rui Zhang. FINAL: Factorized Interaction Layer for CTR Prediction, in SIGIR 2023.

Reranking

Pretraining

No Model Publication
1 UNBERT Qi Zhang, Jingjie Li, Qinglin Jia, Chuyuan Wang, Jieming Zhu, Zhaowei Wang, Xiuqiang He. UNBERT: User-News Matching BERT for News Recommendation, in IJCAI 2021.

fuxictr's People

Contributors

liangcaisu avatar lsjsj92 avatar lu-minous avatar rsj123 avatar sdilbaz avatar xpai avatar zhujiem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuxictr's Issues

Implementation of the model DIN

I am not sure whether the implementation of DIN here is consistent with the original official released code DIN or DIEN.
Given query_field (e.g., query Goods ID, query Cate ID) and history_field (e.g., Goods ID, Cate ID), in the original implementation, they first concatenate the embeddings of history_field (i.e., [Goods ID embedding, Cate ID embedding]) and then perform so-called "local activation unit (LAU)" with the concatenated query_field (i.e., [query Goods ID embedding, query Cate ID embedding]). In contrast, the implementation here seems to first perform the LAU then do the concatenation.
Specifically, the code in DIN.py:

# 1. perform the LAU for each history_field
for idx, (din_query_field, din_history_field) \
            in enumerate(zip(self.din_query_field, self.din_history_field)):
            item_emb = feature_emb_dict[din_query_field]
            history_sequence_emb = feature_emb_dict[din_history_field]
            pooled_history_emb = self.attention_layers[idx](item_emb, history_sequence_emb)
            feature_emb_dict[din_history_field] = pooled_history_emb
# 2. do the concatenation here
feature_emb = self.embedding_layer.dict2tensor(feature_emb_dict)

If my understanding is correct, I wonder whether this will affect the final performance compared with the original implementation?

upgrade to polar 1.2

when upgrade to polar 1.2, some functions used are deprecated. For example.
in criteo.py

return pl.col(col_name).apply(_convert_to_bucket).cast(pl.Int32)

    return pl.col(col_name).map_elements(_convert_to_bucket, return_dtype=pl.Int32)

the function "apply” is not used anymore.

训练代码epoch为100的疑问

您好,尝试您提供的deepFM等模型时,发现epoch为100时在criteo数据集上取得较好的效果,而推荐算法往往1个epoch就可以收敛,请问这里如何运行如此多个epoch,还能取得较好的效果?在百忙之中打扰到您~

Many hyperlinks are no longer valid, responding with 404. Please update.

MultiHeadAttention bug when num_heads is greater than 1

The code of splitting heads is buggy, without transpose the shape. The buggy code impacts AutoInt, DESTINE, InterHAt

query = query.view(batch_size * self.num_heads, -1, self.attention_dim)
key = key.view(batch_size * self.num_heads, -1, self.attention_dim)
value = value.view(batch_size * self.num_heads, -1, self.attention_dim)

The buggy code is from the reference code https://zhuanlan.zhihu.com/p/47812375

The correct code should be:

query = query.view(batch_size, seq_len, self.num_heads, self.attention_dim).transpose(1, 2)
key = key.view(batch_size, seq_len, self.num_heads, self.attention_dim).transpose(1, 2)
value = value.view(batch_size, seq_len, self.num_heads, self.attention_dim).transpose(1, 2)

Maybe a bug found in feature_preprocess.py

I am testing the latest commit of this repo with the criteo dataset. There is a self-defined preprocessor "convert_to_bucket"。 This preprocessor need a column name as its argument. I got error in this piece of code:


1.     def preprocess(self, ddf):
2.         logging.info("Preprocess feature columns...")
3.         all_cols = self.label_cols + self.feature_cols[::-1]
4.         print(all_cols)
5.         for col in all_cols:
6.             name = col["name"]
7.             #if name in ddf.columns:
8.             if name in  ddf.collect_schema().names():
9.                 print(name)
10.                 fill_na = "" if col["dtype"] in ["str", str] else 0
11.                 fill_na = col.get("fill_na", fill_na)
12.                 ddf = ddf.with_columns(pl.col(name).fill_null(fill_na))
13.             if col.get("preprocess"):
14.                 preprocess_args = re.split(r"\(|\)", col["preprocess"])
15.                 preprocess_fn = getattr(self, preprocess_args[0])
16.                 print(preprocess_args)
17.                 ddf = ddf.with_columns(
18.                     #preprocess_fn(*preprocess_args[1:-1])
19.                     preprocess_fn(name, *preprocess_args[1:-1])
20.                     .alias(name)
21.                     .cast(self.dtype_dict[name])
22.                 )
23.         active_cols = [col["name"] for col in all_cols if col.get("active") != False]
24.         ddf = ddf.select(active_cols)
25.         return ddf

The error happened at line 18, where if no arguments are provided, ddf.with_columns(preprocess_fn().alias(name).cast(xxx) will throw exception. At least the column name should be given to this function. I have tested my change, it seems right. So does this mean, we should add the column name into the function arguments at least?

Training on original Criterio Dataset.

Hey, I have a question about training DeepFM on originial dataset Criterio dataset. Is this possible with the code provided in the repository? Dataset presented in demo version has 19 colummns. I mean train_sample.csv, test_sample.csv etc. Are these columns from Criterio dataset? How can I use original Criterio dataset when the data from this dataset is numerical and the caterogical columns are hashed. How to handle no labels in test set?

Where is the paper for FINAL model?

Where is the paper for FINAL model (FINAL: factorized interaction layer for ctr prediction)? I can't find it on the internet. Based on the source code, FINAL model uses multiplicative feature interactions. But I want to find the paper to gain more insight into the model. I would appreciate it if you could provide the paper.

Sequence feature in demo "DeepFM_with_sequence_feature.py".

Should field "sequence" share embeddings with the field "adgroup_id"? I found that the method "encoder.fit()" assigns the encoder such as tokenizer for each field. Since the given tiny datasets record the user historical behavior (ad sequence), then in my understanding that the id that appeared in the field "sequence" may also appear in the field "adgroup_id". As a result, it seems that the field "sequence" should share the same encoder (i.e., tokenizer) with the field "adgroup_id", but the demo "DeepFM_with_sequence_feature.py" gives separate encoders for these two fields.

Normalizer and NaN values.

For StandardScaler, looks like it supports NaN values, see class Normalizer:

null_index = np.isnan(X)

However, during preprocess, _fill_na() will fill na_value for non-string.
So

  • for dtype=str, the X values will be string
  • for dtype=float/int, the X values will be na_value

In the first case, np.isnan will throw an error because X elements are of string type.
In the second case, there is no point to normalize numbers if we have a na_value there.

Is this behavior expected or not?

How to save model for tf serving?

I want to save a model such as DCN_tf for serving. If I add 'model.save("path/to/model")' at the end of the run_expid.py, error occurs as
cannot be saved either because the input shape is not available or because the forward pass of the model is not defined.To define a forward pass, please override Model.call(). To specify an input shape, either call build(input_shape) directly, or call the model on actual data using Model(), Model.fit(), or Model.predict(). If you have a custom training step, please make sure to invoke the forward pass in train step through Model.__call__, i.e. model(inputs), as opposed to model.call().

I have added 'model(input)' before model.save() as the following code and it works. However, when I want to curl to call the service, I must specify the 'clk' which is already specified as label.

for i in train_gen:
    model(i)
    break
model.save("./model/SavedModels/1")

Thank you!

feature embedding bug?

Hi,

I think, I found a bug in the feature_embedding.py file in line 94:
self.embedding_layers[feature] = tf.keras.layers.Dense(feat_emb_dim, user_bias=False)

According to the tensorflow docs (https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense), there's no argument user_bias but an argument called use_bias. I also get the following error message, when I run DeepFM with numeric features: TypeError: ('Keyword argument not understood:', 'user_bias')

expid repeat

when i run "python run_expid.py --config /..../ --expid 191 --gpu 0" in my code,i get 2 results log .they are ..._tune_191_ahdfy68.log and ..._tune_051_os191usd.log.In second log file name,it also has the string of '191'.

Problem with BN of MLP_Block in Pytorch version

Hi, I found that MLP_Block didn't work properly due to BN when the length of the tensor input was greater than or equal to 3.

Here is a toy example.

a = torch.randn(16, 128, 64)
mlp_block = MLP_Block(input_dim=64, hidden_units=[1024,1024,1024],batch_norm=True)
mlp_block(a)

The output is:

RuntimeError: running_mean should contain 128 elements not 1024

Adding new model leads to AttributeError: module 'model_zoo' has no attribute 'DCNv2PositiveWeight'

I have downloaded the newest version of FuxiCTR library and decided to integrate the work from https://github.com/SkylerLinn/Understanding-the-Ranking-Loss
Specifically their DCNv2PositiveWeight and DCNv2ListCE.
I have created a folder under model_zoo like this for DCNv2PositiveWeight

model_zoo
    DCNv2PositiveWeight
         config
              dataset_config
              model_config
         src
              \_\_init\_\_.py
              DCNv2PositiveWeight.py
         fuxictr_version.py
         run_expid.py
         README.md 

In the __init__.py under model_zoo I have added:
from .DCNv2PositiveWeight.src import DCNv2PositiveWeight

In the __init__.py under DCNv2PositiveWeight/src I have added:
from .DCNv2PositiveWeight import DCNv2PositiveWeight

And finally the code for DCNv2PositiveWeight is the same as in the repo Understanding-the-Ranking loss:
import torch
from torch import nn
from fuxictr.pytorch.models import BaseModel
from fuxictr.pytorch.layers import FeatureEmbedding, MLP_Block, CrossNetV2, CrossNetMix
from fuxictr.metrics import evaluate_metrics
from sklearn.metrics import roc_auc_score, log_loss
import numpy as np

class DCNv2PositiveWeight(BaseModel):
    def __init__(self, 
                 feature_map, 
                 model_id="DCNv2PositiveWeight", 
                 gpu=-1,
                 model_structure="parallel",
                 use_low_rank_mixture=False,
                 low_rank=32,
                 num_experts=4,
                 learning_rate=1e-3, 
                 embedding_dim=10, 
                 stacked_dnn_hidden_units=[], 
                 parallel_dnn_hidden_units=[],
                 dnn_activations="ReLU",
                 num_cross_layers=3,
                 net_dropout=0, 
                 batch_norm=False, 
                 embedding_regularizer=None,
                 net_regularizer=None, 
                 pos_weight=1.,
                 **kwargs):
        super(DCNv2PositiveWeight, self).__init__(feature_map, 
                                    model_id=model_id, 
                                    gpu=gpu, 
                                    embedding_regularizer=embedding_regularizer, 
                                    net_regularizer=net_regularizer,
                                    **kwargs)
        self.embedding_layer = FeatureEmbedding(feature_map, embedding_dim)
        self.pos_weight = pos_weight
        
        input_dim = feature_map.sum_emb_out_dim()
        if use_low_rank_mixture:
            self.crossnet = CrossNetMix(input_dim, num_cross_layers, low_rank=low_rank, num_experts=num_experts)
        else:
            self.crossnet = CrossNetV2(input_dim, num_cross_layers)
        self.model_structure = model_structure
        assert self.model_structure in ["crossnet_only", "stacked", "parallel", "stacked_parallel"], \
               "model_structure={} not supported!".format(self.model_structure)
        if self.model_structure in ["stacked", "stacked_parallel"]:
            self.stacked_dnn = MLP_Block(input_dim=input_dim,
                                         output_dim=None, # output hidden layer
                                         hidden_units=stacked_dnn_hidden_units,
                                         hidden_activations=dnn_activations,
                                         output_activation=None, 
                                         dropout_rates=net_dropout,
                                         batch_norm=batch_norm)
            final_dim = stacked_dnn_hidden_units[-1]
        if self.model_structure in ["parallel", "stacked_parallel"]:
            self.parallel_dnn = MLP_Block(input_dim=input_dim,
                                          output_dim=None, # output hidden layer
                                          hidden_units=parallel_dnn_hidden_units,
                                          hidden_activations=dnn_activations,
                                          output_activation=None, 
                                          dropout_rates=net_dropout, 
                                          batch_norm=batch_norm)
            final_dim = input_dim + parallel_dnn_hidden_units[-1]
        if self.model_structure == "stacked_parallel":
            final_dim = stacked_dnn_hidden_units[-1] + parallel_dnn_hidden_units[-1]
        if self.model_structure == "crossnet_only": # only CrossNet
            final_dim = input_dim
        self.fc = nn.Linear(final_dim, 1)
        self.compile(kwargs["optimizer"], kwargs["loss"], learning_rate)
        self.reset_parameters()
        self.model_to_device()

    def forward(self, inputs):
        X = self.get_inputs(inputs)
        feature_emb = self.embedding_layer(X, flatten_emb=True)
        cross_out = self.crossnet(feature_emb)
        if self.model_structure == "crossnet_only":
            final_out = cross_out
        elif self.model_structure == "stacked":
            final_out = self.stacked_dnn(cross_out)
        elif self.model_structure == "parallel":
            dnn_out = self.parallel_dnn(feature_emb)
            final_out = torch.cat([cross_out, dnn_out], dim=-1)
        elif self.model_structure == "stacked_parallel":
            final_out = torch.cat([self.stacked_dnn(cross_out), self.parallel_dnn(feature_emb)], dim=-1)
        y_pred = self.fc(final_out)
        y_pred = self.output_activation(y_pred)
        return_dict = {"y_pred": y_pred}
        return return_dict

    def compute_loss(self, return_dict, y_true):
        weight = torch.where(y_true==1., self.pos_weight, 1)
        loss = self.loss_fn(return_dict["y_pred"], y_true, reduction='mean',weight=weight)
        loss += self.regularization_loss()
        return loss
    
    def evaluate_metrics(self, y_true, y_pred, metrics, group_id=None):
        print(metrics)
        ret_dict = dict()
        if 'wAUC' in metrics:
            sample_weight=np.where(y_true==1., self.pos_weight, 1.)
            ret_dict.update({'wAUC':roc_auc_score(y_true, y_pred, sample_weight=sample_weight, average='samples')})
        if 'wlogloss' in metrics:
            sample_weight=np.where(y_true==1., self.pos_weight, 1.)
            ret_dict.update({'wlogloss':log_loss(y_true=y_true, y_pred=y_pred,sample_weight=sample_weight)})
        tmp = [_ for _ in metrics if _ not in ['wAUC','wlogloss']]
        ret_dict.update(evaluate_metrics(y_true, y_pred, tmp, group_id))
        return ret_dict

I call a script that calls autotuner.enumerate_params and autotuner.grid_search. The autotuner.grid_search calls the run_expid.py inside the experiment folder - which I modified to just print

print('model_zoo options: ', dir(model_zoo))

However this doesn't include the DCNv2PositiveWeight instead this is the output:

...
Traceback (most recent call last):
  File "/Users/dsaranovic/Code/FieldEndToEndLoss/experiment/run_expid.py", line 68, in <module>
    model_class = getattr(model_zoo, params['model'])
AttributeError: module 'model_zoo' has no attribute 'DCNv2PositiveWeight'
model_zoo options:  ['AFM', 'AFN', 'AOANet', 'AutoInt', 'BST', 'CCPM', 'DCN', 'DCNv2', 'DESTINE', 'DIEN', 'DIN', 'DLRM', 'DMIN', 'DMR', 'DNN', 'DSSM', 'DeepCrossing', 'DeepFM', 'DeepIM', 'EDCN', 'ETA', 'FFM', 'FFMv2', 'FGCNN', 'FLEN', 'FM', 'FiBiNET', 'FiGNN', 'FinalMLP', 'FinalNet', 'FmFM', 'FwFM', 'HFM', 'HOFM', 'InterHAt', 'LR', 'LorentzFM', 'MMoE', 'MaskNet', 'NFM', 'ONN', 'ONNv2', 'PEPNet', 'PNN', 'PPNet', 'SAM', 'SDIM', 'SharedBottom', 'WideDeep', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'multitask', 'xDeepFM']
....

How can we add additional models to the model_zoo such that the module sees those models and that we can run them on full data using autotuner (enumerate_params, grid_search) and run_expid.py ?

I feel like I am missing something obvious?

HFM may not be available due to compatibility issues

HFM adopts HolographicInteractionLayer in fuxictr/pytorch/layers/interaction.py.

However, HolographicInteractionLayer may be not available in Pytorch 1.10 because torch.rfft/torch.irfft has changed.\

Here is a solution for reference:

try:
    from torch import irfft
    from torch import rfft
except ImportError:
    from torch.fft import irfft2
    from torch.fft import rfft2
    def rfft(x, d):
        t = rfft2(x, dim = (-d))
        return torch.stack((t.real, t.imag), -1)
    def irfft(x, d, signal_sizes):
        return irfft2(torch.complex(x[:,:,0], x[:,:,1]), s = signal_sizes, dim = (-d))

在复现TransAct的时候验证数据经过embedding映射后全变成了NaN

具体的问题代码定位在这里

Line157        X = self.get_inputs(inputs)
Line158        feature_emb_dict = self.embedding_layer(X)

对于训练阶段没有出现任何问题,但是在评估阶段对于验证数据经过这个embedding_layer后全变成了NaN,经过验证embedding_layer的所有权重全都是NaN,请问这里是哪里出现了问题呢(验证数据也有效加载进来了,X这里对于验证集是正常的)?

2024-07-21 15:36:35,644 P7079 INFO Evaluation @epoch 1 - batch 1: 
                                                                                                                                                                                                                                   Warning: NaN value detected in the weights of embedding_layers.userid.weight in the embedding layer                                                                                                            | 0/1 [00:00<?, ?it/s]
Warning: NaN value detected in the weights of embedding_layers.adgroup_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.pid.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.cate_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.campaign_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.customer.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.brand.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.cms_segid.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.cms_group_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.final_gender_code.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.age_level.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.pvalue_level.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.shopping_level.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.occupation.weight in the embedding layer

关于自己写的一个简单的新模型

我自己设计了一个非常简单的模型,非常简单(类似于MaskedNet),运行速度很快。因为现在毕业了没有GPU资源,所以几乎没有经过调参,但是准确率仅次于FinalMLP,请问下你们有兴趣接手一下吗?

ACC metric cant't work

image
ypred is one dimension tensor,don't have the axis=1
image
In base_model.py ypred have been transform into 1 dimension

[Suggestion] Update the logic of preprocessing for efficiency

Suggest to update the logic of preprocessing for efficiency.

In many cases, a user's behavior sequence is the same for all training samples. In addition, the features of a user or an item are often the same for all training samples.

However, the current version of FuxiCTR receives the training dataset as a single DataFrame, so these features (e.g., a user's behavior sequence, the features of a user or an item) should be stored redundantly in that DataFrame, which consumes too much memory (especially, in large-scale dataset). Also, fit/transform of feature_preprocessor should be performed on redundant behavior sequences and features, which takes too long (especially, in large-scale dataset).

So, to operate more efficiently, I hope these redundancies are removed. To this end, I suggest to change the logic of preprocessing to receive user_df and item_df for each dataset together and fit/transform unique features (i.e., user_df and item_df).

求教无法使用torch.parallel的问题

最近在FuxiCTR的基础上实现了自己的模型,由于显存吃紧,想使用torch.parallel模块在两块GPU上训练。但是无论如何设置,模型始终在一个GPU上运行。请问该如何解决?

Call for model implementations

Streaming data loading, evaluation is not performed in epoch end.

len(data_generator) != the actual batches in npz_block_dataloader

2024-04-14 11:01:33,212 P42965 INFO Evaluation @epoch 2 - batch 1060:
2024-04-14 16:15:11,489 P42965 INFO Evaluation @epoch 3 - batch 2120:
2024-04-14 21:28:40,049 P42965 INFO Evaluation @epoch 4 - batch 3180:
2024-04-15 02:43:36,228 P42965 INFO Evaluation @epoch 5 - batch 4240:

torch.jit.trace not working

torch.jit.trace(fibinet, batch_data)
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/jit/_trace.py", line 794, in trace
return trace_module(
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/jit/_trace.py", line 1056, in trace_module
module._c._create_method_from_trace(
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 3 were given

Emb_LayerNorm bug in MaskNet

The paper uses concat(LN(e1), LN(e2),..., LN(ef)), but the code use nn.LayerNorm([feature_map.num_fields, embedding_dim]) . This makes normalization happen in the last two dimensions.

FNN model problem

image
We can't find a function called self.reduce_learning_rate() in Base model ,does it have change into self.lr_decay()?

生成的数据格式疑问?

example1_build_dataset_to_npz.py运行完之后,生成的数据test train valid都没有尾缀.npz呀,后面读取文件的时候就会报错例如不存在train.npz
另外强制加了npz尾缀之后,读取又显示ValueError: Cannot load file containing pickled data when allow_pickle=False
感谢解答!

sequence feature

41/5000
When I try to use sequence characteristics, the program has an array out of bounds error. Could you please give an example of sequence features.

Failed to run FINAL

Traceback (most recent call last):
File "run_expid.py", line 69, in
model.fit(train_gen, validation_data=valid_gen, **params)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 154, in fit
self.train_epoch(data_generator)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 210, in train_epoch
loss = self.train_step(batch_data)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 193, in train_step
loss = self.get_total_loss(batch_data)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 90, in get_total_loss
total_loss = self.add_loss(inputs) + self.add_regularization()
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 107, in add_loss
return_dict = self.forward(inputs)
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 87, in forward
y1 = self.forward1(feature_emb)
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 96, in forward1
X = self.field_gate(X)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 132, in forward
if gate_residual == "concat":
NameError: name 'gate_residual' is not defined

in file FINAL.py, line 130:

    def forward(self, feature_emb):
        gates = self.linear(feature_emb.transpose(1, 2)).transpose(1, 2)
        if gate_residual == "concat":
            out = torch.cat([feature_emb, feature_emb * gates], dim=1) # b x 2f x d
        else:
            out = feature_emb + feature_emb * gates
        return out

the code if gate_residual == "concat": should be if self.gate_residual == "concat":

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.