reczoo / fuxictr Goto Github PK

View Code? Open in Web Editor NEW

850.0 10.0 147.0 2.07 MB

A configurable, tunable, and reproducible library for CTR prediction https://fuxictr.github.io

License: Apache License 2.0

Python 99.34% Shell 0.66%

ctr-prediction recommender-systems ctr cvr pytorch

fuxictr's Introduction

RecZoo

RecZoo: A curated model zoo for recommendation tasks

Matching

No	Model	Publication
1	UltraGCN	Kelong Mao, Jieming Zhu, Xi Xiao, Biao Lu, Zhaowei Wang, Xiuqiang He. UltraGCN: Ultra Simplification of Graph Convolutional Networks for Recommendation, in CIKM 2021.
2	SimpleX	Kelong Mao, Jieming Zhu, Jinpeng Wang, Quanyu Dai, Zhenhua Dong, Xi Xiao, Xiuqiang He. SimpleX: A Simple and Strong Baseline for Collaborative Filtering, in CIKM 2021.

Ranking

No	Model	Publication
1	FinalMLP	Kelong Mao, Jieming Zhu, Liangcai Su, Guohao Cai, Yuru Li, Zhenhua Dong. FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction, in AAAI 2023.
2	FinalNet	Jieming Zhu, Qinglin Jia, Guohao Cai, Quanyu Dai, Jingjie Li, Zhenhua Dong, Ruiming Tang, Rui Zhang. FINAL: Factorized Interaction Layer for CTR Prediction, in SIGIR 2023.

Reranking

Pretraining

No	Model	Publication
1	UNBERT	Qi Zhang, Jingjie Li, Qinglin Jia, Chuyuan Wang, Jieming Zhu, Zhaowei Wang, Xiuqiang He. UNBERT: User-News Matching BERT for News Recommendation, in IJCAI 2021.

fuxictr's People

Contributors

Stargazers

Watchers

Forkers

songweiping zhyj3038 jxlijunhao wosu xuejj w32zhong kobedeshow yilinmaster fword tuozhanjun aliang-rec jq-lin gqgl chenmoshushi john-james-ai xiaoqingwang yesthing shenguangyuan botak0803 spantarxp 14027204 jiangxubin delaiahz liuxinman samins dustofstars flavorfan ldm0213 yiyualt minyangp-ou muxuezi wangfangye haiyuni ethan-tz qianrenjian lizhengli10086 wangych6 atomu2014 yonghao206 nickyongzhang yuejiang-li tiffen chengzhoulee hitum-dev louis-xuy limingmingli321 godblesssakura zhaochengd guoshenli varunsaagar96 lsjsj92 jisu1013 zhanglangjd aqiang520 dataal bahuafeng jessica-legend nijzhang helenligit yaoxunliu bjjacking klisly jesse-yan zhengxiang1994 aojugg wzx-zzsdad myyxy beijinhe d1research toan01-uet cziszero mikewxp alima13 huaxz1986 maylag zhaozhiyong19890102 aksudogukan skylerlinn yanyusong tiecheng702 xujialu405 spongezz wudingfengbo smartli8 shubhampachori12110095 thu-sigs-rec zhangchunlei0813 ada1582 dotrado youdaodc yunjiao-chen chenls20 ffffkl liangcaisu penganzhen sunorset airxiechao pirpip bigandsweet yaoyuanzhou

fuxictr's Issues

Implementation of the model DIN

I am not sure whether the implementation of DIN here is consistent with the original official released code DIN or DIEN.
Given query_field (e.g., query Goods ID, query Cate ID) and history_field (e.g., Goods ID, Cate ID), in the original implementation, they first concatenate the embeddings of history_field (i.e., [Goods ID embedding, Cate ID embedding]) and then perform so-called "local activation unit (LAU)" with the concatenated query_field (i.e., [query Goods ID embedding, query Cate ID embedding]). In contrast, the implementation here seems to first perform the LAU then do the concatenation.
Specifically, the code in DIN.py:

# 1. perform the LAU for each history_field
for idx, (din_query_field, din_history_field) \
            in enumerate(zip(self.din_query_field, self.din_history_field)):
            item_emb = feature_emb_dict[din_query_field]
            history_sequence_emb = feature_emb_dict[din_history_field]
            pooled_history_emb = self.attention_layers[idx](item_emb, history_sequence_emb)
            feature_emb_dict[din_history_field] = pooled_history_emb
# 2. do the concatenation here
feature_emb = self.embedding_layer.dict2tensor(feature_emb_dict)

If my understanding is correct, I wonder whether this will affect the final performance compared with the original implementation?

upgrade to polar 1.2

when upgrade to polar 1.2, some functions used are deprecated. For example.
in criteo.py

return pl.col(col_name).apply(_convert_to_bucket).cast(pl.Int32)

    return pl.col(col_name).map_elements(_convert_to_bucket, return_dtype=pl.Int32)

the function "apply” is not used anymore.

训练代码epoch为100的疑问

您好，尝试您提供的deepFM等模型时，发现epoch为100时在criteo数据集上取得较好的效果，而推荐算法往往1个epoch就可以收敛，请问这里如何运行如此多个epoch，还能取得较好的效果？在百忙之中打扰到您~

Many hyperlinks are no longer valid, responding with 404. Please update.

For example:

In main README.md:

👉 See reusable dataset splits for CTR prediction.

In FuxiCTR/data/README.md:

Criteo: https://github.com/openbenchmark/BARS/blob/master/ctr_prediction/datasets/Criteo
Avazu: https://github.com/openbenchmark/BARS/blob/master/ctr_prediction/datasets/Avazu
Taobao: https://github.com/openbenchmark/BARS/blob/master/ctr_prediction/datasets/Taobao
KKBox: https://github.com/openbenchmark/BARS/blob/master/ctr_prediction/datasets/KKBox
MicroVideo1.7M: https://github.com/openbenchmark/BARS/blob/master/ctr_prediction/datasets/MicroVideo1.7M
Frappe: https://github.com/openbenchmark/BARS/blob/master/ctr_prediction/datasets/Frappe
Movielens: https://github.com/openbenchmark/BARS/blob/master/ctr_prediction/datasets/Movielens

MultiHeadAttention bug when num_heads is greater than 1

The code of splitting heads is buggy, without transpose the shape. The buggy code impacts AutoInt, DESTINE, InterHAt

query = query.view(batch_size * self.num_heads, -1, self.attention_dim)
key = key.view(batch_size * self.num_heads, -1, self.attention_dim)
value = value.view(batch_size * self.num_heads, -1, self.attention_dim)

The buggy code is from the reference code https://zhuanlan.zhihu.com/p/47812375

The correct code should be:

query = query.view(batch_size, seq_len, self.num_heads, self.attention_dim).transpose(1, 2)
key = key.view(batch_size, seq_len, self.num_heads, self.attention_dim).transpose(1, 2)
value = value.view(batch_size, seq_len, self.num_heads, self.attention_dim).transpose(1, 2)

Maybe a bug found in feature_preprocess.py

I am testing the latest commit of this repo with the criteo dataset. There is a self-defined preprocessor "convert_to_bucket"。 This preprocessor need a column name as its argument. I got error in this piece of code:


1.     def preprocess(self, ddf):
2.         logging.info("Preprocess feature columns...")
3.         all_cols = self.label_cols + self.feature_cols[::-1]
4.         print(all_cols)
5.         for col in all_cols:
6.             name = col["name"]
7.             #if name in ddf.columns:
8.             if name in  ddf.collect_schema().names():
9.                 print(name)
10.                 fill_na = "" if col["dtype"] in ["str", str] else 0
11.                 fill_na = col.get("fill_na", fill_na)
12.                 ddf = ddf.with_columns(pl.col(name).fill_null(fill_na))
13.             if col.get("preprocess"):
14.                 preprocess_args = re.split(r"\(|\)", col["preprocess"])
15.                 preprocess_fn = getattr(self, preprocess_args[0])
16.                 print(preprocess_args)
17.                 ddf = ddf.with_columns(
18.                     #preprocess_fn(*preprocess_args[1:-1])
19.                     preprocess_fn(name, *preprocess_args[1:-1])
20.                     .alias(name)
21.                     .cast(self.dtype_dict[name])
22.                 )
23.         active_cols = [col["name"] for col in all_cols if col.get("active") != False]
24.         ddf = ddf.select(active_cols)
25.         return ddf

The error happened at line 18, where if no arguments are provided, ddf.with_columns(preprocess_fn().alias(name).cast(xxx) will throw exception. At least the column name should be given to this function. I have tested my change, it seems right. So does this mean, we should add the column name into the function arguments at least?

DLRM dimension mismatch given numeric features

topk_words in Tokenizer is not sorted.

words = words[0:self._topk_words] should be sorted.

Training on original Criterio Dataset.

Hey, I have a question about training DeepFM on originial dataset Criterio dataset. Is this possible with the code provided in the repository? Dataset presented in demo version has 19 colummns. I mean train_sample.csv, test_sample.csv etc. Are these columns from Criterio dataset? How can I use original Criterio dataset when the data from this dataset is numerical and the caterogical columns are hashed. How to handle no labels in test set?

The website is not working

https://fuxictr.github.io/ is not working

from fuxictr.datasets import data_generator

想问一下，这个报错，没有看到具体的data_generator函数呀

normalizer.py的接口normalize修改为transform

FuxiCTR/fuxictr/preprocess/feature_processor.py

Line 290 in acab920

col_values = normalizer.transform(col_values)

与normalizer.py中的接口对应错误。

Where is the paper for FINAL model?

Where is the paper for FINAL model (FINAL: factorized interaction layer for ctr prediction)? I can't find it on the internet. Based on the source code, FINAL model uses multiplicative feature interactions. But I want to find the paper to gain more insight into the model. I would appreciate it if you could provide the paper.

How to quickly start to train and test my new model using a public dataset instead of a toy dataset, such as avazu, criteo?

Hello, I don't know how to modify the code detailly for running my model on avazu dataset.I have three csv file(train, valid, test) for avazu, which are placed in "data" directory.which files do I need to modify?please tell me how to transform "avazu" dataset to the format(H5) we want?

Sequence feature in demo "DeepFM_with_sequence_feature.py".

Should field "sequence" share embeddings with the field "adgroup_id"? I found that the method "encoder.fit()" assigns the encoder such as tokenizer for each field. Since the given tiny datasets record the user historical behavior (ad sequence), then in my understanding that the id that appeared in the field "sequence" may also appear in the field "adgroup_id". As a result, it seems that the field "sequence" should share the same encoder (i.e., tokenizer) with the field "adgroup_id", but the demo "DeepFM_with_sequence_feature.py" gives separate encoders for these two fields.

[Reference] 高效DeepFFM代码实现参考，FFM矩阵实现效率与FM相当

https://github.com/SY575/DeepFFM

Normalizer and NaN values.

For StandardScaler, looks like it supports NaN values, see class Normalizer:

null_index = np.isnan(X)

However, during preprocess, _fill_na() will fill na_value for non-string.
So

for dtype=str, the X values will be string
for dtype=float/int, the X values will be na_value

In the first case, np.isnan will throw an error because X elements are of string type.
In the second case, there is no point to normalize numbers if we have a na_value there.

Is this behavior expected or not?

How to save model for tf serving?

I want to save a model such as DCN_tf for serving. If I add 'model.save("path/to/model")' at the end of the run_expid.py, error occurs as
cannot be saved either because the input shape is not available or because the forward pass of the model is not defined.To define a forward pass, please override Model.call(). To specify an input shape, either call build(input_shape) directly, or call the model on actual data using Model(), Model.fit(), or Model.predict(). If you have a custom training step, please make sure to invoke the forward pass in train step through Model.__call__, i.e. model(inputs), as opposed to model.call().

I have added 'model(input)' before model.save() as the following code and it works. However, when I want to curl to call the service, I must specify the 'clk' which is already specified as label.

for i in train_gen:
    model(i)
    break
model.save("./model/SavedModels/1")

Thank you!

Test NVTabular, Petastorm, and Huggingface Datasets for parquet data loading

Huggingface Datasets:

 dataset = load_dataset("parquet", data_files={split: data_blocks}, split=split)
 super().__init__(dataset=dataset, num_workers=8, batch_size=self.batch_size)

feature embedding bug?

Hi,

I think, I found a bug in the feature_embedding.py file in line 94:
self.embedding_layers[feature] = tf.keras.layers.Dense(feat_emb_dim, user_bias=False)

According to the tensorflow docs (https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense), there's no argument user_bias but an argument called use_bias. I also get the following error message, when I run DeepFM with numeric features: TypeError: ('Keyword argument not understood:', 'user_bias')

expid repeat

when i run "python run_expid.py --config /..../ --expid 191 --gpu 0" in my code,i get 2 results log .they are ..._tune_191_ahdfy68.log and ..._tune_051_os191usd.log.In second log file name,it also has the string of '191'.

Problem with BN of MLP_Block in Pytorch version

Hi, I found that MLP_Block didn't work properly due to BN when the length of the tensor input was greater than or equal to 3.

Here is a toy example.

a = torch.randn(16, 128, 64)
mlp_block = MLP_Block(input_dim=64, hidden_units=[1024,1024,1024],batch_norm=True)
mlp_block(a)

The output is:

RuntimeError: running_mean should contain 128 elements not 1024

why FinalMLP has code but could not find the paper and aaai 2023 has been finished？

Adding new model leads to AttributeError: module 'model_zoo' has no attribute 'DCNv2PositiveWeight'

I have downloaded the newest version of FuxiCTR library and decided to integrate the work from https://github.com/SkylerLinn/Understanding-the-Ranking-Loss
Specifically their DCNv2PositiveWeight and DCNv2ListCE.
I have created a folder under model_zoo like this for DCNv2PositiveWeight

model_zoo
    DCNv2PositiveWeight
         config
              dataset_config
              model_config
         src
              \_\_init\_\_.py
              DCNv2PositiveWeight.py
         fuxictr_version.py
         run_expid.py
         README.md

In the __init__.py under model_zoo I have added:
from .DCNv2PositiveWeight.src import DCNv2PositiveWeight

In the __init__.py under DCNv2PositiveWeight/src I have added:
from .DCNv2PositiveWeight import DCNv2PositiveWeight

And finally the code for DCNv2PositiveWeight is the same as in the repo Understanding-the-Ranking loss:

import torch
from torch import nn
from fuxictr.pytorch.models import BaseModel
from fuxictr.pytorch.layers import FeatureEmbedding, MLP_Block, CrossNetV2, CrossNetMix
from fuxictr.metrics import evaluate_metrics
from sklearn.metrics import roc_auc_score, log_loss
import numpy as np

class DCNv2PositiveWeight(BaseModel):
    def __init__(self, 
                 feature_map, 
                 model_id="DCNv2PositiveWeight", 
                 gpu=-1,
                 model_structure="parallel",
                 use_low_rank_mixture=False,
                 low_rank=32,
                 num_experts=4,
                 learning_rate=1e-3, 
                 embedding_dim=10, 
                 stacked_dnn_hidden_units=[], 
                 parallel_dnn_hidden_units=[],
                 dnn_activations="ReLU",
                 num_cross_layers=3,
                 net_dropout=0, 
                 batch_norm=False, 
                 embedding_regularizer=None,
                 net_regularizer=None, 
                 pos_weight=1.,
                 **kwargs):
        super(DCNv2PositiveWeight, self).__init__(feature_map, 
                                    model_id=model_id, 
                                    gpu=gpu, 
                                    embedding_regularizer=embedding_regularizer, 
                                    net_regularizer=net_regularizer,
                                    **kwargs)
        self.embedding_layer = FeatureEmbedding(feature_map, embedding_dim)
        self.pos_weight = pos_weight
        
        input_dim = feature_map.sum_emb_out_dim()
        if use_low_rank_mixture:
            self.crossnet = CrossNetMix(input_dim, num_cross_layers, low_rank=low_rank, num_experts=num_experts)
        else:
            self.crossnet = CrossNetV2(input_dim, num_cross_layers)
        self.model_structure = model_structure
        assert self.model_structure in ["crossnet_only", "stacked", "parallel", "stacked_parallel"], \
               "model_structure={} not supported!".format(self.model_structure)
        if self.model_structure in ["stacked", "stacked_parallel"]:
            self.stacked_dnn = MLP_Block(input_dim=input_dim,
                                         output_dim=None, # output hidden layer
                                         hidden_units=stacked_dnn_hidden_units,
                                         hidden_activations=dnn_activations,
                                         output_activation=None, 
                                         dropout_rates=net_dropout,
                                         batch_norm=batch_norm)
            final_dim = stacked_dnn_hidden_units[-1]
        if self.model_structure in ["parallel", "stacked_parallel"]:
            self.parallel_dnn = MLP_Block(input_dim=input_dim,
                                          output_dim=None, # output hidden layer
                                          hidden_units=parallel_dnn_hidden_units,
                                          hidden_activations=dnn_activations,
                                          output_activation=None, 
                                          dropout_rates=net_dropout, 
                                          batch_norm=batch_norm)
            final_dim = input_dim + parallel_dnn_hidden_units[-1]
        if self.model_structure == "stacked_parallel":
            final_dim = stacked_dnn_hidden_units[-1] + parallel_dnn_hidden_units[-1]
        if self.model_structure == "crossnet_only": # only CrossNet
            final_dim = input_dim
        self.fc = nn.Linear(final_dim, 1)
        self.compile(kwargs["optimizer"], kwargs["loss"], learning_rate)
        self.reset_parameters()
        self.model_to_device()

    def forward(self, inputs):
        X = self.get_inputs(inputs)
        feature_emb = self.embedding_layer(X, flatten_emb=True)
        cross_out = self.crossnet(feature_emb)
        if self.model_structure == "crossnet_only":
            final_out = cross_out
        elif self.model_structure == "stacked":
            final_out = self.stacked_dnn(cross_out)
        elif self.model_structure == "parallel":
            dnn_out = self.parallel_dnn(feature_emb)
            final_out = torch.cat([cross_out, dnn_out], dim=-1)
        elif self.model_structure == "stacked_parallel":
            final_out = torch.cat([self.stacked_dnn(cross_out), self.parallel_dnn(feature_emb)], dim=-1)
        y_pred = self.fc(final_out)
        y_pred = self.output_activation(y_pred)
        return_dict = {"y_pred": y_pred}
        return return_dict

    def compute_loss(self, return_dict, y_true):
        weight = torch.where(y_true==1., self.pos_weight, 1)
        loss = self.loss_fn(return_dict["y_pred"], y_true, reduction='mean',weight=weight)
        loss += self.regularization_loss()
        return loss
    
    def evaluate_metrics(self, y_true, y_pred, metrics, group_id=None):
        print(metrics)
        ret_dict = dict()
        if 'wAUC' in metrics:
            sample_weight=np.where(y_true==1., self.pos_weight, 1.)
            ret_dict.update({'wAUC':roc_auc_score(y_true, y_pred, sample_weight=sample_weight, average='samples')})
        if 'wlogloss' in metrics:
            sample_weight=np.where(y_true==1., self.pos_weight, 1.)
            ret_dict.update({'wlogloss':log_loss(y_true=y_true, y_pred=y_pred,sample_weight=sample_weight)})
        tmp = [_ for _ in metrics if _ not in ['wAUC','wlogloss']]
        ret_dict.update(evaluate_metrics(y_true, y_pred, tmp, group_id))
        return ret_dict

I call a script that calls autotuner.enumerate_params and autotuner.grid_search. The autotuner.grid_search calls the run_expid.py inside the experiment folder - which I modified to just print

print('model_zoo options: ', dir(model_zoo))

However this doesn't include the DCNv2PositiveWeight instead this is the output:

...
Traceback (most recent call last):
  File "/Users/dsaranovic/Code/FieldEndToEndLoss/experiment/run_expid.py", line 68, in <module>
    model_class = getattr(model_zoo, params['model'])
AttributeError: module 'model_zoo' has no attribute 'DCNv2PositiveWeight'
model_zoo options:  ['AFM', 'AFN', 'AOANet', 'AutoInt', 'BST', 'CCPM', 'DCN', 'DCNv2', 'DESTINE', 'DIEN', 'DIN', 'DLRM', 'DMIN', 'DMR', 'DNN', 'DSSM', 'DeepCrossing', 'DeepFM', 'DeepIM', 'EDCN', 'ETA', 'FFM', 'FFMv2', 'FGCNN', 'FLEN', 'FM', 'FiBiNET', 'FiGNN', 'FinalMLP', 'FinalNet', 'FmFM', 'FwFM', 'HFM', 'HOFM', 'InterHAt', 'LR', 'LorentzFM', 'MMoE', 'MaskNet', 'NFM', 'ONN', 'ONNv2', 'PEPNet', 'PNN', 'PPNet', 'SAM', 'SDIM', 'SharedBottom', 'WideDeep', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'multitask', 'xDeepFM']
....

How can we add additional models to the model_zoo such that the module sees those models and that we can run them on full data using autotuner (enumerate_params, grid_search) and run_expid.py ?

I feel like I am missing something obvious?

太强了，我单方面宣布，你是CTR的神！

HFM may not be available due to compatibility issues

HFM adopts HolographicInteractionLayer in fuxictr/pytorch/layers/interaction.py.

However, HolographicInteractionLayer may be not available in Pytorch 1.10 because torch.rfft/torch.irfft has changed.\

Here is a solution for reference:

try:
    from torch import irfft
    from torch import rfft
except ImportError:
    from torch.fft import irfft2
    from torch.fft import rfft2
    def rfft(x, d):
        t = rfft2(x, dim = (-d))
        return torch.stack((t.real, t.imag), -1)
    def irfft(x, d, signal_sizes):
        return irfft2(torch.complex(x[:,:,0], x[:,:,1]), s = signal_sizes, dim = (-d))

EDCN bug:cross layer have no residual x_l impleted in fuxictr.layers.interaction.CrossInteraction

在复现TransAct的时候验证数据经过embedding映射后全变成了NaN

具体的问题代码定位在这里

Line157        X = self.get_inputs(inputs)
Line158        feature_emb_dict = self.embedding_layer(X)

对于训练阶段没有出现任何问题，但是在评估阶段对于验证数据经过这个embedding_layer后全变成了NaN，经过验证embedding_layer的所有权重全都是NaN，请问这里是哪里出现了问题呢（验证数据也有效加载进来了，X这里对于验证集是正常的）？

2024-07-21 15:36:35,644 P7079 INFO Evaluation @epoch 1 - batch 1: 
                                                                                                                                                                                                                                   Warning: NaN value detected in the weights of embedding_layers.userid.weight in the embedding layer                                                                                                            | 0/1 [00:00<?, ?it/s]
Warning: NaN value detected in the weights of embedding_layers.adgroup_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.pid.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.cate_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.campaign_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.customer.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.brand.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.cms_segid.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.cms_group_id.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.final_gender_code.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.age_level.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.pvalue_level.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.shopping_level.weight in the embedding layer
Warning: NaN value detected in the weights of embedding_layers.occupation.weight in the embedding layer

关于自己写的一个简单的新模型

我自己设计了一个非常简单的模型，非常简单（类似于MaskedNet），运行速度很快。因为现在毕业了没有GPU资源，所以几乎没有经过调参，但是准确率仅次于FinalMLP，请问下你们有兴趣接手一下吗？

[suggestion] Add a parameter description to some main methods

Suggest to add a parameter description to some main methods ( such as basemodel.fit ( ) and some models ) to facilitate the user to read the code. Can refer to the https://github.com/shenweichen/DeepCTR-Torch parameter annotations and documentation.
I hope more comments to facilitate reading code.

Does FINAL belong to a model that combines explicit and implicit feature interactions on a layer-by-layer way?

refer the title

ACC metric cant't work

ypred is one dimension tensor,don't have the axis=1

In base_model.py ypred have been transform into 1 dimension

[Suggestion] Update the logic of preprocessing for efficiency

Suggest to update the logic of preprocessing for efficiency.

In many cases, a user's behavior sequence is the same for all training samples. In addition, the features of a user or an item are often the same for all training samples.

However, the current version of FuxiCTR receives the training dataset as a single DataFrame, so these features (e.g., a user's behavior sequence, the features of a user or an item) should be stored redundantly in that DataFrame, which consumes too much memory (especially, in large-scale dataset). Also, fit/transform of feature_preprocessor should be performed on redundant behavior sequences and features, which takes too long (especially, in large-scale dataset).

So, to operate more efficiently, I hope these redundancies are removed. To this end, I suggest to change the logic of preprocessing to receive user_df and item_df for each dataset together and fit/transform unique features (i.e., user_df and item_df).

求教无法使用torch.parallel的问题

最近在FuxiCTR的基础上实现了自己的模型，由于显存吃紧，想使用torch.parallel模块在两块GPU上训练。但是无论如何设置，模型始终在一个GPU上运行。请问该如何解决？

Dataloader cannot keep the order of samples when num_workers > 1

        else:
            num_workers = 1 # to keep the order of data reading while multiple workers cannot

Add default param streaming to model_config

streaming: False should be used in dataset_config

Call for model implementations

Wukong: Towards a Scaling Law for Large-Scale Recommendation
TransAct: Transformer-based Realtime User Action Model for Recommendation at Pinterest
DPN: Deep Pattern Network for Click-Through Rate Prediction
MemoNet: Memorizing All Cross Features' Representations Efficiently via Multi-Hash Codebook Network for CTR Prediction
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction
DHEN: A Deep and Hierarchical Ensemble Network for Large-Scale Click-Through Rate Prediction
AT4CTR: Auxiliary Match Tasks for Enhancing Click-Through Rate Prediction
TWIN: TWo-stage Interest Network for Lifelong User Behavior Modeling in CTR Prediction at Kuaishou
End-to-end training of Multimodal Model and ranking Model
Unified Visual Preference Learning for User Intent Understanding
Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta

Streaming data loading, evaluation is not performed in epoch end.

len(data_generator) != the actual batches in npz_block_dataloader

2024-04-14 11:01:33,212 P42965 INFO Evaluation @epoch 2 - batch 1060:
2024-04-14 16:15:11,489 P42965 INFO Evaluation @epoch 3 - batch 2120:
2024-04-14 21:28:40,049 P42965 INFO Evaluation @epoch 4 - batch 3180:
2024-04-15 02:43:36,228 P42965 INFO Evaluation @epoch 5 - batch 4240:

torch.jit.trace not working

torch.jit.trace(fibinet, batch_data)
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/jit/_trace.py", line 794, in trace
return trace_module(
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/jit/_trace.py", line 1056, in trace_module
module._c._create_method_from_trace(
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/anaconda3/envs/fuxictr3.8/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1488, in _slow_forward
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 3 were given

Emb_LayerNorm bug in MaskNet

The paper uses concat(LN(e1), LN(e2),..., LN(ef))， but the code use nn.LayerNorm([feature_map.num_fields, embedding_dim]) . This makes normalization happen in the last two dimensions.

支持多卡训练

你好，有支持多卡训练的想法吗？

FuxiCTR v2.0 updates

To update:
DESTINE
InterHAt
EDCN
MaskNet
DLRM
DSSM

FNN model problem

We can't find a function called self.reduce_learning_rate() in Base model ,does it have change into self.lr_decay()?

生成的数据格式疑问？

example1_build_dataset_to_npz.py运行完之后，生成的数据test train valid都没有尾缀.npz呀，后面读取文件的时候就会报错例如不存在train.npz
另外强制加了npz尾缀之后，读取又显示ValueError: Cannot load file containing pickled data when allow_pickle=False
感谢解答！

[Bug] SDIM was not correctly implemented.

The model raises errors occasionally
lsh_attention on long sequence has no mask for sequence padding positions

sequence feature

41/5000
When I try to use sequence characteristics, the program has an array out of bounds error. Could you please give an example of sequence features.

Any examples for numerical features?

I don't find any examples for numerical + category + sequence dataset
maybe you can provide it

Failed to run FINAL

Traceback (most recent call last):
File "run_expid.py", line 69, in
model.fit(train_gen, validation_data=valid_gen, **params)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 154, in fit
self.train_epoch(data_generator)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 210, in train_epoch
loss = self.train_step(batch_data)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 193, in train_step
loss = self.get_total_loss(batch_data)
File "/root/miniconda3/lib/python3.8/site-packages/fuxictr/pytorch/models/ctr_model.py", line 90, in get_total_loss
total_loss = self.add_loss(inputs) + self.add_regularization()
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 107, in add_loss
return_dict = self.forward(inputs)
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 87, in forward
y1 = self.forward1(feature_emb)
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 96, in forward1
X = self.field_gate(X)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/FuxiCTR/model_zoo/FINAL/model/FINAL.py", line 132, in forward
if gate_residual == "concat":
NameError: name 'gate_residual' is not defined

in file FINAL.py, line 130:

    def forward(self, feature_emb):
        gates = self.linear(feature_emb.transpose(1, 2)).transpose(1, 2)
        if gate_residual == "concat":
            out = torch.cat([feature_emb, feature_emb * gates], dim=1) # b x 2f x d
        else:
            out = feature_emb + feature_emb * gates
        return out

the code if gate_residual == "concat": should be if self.gate_residual == "concat":

Can you provide the config of DIN ?

Hi, can you provide the config of the dataset and model about DIN on 'taobao_x1_001' ?

[Suggestion] Automatically calculate the number of evaluation steps by setting eval_steps to less than 1.

In FuxiCTRv1, eval_steps seems to be set to a float number less than 1 to support multiple evaluations within one epoch.

However, in FuxiCTRv2, eval_steps needs to manually calculate and set the number of evaluation steps.

I suggest to be able to automatically calculate the number of steps for evaluation if eval_steps is less than 1.