Comments (7)
@githubbayes 感谢您的关注!
目前版本(v0.2.0)的deepctr只有DIN是支持多值特征输入的,其他模型出于简化使用的目的暂时未支持,会在后续版本中逐步加入。
下面是使用DIN模型处理多值特征输入的一个例子:
import numpy as np
from deepctr.models import DIN
def get_xy_fd():
feature_dim_dict = {"sparse": {'user_age': 4, 'user_gender': 2,
'item_id': 4, 'item_gender': 2}, "dense": []}#原始特征
behavior_feature_list = ["item_id","item_gender"]#历史行为特征
#单值特征
user_age = np.array([1, 2, 3])
user_gender = np.array([0, 1, 0])
item_id = np.array([0, 1, 2])
item_gender = np.array([0, 1, 0])
#多值特征
hist_item_id = np.array([[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 0]])
hist_item_gender = np.array([[0, 1, 0, 1], [0, 1, 1, 1], [0, 0, 1, 0]])
hist_length = np.array([4, 4, 3])#每个样本的历史序列长度
feature_dict = {'user_age': user_age, 'user_gender': user_gender, 'item_id': item_id, 'item_gender': item_gender,
'hist_item_id': hist_item_id, 'hist_item_gender': hist_item_gender, }
x = [feature_dict[feat] for feat in feature_dim_dict["sparse"]] + [feature_dict['hist_'+feat] for feat in behavior_feature_list] +[hist_length]
#这里注意拼接顺序:依次为单值特征,多值特征,多值特征长度
#由于DIN中不同特征的历史序列长度都是一致的,因为都是从item_id扩展出来的,所以只需一个向量就够了
y = [1, 0, 1]
return x, y, feature_dim_dict, behavior_feature_list
x, y, feature_dim_dict, behavior_feature_list = get_xy_fd()
model = DIN(feature_dim_dict, behavior_feature_list, hist_len_max=4,)
model.compile('adam', 'binary_crossentropy',
metrics=['binary_crossentropy'])
history = model.fit(x, y, verbose=1, validation_split=0.5)
具体DIN模型的参数可以关注下说明文档中的相关内容~
from deepctr.
@githubbayes 感谢您的关注!
目前版本(v0.2.0)的deepctr只有DIN是支持多值特征输入的,其他模型出于简化使用的目的暂时未支持,会在后续版本中逐步加入。
下面是使用DIN模型处理多值特征输入的一个例子:import numpy as np from deepctr.models import DIN def get_xy_fd(): feature_dim_dict = {"sparse": {'user_age': 4, 'user_gender': 2, 'item_id': 4, 'item_gender': 2}, "dense": []}#原始特征 behavior_feature_list = ["item_id","item_gender"]#历史行为特征 #单值特征 user_age = np.array([1, 2, 3]) user_gender = np.array([0, 1, 0]) item_id = np.array([0, 1, 2]) item_gender = np.array([0, 1, 0]) #多值特征 hist_item_id = np.array([[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 0]]) hist_item_gender = np.array([[0, 1, 0, 1], [0, 1, 1, 1], [0, 0, 1, 0]]) hist_length = np.array([4, 4, 3])#每个样本的历史序列长度 feature_dict = {'user_age': user_age, 'user_gender': user_gender, 'item_id': item_id, 'item_gender': item_gender, 'hist_item_id': hist_item_id, 'hist_item_gender': hist_item_gender, } x = [feature_dict[feat] for feat in feature_dim_dict["sparse"]] + [feature_dict['hist_'+feat] for feat in behavior_feature_list] +[hist_length] #这里注意拼接顺序:依次为单值特征,多值特征,多值特征长度 #由于DIN中不同特征的历史序列长度都是一致的,因为都是从item_id扩展出来的,所以只需一个向量就够了 y = [1, 0, 1] return x, y, feature_dim_dict, behavior_feature_list x, y, feature_dim_dict, behavior_feature_list = get_xy_fd() model = DIN(feature_dim_dict, behavior_feature_list, hist_len_max=4,) model.compile('adam', 'binary_crossentropy', metrics=['binary_crossentropy']) history = model.fit(x, y, verbose=1, validation_split=0.5)具体DIN模型的参数可以关注下说明文档中的相关内容~
Thanks shenweichen
from deepctr.
@githubbayes 感谢您的关注!
最新版本已经加入了对multivalent 分类变量输入的支持,
- 对于
AFM,AutoInt,DCN,DeepFM,FNN,NFM,PNN,xDeepFM
请参考 https://deepctr-doc.readthedocs.io/en/latest/Examples.html#multi-value-input-movielens - 对于
DIN
请参考 https://github.com/shenweichen/DeepCTR/blob/master/examples/run_din.py
from deepctr.
@shenweichen 你好 请问一下,是不是sequence_feature中只能输入一个VarLenFeat对象? 我在实际应用的时候 碰到一个问题 一个字段是用户最喜欢的广告id(ad1|ad2|ad3),一个字段是用户最喜欢的产品id(product1|product2|product3)这种,两个embeding肯定是要分开的,
sequence_features = {'activity_features':activity_dict,'product_features':product_dict}
sequence_feat_list = [VarLenFeat(feat, len(value)+1,len(load_data[feat][0]),'mean') for feat,value in sequence_features.items()
]
但是我用这种方式输入以后,加上dense_input,sparse_input 就超出了DeepFM的model限制
from deepctr.
问题解决了,是在model的地方没有写dense_list的缘故,加上以后就可以了
4.Define Model,compile and train
model = DeepFM({"sparse": sparse_feat_list,
"dense":dense_feat_list,
"sequence": sequence_feat_list}, final_activation='linear')
from deepctr.
您好,请问 我有多值特征 然后想用NFFM 但目前NFFM还没有multivalent input 我该怎么操作呢 有什么好的建议吗 非常感谢! 感觉NFFM还是挺强大的
from deepctr.
@zksar 请参考这个样例 https://deepctr-doc.readthedocs.io/en/latest/Examples.html#multi-value-input-movielens
from deepctr.
Related Issues (20)
- estimator with Multi-value Input HOT 1
- mmoe训练模型,测试集ctr和cvr的auc完全相等。
- deepfm模型如何实现多头输出?
- SDM 模型中,movielens中 genres 这种多值离散特征怎么处理
- The following Variables were used a Lambda layer's call,BatchNormalization
- Linear logic in DCNMIX
- The use of linear logic in DeepFM/DCNMIX
- ple可以只用于单任务吗
- 安装gpu版本报错 HOT 1
- 如何保存deepctr-torch训练好的deepfm模型 HOT 1
- DIN mask为何没有传入mask参数 HOT 1
- Implementing fix from Issue#344
- 多值特征代码有bug HOT 3
- save/load model error HOT 1
- model.predict only support np.array ?
- py3.11 to install error for h5py==3.7.0 which not support for py3.11 HOT 1
- 为什么GPU运行时SparseFeat中vocabulary_size的值大小不会引起错误
- How to self define metric instead of using one of the pre-defined metrics HOT 1
- feature interaction visualization
- I'm using this model with cpu, so I'm getting an error.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepctr.