Giter VIP home page Giter VIP logo

Comments (7)

shenweichen avatar shenweichen commented on May 24, 2024

@githubbayes 感谢您的关注!
目前版本(v0.2.0)的deepctr只有DIN是支持多值特征输入的,其他模型出于简化使用的目的暂时未支持,会在后续版本中逐步加入。
下面是使用DIN模型处理多值特征输入的一个例子:

import numpy as np
from deepctr.models import DIN

def get_xy_fd():

    feature_dim_dict = {"sparse": {'user_age': 4, 'user_gender': 2,

                                   'item_id': 4, 'item_gender': 2}, "dense": []}#原始特征

    behavior_feature_list = ["item_id","item_gender"]#历史行为特征
    #单值特征
    user_age = np.array([1, 2, 3])
    user_gender = np.array([0, 1, 0])
    item_id = np.array([0, 1, 2])
    item_gender = np.array([0, 1, 0])

    #多值特征
    hist_item_id = np.array([[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 0]])
    hist_item_gender = np.array([[0, 1, 0, 1], [0, 1, 1, 1], [0, 0, 1, 0]])
    hist_length = np.array([4, 4, 3])#每个样本的历史序列长度


    feature_dict = {'user_age': user_age, 'user_gender': user_gender, 'item_id': item_id, 'item_gender': item_gender,

                    'hist_item_id': hist_item_id, 'hist_item_gender': hist_item_gender, }

    x = [feature_dict[feat] for feat in feature_dim_dict["sparse"]] + [feature_dict['hist_'+feat] for feat in behavior_feature_list] +[hist_length]
    #这里注意拼接顺序:依次为单值特征,多值特征,多值特征长度
    #由于DIN中不同特征的历史序列长度都是一致的,因为都是从item_id扩展出来的,所以只需一个向量就够了
    y = [1, 0, 1]

    return x, y, feature_dim_dict, behavior_feature_list

x, y, feature_dim_dict, behavior_feature_list = get_xy_fd()
model = DIN(feature_dim_dict, behavior_feature_list, hist_len_max=4,)
model.compile('adam', 'binary_crossentropy',

              metrics=['binary_crossentropy'])
history = model.fit(x, y, verbose=1, validation_split=0.5)

具体DIN模型的参数可以关注下说明文档中的相关内容~

from deepctr.

githubbayes avatar githubbayes commented on May 24, 2024

@githubbayes 感谢您的关注!
目前版本(v0.2.0)的deepctr只有DIN是支持多值特征输入的,其他模型出于简化使用的目的暂时未支持,会在后续版本中逐步加入。
下面是使用DIN模型处理多值特征输入的一个例子:

import numpy as np
from deepctr.models import DIN

def get_xy_fd():

    feature_dim_dict = {"sparse": {'user_age': 4, 'user_gender': 2,

                                   'item_id': 4, 'item_gender': 2}, "dense": []}#原始特征

    behavior_feature_list = ["item_id","item_gender"]#历史行为特征
    #单值特征
    user_age = np.array([1, 2, 3])
    user_gender = np.array([0, 1, 0])
    item_id = np.array([0, 1, 2])
    item_gender = np.array([0, 1, 0])

    #多值特征
    hist_item_id = np.array([[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 0]])
    hist_item_gender = np.array([[0, 1, 0, 1], [0, 1, 1, 1], [0, 0, 1, 0]])
    hist_length = np.array([4, 4, 3])#每个样本的历史序列长度


    feature_dict = {'user_age': user_age, 'user_gender': user_gender, 'item_id': item_id, 'item_gender': item_gender,

                    'hist_item_id': hist_item_id, 'hist_item_gender': hist_item_gender, }

    x = [feature_dict[feat] for feat in feature_dim_dict["sparse"]] + [feature_dict['hist_'+feat] for feat in behavior_feature_list] +[hist_length]
    #这里注意拼接顺序:依次为单值特征,多值特征,多值特征长度
    #由于DIN中不同特征的历史序列长度都是一致的,因为都是从item_id扩展出来的,所以只需一个向量就够了
    y = [1, 0, 1]

    return x, y, feature_dim_dict, behavior_feature_list

x, y, feature_dim_dict, behavior_feature_list = get_xy_fd()
model = DIN(feature_dim_dict, behavior_feature_list, hist_len_max=4,)
model.compile('adam', 'binary_crossentropy',

              metrics=['binary_crossentropy'])
history = model.fit(x, y, verbose=1, validation_split=0.5)

具体DIN模型的参数可以关注下说明文档中的相关内容~

Thanks shenweichen

from deepctr.

shenweichen avatar shenweichen commented on May 24, 2024

@githubbayes 感谢您的关注!
最新版本已经加入了对multivalent 分类变量输入的支持,

from deepctr.

thulorry avatar thulorry commented on May 24, 2024

@shenweichen 你好 请问一下,是不是sequence_feature中只能输入一个VarLenFeat对象? 我在实际应用的时候 碰到一个问题 一个字段是用户最喜欢的广告id(ad1|ad2|ad3),一个字段是用户最喜欢的产品id(product1|product2|product3)这种,两个embeding肯定是要分开的,
sequence_features = {'activity_features':activity_dict,'product_features':product_dict}
sequence_feat_list = [VarLenFeat(feat, len(value)+1,len(load_data[feat][0]),'mean') for feat,value in sequence_features.items()
]
但是我用这种方式输入以后,加上dense_input,sparse_input 就超出了DeepFM的model限制

from deepctr.

thulorry avatar thulorry commented on May 24, 2024

问题解决了,是在model的地方没有写dense_list的缘故,加上以后就可以了

4.Define Model,compile and train

model = DeepFM({"sparse": sparse_feat_list,
"dense":dense_feat_list,
"sequence": sequence_feat_list}, final_activation='linear')

from deepctr.

zksar avatar zksar commented on May 24, 2024

您好,请问 我有多值特征 然后想用NFFM 但目前NFFM还没有multivalent input 我该怎么操作呢 有什么好的建议吗 非常感谢! 感觉NFFM还是挺强大的

from deepctr.

shenweichen avatar shenweichen commented on May 24, 2024

@zksar 请参考这个样例 https://deepctr-doc.readthedocs.io/en/latest/Examples.html#multi-value-input-movielens

from deepctr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.