Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C-EASE and ADD-EASE #696

Open
wants to merge 5 commits into
base: 0.2.x
Choose a base branch
from
Open

Add C-EASE and ADD-EASE #696

wants to merge 5 commits into from

Conversation

deklanw
Copy link
Contributor

@deklanw deklanw commented Jan 21, 2021

Both models from "Closed-Form Models for Collaborative Filtering with Side-Information" by Olivier Jeunen, et al. https://dl.acm.org/doi/abs/10.1145/3383313.3418480

These are variations of EASE which incorporate item token and token_seq features. Small accuracy improvements over EASE (indeed, EASE is a special case of ADD-EASE with item_feat_proportion = 0.0). The paper claims large improvements for cold-start items, but unfortunately there is no way to check that in RecBole #671

The model in the paper allows feature-specific importance weighting, but didn't test that variant (probably because there is no obvious way in general to assign importance). So, in the case of C-EASE I just implemented a single parameter controlling the importance of item features (as is tested in the paper).

@yarncraft
Copy link
Contributor

@deklanw, I ran into the following issue:

File "/home/yarncraft/.local/lib/python3.8/site-packages/recbole/model/general_recommender/addease.py", line 104, in __init__
    self.item_similarity = (1-item_feat_proportion) * inter_S + item_feat_proportion * item_S
ValueError: operands could not be broadcast together with shapes (2149,2149) (2786,2786) 

I manually checked this and it seems that your code does not account for the possibility of having more items in the .item file than (actually used) in the .inter file. This can be fixed by first filtering the item feature matrix with those items that are actually being used in the interactions I think?

Anyways, thanks for the nice work in the other PR's! 😄

item_reg_weight = config['item_reg_weight']
selected_features = config['selected_features']

tag_item_matrix = encode_categorical_item_features(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To sovle the above issue, you can clip the tag_item_matrix by num_items.
tag_item_matrix = tag_item_matrix[:, :self.num_items]

Or just filter the items by config parameter.
tem_inter_num_interval: [1,inf)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants