Giter VIP home page Giter VIP logo

wyze-rule-recommendation's Introduction

wyze-rule-recommendation

This repository showcases the second-place solution for the HuggingFace challenge hosted by Wyze Labs. For detailed information about the challenge, please refer to the original link.

img.png

The challenge focuses on creating a recommendation system for smart home automation, specifically targeting the suggestion of rules to users. Each user possesses a diverse collection of devices spanning 16 different types, including Cameras, Motion Sensors, Thermostats, and more. Users can establish rules, each identified by a trigger device, a trigger state, an action device, and a corresponding action. The task at hand involves proposing new rules for users.

The dataset provided by Wyze Labs consists of:

  • A training set containing actual rules defined by users.
  • Two test set splits (public and private), each containing actual rules with a leave-one-out design to assess the implemented system.

Each split includes a list of rules and a list of devices, along with their associations with users.

The devised solution formulates the problem as a link prediction task and leverages a Graph Neural Network (GNN) implemented in PyTorch-Geometric. The model is trained using positive and negative sampling from the training set provided by Wyze Labs. Significantly, this solution builds upon the foundational approach outlined in FedRule. However, it enhances the overall performance by incorporating batched training and making distinct architectural choices. Further insights into these improvements can be found in the Approach section.

The evaluation metrics for the competition is the mean reciprocal rank (MRR), defined as

img.png

In the competition, the presented approach attained an MRR close to 0.45, whereas the baseline only reached approximately 0.25.

How to install

Install the dependencies in a virtual environment

pip install -r requirements.tx

In order to reproduce the training and testing experiments, you should download the dataset. Both training and testing scripts do this for you, but you need to:

  • Accept the Terms and Conditions for this dataset here by creating an HF account;
  • Login in your HF account using huggingface-cli following this guide.

Training

To train a new model, execute the train.py script. You can customize the training configuration by modifying the settings in the cfg/training.yaml file.

The model used in the competition is provided in models\best_model.ckpt.

Test

To test a model, execute the train.py script.

python test.py --mode=private --model_path="./models/best_model.ckpt"

Approach

This section discuss the details of the approach used.

Problem modelling

The problem has been framed as a link prediction task, wherein a graph is constructed for each user. The graph structure is defined as follows:

  • Each device serves as a node, with each node characterized by a node feature representing the device model (e.g., Camera, Cloud) using one-hot encoding.
  • Every rule is translated in a directed edge connecting the trigger device to the action device. The edges may vary in type based on the trigger state and action state, with the dataset containing 45 trigger states and 47 actions.

Model Architecture

For each graph, the model incorporates the following in a first component:

  • Embedding layers are utilized for both trigger state and action for each edge.
  • Node types are one-hot encoded.
  • Edge features are aggregated per node and concatenated to node features.
  • Two SageConv layers are applied to the concatenated features.

This component acts as a node embedding, providing an embedding vector for each node. At this stage, a set of links is considered for prediction. The prediction process involves:

  • Concatenating the embedding vector for nodes in the edge.
  • Applying a classification head with a sigmoid activation function to predict the edge probability.

At inference time, supposing to have a graph with n_nodes, the output will be a (n_nodes * n_nodes, num_edges_categories) tensor containing the score for each possible edge. The scores are sorted and the top-50 scores are selected.

Training approach

The entire network is trained using a positive and negative sampling from the dataset, similarly to the base-line approach in FedRule. However, the code has been re-implemented in pytorch-geometric. In details: In details:

  • Positive sampling: example pairs are generated with a leave-one-out approach;
  • Negative sampling: given the graph structure, an unseen edge is generated.

The loss function, as in the FedRule paper, consists in a binary cross entropy between positive and negative examples.

wyze-rule-recommendation's People

Contributors

conti748 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

609bob

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.