Giter VIP home page Giter VIP logo

Comments (3)

cwalker4 avatar cwalker4 commented on July 1, 2024 1

bb3ae0d takes a first pass at this. Seems to work as expected, reasonably efficient (~3 seconds per tree using our most recent params), but needs further testing.

from youtube-recommendations.

cwalker4 avatar cwalker4 commented on July 1, 2024

(The markdown document mixing_analysis was the initial motivation for this issue, and gives some more colour on why we might want to reconstruct the full BFS tree)

from youtube-recommendations.

cwalker4 avatar cwalker4 commented on July 1, 2024

48866ab adds scripts that deal with the above. Will likely need miscellaneous fixes, but closing this issue since the central problem has been solved.

The logic of the script is as follows: suppose we are at depth n. Some subset of the videos at depth n were visited at some depth n' < n, so as described above we do not follow those edges in our crawl. Consequently if such a video is listed in the recommendation column at depth n, it would not appear as in the video_id column at depth n + 1. Continuing with the example above:

a
|
|--a
|
|--b
   |--c
   |--d


a|a|0
a|b|0
b|c|1
b|d|1

The list of recommended videos at depth 0 is [a,b], but the (unique) videos we have in the video_id column for depth 1 is [b]. Taking the difference between these lists, we see we need to add a record for a. So our script would:

  • scan over the depths in an outer loop
  • at depth n take a list difference between the video_id column for depth n and the recommendation column for depth n - 1: these are the videos we truncated.
  • iterate over the truncated videos, looking up their recommendations at whatever depth they were first encountered, and appending the appropriate records to the table.

Things become more complicated when we reach our sampling depth, and constituted the biggest headache in addressing this issue. Can give more colour on this if people are interested (/ if there's anybody else reading this).

from youtube-recommendations.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.