Giter VIP home page Giter VIP logo

Comments (4)

sukun1045 avatar sukun1045 commented on June 19, 2024

Hi, thanks for your interest.

While the input dimension is only 60, the representation is pretty noisy and we want to use a RNN-based network to extract a better representation to describe the whole sequence.

Since we are using the final hidden state of bidirectional GRU encoder to do the knn classification in the end, the dimension of that hidden state is 2048 (this is a hyper-parameter by experiments). However, this dimension is pretty large and algorithm like knn may have 'curse of dimensionality' issue. It is also not common to use a 2048 vector for classification in any case.

In some previous 'unsupervised representation learning' works using RNN-based model, they typically add a fully connected layer as a classifier on the final hidden state to reduce the dimension to the number of classes and fine tune the model in supervised setting to show the representations have been learned. However, it won't be completely unsupervised since one layer FC could actually change a lot and classification results may only rely on that single FC (We actually tried that).

In this work, we would like to avoid any supervision on classification and use knn to test the learned representation. You can consider this auto-encoder works as a dimension reduction technique that tries to compress the final 2048 vector representation in a relatively lower dimensional compact vector for evaluation. Training this auto-encoder is very simple and converges very fast and it is helpful to provide a better accuracy performance.

from predict-cluster.

lucasgnz avatar lucasgnz commented on June 19, 2024

Thank you very much for your quick reply, this helps a lot.

So until the final 254 vector representation, there is not any supervision. But knn is a supervised algorithm and I can't rely on it to evaluate the representations because I don't have labels in my data.

Do you think K-means algorithm could give a good unsupervised clustering ?

I am trying to train your model to do unsupervised action clustering from untrimmed skeleton sequences, which means I don't have different action sequences but only one long sequence of body keypoints.

So I am feeding your model with short sequences that I sample randomly from the raw sequence. My goal is then to be able to cluster these short sequences into different actions/movements, without supervision.

Do you think this has a chance of succeeding ? Or should I work first on action segmentation ?

Again thanks a lot for your time

from predict-cluster.

sukun1045 avatar sukun1045 commented on June 19, 2024

oh I see. It looks like you are actually working on problems of unsupervised temporal segmentation & action recognition? I am not sure whether this work would help since the assumption is there is a complete sequence that could use single hidden vector to represent. However, I did try something similar to your current problem before. Maybe you can check out this paper and see if it helps: Clustering and Recognition of Spatiotemporal Features through Interpretable
Embedding of Sequence to Sequence Recurrent Neural Networks

from predict-cluster.

lucasgnz avatar lucasgnz commented on June 19, 2024

Yes, I think this is exactly what I am looking for. Thank you so much !

from predict-cluster.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.