Giter VIP home page Giter VIP logo

Comments (7)

bryanyzhu avatar bryanyzhu commented on May 24, 2024

You can use cv2.VideoCapture to read in the video (no matter read videos offline or read videos from camera), then get the frames, and finally do the prediction. Something like this,

cap = cv2.VideoCapture(VIDEO_NAME)
net = get_model(MODEL_NAME)
while(cap.isOpened()):
    ret, frame = cap.read()
    input = preprocess(frame)
    pred = net(input)
    if not ret: break

cap.release()

from two-stream-pytorch.

shijubushiju avatar shijubushiju commented on May 24, 2024

@bryanyzhu Thank you very much. Let me have a try

from two-stream-pytorch.

shijubushiju avatar shijubushiju commented on May 24, 2024

@bryanyzhu Hello, I have another question:
Lines 85 and 86 of the flow_vgg16.py file look like this:
rgb_weight_mean = torch.mean(rgb_weight, dim=1)
flow_weight = rgb_weight_mean.repeat(1,in_channels,1,1)
However, lines 179 and 182 of the file flow_resnet.py look like this:
rgb_weight_mean = torch.mean(rgb_weight, dim=1)
flow_weight = rgb_weight_mean.unsqueeze(1).repeat(1,in_channels,1,1)
How do I make sense of the difference

from two-stream-pytorch.

bryanyzhu avatar bryanyzhu commented on May 24, 2024

For current PyTorch version, I think the second one is more rigorous. But both of them should work because many operators support automatic broadcasting, so users don't need to worry about the dimension mismatch.

from two-stream-pytorch.

shijubushiju avatar shijubushiju commented on May 24, 2024

@bryanyzhu Ok, I get an error when I run the first one, and then I run it perfectly with the second modification.Thank you for your reply. I will consult you if I have any questions

from two-stream-pytorch.

shijubushiju avatar shijubushiju commented on May 24, 2024

@bryanyzhu
In VideoSpatialPrediction. py :
def VideoSpatialPrediction(
vid_name,
net,
num_categories,
start_frame=0,
num_frames=0,
num_samples=25
) :
Is num_samples the number of test videos in here?

The other problem is:
The model was tested using recorded video. What should I do if I want to test it online?

from two-stream-pytorch.

bryanyzhu avatar bryanyzhu commented on May 24, 2024

num_samples means number of frames sampled from one video. This is the standard evaluation setting used before, that is, we take 25 frames per video and do 10-crop per frame. So for each video, we actually perform 250 forward and average the predictions to get the final result.

For online videos, usually what people do (or the simplest way) is to wait for a few frames, do the prediction and perform average. Then doing the same thing in a sliding window fashion.

from two-stream-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.