Posted by on
Tags: , , , , , , , , , , , ,
Categories: Uncategorized

Supervised learning and deep learning AI models are effective in recognizing and classifying objects and actions in videos. These methods however currently summarize an entire video clip with a single label, which does not always completely capture the content. To better understand a multi-step process such as pitching a baseball, more than one label is required. 

Frame-by-frame fine-grained labeling is however a time-consuming task. To tackle this problem, researchers from Google AI and DeepMind have introduced a novel self-supervised learning method called Temporal Cycle-Consistency Learning (TCC), which leverages temporal alignment between videos to break down continuous actions in videos to develop “a semantic understanding of each video frame.”

Red more here:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.