vimarsana.com

Scaling audio-visual learning without labels

Researchers from MIT, the MIT-IBM Watson AI Lab, IBM Research, and elsewhere have developed a new technique for analyzing unlabeled audio and visual data that could improve the performance of machine-learning models used in applications like speech recognition and object detection. The work, for the first time, combines two architectures of self-supervised learning, contrastive learning and masked data modeling, in an effort to scale machine-learning tasks like event classification in sing...

Related Keywords

Frankfurt ,Brandenburg ,Germany ,Texas ,United States ,Yuan Gong ,Alexanderh Liu ,Leonid Karlinsky ,Hilde Kuehne ,Andrew Rouditchenko ,Artificial Intelligence Laboratory ,International Conference On Learning Representations ,Youtube ,Goethe University Frankfurt ,Ibm ,University Of Texas At Austin ,Watson Ai Lab ,Bing Create ,Computer Science ,Jim Glass ,David Harwath Phd ,Goethe University ,International Conference ,Press Release Image ,Researchers ,It ,It Ibm ,Watson ,Hi ,Lab ,Research ,Eveloped ,Technique ,Analyzing ,Unlabeled ,Udio ,Visual ,Data ,Mprove ,Performance ,Machine Learning ,Models ,Ased ,Applications ,Bike ,Speech ,Recognition ,Object ,Detection ,Work ,Dime ,Combines ,Architectures ,Elf Supervised ,Learning ,Ontrastive ,Asked ,Modeling ,Effort ,Scale ,Tasks ,Event ,Classification ,King ,Ebwire ,Press Release ,Ews Release ,

vimarsana.com © 2020. All Rights Reserved.