World's largest open-source multimodal dataset delivers 17x training efficiency, unlocking enterprise AI that connects documents, audio and video

📋 요약

최신 기술 소식을 전해드립니다.

📰 전체 내용

AI models are only as good as the data they re trained on. That data generally needs to be labeled, curated and organized before models can learn from it in an effective way.One of the big missing links in the AI ecosystem has been the availability of a large high-quality open-source multimodal dataset. That changes today with the debut of the EMM-1 dataset which is comprised of 1 billion data pairs and 100M data groups across 5 modalities: text, image, video, audio and 3d point clouds. Multimodal datasets combine different types of data that AI systems can process together. This mirrors how humans perceive the world using multiple senses simultaneously. These datasets enable AI systems to make richer inferences by understanding relationships across data types, rather than processing each modality in isolation.EMM-1 is developed by data labeling platform vendor Encord. The company s platform enables teams to curate, label and manage training data at scale using both automated and human-in-the-loop workflows. Alongside the new model, Encord developed the EBind training methodology that prioritizes data quality over raw computational scale. The approach enabled a compact 1.8 billion parameter model to match the performance of models up to 17 times larger while slashing training time from days to hours on a single GPU rather than GPU clusters. The big trick for us was to really focus on the data and to make the data very, very high quality, Encord Co-Founder and CEO Eric Landau told VentureBeat in an exclusive interview. We were able to get to the same level of performance as models 20 times larger, not because we were super clever on the architecture, but because we trained it with really good data overall. The data quality advantageEncord s dataset is 100 times larger than the next comparable multimodal dataset, according to Landau. It operates at petabyte scale with terabytes of raw data and over 1 million human annotations.But scale alone doesn t explain the perfor

🌐 원본 출처

원문: World's largest open-source multimodal dataset delivers 17x training efficiency, unlocking enterprise AI that connects documents, audio and video

출처: venturebeat.com

📖 원문 기사 보기

🌍 글로벌 기술 뉴스

해외 최신 기술 동향을 정확하게 번역하여
국내 독자들에게 신속하고 정확한 정보를 전달합니다.

World's largest open-source multimodal dataset delivers 17x training efficiency, unlocking enterprise AI that connects documents, audio and video

World's largest open-source multimodal dataset delivers 17x training efficiency, unlocking enterprise AI that connects documents, audio and video

📋 요약

📰 전체 내용

🌐 원본 출처

🌍 글로벌 기술 뉴스

관련 포스트

Qwen's new Deep Research update lets you turn its reports into webpages, podcasts in seconds

iOS 26 public beta 4 is here: Everything to know about Apple's big software changes coming to iPhone and iPad

In crowded voice AI market, OpenAI bets on instruction-following and expressive speech to win enterprise adoption