DefinedCrowd Launches DefinedData, an Online Marketplace of AI Datasets Available for On-Demand Purchase

DefinedCrowd Launches DefinedData_ an Online Marketplace of AI Datasets Available for On-Demand Purchase

DefinedCrowd, the leading data provider for Artificial Intelligence, today announced the launch of DefinedData, a new offering that enables customers to rapidly accelerate their AI-initiatives into the market by acquiring pre-collected, annotated, and validated AI training data from an online catalog.

This product launch follows the recent closing of a US $50.5M Series B funding round and the addition of new investor Balderton Capital. This funding round enables DefinedCrowd to continue its launch of new and innovative data solutions for the AI industry.

“Machine learning teams building AI models have always faced one particularly pressing problem, and that is continuous access to highly accurate data. When technology-focused companies want to launch their AI initiatives into the market quickly, they simply don’t have the time to collect and validate the data required to do so,” said Daniela Braga, founder and CEO of DefinedCrowd.

According to Braga, DefinedData aims to solve this problem by providing time-strapped customers with high-quality, pre-collected datasets, already annotated and validated by a global crowd of over 300,000 contributors. Usually, creating such high-quality datasets would take a machine learning (ML) team anywhere from three to six months. However, DefinedData makes accessing high-quality data for AI much easier.

Customers can simply browse pre-collected AI datasets in multiple languages, domains, and recording types and either request samples or request to purchase. Customers can also choose between a one-time purchase or annual subscription that provides access to all of the new datasets. By May 2021, the catalog is expected to grow to include over 25,000 hours of speech and natural language data.

“As the appetite for high-quality data continues to grow, the market for training data will become increasingly modularised. Training data repositories and marketplaces will be a key feature of the value chain, allowing teams to both monetise existing data sets as well as source new data time and cost-effectively. We are incredibly excited to be joining Daniela and her team on their journey as they pave the way in this space,” said Laura Connell, Principal at Balderton Capital.

DefinedData will maintain the commitment to quality for which DefinedCrowd has become known. To ensure the highest levels of accuracy and authenticity, multiple key performance indicators (KPIs) will be used including Word Error Rate, gender distribution levels, age distribution, ambient noise levels, nativeness (accuracy of native speakers), and domain accuracy.

“Whether you’re building a prototype or minimum viable product, testing internal models or benchmarking third-party cognitive services, our continually updated library of datasets will help you quickly achieve your AI goals,” concluded Braga.