See What's NEW

About Us

Better data, stronger AI


Magic Data

Magic Data provides high quality training datasets for ML to enterprises and academic institutions engaged in artificial intelligence R&D and application research to voice recognition (ASR), speech synthesis (TTS), natural language processing (NLP), and computer vision (CV).

Magic Data has been dedicated to build conversational and read speech training datasets for ML, which accumulated over 200,000 hours for ASR model, serving top AI companies and Fortune 500 companies around the world, including Microsoft, Nvidia, Qualcomm, Nuance, Cerence, Alibaba Group, Baidu, Tencent, with datasets in dozens of languages, involving HMI, customer service, virtual assistant, machine translation, and many other AI scenarios.

Magic Data is ISO/IEC 27001 & ISO/IEC 27701:2019 accredited and GDPR compliant.

Magic Data Leadership

Dr. ZHANG Qingqing

Founder & CEO

· Former Associate Researcher at IOA, CAS

· Postdoctoral researcher at LIMSI-CNRS

· Fortune “The Most Powerful Women 2021”

· CYZone “Top Female Founder 2021”

· CAS Outstanding Scientific and Technological Achievement Award

· Member of Committee of Acoustics/Automobile/Female Worker/Standardization of CCF

Dr. ZHANG Qingqing
Partner, Sales VP-img


Partner, Sales VP

Data Scientist-img


Data Scientist

CFO & CLO-img

Kenneth PANG


Embrace limitless opportunity

Awards & Recognition

honor-img honor-img honor-img honor-img

Press Room

Press Room

Baseline & Training Datasets Are Open Now | ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge (CSSD)

As of its launch on July 4, 2022, ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge has received more than 40 registration. On July 24, the committee releases the baseline and training datasets for all participants.

How to Ensure AI Data Security?

In the recently held 2022 World Artificial Intelligence Conference, the WAIC 2022 Data Element Circulation Technology Frontier Exploration Forum was one of the major theme forums of the conference. With the theme of "Open Symbiosis, Integration of Data and Reality", the forum focused on the important economic and strategic value of data as a key production factor driving economic and social innovation and development, as well as the corresponding security threats and privacy challenges.

Brain-Computer Interface - The Next Big Thing of Intelligent Robots?

Recently, Dogecoin DOGE/USD founder Billy Marcus tweeted, "Would you be friends if you could upload your brain to the cloud and talk to a virtual version of yourself?" Musk replied, "I've already done it".

The New Celebrity: Virtual Human

Since the epidemic in 2020, the most popular is not a popular star, but a "virtual human". From the Japanese fashionista IMMA, the domestic AYAYI, the virtual singer Getong, to the dimensional virtual person A-SOUL combination, to CCTV's virtual person Xiao C and the live-action virtual people Teresa Teng and Gong Jun... They are from the fashion industry, From the singing and dancing world, the dimension world to reporters, actors and other industries, there are many fans who shine.

Bone Voiceprint Recognition—How Bone Conduction Headphones Work

With the development of artificial intelligence, many people are no strangers to voiceprint recognition. Voiceprint recognition is to convert sound signals into electrical signals, and then use a computer for identification. Different tasks and applications will use different voiceprint recognition technologies. For example, identification technology may be required when narrowing the scope of criminal investigations, while verification technology may be required for banking transactions.

AI for Good--Instead of Discussing the Consciousness of AI, Let it Be Our Assistant

In June 2022, Google engineer, Blake Lemonine, claimed that Google's large-scale language model LaMDA had human "self-awareness", and in Blake Lemoine's view, even the artificial intelligence created by GPT-3, the largest language neural model constructed in the form of OpenAI open source architecture, the consciousness of "human" may also appear. According to Blake Lemoine's interview quotes, LaMDA can access services such as YouTube, Google Search, Google Maps, and Google Books, which means that it can continue to accumulate "knowledge" through Google's services, thereby becoming smarter, that is, it can continue to imitate the human brain. Learning to evolve. As soon as the news came out, discussions about whether LaMDA had a "personality" spread across social platforms at home and abroad.

Get Started?

Contact Us

Talk to Magic Data