CALL
  • News
  • Fresh
  • Synthetic Data: A New Era of Artificial Intelligence

Synthetic Data: A New Era of Artificial Intelligence

This week in the field of artificial intelligence, the focus was on synthetic data. OpenAI introduced Canvas, a new tool for interacting with ChatGPT, allowing users to create and edit text and code within a single workspace. Canvas enhances user experience by enabling text and code generation, as well as edits through ChatGPT. This feature is supported by the adapted GPT-4o model, which utilizes synthetic data to create new user interactions. ChatGPT product lead, Nick Terl, noted that synthetic data provides high-quality embedded comments and edits, significantly simplifying the process.

 

However, OpenAI is not the only company relying on synthetic data. Meta, while developing Movie Gen, a tool for creating and editing video clips, also used synthetic subtitles generated by its Llama 3 models. While annotators were brought in to improve the quality of subtitles, the bulk of the work was automated, speeding up the process. OpenAI CEO Sam Altman believes that in the future, AI will be able to produce synthetic data that is sufficient for effective training, which will help reduce costs associated with data annotation and licensing.

 

However, the "synthetic data first" approach carries risks. Models used to generate such data may hallucinate and contain biases, which will affect the quality of outputs. Without careful selection and filtering, synthetic data can lead to a decline in model quality and functionality. Therefore, it is essential to conduct as thorough vetting of synthetic data as is done with traditional data to avoid potential issues.

 

Despite the challenges developers face, synthetic data may become the only viable solution as real-world data becomes increasingly expensive and difficult to obtain. We hope that companies operating in this space will proceed with caution, considering the possible consequences.
Synthetic Data in AI: How OpenAI and Meta are Transforming Technology

Author: Anna
 

ANY QUESTIONS?