How AI tools could be used to reimagine live audio

How AI tools could be used to reimagine live audio

As a longtime public media producer, storytelling technologist and now AI strategist, it’s only natural for me to imagine how AI tools can help the audio craft I honed for many years. 

Given my interest in AI, I revisited my notes from when I was in charge of NPR’s Weekend Edition Sunday and compared them to the show’s rundown to understand our process. The early morning of Aug. 17, 2008, was a whirlwind at NPR as the team raced to get Weekend Edition Sunday on the air. 

The director for the day was stretched thin as he juggled directing and mixing a reporter’s piece, a task normally handled by another producer. Production Assistant (PA) 1 was deep into finalizing a package on race and politics when a problem with tracks for a story by another reporter forced her to retrack and remix on the fly. 

PA 2 was similarly swamped, working to finalize multiple segments, while a producer managed the DACS and stepped in to assist with other pieces. Meanwhile, our editorial assistant faced a critical moment when tracks from a story from a third reporter went missing, causing panic in the newsroom. 

As the deadline loomed, another producer’s computer stalled, forcing her to switch to mine, while the second producer frantically trimmed a reporter piece. The pressure peaked as the team searched for a reporter’s tracks, finally locating them just in time. The team’s quick thinking and collaboration pulled them through, demonstrating the relentless effort required to deliver a live show on time.

This experience underscores how AI, whether through off-the-shelf products or proprietary tools, could significantly reduce pressure in live production environments by automating routine tasks such as track retrieval and mixing, which are often time-consuming and prone to last-minute issues. 

By allowing for piloting and experimentation with AI tools, you can better understand how they can help streamline workflows, prevent crises and enable your team to focus more on creative and strategic aspects of production. This approach ultimately leads to smoother broadcasts and more innovative content, enhancing both productivity and program quality.

Time Event Human Role
Overnight Five pieces arriving overnight Managing and organizing incoming pieces
Early Morning Host piece finished Sunday morning Finishing and filing host piece
6:30 AM One reporter piece mixed Mixing piece, staff overloaded
6:45 AM Second reporter piece filed Managing file tracking and ensuring accuracy
7:00 AM Problem with 2nd reporter tracks – re-tracking Re-tracking due to file error
7:15 AM Third reporter piece edited and filed Editing and filing piece
7:30 AM DACS finished and sent Managing DACS and sending
7:45 AM Fourth reporter re-tracks to fix error Correcting tracking errors and ensuring accuracy
7:55 AM Move to mix on Davar’s computer due to technical issues Troubleshooting technical issues with mix
8:15 AM Reporter tracks found after technical issue Locating and organizing tracks

This intense morning highlights the importance of meticulously documenting processes. By carefully tracking each step of the workflow, hosts, producers, managers and audio engineers can collaborate to identify areas where generative AI tools could streamline routine tasks.

AI has the potential to optimize many repetitive tasks, such as scheduling, file management and audio editing, reducing the risk of technical glitches and freeing up energy for crafting compelling stories.

By streamlining these processes, AI allows producers to focus more on the creative and strategic aspects of their work, ultimately leading to smoother broadcasts and more innovative content. For public media managers, this means a more efficient workflow and a better allocation of team resources, enhancing both productivity and program quality.

Time Event Human role AI efficiency Suggestions for producers and managers
Overnight Five pieces arriving overnight Managing and organizing incoming pieces Automatically manage and organize incoming pieces with human oversight for quality control Identify and document routine tasks in detail to understand where AI can be beneficial
Early Morning Host piece finished Sunday morning Finishing and filing host piece Collaborate with technical teams to integrate AI in areas where it can handle repetitive tasks
6:30 AM One reporter piece mixed Mixing piece, staff overloaded Assist in mixing, preventing staff overload while ensuring human editorial oversight Ensure that AI tools are used to augment the team’s work, not replace critical human elements
6:45 AM Second reporter piece filed Managing file tracking and ensuring accuracy Manage file tracking and error detection, allowing humans to focus on storytelling and narrative Continuously monitor AI performance and adjust its role to align with editorial standards
7:00 AM Problem with 2nd reporter tracks – re-tracking Re-tracking due to file error Identify and correct file errors with voice AI before air, with humans approving the final output Permissions from reporters to use their voice AI for last minute retracks/Reducing workload while maintaining quality
7:15 AM Third reporter piece edited and filed Editing and filing piece Assist in editing and automatically file pieces, with humans making the final creative decisions Foster a culture of innovation where AI is seen as a partner in the creative process
7:30 AM DACS finished and sent Managing DACS and sending Automate DACS management and sending in real-time, while humans finalize and ensure accuracy Regularly assess the impact of AI tools on production efficiency and content quality
7:45 AM Fourth reporter re-tracks to fix error Correcting tracking errors and ensuring accuracy Automatically correct tracking errors, with humans verifying fixes for accuracy Incorporate feedback from producers and editors to refine AI’s role in the workflow
7:55 AM Move to mix on Davar’s computer due to technical issues Troubleshooting technical issues with mix Prevent technical issues by managing resources, with humans troubleshooting if needed Stay informed on emerging AI technologies to continuously improve the production process
8:15 AM Reporter tracks found after technical issue Locating and organizing tracks Automatically find and organize tracks, allowing humans to focus on content and narrative structure Develop guidelines to ensure AI is used ethically and effectively in content creation

Integrating AI successfully requires a shared commitment to using it as a tool that enhances, rather than replaces, the human touch in content creation. As I’ve discussed, integrating AI ethically is imperative. We must ensure that it enhances storytelling, upholds journalistic integrity and respects cultural diversity.

Public media must lead by example, creating proprietary AI tools that reflect community values. The controversy surrounding OpenAI’s “Sky” voice, which mimicked Scarlett Johansson without the actor’s consent, underscored the need for transparency and ethical considerations in AI. 

Here is a summary of how generative AI could be used for show production:

  1. Automated Transcription and Editing:
    • Tools like Otter.ai and Descript provide instant transcriptions and suggest edits, speeding up production. They reduce manual effort, allowing producers more time for creativity.
  2. Audio Quality Optimization:
    • AI tools like Podcastle.io can automatically adjust sound levels and detect issues. They prevent disruption by ensuring high audio quality from the start.
  3. Track and File Management:
    • A proprietary AI could organize and label tracks, ensuring that files are easily found and correctly managed. It could prevent delays like the one caused by missing tracks.
  4. System Monitoring:
    • A proprietary AI could monitor editing stations and suggest alternatives if hardware fails, reducing the risk of technical issues.
  5. Voice AI Tools for Text-to-Speech:
    • We could work with correspondents and hosts to create AI voices, ensuring proper compensation. These voices could be used for retracks on tight deadlines, maintaining quality even under time constraints.
    • AI-generated voices could be created from recordings of hosts and reporters, with their permission. Funders’ messages and commercials can be created using voice AI tools, maintaining quality even when time is limited. For example, NBC used AI to recreate Al Michaels’ voice for daily Summer Olympic Games recaps for subscribers with his permission. 
  6. AI Scheduling Assistants: 
  • Tools like Google Calendar’s AI features could automate scheduling and reminders for interviews and pre-tapes, ensuring timely preparation and coordination.

While the potential of AI is exciting, I approach it with a healthy dose of skepticism. All of this is new, and while the possibilities are promising, the time to pilot and test these tools is now. 

It’s essential that we not only imagine how AI can support our work but also rigorously evaluate its impact to ensure that it truly enhances our processes without compromising the quality that audiences expect. This is an evolving landscape, and our role is to lead with caution, innovation and a steadfast commitment to the craft we’ve spent years developing.

Time Event Human Role
Overnight Three pieces arriving overnight, AI-assisted Supervising AI file management, ensuring quality
Early Morning Host piece finalized with AI editing tools Finalizing content, focusing on narrative flow
6:30 AM Reporter story mixed with AI support Overseeing AI-supported mixing, making creative decisions
6:45 AM Second piece auto-filed by AI Reviewing AI filing, ensuring accuracy
7:00 AM AI identifies minor error in reporter tracks, automatically corrects w/voiceAI Approving AI corrections, making final content decisions
7:15 AM Another reporter piece edited and filed with AI assistance Finalizing edits, concentrating on storytelling
7:30 AM DACS automatically generated and sent Reviewing AI-generated DACS, confirming all elements
7:45 AM AI rechecks piece, final adjustments made by the producer Final editorial supervision, making creative adjustments
7:55 AM AI resolves minor technical issue with reporter’s mix Troubleshooting with AI, focusing on narrative integrity
8:15 AM Reporter tracks auto-organized by AI, ready for final review Final review and approval, planning for future projects

To set the stage for the discussion of AI and audio production, it’s important to recognize that AI has made significant strides in recent years, particularly in generating realistic audio and music. These advancements allow AI to produce everything from lifelike voiceovers to complex musical compositions, transforming industries like entertainment and advertising. However, these new capabilities also introduce challenges that must be addressed to ensure that the technology serves all communities equitably.

One of the most pressing issues is the lack of diverse training data for AI models, which has significant implications for public media stations across the country. Stations represent diverse communities, yet the audio training data used by AI systems often does not represent this diversity. This means that when you use an audio generation or music generation platform, you may notice a lack of cultural variety in the outputs, which can result in content that fails to accurately reflect the richness of global cultures.

Moreover, there is often a troubling lack of transparency regarding the sources of training data. Was it pulled from platforms like YouTube? Was it scraped from across the internet without regard for cultural context or ethical considerations? The opacity surrounding where and how this data is collected raises concerns about the authenticity and ethical integrity of AI-generated content. This gap in diversity and transparency is not just a technical oversight; it carries significant cultural implications, as it risks perpetuating stereotypes, erasing important cultural nuances and ultimately diminishing the quality of content produced for diverse audiences.

Recognizing this, while leading my startup TulipAI, I initiated CulturaFX, a research collaboration with Florida Gulf Coast University software engineering students. This project was designed to directly address the issue of cultural authenticity in AI-generated audio by ethically sourcing and curating diverse datasets. For example, in the case of mariachi music, we proved that the AI model was not trained on recordings that truly reflect the cultural depth and variation within this genre.

This is a great opportunity for public media. By fostering collaboration with cultural experts, public media can come together to create an open-source platform to preserve the authenticity and richness of these sounds, enhancing the quality and relevance of AI-generated audio in public media and beyond.

The risk of cultural insensitivity or inaccuracy in AI-generated content is real, and it’s our responsibility to ensure that these tools are used thoughtfully and responsibly. As we continue to integrate AI tools like GPT-4o into our workflows, we must remain vigilant about these ethical concerns. GPT-4o’s enhanced multimodal capabilities offer exciting possibilities for multilingual communication and cultural preservation, but they also underscore the need for human oversight. 

The evolution of our work with AI is not just about efficiency; it’s about enhancing our ability to tell stories that are true to the diverse cultures and communities we serve. Whether you’re directly involved in AI projects or just starting to explore these tools, it’s essential to engage with these ethical challenges. Don’t just be enamored by these tools or afraid of them — get involved and do the hard work of making them better, more ethical and more transparent.

To cultivate a more thoughtful and responsible integration of AI in creative processes, we must engage in pilot projects, foster meaningful debate and ask critical questions. This approach will ensure that our technological advancements enhance inclusivity and cultural sensitivity in public media.

Davar Ardalan is an AI and Media Specialist known for her leadership in Cultural AI initiatives. With experience at National Geographic, NPR News, The White House Presidential Innovation Fellowship Program, IVOW AI and TulipAI, she has championed principles like fairness and cultural preservation in AI. Ardalan has also developed AI training and educational workshops. As the leader of TulipAI, she focuses on AI and cultural heritage preservation. She is offering an upcoming course about AI and audio that will cover how to leverage AI for content creation, historical reenactments and more while mastering techniques to enhance sound quality and produce multilingual, ethical and culturally rich audio. 

This content was crafted with the assistance of artificial intelligence, which contributed to structuring the narrative, ensuring grammatical accuracy, summarizing key points and enhancing the readability and coherence of the material. 

Originally Appeared Here