Close Menu
    Trending
    • People are using AI to ‘sit’ with them while they trip on psychedelics
    • Reinforcement Learning in the Age of Modern AI | by @pramodchandrayan | Jul, 2025
    • How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures
    • Finding the right tool for the job: Visual Search for 1 Million+ Products | by Elliot Ford | Kingfisher-Technology | Jul, 2025
    • How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins
    • Become a Better Data Scientist with These Prompt Engineering Tips and Tricks
    • Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025
    • Transform Complexity into Opportunity with Digital Engineering
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»Mono to Stereo: How AI Is Breathing New Life into Music | by Max Hilsdorf | Dec, 2024
    Artificial Intelligence

    Mono to Stereo: How AI Is Breathing New Life into Music | by Max Hilsdorf | Dec, 2024

    Team_AIBS NewsBy Team_AIBS NewsDecember 25, 2024No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Now that we mentioned how related mono-to-stereo expertise is, you is likely to be questioning the way it works below the hood. Turns on the market are completely different approaches to tackling this downside with AI. Within the following, I need to showcase 4 completely different strategies, ranging from conventional sign processing to generative AI. It doesn’t function a whole record of strategies, however slightly as an inspiration for a way this activity has been solved during the last 20 years.

    Conventional Sign Processing: Sound Supply Formation

    Earlier than machine studying turned as fashionable as it’s at this time, the sphere of Music Info Retrieval (MIR) was dominated by sensible, hand-crafted algorithms. It’s no surprise that such approaches additionally exist for mono-to-stereo upmixing.

    The elemental concept behind a paper from 2007 (Lagrange, Martins, Tzanetakis, [1]) is easy:

    If we will discover the completely different sound sources of a recording and extract them from the sign, we will combine them again collectively for a sensible stereo expertise.

    This sounds easy, however how can we inform what the sound sources within the sign are? How can we outline them so clearly that an algorithm can extract them from the sign? These questions are tough to resolve and the paper makes use of quite a lot of superior strategies to realize this. In essence, that is the algorithm they got here up with:

    1. Break the recording into brief snippets and establish the height frequencies (dominant notes) in every snippet
    2. Establish which peaks belong collectively (a sound supply) utilizing a clustering algorithm
    3. Determine the place every sound supply must be positioned within the stereo combine (handbook step)
    4. For every sound supply, extract its assigned frequencies from the sign
    5. Combine all extracted sources collectively to type the ultimate stereo combine.
    Instance of the person interface constructed for the examine. The person goes by way of all of the extracted sources and manually locations them within the stereo combine, earlier than resynthesizing the entire sign. Picture taken from [1].

    Though fairly complicated within the particulars, the instinct is sort of clear: Discover sources, extract them, combine them again collectively.

    A Fast Workaround: Supply Separation / Stem Splitting

    So much has occurred since Lagrange’s 2007 paper. Since Deezer launched their stem splitting instrument Spleeter in 2019, AI-based supply separation programs have grow to be remarkably helpful. Main gamers resembling Lalal.ai or Audioshake make a fast workaround doable:

    1. Separate a mono recording into its particular person instrument stems utilizing a free or business stem splitter
    2. Load the stems right into a Digital Audio Workstation (DAW) and blend them collectively to your liking

    This system has been utilized in a analysis paper in 2011 (see [2]), but it surely has grow to be far more viable since as a result of current enhancements in stem separation instruments.

    The draw back of supply separation approaches is that they produce noticeable sound artifacts, as a result of supply separation itself continues to be not with out flaws. Moreover, these approaches nonetheless require handbook mixing by people, making them solely semi-automatic.

    To completely automate mono-to-stereo upmixing, machine studying is required. By studying from actual stereo mixes, ML system can adapt the blending type of actual human producers.

    Machine Studying with Parametric Stereo

    Photograph by Zarak Khan on Unsplash

    One very artistic and environment friendly method of utilizing machine studying for mono-to-stereo upmixing was introduced at ISMIR 2023 by Serrà and colleagues [3]. This work relies on a music compression approach referred to as parametric stereo. Stereo mixes encompass two audio channels, making it arduous to combine in low-bandwidth settings resembling music streaming, radio broadcasting, or phone connections.

    Parametric stereo is a method to create stereo sound from a single mono sign by specializing in the essential spatial cues our mind makes use of to find out the place sounds are coming from. These cues are:

    1. How loud a sound is within the left ear vs. the correct ear (Interchannel Depth Distinction, IID)
    2. How in sync it’s between left and proper when it comes to time or part (Interchannel Time or Section Distinction)
    3. How comparable or completely different the indicators are in every ear (Interchannel Correlation, IC)

    Utilizing these parameters, a stereo-like expertise will be created from nothing greater than a mono sign.

    That is the strategy the researchers took to develop their mono-to-stereo upmixing mannequin:

    1. Acquire a big dataset of stereo music tracks
    2. Convert the stereo tracks to parametric stereo (mono + spatial parameters)
    3. Prepare a neural community to foretell the spatial parameters given a mono recording
    4. To show a brand new mono sign into stereo, use the skilled mannequin to infer spatial parameters from the mono sign and mix the 2 to a parametric stereo expertise

    At present, no code or listening demos appear to be out there for this paper. The authors themselves confess that “there may be nonetheless a niche between skilled stereo mixes and the proposed approaches” (p. 6). Nonetheless, the paper outlines a artistic and environment friendly strategy to accomplish absolutely automated mono-to-stereo upmixing utilizing machine studying.

    Generative AI: Transformer-based Synthesis

    Stereo-Genration in Meta’s text-to-music mannequin MusicGen. Picture taken from another article by the author.

    Now, we are going to get to the seemingly most straight-forward strategy to generate stereo from mono. Coaching a generative mannequin to take a mono enter and synthesizing each stereo output channels straight. Though conceptually easy, that is by far essentially the most difficult strategy from a technical standpoint. One second of high-resolution audio has 44.1k information factors. Producing a three-minute track with stereo channels due to this fact means producing over 15 million information factors.

    With todays applied sciences resembling convolutional neural networks, transformers, and neural audio codecs, the complexity of the duty is beginning to grow to be managable. There are some papers who selected to generate stereo sign by way of direct neural synthesis (see [4], [5], [6]). Nonetheless, solely [5] prepare a mannequin than can resolve mono to stereo technology out of the field. My instinct is that there’s room for a paper that builds a devoted for the “easy” activity of mono-to-stereo technology and focuses 100% on fixing this goal. Anybody right here searching for a PhD matter?



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleBoycotting Unilever: What Happens to Their Stock Prices? | by Muhammad Kamil Dipinto | Dec, 2024
    Next Article Last Chance to Get Our Unbeatable Babbel Deal
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

    July 1, 2025
    Artificial Intelligence

    Lessons Learned After 6.5 Years Of Machine Learning

    July 1, 2025
    Artificial Intelligence

    Prescriptive Modeling Makes Causal Bets – Whether You Know it or Not!

    June 30, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    People are using AI to ‘sit’ with them while they trip on psychedelics

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    ESA’s Nuclear Rocket: Faster Mars Missions

    June 14, 2025

    PyCaret Explained: Transforming Machine Learning from Hours to Minutes | by Harshit Kandoi | Jan, 2025

    January 20, 2025

    AI Startups Raised Almost Half of All Funding in 2024

    January 8, 2025
    Our Picks

    People are using AI to ‘sit’ with them while they trip on psychedelics

    July 1, 2025

    Reinforcement Learning in the Age of Modern AI | by @pramodchandrayan | Jul, 2025

    July 1, 2025

    How This Man Grew His Beverage Side Hustle From $1k a Month to 7 Figures

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.