Close Menu
    Trending
    • STOP Building Useless ML Projects – What Actually Works
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    • Why PDF Extraction Still Feels LikeHack
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»Are We Watching More Ads Than Content? Analyzing YouTube Sponsor Data
    Artificial Intelligence

    Are We Watching More Ads Than Content? Analyzing YouTube Sponsor Data

    Team_AIBS NewsBy Team_AIBS NewsApril 4, 2025No Comments22 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    not the one one that feels that YouTube sponsor segments have develop into longer and extra frequent not too long ago. Generally, I watch movies that appear to be attempting to promote me one thing each couple of seconds.

    , it certain is annoying to be bombarded by adverts. 

    On this weblog publish, I’ll discover these sponsor segments, utilizing knowledge from a well-liked browser extension referred to as SponsorBlock, to determine if the perceived improve in adverts really did occur and likewise to quantify what number of adverts I’m watching.

    I’ll stroll you thru my evaluation, offering code snippets in Sql, DuckDB, and pandas. All of the code is accessible on my GitHub, and for the reason that dataset is open, I may also train you the right way to obtain it, in an effort to comply with alongside and play with the information your self.

    These are the questions I might be attempting to reply on this evaluation:

    • Have sponsor segments elevated through the years?
    • Which channels have the very best share of sponsor time per video?
    • What’s the density of sponsor segments all through a video?

    To get to those solutions, we must cowl a lot floor. That is the agenda for this publish:

    Let’s get this began!

    SponsorBlock is an extension that means that you can skip advert segments in movies, just like the way you skip Netflix intros. It’s extremely correct, as I don’t bear in mind seeing one incorrect phase since I began utilizing it round a month in the past, and I watch lots of smaller non-English creators.

    You is perhaps asking your self how the extension is aware of which components of the video are sponsors, and, imagine it or not, the reply is thru crowdsourcing!

    Customers submit the timestamps for the advert segments, and different customers vote if it’s correct or not. For the common person, who isn’t contributing in any respect, the one factor you need to do is to press Enter to skip the advert.

    Okay, now that you recognize what SponsorBlock is, let’s discuss concerning the knowledge. 

    Cleansing the Knowledge

    If you wish to comply with alongside, you may obtain a replica of the information utilizing this SponsorBlock Mirror (it would take you fairly a couple of minutes to obtain all of it). The database schema may be seen here, though most of it gained’t be helpful for this venture.

    As one would possibly anticipate, their database schema is made for the extension to work correctly, and never for some man to mainly leech from an enormous neighborhood effort to search out what share of adverts his favourite creator runs. For this, some work will have to be carried out to scrub and mannequin the information.

    The one two tables which might be vital for this evaluation are:

    • sponsorTimes.csv : That is an important desk, containing the startTime and endTime of all crowdsourced sponsor segments. The CSV is round 5GB.
    • videoInfo.csv : Accommodates the video title, publication date, and channel ID related to every video.

    Earlier than we get into it, these are all of the libraries I ended up utilizing. I’ll clarify the much less apparent ones as we go.

    pandas
    duckdb
    requests
    requests-cache
    python-dotenv
    seaborn
    matplotlib
    numpy

    Step one, then, is to load the information. Surprisingly, this was already a bit difficult, as I used to be getting lots of errors parsing some rows of the CSV. These have been the settings I discovered to work for almost all of the rows:

    import duckdb
    import os
    
    # Hook up with an in-memory DuckDB occasion
    con = duckdb.join(database=':reminiscence:')
    
    sponsor_times = con.read_csv(
        "sb-mirror/sponsorTimes.csv",
        header=True,
        columns={
            "videoID": "VARCHAR",
            "startTime": "DOUBLE",
            "endTime": "DOUBLE",
            "votes": "INTEGER",
            "locked": "INTEGER",
            "incorrectVotes": "INTEGER",
            "UUID": "VARCHAR",
            "userID": "VARCHAR",
            "timeSubmitted": "DOUBLE",
            "views": "INTEGER",
            "class": "VARCHAR",
            "actionType": "VARCHAR",
            "service": "VARCHAR",
            "videoDuration": "DOUBLE",
            "hidden": "INTEGER",
            "fame": "DOUBLE",
            "shadowHidden": "INTEGER",
            "hashedVideoID": "VARCHAR",
            "userAgent": "VARCHAR",
            "description": "VARCHAR",
        },
        ignore_errors=True,
        quotechar="",
    )
    
    video_info = con.read_csv(
        "sb-mirror/videoInfo.csv",
        header=True,
        columns={
            "videoID": "VARCHAR",
            "channelID": "VARCHAR",
            "title": "VARCHAR",
            "printed": "DOUBLE",
        },
        ignore_errors=True,
        quotechar=None,
    )
    
    # Ignore warnings
    import warnings
    warnings.filterwarnings('ignore')

    Here’s what a pattern of the information appears like:

    con.sql("SELECT videoID, startTime, endTime, votes, locked, class FROM sponsor_times LIMIT 5")
    
    con.sql("SELECT * FROM video_info LIMIT 5")
    Pattern of sponsorTimes.csv
    Pattern of videoInfo.csv

    Understanding the information within the sponsorTimes desk is ridiculously vital, in any other case, the cleansing course of gained’t make any sense.

    Every row represents a user-submitted timestamp for a sponsored phase. Since a number of customers can submit segments for a similar video, the dataset accommodates duplicate and probably incorrect entries, which is able to have to be handled throughout cleansing.

    To seek out incorrect segments, I’ll use the votes and the locked column, because the latter one represents segments that have been confirmed to be appropriate. 

    One other vital column is the class. There are a bunch of classes like Intro, Outro, Filler, and so on. For this evaluation, I’ll solely work with Sponsor and Self-Promo.

    I began by making use of some filters:

    CREATE TABLE filtered AS
    SELECT
        *
    FROM sponsor_times
    WHERE class IN ('sponsor', 'selfpromo') AND (votes > 0 OR locked=1)

    Filtering for locked segments or segments with greater than 0 votes was a giant resolution. This diminished the dataset by an enormous share, however doing so made the information very dependable. For instance, earlier than doing this, the entire Prime 50 channels with the very best share of adverts have been simply spam, random channels that ran 99.9% of adverts.

    With this carried out, the subsequent step is to get a dataset the place every sponsor phase reveals up solely as soon as. For instance, a video with a sponsor phase firstly and one other on the finish ought to have solely two rows of information.

    That is very a lot not the case thus far, since in a single video we are able to have a number of user-submitted entries for every phase. To do that, I’ll use window capabilities to establish if two or extra rows of information characterize the identical phase. 

    The primary window perform compares the startTime of 1 row with the endTime of the earlier. If these values don’t overlap, it means they’re entries for separate segments, in any other case they’re repeated entries for a similar phase. 

    CREATE TABLE new_segments AS
    SELECT
        -- Coalesce to TRUE to cope with the primary row of each window
        -- because the values are NULL, but it surely ought to depend as a brand new phase.
        COALESCE(startTime > LAG(endTime) 
          OVER (PARTITION BY videoID ORDER BY startTime), true) 
          AS new_ad_segment,
        *
    FROM filtered
    Window Operate instance for a single video.

    The new_ad_segment column is TRUE each time a row represents a brand new phase of a video. The primary two rows, as their timestamps overlap, are correctly marked as the identical phase.

    Subsequent up, the second window perform will label every advert phase by quantity:

    CREATE TABLE ad_segments AS
    SELECT
        SUM(new_ad_segment) 
          OVER (PARTITION BY videoID ORDER BY startTime)
          AS ad_segment,
        *
    FROM new_segments
    Instance of labels for advert segments for a single video.

    Lastly, now that every phase is correctly numbered, it’s simple to get the phase that’s both locked or has the very best quantity of votes.

    CREATE TABLE unique_segments AS
    SELECT DISTINCT ON (videoID, ad_segment)
        *
    FROM ad_segments
    ORDER BY videoID, ad_segment, locked DESC, votes DESC
    Instance of what the ultimate dataset appears like for a single video.

    That’s it! Now this desk has one row for every distinctive advert phase, and I can begin exploring the information.

    If these queries really feel difficult, and also you want a refresher on window capabilities, take a look at this blog publish that can train you all you must learn about them! The final instance lined within the weblog publish is nearly precisely the method I used right here.

    Exploring and Enhancing the Knowledge

    Lastly, the dataset is nice sufficient to begin exploring. The very first thing I did was to get a way of the dimensions of the information:

    • 36.0k Distinctive Channels
    • 552.6k Distinctive Movies
    • 673.8k Distinctive Sponsor Segments, for a mean of 1.22 segments per video

    As talked about earlier, filtering by segments that have been both locked or had not less than 1 upvote, diminished the dataset massively, by round 80%. However that is the value I needed to pay to have knowledge that I might work with.

    To examine if there may be nothing instantly incorrect with the information, I gathered the channels which have essentially the most quantity of movies:

    CREATE TABLE top_5_channels AS 
    SELECT
        channelID,
        depend(DISTINCT unique_segments.videoID) AS video_count
    FROM
        unique_segments
        LEFT JOIN video_info ON unique_segments.videoID = video_info.videoID 
    WHERE
        channelID IS NOT NULL
        -- Some channel IDs are clean
        AND channelID != '""'
    GROUP BY
        channelID
    ORDER BY
        video_count DESC
    LIMIT 5

    The quantity of movies per channel appears real looking… However that is horrible to work with. I don’t need to go to my browser and lookup channel IDs each time I need to know the identify of a channel.

    To repair this, I created a small script with capabilities to get these values from the YouTube API in Python. I’m utilizing the library requests_cache to ensure I gained’t be repeating API calls and depleting the API limits.

    import requests
    import requests_cache
    from dotenv import load_dotenv
    import os
    
    load_dotenv()
    API_KEY = os.getenv("YT_API_KEY")
    
    # Cache responses indefinitely
    requests_cache.install_cache("youtube_cache", expire_after=None)
    
    def get_channel_name(channel_id: str) -> str:
        url = (
            f"https://www.googleapis.com/youtube/v3/channels"
            f"?half=snippet&id={channel_id}&key={API_KEY}"
        )
        response = requests.get(url)
        knowledge = response.json()
    
        strive:
            return knowledge.get("objects", [])[0].get("snippet", {}).get("title", "")
        besides (IndexError, AttributeError):
            return ""

    Apart from this, I additionally created very related capabilities to get the nation and thumbnail of every channel, which might be helpful later. In case you’re within the code, examine the GitHub repo.

    On my DuckDB code, I’m now in a position to register this Python perform and name them inside SQL! I simply have to be very cautious to all the time use them on aggregated and filtered knowledge, in any other case, I can say bye-bye to my API quota.

    # This the script created above
    from youtube_api import get_channel_name
    
    # Attempt registering the perform, ignore if already exists
    strive:
        con.create_function('get_channel_name', get_channel_name, [str], str)
    besides Exception as e:
        print(f"Skipping perform registration (probably already exists): {e}")
    
    # Get the channel names
    channel_names = con.sql("""
        choose
            channelID,
            get_channel_name(channelID) as channel_name,
            video_count
        from top_5_channels
    """)

    A lot better! I regarded up two channels that I’m accustomed to on YouTube for a fast sanity examine. Linus Tech Suggestions has a complete of seven.2k movies uploaded, with 2.3k current on this dataset. Avid gamers Nexus has 3k movies, with 700 within the dataset. Appears to be like ok for me!

    The very last thing to do, earlier than shifting over to truly answering the query I set myself to reply, is to have an concept of the common period of movies. 

    This matches my expectations, for essentially the most half. I’m nonetheless a bit shocked by the quantity of 20-40-minute movies, as for a few years the “meta” was to have movies of 10 minutes to maximise YouTube’s personal adverts. 

    Additionally, I assumed these buckets of video durations used within the earlier graph have been fairly consultant of how I take into consideration video lengths, so I might be sticking with them for the subsequent sections.

    For reference, that is the pandas code used to create these buckets.

    video_lengths = con.sql("""
      SELECT DISTINCT ON (videoID)
          videoID,
          videoDuration
      FROM
          unique_segments
      WHERE
          videoID IS NOT NULL
          AND videoDuration > 0
    """
    ).df()
    
    # Outline customized bins, in minutes
    bins = [0, 3, 7, 12, 20, 40, 90, 180, 600, 9999999] 
    labels = ["0-3", "3-7", "7-12", "12-20", "20-40", "40-90", "90-180", "180-600", "600+"]
    
    # Assign every video to a bucket (trasnform period to min)
    video_lengths["duration_bucket"] = pd.lower(video_lengths["videoDuration"] / 60, bins=bins, labels=labels, proper=False)

    The massive query. This may show if I’m being paranoid or not about everybody attempting to promote me one thing always. I’ll begin, although, by answering a less complicated query, which is the proportion of sponsors for various video durations.

    My expectation is that shorter movies have the next share of their runtime from sponsors compared to longer movies. Let’s examine if that is really the case.

    CREATE TABLE video_total_ads AS
    SELECT
        videoID,
        MAX(videoDuration) AS videoDuration,
        SUM(endTime - startTime) AS total_ad_duration,
        SUM(endTime - startTime) / 60 AS ad_minutes,
        SUM(endTime - startTime) / MAX(videoDuration) AS ad_percentage,
        MAX(videoDuration) / 60 AS video_duration_minutes
    FROM
        unique_segments
    WHERE
        videoDuration > 0
        AND videoDuration < 5400
        AND videoID IS NOT NULL
    GROUP BY
        videoID

    To maintain the visualization easy, I’m making use of related buckets, however solely as much as 90 minutes.

    # Outline period buckets (in minutes, as much as 90min)
    bins = [0, 3, 7, 12, 20, 30, 40, 60, 90]    
    labels = ["0-3", "3-7", "7-12", "12-20", "20-30", "30-40", "40-60", "60-90"]
    
    video_total_ads = video_total_ads.df()
    
    # Apply the buckets once more
    video_total_ads["duration_bucket"] = pd.lower(video_total_ads["videoDuration"] / 60, bins=bins, labels=labels, proper=False)
    
    # Group by bucket and sum advert occasions and complete durations
    bucket_data = video_total_ads.groupby("duration_bucket")[["ad_minutes", "videoDuration"]].sum()
    
    # Convert to share of complete video time
    bucket_data["ad_percentage"] = (bucket_data["ad_minutes"] / (bucket_data["videoDuration"] / 60)) * 100
    bucket_data["video_percentage"] = 100 - bucket_data["ad_percentage"]

    As anticipated, in the event you’re watching shorter-form content material on YouTube, then round 10% of it’s sponsored! Movies of 12–20 min in period have 6.5% of sponsors, whereas 20–30 min have solely 4.8%.

    To maneuver ahead to the year-by-year evaluation I want to hitch the sponsor occasions with the videoInfo desk.

    CREATE TABLE video_total_ads_joined AS
    SELECT
        *
    FROM
        video_total_ads
    LEFT JOIN video_info ON video_total_ads.videoID = video_info.videoID

    Subsequent, let’s simply examine what number of movies we’ve per 12 months:

    SELECT
        *,
        to_timestamp(NULLIF (printed, 0)) AS published_date,
        extract(12 months FROM to_timestamp(NULLIF (printed, 0))) AS published_year
    FROM
        video_total_ads

    Not good, not good in any respect. I’m not precisely certain why however there are lots of movies that didn’t have the timestamp recorded. It appears that evidently solely in 2021 and 2022 movies have been reliably saved with their printed date.

    I do have some concepts on how I can enhance this dataset with different public knowledge, but it surely’s a really time-consuming course of and I’ll go away this for a future weblog publish. I don’t intend to accept a solution based mostly on restricted knowledge, however for now, I must make do with what I’ve.

    I selected to maintain the evaluation between the years 2018 and 2023, on condition that these years had extra knowledge factors.

    # Limiting the years as for these right here I've a good quantity of information.
    start_year = 2018
    end_year = 2023
    
    plot_df = (
        video_total_ads_joined.df()
        .question(f"published_year >= {start_year} and published_year <= {end_year}")
        .groupby(["published_year", "duration_bucket"], as_index=False)
        [["ad_minutes", "video_duration_minutes"]]
        .sum()
    )
    
    # Calculate ad_percentage & content_percentage
    plot_df["ad_percentage"] = (
        plot_df["ad_minutes"] / plot_df["video_duration_minutes"] * 100
    )
    plot_df["content_percentage"] = 100 - plot_df["ad_percentage"]

    There’s a steep improve in advert share, particularly from 2020 to 2021, however afterward, it plateaus, particularly for longer movies. This makes lots of sense since throughout these years on-line commercial grew loads as individuals spent increasingly more time at dwelling. 

    For shorter movies, there does appear to be a rise from 2022 to 2023. However as the information is proscribed, and I don’t have knowledge for 2024, I can’t get a conclusive reply to this. 

    Subsequent up, let’s transfer into questions that don’t rely upon the publishing date, this manner I can work with a bigger portion of the dataset.

    It is a enjoyable one for me, as I’m wondering if the channels I actively watch are those that run essentially the most adverts. 

    Persevering with from the desk created beforehand, I can simply group the advert and video quantity by channel:

    CREATE TABLE ad_percentage_per_channel AS
    SELECT
        channelID,
        sum(ad_minutes) AS channel_total_ad_minutes,
        sum(videoDuration) / 60 AS channel_total_video_minutes
    FROM
        video_total_ads_joined
    GROUP BY
        channelID

    I made a decision to filter for channels that had not less than half-hour of movies within the knowledge, as a method of eliminating outliers.

    SELECT
        channelID,
        channel_total_video_minutes,
        channel_total_ad_minutes,
        channel_ad_percentage
    FROM
        ad_percentage_per_channel
    WHERE
        -- A minimum of half-hour of video
        channel_total_video_minutes > 1800
        AND channelID IS NOT NULL
    ORDER BY
        channel_ad_percentage DESC
    LIMIT 50

    As rapidly talked about earlier, I additionally created some capabilities to get the nation and thumbnail of channels. This allowed me to create this visualization.

    I’m undecided if this shocked me or not. Among the channels on this listing I watch very ceaselessly, particularly Gaveta (#31), a Brazilian YouTuber who covers films and movie enhancing.

    I additionally know that each he and Hall Crew (#32) do lots of self-sponsor, selling their very own content material and merchandise, so possibly that is additionally the case for different channels! 

    In any case, the information appears good, and the odds appear to match my handbook checks and private expertise.

    I might like to know if channels that you simply watch have been current on this listing, and if it shocked you or not!

    If you wish to see the Prime 150 Creators, subscribe to my free newsletter, as I might be publishing the total listing in addition to extra details about this evaluation in there!

    Have you ever ever thought of at which level of the video adverts work finest? Folks in all probability simply skip sponsor segments positioned firstly, and simply transfer on and shut the video for these positioned on the finish.

    From private expertise, I really feel that I’m extra prone to watch an advert if it performs across the center of a video, however I don’t assume that is what creators do generally.

    My purpose, then, is to create a heatmap that reveals the density of adverts throughout a video runtime. Doing this was surprisingly not apparent, and the answer that I discovered was so intelligent that it kinda blew my thoughts. Let me present you.

    That is the information wanted for this evaluation. One row per advert, with the timestamp when every phase begins and ends:

    Step one is to normalize the intervals, e.g., I don’t care that an advert began at 63s, what I need to know is that if it began at 1% of the video runtime or 50% of the video runtime.

    CREATE TABLE ad_intervals AS
    SELECT
        videoID,
        startTime,
        endTime,
        videoDuration,
        startTime / videoDuration AS start_fraction,
        endTime / videoDuration AS end_fraction
    FROM
        unique_segments
    WHERE
        -- Simply to ensure we do not have dangerous knowledge
        videoID IS NOT NULL
        AND startTime >= 0
        AND endTime <= videoDuration
        AND startTime < endTime
        -- Lower than 40h
        AND videoDuration < 144000

    Nice, now all intervals are comparable, however the issue is much from solved.

    I need you to assume, how would you resolve this? If I requested you “At 10% runtime out of all movies, what number of adverts are operating?”

    I don’t imagine that that is an apparent drawback to unravel. My first intuition was to create a bunch of buckets, after which, for every row, I might ask “Is there an advert operating at 1% of the runtime? What about at 2%? And so forth…”

    This appeared like a horrible concept, although. I wouldn’t be capable of do it in SQL, and the code to unravel it could be extremely messy. Ultimately, the implementation of the answer I discovered was remarkably easy, utilizing the Sweep Line Algorithm, which is an algorithm that’s usually utilized in programming interviews and puzzles.

    I’ll present you the way I solved it however don’t fear in the event you don’t perceive what is occurring. I’ll share different assets so that you can study extra about it afterward.

    The very first thing to do is to remodel every interval (startTime, endTime) into two occasions, one that can depend as +1 when the advert begins, and one other that can depend as -1 when the advert finishes. Afterward, simply order the dataset by the “begin time”.

    CREATE TABLE ad_events AS
    WITH unioned as (
      -- That is an important step.
      SELECT
          videoID,
          start_fraction as fraction,
          1 as delta
      FROM ad_intervals
      UNION ALL
      SELECT
          videoID,
          end_fraction as fraction,
          -1 as delta
      FROM ad_intervals
    ), ordered AS (
      SELECT
          videoID,
          fraction,
          delta
      FROM ad_events
      ORDER BY fraction, delta
    )
    SELECT * FROM ordered

    Now it’s already a lot simpler to see the trail ahead! All I’ve to do is use a operating sum on the delta column, after which, at any level of the dataset, I can know what number of adverts are operating! 

    For instance, if from 0s to 10s three adverts began, however two of these additionally completed, I might have a delta of +3 after which -2, which implies that there’s just one advert presently operating!

    Going ahead, and to simplify the information a bit, I first around the fractions to 4 decimal factors and combination them. This isn’t needed, however having too many rows was an issue when attempting to plot the information. Lastly, I divide the quantity of operating adverts by the overall quantity of movies, to have it as a share.

    CREATE TABLE ad_counter AS 
    WITH rounded_and_grouped AS (
      SELECT
          ROUND(fraction, 4) as fraction,
          SUM(delta) as delta
      FROM ad_events
      GROUP BY ROUND(fraction, 4)
      ORDER BY fraction
    ), running_sum AS (
      SELECT
          fraction,
          SUM(delta) OVER (ORDER BY fraction) as ad_counter
      FROM rounded_and_grouped
    ), density AS (
      SELECT
          fraction,
          ad_counter,
          ad_counter / (SELECT COUNT(DISTINCT videoID) FROM unique_segments_filtered) as density
      FROM running_sum
    )
    SELECT * FROM density

    With this knowledge not solely do I do know that firstly of the movies (0.0% fraction), there are 69987 movies operating adverts, this additionally represents 17% of all movies within the dataset.

    Now I can lastly plot it as a heatmap:

    As anticipated, the bumps on the extremities present that it’s far more widespread for channels to run adverts firstly and finish of the video. It’s additionally fascinating that there’s a plateau across the center of the video, however then a drop, because the second half of the video is usually extra ad-free.

    What I discovered humorous is that it’s apparently widespread for some movies to begin right away with an advert. I couldn’t image this, so I manually checked 10 movies and it’s really true… I’m undecided how consultant it’s, however many of the ones that I opened have been gaming-related and in Russian, they usually began immediately with adverts!

    Earlier than we transfer on to the conclusions, what did you consider the answer to this drawback? I used to be shocked at how easy was doing this with the Sweep Line trick. If you wish to know extra about it, I not too long ago printed a blog post protecting some SQL Patterns, and the final one is strictly this drawback! Simply repackaged within the context of counting concurrent conferences.

    Conclusion

    I actually loved doing this evaluation for the reason that knowledge feels very private to me, particularly as a result of I’ve been hooked on YouTube currently. I additionally really feel that the solutions I discovered have been fairly passable, not less than for essentially the most half. To complete it off, let’s do a final recap!

    Have Sponsor Segments Elevated Over the Years?

    There was a transparent improve from 2020 to 2021. This was an impact that occurred all through all digital media and it’s clearly proven on this knowledge. In more moderen years, I can’t say whether or not there was a rise or not, as I don’t have sufficient knowledge to be assured. 

    Which Channels Have the Highest Proportion of Sponsor Time Per Video?

    I obtained to create a really convincing listing of the Prime 50 channels that run the very best quantity of adverts. And I found that a few of my favourite creators are those that spend essentially the most period of time attempting to promote me one thing!

    What’s the density of sponsor segments all through a video?

    As anticipated, most individuals run adverts firstly and the tip of movies. Apart from this, lots of creators run adverts across the center of the video, making the second half barely extra ad-free. 

    Additionally, there are YouTubers who instantly begin a video with adverts, which I believe it’s a loopy technique. 

    Different Learnings and Subsequent Steps

    I appreciated how clear the information was in displaying the proportion of adverts in several video sizes. Now I do know that I’m in all probability spending 5–6% of my time on YouTube watching adverts if I’m not skipping them since I largely watch movies which might be 10–20 min.

    I’m nonetheless not totally pleased although with the year-by-year evaluation. I’ve already regarded into different knowledge and downloaded greater than 100 GB of YouTube metadata datasets. I’m assured that I can use it, along with the YouTube API, to fill some gaps and get a extra convincing reply to my query.

    Visualization Code

    You might need observed that I didn’t present snippets to plot the charts proven right here. This was on function to make the weblog publish extra readable, as matplotlib code occupies lots of area.

    You could find all of the code in my GitHub repo, that method you may copy my charts if you wish to.


    That’s it for this one! I actually hope you loved studying this weblog publish and realized one thing new!

    In case you’re interested in fascinating matters that didn’t make it into this publish, or get pleasure from studying about knowledge, subscribe to my free newsletter on Substack. I publish at any time when I’ve one thing genuinely fascinating to share.

    Need to join immediately or have questions? Attain out anytime at mtrentz.com.

    All pictures and animations by the creator except acknowledged in any other case.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleModern Applications of Kalman Filters: A Comprehensive Guide | by Amir | Apr, 2025
    Next Article Software Engineers Promise $10K If You Help Them Find Work
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025
    Artificial Intelligence

    Implementing IBCS rules in Power BI

    July 1, 2025
    Artificial Intelligence

    Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    ‘It’s just thrown away but it’s gold’

    February 19, 2025

    प्रीति ज़िंटा प्रीति ज़िंटा का इतिहास इस प्रकार है: प्रीति ज़िंटा का जन्म 31 जनवरी, 1975 को शिमला, हिमाचल प्रदेश में हुआ था. उनके पिता दुर्गानंद ज़िंटा भारतीय थलसेना में अफ़सर थे. प्रीति ने अपनी शुरुआती पढ़ाई शिमला के कॉन्वेंट ऑफ़ जीज़स एंड मेरी बोर्डिंग स्कूल से की. प्रीति ने अंग्रेज़ी ऑनर्स की डिग्री हासिल की और फिर मनोविज्ञान में स्नातक कार्यक्रम शुरू किया. उन्होंने आपराधिक मनोविज्ञान में स्नातकोत्तर की डिग्री हासिल की, लेकिन बाद में मॉडलिंग शुरू कर दी. प्रीति ने साल 1998 में शाहरुख़ खान के साथ आई फ़िल्म ‘दिल से’ से बॉलीवुड में डेब्यू किया. प्रीति ने कई हिट फ़िल्में दीं और बड़े-बड़े कलाकारों के साथ काम किया. प्रीति ने साल 2016 में अमेरिकी सिटिज़न जीन गुडइनफ़ से शादी की. प्रीति ने सरोगेसी के ज़रिए साल 2021 में जुड़वां बच्चों जय और जिया को जन्म दिया. प्रीति आईपीएल टीम पंजाब किंग्स की मालकिन हैं. | by Nimbaram Kalirana 3744 | Jan, 2025

    January 4, 2025

    The Shadow Side of AutoML: When No-Code Tools Hurt More Than Help

    May 8, 2025
    Our Picks

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025

    The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.