Close Menu
    Trending
    • AI Startup TML From Ex-OpenAI Exec Mira Murati Pays $500,000
    • STOP Building Useless ML Projects – What Actually Works
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Artificial Intelligence»Measuring Cross-Product Adoption Using dbt_set_similarity | by Matthew Senick | Dec, 2024
    Artificial Intelligence

    Measuring Cross-Product Adoption Using dbt_set_similarity | by Matthew Senick | Dec, 2024

    Team_AIBS NewsBy Team_AIBS NewsDecember 28, 2024No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Enhancing cross-product insights inside dbt workflows

    Towards Data Science

    For multi-product firms, one vital metric is usually what known as “cross-product adoption”. (i.e. understanding how customers interact with a number of choices in a given product portfolio)

    One measure recommended to calculate cross-product or cross-feature utilization within the standard e-book Hacking Development [1] is the Jaccard Index. Historically used to measure the similarity between two units, the Jaccard Index may also function a robust device for assessing product adoption patterns. It does this by quantifying the overlap in customers between merchandise, you possibly can determine cross-product synergies and development alternatives.

    A dbt bundle dbt_set_similarity is designed to simplify the calculation of set similarity metrics straight inside an analytics workflow. This bundle offers a way to calculate the Jaccard Indices inside SQL transformation workloads.

    To import this bundle into your dbt undertaking, add the next to the packages.yml file. We may even want dbt_utils for the needs of this articles instance. Run a dbt deps command inside your undertaking to put in the bundle.

    packages:
    - bundle: Matts52/dbt_set_similarity
    model: 0.1.1
    - bundle: dbt-labs/dbt_utils
    model: 1.3.0

    The Jaccard Index, also referred to as the Jaccard Similarity Coefficient, is a metric used to measure the similarity between two units. It’s outlined as the scale of the intersection of the units divided by the scale of their union.

    Mathematically, it may be expressed as:

    The Jaccard Index represents the “Intersection” over the “Union” of two units (picture by writer)

    The place:

    • A and B are two units (ex. customers of product A and product B)
    • The numerator represents the variety of components in each units
    • The denominator represents the entire variety of distinct components throughout each units
    (picture by writer)

    The Jaccard Index is especially helpful within the context of cross-product adoption as a result of:

    • It focuses on the overlap between two units, making it superb for understanding shared person bases
    • It accounts for variations within the whole measurement of the units, guaranteeing that outcomes are proportional and never skewed by outliers

    For instance:

    • If 100 customers undertake Product A and 50 undertake Product B, with 25 customers adopting each, the Jaccard Index is 25 / (100 + 50 — 25) = 0.2, indicating a 20% overlap between the 2 person bases by the Jaccard Index.

    The instance dataset we can be utilizing is a fictional SaaS firm which presents cupboard space as a product for customers. This firm offers two distinct storage merchandise: doc storage (doc_storage) and picture storage (photo_storage). These are both true, indicating the product has been adopted, or false, indicating the product has not been adopted.

    Moreover, the demographics (user_category) that this firm serves are both tech fanatics or owners.

    For the sake of this instance, we are going to learn this csv file in as a “seed” mannequin named seed_example throughout the dbt undertaking.

    Now, let’s say we wish to calculate the jaccard index (cross-adoption) between our doc storage and picture storage merchandise. First, we have to create an array (listing) of the customers who’ve the doc storage product, alongside an array of the customers who’ve the picture storage product. Within the second cte, we apply the jaccard_coef perform from the dbt_set_similarity bundle to assist us simply compute the jaccard coefficient between the 2 arrays of person id’s.

    with product_users as (
    choose
    array_agg(user_id) filter (the place doc_storage = true)
    as doc_storage_users,
    array_agg(user_id) filter (the place photo_storage = true)
    as photo_storage_users
    from {{ ref('seed_example') }}
    )

    choose
    doc_storage_users,
    photo_storage_users,
    {{
    dbt_set_similarity.jaccard_coef(
    'doc_storage_users',
    'photo_storage_users'
    )
    }} as cross_product_jaccard_coef
    from product_users

    Output from the above dbt mannequin (picture by writer)

    As we will interpret, plainly simply over half (60%) of customers who’ve adopted both of merchandise, have adopted each. We are able to graphically confirm our consequence by inserting the person id units right into a Venn diagram, the place we see three customers have adopted each merchandise, amongst 5 whole customers: 3/5 = 0.6.

    What the gathering of person id’s and product adoption would seem like, verifying our consequence (picture by writer)

    Utilizing the dbt_set_similarity bundle, creating segmented jaccard indices for our totally different person classes ought to be pretty pure. We are going to comply with the identical sample as earlier than, nonetheless, we are going to merely group our aggregations on the person class {that a} person belongs to.

    with product_users as (
    choose
    user_category,
    array_agg(user_id) filter (the place doc_storage = true)
    as doc_storage_users,
    array_agg(user_id) filter (the place photo_storage = true)
    as photo_storage_users
    from {{ ref('seed_example') }}
    group by user_category
    )

    choose
    user_category,
    doc_storage_users,
    photo_storage_users,
    {{
    dbt_set_similarity.jaccard_coef(
    'doc_storage_users',
    'photo_storage_users'
    )
    }} as cross_product_jaccard_coef
    from product_users

    Output from the above dbt mannequin (picture by writer)

    We are able to see from the information that amongst owners, cross-product adoption is larger, when contemplating jaccard indices. As proven within the output, all owners who’ve adopted one of many product, have adopted each. In the meantime, solely one-third of the tech fanatics who’ve adopted one product have adopted each of the merchandise. Thus, in our very small dataset, cross-product adoption is larger amongst owners versus tech fanatics.

    We are able to graphically confirm the output by once more creating Venn diagram:

    Venn diagrams break up by the 2 segments (picture by writer)

    dbt_set_similarity offers a simple and environment friendly strategy to calculate cross-product adoption metrics such because the Jaccard Index straight inside a dbt workflow. By making use of this technique, multi-product firms can achieve helpful insights into person conduct and adoption patterns throughout their product portfolio. In our instance, we demonstrated the calculation of total cross-product adoption in addition to segmented adoption for distinct person classes.

    Utilizing the bundle for cross-product adoption is solely one easy utility. In actuality, there exists numerous different potential purposes of this method, for instance some areas are:

    • Characteristic utilization evaluation
    • Advertising and marketing marketing campaign affect evaluation
    • Assist evaluation

    Moreover, this fashion of study is actually not restricted to simply SaaS, however can apply to just about any trade. Glad Jaccard-ing!

    References

    [1] Sean Ellis and Morgan Brown, Hacking Growth (2017)

    Assets

    dbt package hub



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous Article5 Ways of Deploying A Geospatial Python Machine Learning Algorithm Like A Pro | by Stephen Chege | Dec, 2024
    Next Article Portable, Durable, and Fast: the Dual-USB Flash Drive Every Entrepreneur Needs
    Team_AIBS News
    • Website

    Related Posts

    Artificial Intelligence

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025
    Artificial Intelligence

    Implementing IBCS rules in Power BI

    July 1, 2025
    Artificial Intelligence

    Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    AI Startup TML From Ex-OpenAI Exec Mira Murati Pays $500,000

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    This CEO Says the Secret to Growth Is Knowing Who You’re Not For

    May 25, 2025

    Grief Forced Me to Step Away From My Company. These 5 Systems Made It Possible.

    June 30, 2025

    Denetimsiz Öğrenmede Boyut İndirgeme Algoritmaları | by X | Apr, 2025

    April 27, 2025
    Our Picks

    AI Startup TML From Ex-OpenAI Exec Mira Murati Pays $500,000

    July 1, 2025

    STOP Building Useless ML Projects – What Actually Works

    July 1, 2025

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.