Close Menu
    Trending
    • Using Graph Databases to Model Patient Journeys and Clinical Relationships
    • Cuba’s Energy Crisis: A Systemic Breakdown
    • AI Startup TML From Ex-OpenAI Exec Mira Murati Pays $500,000
    • STOP Building Useless ML Projects – What Actually Works
    • Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025
    • The New Career Crisis: AI Is Breaking the Entry-Level Path for Gen Z
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Decision Trees: How They Split the Data | by Jim Canary | Jan, 2025
    Machine Learning

    Decision Trees: How They Split the Data | by Jim Canary | Jan, 2025

    Team_AIBS NewsBy Team_AIBS NewsJanuary 21, 2025No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Now let’s get right into a step-by-step rationalization, together with Python code to coach, visualize, and interpret a easy resolution tree.

    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.datasets import make_classification
    from sklearn.tree import DecisionTreeClassifier, plot_tree
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
    import seaborn as sns

    # For reproducibility
    np.random.seed(42)

    We’ll create an artificial dataset for binary classification.

    # Create a toy dataset with 2 options and a binary label
    X, y = make_classification(
    n_samples=200,
    n_features=2,
    n_informative=2,
    n_redundant=0,
    n_clusters_per_class=1,
    random_state=42
    )

    plt.determine(figsize=(6,4))
    sns.scatterplot(x=X[:,0], y=X[:,1], hue=y, palette='coolwarm', edgecolor='ok')
    plt.title("Artificial Knowledge for Choice Tree Demo")
    plt.present()

    1. X: Has two options (X[:, 0] and X[:, 1]).
    2. y: A binary label (0 or 1).

    We separate knowledge into coaching (80%) and testing (20%) units to judge generalization.

    X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
    )
    # Initialize the Choice Tree
    dt_model = DecisionTreeClassifier(
    criterion='gini', # or 'entropy'
    max_depth=3, # restrict the tree depth
    random_state=42
    )

    # Match the mannequin on the coaching knowledge
    dt_model.match(X_train, y_train)

    # Predict on the check knowledge
    y_pred = dt_model.predict(X_test)

    1. criterion: We use gini right here, however you’ll be able to swap to entropy.
    2. max_depth: Prevents the tree from rising too deep (a type of pre-pruning).
    # Accuracy
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Accuracy: {accuracy:.2f}")

    # Confusion Matrix
    cm = confusion_matrix(y_test, y_pred)
    print("Confusion Matrix:n", cm)

    # Classification Report
    print("Classification Report:")
    print(classification_report(y_test, y_pred))

    >>> Accuracy: 0.82

    >>> Confusion Matrix:
    [[16 7]
    [ 0 17]]

    >>> Classification Report:
    precision recall f1-score help

    0 1.00 0.70 0.82 23
    1 0.71 1.00 0.83 17

    accuracy 0.82 40
    macro avg 0.85 0.85 0.82 40
    weighted avg 0.88 0.82 0.82 40

    Scikit-learn offers a useful plot_tree operate. Bigger timber may be visually cluttered, however we set a max depth of three to maintain it manageable.

    plt.determine(figsize=(12, 8))
    plot_tree(
    dt_model,
    crammed=True,
    feature_names=["Feature_1", "Feature_2"],
    class_names=["Class 0", "Class 1"]
    )
    plt.title("Choice Tree Visualization")
    plt.present()
    • Rectangles signify nodes, exhibiting how the information splits.
    • Situations (like Feature_1 <= 0.05) outline the branches.
    • Samples present what number of knowledge factors fall into every node.
    • Values present what number of knowledge factors belong to every class.
    • Gini or Entropy mirror how pure (or impure) the node is.
    1. Overfitting: With out constraints (max_depth, min_samples_split, and so forth.), timber are inclined to develop very giant, memorizing coaching knowledge.
    2. Ensembles: In style strategies like Random Forest or Gradient Boosted Bushes construct a number of timber to get extra strong, correct predictions.
    3. Interpretability: Choice timber are sometimes praised for a way straightforward they’re to interpret in comparison with black-box fashions like deep neural networks.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticlePresident Trump signs executive order delaying TikTok ban
    Next Article Why Generative-AI Apps’ Quality Often Sucks and What to Do About It | by Dr. Marcel Müller | Jan, 2025
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Credit Risk Scoring for BNPL Customers at Bati Bank | by Sumeya sirmula | Jul, 2025

    July 1, 2025
    Machine Learning

    Why PDF Extraction Still Feels LikeHack

    July 1, 2025
    Machine Learning

    🚗 Predicting Car Purchase Amounts with Neural Networks in Keras (with Code & Dataset) | by Smruti Ranjan Nayak | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Using Graph Databases to Model Patient Journeys and Clinical Relationships

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    Survey: 60% of Business Leaders Unsure of Data-AI Readiness

    April 23, 2025

    Richard L. Garwin, a Creator of the Hydrogen Bomb, Dies at 97

    May 14, 2025

    Unlocking the Future: How “Database Management Using AI” by A. Purushotham Reddy is Revolutionizing Data Systems | by A Purushotham Reddy | Feb, 2025

    February 17, 2025
    Our Picks

    Using Graph Databases to Model Patient Journeys and Clinical Relationships

    July 1, 2025

    Cuba’s Energy Crisis: A Systemic Breakdown

    July 1, 2025

    AI Startup TML From Ex-OpenAI Exec Mira Murati Pays $500,000

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.