Close Menu
    Trending
    • Musk’s X appoints ‘king of virality’ in bid to boost growth
    • Why Entrepreneurs Should Stop Obsessing Over Growth
    • Implementing IBCS rules in Power BI
    • What comes next for AI copyright lawsuits?
    • Why PDF Extraction Still Feels LikeHack
    • GenAI Will Fuel People’s Jobs, Not Replace Them. Here’s Why
    • Millions of websites to get ‘game-changing’ AI bot blocker
    • I Worked Through Labor, My Wedding and Burnout — For What?
    AIBS News
    • Home
    • Artificial Intelligence
    • Machine Learning
    • AI Technology
    • Data Science
    • More
      • Technology
      • Business
    AIBS News
    Home»Machine Learning»Create Simple Transformers, Step by Step Process and Explanation | by Vishnuam | Feb, 2025
    Machine Learning

    Create Simple Transformers, Step by Step Process and Explanation | by Vishnuam | Feb, 2025

    Team_AIBS NewsBy Team_AIBS NewsFebruary 27, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    I’ll stroll you thru making a next-word prediction mannequin utilizing a Transformer.

    import tensorflow as tf
    from tensorflow.keras.layers import Embedding, LSTM, Dense
    from tensorflow.keras.preprocessing.textual content import Tokenizer
    from tensorflow.keras.preprocessing.sequence import pad_sequences
    import numpy as np

    Rationalization:

    • tensorflow → Deep studying library.
    • Embedding → Converts phrases into numerical vectors.
    • LSTM → Lengthy Brief-Time period Reminiscence, used to recollect sequences.
    • Dense → Totally related layer for output.
    • Tokenizer → Converts textual content into tokens (numbers).
    • pad_sequences → Ensures sequences have the identical size.
    textual content = """The short brown fox jumps over the lazy canine The short brown fox could be very quick"""
    tokenizer = Tokenizer()
    tokenizer.fit_on_texts([text])
    total_words = len(tokenizer.word_index) + 1 # Including 1 for padding

    Rationalization:

    • We use a pattern textual content.
    • The Tokenizer assigns a novel quantity to every phrase.
    • word_index shops word-to-number mappings.
    • total_words shops the vocabulary measurement.
    input_sequences = []
    for line in textual content.cut up("."): # Splitting sentences (for actual datasets, use full textual content)
    token_list = tokenizer.texts_to_sequences([line])[0] # Convert phrases to numbers
    for i in vary(1, len(token_list)):
    n_gram_sequence = token_list[:i+1] # Creating n-gram sequences
    input_sequences.append(n_gram_sequence)
    # Padding sequences to make them the identical size
    max_seq_length = max(len(seq) for seq in input_sequences)
    input_sequences = pad_sequences(input_sequences, maxlen=max_seq_length, padding='pre')
    X, y = input_sequences[:, :-1], input_sequences[:, -1] # Splitting into inputs and labels
    y = tf.keras.utils.to_categorical(y, num_classes=total_words) # Convert to categorical

    Rationalization:

    • We convert textual content into n-gram sequences (progressively longer phrases).
    • Instance: "The short brown" → [2, 3, 4]
    • Sequences are padded to make sure equal lengths.
    • X (options) accommodates phrases earlier than the final.
    • y (label) is the subsequent phrase to foretell.
    from tensorflow.keras.layers import MultiHeadAttention, LayerNormalization, Dropout
    class TransformerBlock(tf.keras.layers.Layer):
    def __init__(self, embed_dim, num_heads, ff_dim, fee=0.1):
    tremendous(TransformerBlock, self).__init__()
    self.att = MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
    self.ffn = tf.keras.Sequential([
    Dense(ff_dim, activation="relu"),
    Dense(embed_dim),
    ])
    self.layernorm1 = LayerNormalization(epsilon=1e-6)
    self.layernorm2 = LayerNormalization(epsilon=1e-6)
    self.dropout1 = Dropout(fee)
    self.dropout2 = Dropout(fee)
    def name(self, inputs, coaching):
    attn_output = self.att(inputs, inputs)
    attn_output = self.dropout1(attn_output, coaching=coaching)
    out1 = self.layernorm1(inputs + attn_output)
    ffn_output = self.ffn(out1)
    ffn_output = self.dropout2(ffn_output, coaching=coaching)
    return self.layernorm2(out1 + ffn_output)

    Rationalization:

    • That is the Transformer Encoder Block.
    • MultiHeadAttention permits the mannequin to deal with completely different phrases.
    • LayerNormalization ensures stability.
    • Dropout prevents overfitting.
    • The mannequin provides consideration outputs again to the enter to study relationships.
    embed_dim = 64  # Phrase embedding measurement
    num_heads = 2 # Variety of consideration heads
    ff_dim = 128 # Hidden layer measurement
    inputs = tf.keras.layers.Enter(form=(max_seq_length-1,))
    embedding_layer = Embedding(total_words, embed_dim, input_length=max_seq_length-1)(inputs)
    transformer_block = TransformerBlock(embed_dim, num_heads, ff_dim)(embedding_layer)
    flatten = tf.keras.layers.Flatten()(transformer_block)
    output = Dense(total_words, activation="softmax")(flatten)
    mannequin = tf.keras.Mannequin(inputs=inputs, outputs=output)
    mannequin.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
    mannequin.abstract()

    Rationalization:

    • Embedding converts phrases into dense vectors.
    • TransformerBlock processes sequences.
    • Flatten converts multi-dimensional knowledge right into a single vector.
    • Dense output layer predicts the subsequent phrase.
    • softmax provides chance scores for all phrases.
    mannequin.match(X, y, epochs=50, verbose=1)

    Rationalization:

    • We prepare the mannequin utilizing categorical_crossentropy loss.
    • epochs=50 runs the coaching 50 instances.
    def predict_next_word(seed_text, tokenizer, max_seq_length, mannequin):
    token_list = tokenizer.texts_to_sequences([seed_text])[0]
    token_list = pad_sequences([token_list], maxlen=max_seq_length-1, padding='pre')
    predicted_probs = mannequin.predict(token_list)
    predicted_word_index = np.argmax(predicted_probs)

    for phrase, index in tokenizer.word_index.gadgets():
    if index == predicted_word_index:
    return phrase
    return ""

    # Instance Utilization:
    seed_text = "The short brown"
    next_word = predict_next_word(seed_text, tokenizer, max_seq_length, mannequin)
    print(f"Predicted subsequent phrase: {next_word}")

    Rationalization:

    • Converts enter seed textual content into tokens.
    • Pads it to match coaching sequence size.
    • Predicts the phrase with the best chance.
    • Converts predicted index again to a phrase.

    If the coaching was efficient, operating:

    print(predict_next_word("The short brown", tokenizer, max_seq_length, mannequin))

    May output:

    Predicted subsequent phrase: fox
    1. Preprocess textual content → Tokenization, sequences, padding.
    2. Construct Transformer mannequin → Embeddings, consideration, dense layers.
    3. Prepare mannequin → Predicts subsequent phrases.
    4. Generate predictions → Makes use of educated weights to generate textual content.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleHow to Implement Blockchain in Supply Chain Management
    Next Article This Unconventional Strategy Is the Secret to Unprecedented Business Growth
    Team_AIBS News
    • Website

    Related Posts

    Machine Learning

    Why PDF Extraction Still Feels LikeHack

    July 1, 2025
    Machine Learning

    🚗 Predicting Car Purchase Amounts with Neural Networks in Keras (with Code & Dataset) | by Smruti Ranjan Nayak | Jul, 2025

    July 1, 2025
    Machine Learning

    Reinforcement Learning in the Age of Modern AI | by @pramodchandrayan | Jul, 2025

    July 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Musk’s X appoints ‘king of virality’ in bid to boost growth

    July 1, 2025

    I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

    December 10, 2024

    Amazon and eBay to pay ‘fair share’ for e-waste recycling

    December 10, 2024

    Artificial Intelligence Concerns & Predictions For 2025

    December 10, 2024

    Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

    December 10, 2024
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    Most Popular

    The Stateless Trap and the Rise of Memory-Native Intelligence | by Darcschnider | May, 2025

    May 1, 2025

    Deep Learning — Nanodegree Program | by Franklin Rhodes | Jun, 2025

    June 4, 2025

    Warren Buffett Is Retiring as CEO of Berkshire Hathaway

    May 3, 2025
    Our Picks

    Musk’s X appoints ‘king of virality’ in bid to boost growth

    July 1, 2025

    Why Entrepreneurs Should Stop Obsessing Over Growth

    July 1, 2025

    Implementing IBCS rules in Power BI

    July 1, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Business
    • Data Science
    • Machine Learning
    • Technology
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Aibsnews.comAll Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.