NLP (and particularly LLMs) are stuffed with jargon — it’s like studying a brand new dialect of AI 😄
Right here’s a good and memorable means to sort out this:
💡 Technique: “3-Degree Reminiscence Hook”
We’ll use Associations, Analogies, and Acronyms to memorize advanced phrases.
📦 1. Group Phrases by Theme (Contextual Chunking)
As an alternative of memorizing random phrases, group them:
🧠 Language Modeling & Tokenization
• Token: smallest unit (like a phrase or sub-word)
• Vocabulary: listing of all potential tokens
• Corpus: giant assortment of textual content
• N-gram: sequence of N tokens
• Perplexity: how “stunned” the mannequin is by the precise subsequent phrase
🧠 Analogy:
Corpus = the novel, Tokens = phrases in it, Vocabulary = dictionary, N-gram = phrase, Perplexity = reader’s confusion
🧠 Phrase Embeddings
• Word2Vec: Predict neighbors (CBOW & Skip-gram)
• GloVe: Counts + matrix factorization
• fastText: Sub-word embeddings
• Cosine similarity: How shut phrases are in which means
💡 Mnemonic:
WGF → “Phrases Get Pleasant” (Word2Vec, GloVe, fastText) — all construct vector friendships.
🧠 Mannequin Architectures
• RNN, LSTM, GRU: Deal with sequences
• Transformer: No recurrence, simply consideration
• Self-Consideration: Every phrase attends to others in identical sentence
• Multi-Head Consideration: Look from a number of “views”
💡 Analogy:
LSTM = observe taker with reminiscence,
Transformer = group chat with everybody studying one another’s messages
🧠 Decoding & Technology
• Grasping decoding: choose most possible subsequent phrase
• Beam search: hold top-k sequences
• High-k sampling: select from prime okay probably tokens
• High-p (nucleus) sampling: choose from prime likelihood mass
🧠 Acronym Trick: GBK-P
Grasping — Beam — High-okay — High-p
Consider “Getting Higher Okay-Pop” as you discover decoding 😄
🧠 High-quality-tuning & Coaching
• ELMo: Deep contextual embeddings (makes use of LSTM)
• BERT: Bi-directional encoding
• GPT: Decoder-only, autoregressive
• LoRA: Low-Rank Adapters
• Distillation: Small pupil learns from huge trainer
💡 Analogy:
ELMo = good man who learns context deeply,
BERT = reads each side,
GPT = writes tales word-by-word
🧠 Ethics & Alignment
• Bias: systemic prejudice in outputs
• Toxicity: dangerous/offensive content material
• RLHF: Reinforcement Studying from Human Suggestions
💡 Mnemonic: “Be Actual”
Bias, Ethics, RLHF → Be Actual about AI security
🧠 2. Use Flashcards (Anki / Google Docs / Sticky Notes)
Entrance:
What’s the distinction between BERT and GPT?
Again:
BERT = encoder, bidirectional, masked LM
GPT = decoder, left-to-right, autoregressive LM
Do 10–15 flashcards a day — spaced repetition is gold.
🔁 3. Use & Clarify the Phrases Usually
• Communicate to your self utilizing them.
• Educate a good friend.
• Clarify by way of diagrams (nice for Transformers or token circulate).
• Create humorous tales utilizing them (eg: “GPT is an keen author who guesses each subsequent phrase whereas BERT is a deep thinker who fills within the blanks.”)