Why Deep Networks Explode (or Vanish) and How Simple Statistics Fix Them: Deriving Xavier and He Initialization. | by Siddhesh Rane

Once I first began working with deep studying, I believed that including extra layers would routinely make a community higher. However I rapidly realized that deep networks usually endure from two large issues: vanishing gradients and exploding gradients. These issues make it onerous to coach deep networks successfully.

Then I found one thing stunning: the approach we initialize weights performs an enormous position in whether or not a community trains efficiently. Particularly, the variance of the weights determines whether or not alerts develop, shrink, or keep steady as they go by the community.

However why does variance matter? And the way do initialization strategies like Xavier and He use easy statistics to stop these issues? Let’s dive in and discover out!

Don’t have Medium account? Use this hyperlink: https://medium.com/@r.siddhesh96/why-deep-networks-explode-or-vanish-and-how-simple-statistics-fix-them-deriving-xavier-and-he-f2b64b89e3b8?sk=682d031a65c00dad50e6399f5599bf1c

To research what’s taking place let’s begin with a easy experiment with Pytorch. Think about you’re constructing a neural community with 100 layers, every with 512 neurons. You…

Source link

🔴 20 Most Common ORA- Errors in Oracle Explained in Details | by Pranav Bakare | Aug, 2025

Data Analysis Lecture 2 : Getting Started with Pandas | by Yogi Code | Coding Nexus | Aug, 2025

Current Landscape of Artificial Intelligence Threats | by Kosiyae Yussuf | CodeToDeploy : The Tech Digest | Aug, 2025

Tried an AI Text Humanizer That Passes Copyscape Checker

I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

Amazon and eBay to pay ‘fair share’ for e-waste recycling

Artificial Intelligence Concerns & Predictions For 2025

Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

Most Popular

How an Autopen Conspiracy Theory About Biden Went Viral

Find out who’s behind any phone number with this free lookup tool

How AI Is Transforming the SEO Landscape — and Why You Need to Adapt

Our Picks

Tried an AI Text Humanizer That Passes Copyscape Checker

🔴 20 Most Common ORA- Errors in Oracle Explained in Details | by Pranav Bakare | Aug, 2025

The AI Superfactory: NVIDIA’s Multi-Data Center ‘Scale Across’ Ethernet

Why Deep Networks Explode (or Vanish) and How Simple Statistics Fix Them: Deriving Xavier and He Initialization. | by Siddhesh Rane | Jan, 2025

Related Posts