Handling Big Git Repos in AI Development | by Rajarshi Karmakar

So why would possibly or not it’s an issue if every thing is tracked proper? Properly it labored fairly properly in our conventional software program methods till we began to make use of AI.

In AI improvement, it creates skilled fashions as massive binary information, enormous gigabytes of picture datasets, checkpoint information for mannequin coaching and numpy objects (npy/npz). Git was not initially designed to deal with these information effectively for representing modifications. Even including 1 layer or retraining the mannequin creates a complete new whole binary of the mannequin. So all of the variations of the mannequin or numpy objects are getting tracked, and it will increase reminiscence exponentially!

It bloats git listing and every thing is slowed down and will get laggy. You’ll have to wait hours to make a pull request or fetch new modifications. Even throughout merging or rebasing, it’s a enormous drawback and wishes guide intervention. And your teammate can by accident commit enormous binary information to shared branches, and if it will get merged together with your dev department entire factor can get slowed down!

There are answers to this although, whereas in a manufacturing system to make use of Git LFS for storing massive information, Cloud options (S3, Azure Blob, GCS) and even there isan opensource different to git for ML improvement like DVC, comet, mlflow or as a easy resolution, put all these information in .gitignore

Source link

Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025

A Technical Overview of the Attention Mechanism in Deep Learning | by Silva.f.francis | Jun, 2025

Tone Awareness: Setting the Right Energy for Digital Spaces | by Fred’s Bytes | Jun, 2025

How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins

I Tried Buying a Car Through Amazon: Here Are the Pros, Cons

Amazon and eBay to pay ‘fair share’ for e-waste recycling

Artificial Intelligence Concerns & Predictions For 2025

Barbara Corcoran: Entrepreneurs Must ‘Embrace Change’

Most Popular

Mastering 1:1s as a Data Scientist: From Status Updates to Career Growth

Amazon Reports 88% Rise in Profits but Says Growth Could Slow

How AI is booming in China

Our Picks

How Smart Entrepreneurs Turn Mid-Year Tax Reviews Into Long-Term Financial Wins

Become a Better Data Scientist with These Prompt Engineering Tips and Tricks

Meanwhile in Europe: How We Learned to Stop Worrying and Love the AI Angst | by Andreas Maier | Jul, 2025

Handling Big Git Repos in AI Development | by Rajarshi Karmakar | Jul, 2025

Related Posts