Overfitting is commonly seen as taboo in machine studying, as it could possibly sabotage a ML mannequin’s prediction efficiency with an excessive amount of coaching knowledge. However are you able to think about if we might use this overfitting to good use in ML?
Helpful or intentional overfitting is taken into account a purposeful ML mannequin coaching sample when the overall ML regularization and generalization mechanisms are omitted whereas coaching the ML mannequin. Additionally there are not any practice and check pattern splits whereas coaching after we are doing this intentional overfitting.
However the place we are able to apply this intentional overfitting, and why?
In some bodily and dynamic methods, we are able to mannequin conduct utilizing theoretically and virtually confirmed mathematical formulation. These equations could include Bizarre or Partial Differential Equations (PDEs), so most of them shouldn’t have any closed kind options. So as a way to resolve them in computer systems now we have to use classical numerical strategies which take loads of time and assets. Even when we might hold a lookup desk derived from precise enter and output to get the outputs, if the lookup desk is just too huge it would take an terrible lot of time to offer the answer.
So we are able to use a ML mannequin to roughly mannequin the system right here. We are able to use a lookup desk (which comprises all the enter and output area) which is derived from correct mannequin as our coaching knowledge to coach the ML mannequin. Then the ML mannequin can generate quick approximated predictions that are nearer to the true mannequin.
So as to do that, the next situations ought to be happy relating to the bodily system:
- Entire enter spectrum ought to be coated in coaching knowledge and it additionally ought to be finite. ( ex: 0.5 meters increments of distances from 1km to 500km), the place there isn’t a unseen knowledge might be inserted to the mannequin in anytime.
- There shouldn’t be overlapping coaching samples.
- The bodily system shouldn’t be a chaotic system, the place slight modifications in preliminary situations of the system outcomes drastic modifications within the conduct.
So we are able to deliberately overfit our mannequin for the system right here, as a result of there isn’t a unseen knowledge getting used for the prediction. The mannequin solely must comply with the patterns within the coaching knowledge with none situation. In machine studying, overlapping samples lets ML mannequin to do the probabilistic prediction on coaching knowledge however we are able to omit it and do a deterministic prediction in right here after we don’t have overlapping samples in coaching knowledge. Normally the ML fashions do interpolation by weighting the samples with their closest samples in coaching knowledge, which makes this method is extra appropriate for non-chaotic methods. Additionally bodily methods don’t have loads of noise within the knowledge and at all times present deterministic outcomes, which makes intentional overfitting fashions a superb possibility.
However how can we confidently state that ML fashions can establish the patterns in advanced bodily methods? This may be confirmed by the Common Approximation Idea in Deep studying which states that any advanced perform may be approximated by an Synthetic Neural Community with not less than one hidden layer and with non linear (squashing) activation perform. This implies it doesn’t matter what perform is given, a easy NN can approximate the perform and likewise its derivatives as effectively.
However can be necessary to state that Intentional overfitting is relevant in very restricted eventualities and it isn’t a silver bullet resolution (Regardless that the numerical resolution is advanced, but when it solely has a small enter output lookup desk, then doing a search within the lookup desk is essentially the most acceptable resolution)
It’s clear that intentional overfitting flips the same old script on ML practices, turning what’s often seen as a flaw right into a strategic benefit for modeling sure bodily methods. By purposefully overfitting, we are able to make ML fashions mimic advanced, deterministic behaviors with out the same old considerations for generalization or noise.
However let’s be clear, this method solely works underneath particular situations: the enter vary have to be finite and absolutely coated, the system have to be non-chaotic, and there shouldn’t be overlapping samples. With the backing of the Common Approximation Idea, we all know neural networks can seize these exact patterns, making this method a sensible alternative when conventional numerical strategies or lookup tables get too cumbersome.
So, whereas intentional overfitting isn’t a one-size-fits-all reply, it’s a intelligent and environment friendly different in the appropriate eventualities, the place velocity and approximation matter essentially the most.
References
Machine Studying Design Patterns (Guide) : Valliappa Lakshmanan, Sara Robinson & Michael Munn