Information pollution and recursive training
AI models that are trained recursively can experience what is known as model collapse leads to model collapse. But why does this happen? 🤔
To understand this, let's draw an analogy to principles of complexity and information physics, particularly those related to the second law of thermodynamics. This law states that in a closed system, the amount of disorder, or entropy, will naturally increase over time unless energy is added to maintain or increase the system's complexity.
In the context of AI models, information serves as the 'energy' that sustains the model's complexity. Information has a 'temperature' related to how accurately it represents the source data. If an AI model doesn't receive new, accurate information continuously, it is similar to a closed system that doesn't get an energy infusion. Over time, the quality of the model degrades, a process analogous to entropic degradation. This degradation happens at a rate determined by the 'temperature' (noise) of the information and the model's capacity to manage complexity (its 'heat capacity').
When an AI model is trained recursively, it often reuses its own output as new input. This is akin to a closed system recycling its energy. Without new, external information, the recycled data introduces more noise, raising the system's 'temperature.' Even well-designed models can only handle this increased noise for so long before their performance starts to degrade, or they 'go off the rails.' 🚂
The concept of sanity in models is critical. It describes the model's ability to accurately understand and represent both internal and external realities. When a model is trained recursively without new data, it becomes 'insane' in the sense that its representations of reality become increasingly distorted. Eventually, this leads to model collapse, where the AI can no longer generate sensible results. 😵
A useful analogy from physics is a closed system without external energy. Over time, such a system loses its complexity as it approaches thermal equilibrium. In the same way, a recursively trained AI model without new information will degrade. However, if the model continuously receives new, accurate data, it can maintain its complexity, improve its understanding of reality, and enhance its utility. This ongoing influx of information is analogous to the external energy that a physical system needs to sustain its complexity.
Therefore, to prevent model collapse, AI models need a constant stream of new and accurate information, much like a physical system needs energy to prevent entropy from increasing. This ensures that the model remains valuable and continues to function effectively, providing accurate and reliable outputs. 🌟