How to get URL link on X (Twitter) App
Our novel framework casts stochastic optimization as a random dynamical system, showing that at the edge of stability (EoS) regime, the training does not converge to a single point, but explores an attractor set. Generalization is then a property of this attractor, not a point.👇
Relations beyond pairwise interactions are essential in networks. Yet, their role in semantic graph generative models remains unclear, as mentioned in our #ICML2024 paper []. We address this by integrating higher-order information in a principled manner.👇arxiv.org/abs/2402.08871
First, we observe that grokking is prevented by absorption errors in the Softmax, in the absence of regularization. This problem, which we refer as the Softmax Collapse (SC), causes vanishing gradients and puts an end to learning, sometimes resulting in complete overfitting.👇
So what's a #HigherOrderRelationship? Essentially, it's a relation between relations. Graphs only consider node-to-node relations (via so called edges). Yet, relations between edges or edges & faces, encode interesting patterns, leading to a full hierarchy of relationships.