Tolga Birdal Profile picture
geometric computing, computer vision, machine learning, philosophy, music @ICComputing, previously @StanfordAILab
Apr 22 8 tweets 4 min read
Modern deep networks are often trained at the #EdgeOfStability, a regime where dynamics are locally unstable, nearing chaos. Yet generalization improves, defying the wisdom of classical optimization. We now theoretically explain this central puzzle: . 👇 arxiv.org/abs/2604.19740Image Our novel framework casts stochastic optimization as a random dynamical system, showing that at the edge of stability (EoS) regime, the training does not converge to a single point, but explores an attractor set. Generalization is then a property of this attractor, not a point.👇 Image
Feb 9, 2025 7 tweets 3 min read
Our new topological, diffusion-bridge based graph generation framework, HOG-Diff [], leverages higher-order information as a guide to progressively generate plausible graphs thru a coarse-to-fine curriculum. #TDL #TopologicalDeepLearning #GenAI #ML #AI 👇 arxiv.org/abs/2502.04308Image Relations beyond pairwise interactions are essential in networks. Yet, their role in semantic graph generative models remains unclear, as mentioned in our #ICML2024 paper []. We address this by integrating higher-order information in a principled manner.👇arxiv.org/abs/2402.08871
Jan 9, 2025 12 tweets 4 min read
Our latest work [] explores sudden generalization in neural nets, aka #grokking. We identify & propose solutions to two key issues hindering grokking: (i) floating point errors in Softmax and (ii) aligned gradients naively scaling logits post-overfitting.👇 arxiv.org/abs/2501.04697Image First, we observe that grokking is prevented by absorption errors in the Softmax, in the absence of regularization. This problem, which we refer as the Softmax Collapse (SC), causes vanishing gradients and puts an end to learning, sometimes resulting in complete overfitting.👇 Image
Apr 24, 2023 7 tweets 5 min read
Graphs & geometric deep learning are cool, but what's next? Our latest work leverages topology to model higher order relationships beyond graphs, covering domains like meshes, sets & their mixtures naturally. Learn more: arxiv.org/abs/2206.00606 #TopologicalDeepLearning #TDL Image So what's a #HigherOrderRelationship? Essentially, it's a relation between relations. Graphs only consider node-to-node relations (via so called edges). Yet, relations between edges or edges & faces, encode interesting patterns, leading to a full hierarchy of relationships. Image