Read on Twitter

12,399 views

Smerity

@Smerity

, 5 tweets, 1 min read Read on Twitter

Deep learning training tip that I realized I do but never learned from anyone - when tweaking your model for improving gradient flow / speed to converge, keep the exact same random seed (hyperparameters and weight initializations) and only modify the model interactions.

- Your model runs will have the exact same perplexity spikes (hits confusing data at the same time)
- You can compare timestamp / batch results in early training as a pseudo-estimate of convergence
- Improved gradient flow visibly helps the same init do better

Important to change out the random seed occasionally when you think you've isolated progress but minimizing noise during experimentation is OP. You're already dealing with millions of parameters and billions of calculations. You don't need any more confusion in the process.

Anyone else doing this? As noted I never learned it explicitly, just a habit I got into. I tend to think most BigCo will be running hyperparameter sweeps / randomization due to the environment but maybe others are doing this on their own?

@hardmaru

@hardmaru

This may or may not be a soft pseudo-science variant of the Lottery Ticket Hypothesis / @hardmaru et al's Weight Agnostic Neural Networks. Either way it has worked multiple times over multiple datasets for me and the results seem to generalize.

Like this thread? Get email updates or save it to PDF!

Subscribe to Smerity

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Like this thread? Get email updates or save it to PDF!

Subscribe to Smerity

This content may be removed anytime!

Try unrolling a thread yourself!

More from @Smerity see all

Related threads

Trending hashtags

Did Thread Reader help you today?