W&B Tables enables quick and powerful exploration of image, video, audio, tabular, molecule and NLP datasets.
@metaphdor used Tables to explore 100k rows from @Reddit's Go Emotions dataset:
πΊ: )
π First, filtering for multiple column values
π Exploring the distribution of reddit comments by sub-reddit name:
πͺ Creating additional calculated columns; here we get the count of comments per sub-reddit. Looks like the "farcry" sub has the fewest comments:
π€ We can find which sub had the highest fraction of "caring" comments:
And which sub had the highest ratio of gratitude π to excitement π₯³ (i.e. thankful but maybe kinda boring) - sorry r/legaladvice π :
βοΈ Documenting and sharing these findings with collaborators is a breeze by sharing them W&B Reports.
Your collaborators can also start their own exploration in Tables that you've added to a Report ** in the Report UI itself ** and persist these changes between visits.
π» Logging to W&B Tables is super easy, here we downloaded the Go Emotions dataset from the @huggingface Datasets library and logged it as a pandas dataframe
To log to W&B Tables and start your own exploration, you can run this colab:
πΌοΈ This is only 1 example for NLP; Tables supports exploration of a wide variety of data types, here @sbxrobotics used Tables to demonstrate how to evaluate image segmentation modes:
We're incredibly excited about Tables and will be continuously improving functionality and performance over the coming months. We'd love to know what you think: support@wandb.com
β’ β’ β’
Missing some Tweet in this thread? You can try to
force a refresh
@l2k and @emilymbender dive into the problems with bigger and bigger language models, the difference between form and meaning, the limits of benchmarks, and the #BenderRule.
π₯:
They discuss 4 of Emily's papers β¬οΈ
1/5
"On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? π¦" (Bender, Gebru et al. 2021)
Possible risks associated with bigger and bigger language models, and ways to mitigate those risks.
1. You can monitor how your models and hyperparameters are performing, including automatically tracking:
- Training and validation losses
- Precision, Recall, mAP@0.5, mAP@0.5:0.95
- Learning Rate over time
2. Automatically tracked system metrics like GPU Type,Β GPU Utilization, power, temperature,Β CUDA memory usage; and system metrics like Disk I/0, CPU utilization, RAM memory usage.