github.com/tusharsarkar3/…
Trained by using a novel optimization technique, Boosted Gradient Descent for Tabular Data which increases its interpretability and performance.
Jun 11, 2021 • 4 tweets • 2 min read
Reconstructing Implicit Knowledge with Language Models.
Generating statements that explicate implicit knowledge connecting sentences in text.
github.com/Heidelberg-NLP…
They make use of pre-trained language models which they refine by fine-tuning them on specifically prepared corpora that we enriched with implicit information and by constraining them with relevant concepts and connecting commonsense knowledge paths.
Jun 10, 2021 • 4 tweets • 2 min read
Deepface is a lightweight face recognition and facial attribute analysis framework in Python
Don't forget to spend some star love for the repository!
github.com/serengil/deepf…
It is a hybrid face recognition framework wrapping state-of-the-art models: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, ArcFace and Dlib
The library is mainly based on Keras & TensorFlow
Jun 10, 2021 • 4 tweets • 2 min read
Quant UX is a research, usability and prototyping tool to quickly test your designs & get data driven insights
Figma-Low-Code use Figma designs directly in VUE.js applications.
The low code approach reduces drastically the hand-off time between designers and developers, reduces front-end code and ensures that the Figma design stays the single source of truth.
github.com/coqui-ai/TTS
The model reaches state-of-the-art results for similarity with new speakers and speech quality with only 11 speakers in training.
SC-GlowTTS: An Efficient Zero-Shot Multi-Speaker Text-To-Speech Model
3 tools to find what's trending:
Find trending ArXiv papers on arxiv-sanity.com you can sort by categories and save for later reading
May 22, 2021 • 9 tweets • 3 min read
Why are graphs the future of biomedical research and what is the value of NLP here?
A small case study about:
How to speed up drug discovery with knowledge graphs and discover potential cures for diseases
In this case text mining is used to contextualize knowledge about:
Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency
github.com/twitter-resear…
In fall 2020, Twitter users raised concerns that the automated image cropping system on Twitter favored light-skinned over dark-skinned individuals, as well as concerns that the system favored cropping woman's bodies instead of their heads
Did you think bringing your machine learning model to production was the hard part?
What about model drift?
Now MLOps comes into play but how does it work and what are good tools?
What is:
- Continuous integration (CI)
- Continuous deployment (CD)
- Continuous training (CT)
The full MLOps life cycle
- Data Engineering: Get and clean the data recurring if necessary
- Model Engineering: Model training, evaluation, testing, and packaging
- Model Deployment: integrating the trained model. Model serving, performance monitoring
May 16, 2021 • 4 tweets • 2 min read
Note taking apps are like muscle training - you have to do it every day.
How many times I have changed ...
From Evernote to OneNote to Google Keep to Notion and from Roam now to Obsidian
Why?
Where the big ones like OneNote, Google Keep and Evernote fail is that the brain does not work like an index, thoughts are linked and associatively this is where the next generation of note taking apps show their strength.
Your open source project is ready for deployment? Documentation is still missing?
Good documentation and its presentation is an art!
A case study with 4 examples on awesome documentation
What makes good documentation?
- No prosaic texts! Choose a practical approach with code snippets
- Good structure and overview with a quick entry then in depth
- Good search is everything
- Good code examples
May 13, 2021 • 11 tweets • 6 min read
Where to get data for your next machine learning project?
An overview of 8 amazing resources to accelerate your next project with data!
- Google Datasets
- Big Bad NLP Datasets
- Hugging Face Datasets
- Papers with Code Datasets
- Open Data on AWS
- Awesome Public Datasets
Hugging Face Datasets
Mainly for NLP but the good news Hugging Face is expanding and we can be sure that they will add datasets for visual machine learning soon!
How to get your dream job in Data Science if you are a career changer?
First you have to sneak around HR and their antiquated methods. This is only possible through contacts or unusual ways.
But what are good ways?
The middleman
Someone who can hand over your application who has a connection to the company or someone who works there.
May 10, 2021 • 4 tweets • 2 min read
Does BERT Pretrained on Clinical Notes Reveal Sensitive Data? • Large Transformers pretrained over clinical notes from Electronic Health Records (EHR) have afforded substantial gains in performance on predictive clinical tasks.
The cost of training such models and the necessity of data access to do so is coupled with their utility motivates parameter sharing, i.e., the release of pretrained models such as ClinicalBERT.
↓ 2/4
May 8, 2021 • 11 tweets • 2 min read
To build a chatbot you need data for your intent classification.
But what if you have too little training data?
Paraphrasing is one option for augmentation
But what is a good paraphrase?
Almost all conditioned text generation models are validated on 2 factors:
1. If the generated text conveys the same meaning as the original context (Adequacy)
2. If the text is fluent / grammatically correct english (Fluency)
May 7, 2021 • 8 tweets • 5 min read
How do you create a beautiful interface for your machine learning or data science project?
Handmade from scratch?
Any good tools?
Sure there are incredible tools:
Beautiful ML & DS interfaces
Gradio
Quickly create customizable UI components around your ML models. By dragging-and-dropping in your own images, pasting your own text, recording your own voice & seeing what the model outputs.