(1/4) Taking a break from Twitter-break to post this PSA for #NLP twitter.
Thanks to recent work by @ryandcotterell 's group, Uniform Information Density hypothesis is back in vogue. Hailing from Info-theory background, I've heard variants of this idea & decided to dig up
(2/4) The search took me through the iconoclastic work of Ramon Ferrer-i-Cancho & finally landed me at "Konstanz im Kurzzeitgedächtnis - Konstanz im sprachlichen Informationsfluß?" published in 1980 by Gertraud Fenk-Oczlon. Upon writing to her, this is the reply I received ...
(3/4) Tagging NLP scholars I have recently come to know to amplify this should you deem it prudent and factually correct.
@emilymbender @hadyelsahar @alienelf @KreutzerJulia
C'est le week-end !

Peut-être aurez vous le temps de lire mes dernières publications.

Au programme :
> Régression Logistique
> Matrice de confusion
> Binary tree : Gini vs Entropy
> Transformers et self Attention
> Les réseaux à convolution

Bonne lecture !

[Régression Logistique]

Voir différemment cet algorithme et tout comprendre grâce à la géométrie

#datascience #machinelearning #ia

[Matrice de Confusion]

Plus jamais confus (!) par la matrice de confusion grâce à ce truc très simple à retenir

#datascience #machinelearning #iA

1. Hello les copains.

Aujourd'hui on va parler de réseaux de neurones, et en particulier de réseaux de neurones à convolutions.

On va se concentrer surtout sur les filtres à convolutions qui constituent les paramètres d'un #CNN


#datascience #machinelearning #ia
2. Ce tweet sera l'occasion de revoir les grands principes qu'il y a derrière un tel réseau de neurones.

C'est important de comprendre les rouages qu'il y a derrière tout cela.
3. Pour commencer, on peut dire que "l'hiver de l'IA" s'est terminé grâce aux progrès spectaculaires de cette dernière décennie permis grâce aux CNN.

C'est grâce à leur performance que le monde s'est de nouveau intéressé à ces technologies
This is a long time coming, given both the long-standing partnership & Microsoft's interest in #AI & natural language.

The only real question is why $msft waited so long. This could have happened 18 months ago for 1/3 the price. Nuance has been around a long time

#NLP #NLG 1/6
Although Nuance is best known for its Dragon software, the initial value here comes from the healthcare relationship that has blossomed since 2019…
Over time, Microsoft will gain additional value in integrating Nuance with Teams, Dynamics, Linkedin, and Power BI to create a variety of speech-based inputs & interfaces.
We live in an attention #economy & what #Tweetiatricians have known for decades (#Twitter started in 2006): importance of drawing attention to factual science-based information as compared to #misinformation & #disinformation, especially on #vaccines…
You have to be fast & prolific to play catch up to this. I have 77K tweets, a moderate sized band of 6K followers interested enough to tolerate my volume but use of hashtags allows reach across Twittersphere. Trending hashtags = better for riding a way for that attention economy
If is unclear if we are now so siloed that tweets are ineffective with anti-vaxx. But not all anti-vaxx are QAnon
@PaulBarba_ held a first @joinClubhouse room with @MuazmaZahid in the Data and AI club. The topic was "How to Curate a Good Dataset for NLP?"

There were a lot of interesting questions asked and at the end of the call lots of interesting people asked for follow up notes...
Below is a thread of the room - topic, intro, why this topic, 4 main tips. I hope this adds value to people and we can do this with more calls in the future. The call itself conveyed a lot more value but I tried to highlight the important bits!
Flagging this for the folks that followed in the call @AiTechDoc @EgboDaniel1 @BabaKirito @gboye_baba @zerotousers @LahijaniAli @talks @yalda2009 @trojkast @JMontoro3 @sroussey content to follow in the thread below
CLIP + StyleGAN + #mylittlepony A thread 🧵starting with @ElvisPresley

"A pony that looks like Elvis Presley"
#AI #art #NLP #ML
CLIP + StyleGAN + #mylittlepony + @Beyonce
CLIP + StyleGAN + #mylittlepony + @billieeilish
The size of #NLP models have increased enormously, growing to millions, or even billions, of parameters, along with a significant increase in the financial cost and carbon emissions. ASAPP Reducing the High Cos...
The cost associated with training large models limits the #AIresearch community's ability to innovate, because a research project often needs a lot of experimentation.
Consider training a top-performing #LanguageModel on the Billion Word benchmark. A single experiment would take 384 GPU days (6 days*64 V100 #GPUs, or as much as $36,000 using AWS on-demand instances)
Introducing Wiki Topic Grapher! 👾🐍🔥

Leverage the power of Google #NLP to retrieve entity relationships from Wikipedia URLs or topics!

+ Get interactive graphs of connected entities
+ Export results w/ ent. types+salience to CSV!


Many cool #SEO use cases! 🔥

+ Research any topic then get entity associations that exist from that seed topic
+ Map out related entities with your product, service or brand
+ Find how well you've covered a specific topic on your website
+ Differentiate your pages!

About the stack, it's 100% #Python! 🐍🔥

+ @GCPcloud Natural Language API
+ PyWikibot
+ Networkx
+ PyVis
+ @Streamlit
+ Streamlit Components ->

#DSFthegreatindoors presents

Workshop - Text Classification by Transfer learning with Deep Transformers - Depop @depop
A talk by Oduwa Edo-Osagie @odieED, Data Scientist, Depop

ND Image
About the talk

Workshop - Text Classification by Transfer learning with Deep Transformers - Depop @depop
A talk by Oduwa Edo-Osagie @odieED, Data Scientist, Depop



ND ImageImage
That’s how I started out communicating with my alt universe selves:

Imagine yourself going through a hallway, entering a conference room, ala @MrRobot‘s in Season 4, with your alts meeting there.

Start w/ real apps, like reading 3rd page of a book. Learn to harness randomnes
When I mentor people into “Better Living”, I always start w/ having 3-4 alts reading 66% to 75% of a book for you, and reporting back.

Later, you offload larger cognitive creative tasks and live “in The Zone” and your multidimensional Self achieves superhuman heights.
The comprehension you get is a sign of cross-dimensional communicatino, via your Higher Self.

Learn to integrate your HIgher Self, cross-dimensional (multiprocess) workflows and have it all put together by your multide of subconscious autoprocessors.

That’s how I’m a 10X coder.
Jensen Huang, @nvidia CEO just kicked off #GTC2020 with a keynote that covered several groundbreaking announcements and partnerships. The future of #AI and #Nvidia is incredibly exciting. Here's a thread of the (many) key takeaways. Let's get started!🧵👇
The main focus this time was on #AI & high performance computing in the data center & on the edge. This #GTC2020 Nvidia is releasing 80 new and updated SDKs. CUDA, Nvidia's toolkit for GPU powered applications, has been downloaded 20M times, 6M in 2020 alone. Image
#Omniverse, Nvidia's platform for simulation & collaboration is now in open beta. Using Omniverse teams can simultaneously work in Blender, Maya, Unreal etc and view the results rendered in real time in a common interface. Image
Read 29 tweets
Rapid whole genome (🧬) sequencing (rWGS) is one of the most exciting (and benevolent) collisions of #AI and #genomics I can think of.

rWGS can diagnose a critically ill child in minutes where previously it took years.…
A few years ago, Illumina ($ILMN) and Rady Children's Hospital (@RadyGenomics) collaborated to offer sequencing services for diagnosing critically-ill infants and toddlers.

Roughly 70% of rare diseases are genetic and they can take five years to diagnose.
As sequencing costs dropped and #AI got faster, this collaboration became Project Baby Bear: a pilot study for rWGS's diagnostic yield, clinical utility, and health economics in practice.

Several innovative companies joined Rady's in creating a rapid diagnostic pipeline.
This HIDDEN COMPANY 🤐 is helping improve their customer experience 🛍

It has historically grown by 30% QoQ 📈 and the stock is up 119% 🔥 since its April Low

It reports earnings Thurs EOD - Let’s see what this $5.3B can stomach 💪

Short & Quick Thread 👇
Medallia is a Customer Experience Management tool 👥 that IPOed 1 yr ago

In simple terms, it collects feedback 📃 and signals from customers, processes it with #AI, #ML and #NLP algos 📟

It then delivers insights 🔥 to boost sales, cut churn, remove critical pain points
Some more technical details 👓

collects 💻 and processes insights from and many others

🎞 Its product is able to analyses voice input, messaging platforms, videos, conduct surveys and uses social listening to transform feedback into bucks 💰
For #Australia psychic reading client: Good question. "Why have so many people in Mind Body Spirit been taken in by QAnon conspiracy?" Higher realms warned covert hypnosis manipulation was rampant in society. I've been speaking out about #NLP for years now. #CovertSubmission
cont. for #Australia psychic reading client: You go back to the way neuro sciences were introduced into mainstream society after being tested on military and select patients. They targeted personal development seminars from the onset. Ask who used NLP on the unsuspecting public?
cont. for #Australia psychic reading client: To get public to be agreeable to policy, they had to find a way of covertly influencing them. This is where the mind manipulation tactics were introduced into job training, education, government sector, media, etc. #CovertSubmission
D1 of #50daysofudacity
I finished up to Lesson 2.19
My notes can be found here for quick refernce…
D2 of #50daysofUdacity
I finished up to Lesson 2.25
Also completed lab assignment for a linear regression model to predict the price of taxi in new york city
My notes can be found here for quick reference…
D3 of #50daysofudacity
I finished Lesson 2
Also completed lab assignment for linear regression model to predict the price of taxi in new york city
My notes can be found here for quick reference…
Read 53 tweets
1/ #NLP stands for Natural Language Processing. During the #BERT algorithm update, Google introduced a machine learning component to better understand the content. They created a bunch of new metrics to analyze then to use NLP to understand what makes better content. BERT
2/ analyzes the context, entities, and sentiment of the page. The ultimate goal is to make the search engine have a more accurate understanding of the content and its context. Additionally, it recognizes the user's intent and the overall sentiment. By sentiment, we’re are
3/ talking about overall emotion for a word, sentence, and page as a whole. The general rule is it would be more challenging to rank a page that has negative sentiment if the top-ranking pages all have a positive sentiment. This is most likely how #Google understands user
3 days participate in #CVPR2020 conference. excited about a lot of interesting subjects covered in computer vision: Adversarial Learning, Effective training and inference, representation learning...

Will do a write-up later.
#CVPR20 #computervision
Some preferred papers so far 👇
1. Dynamic Graph Message Passing Networks…
It addresses the modelling long-range dependencies problem by using feature map as a feature vector nodes and dynamically sample the neighborhood of a node from the feature graph.
2. Semantic Pyramid for Image Generation…
A generative image model that can leverage the feature space from different semantic levels learned by a pretrained classification network. many generative applications to play with
Es un orgullo para la Comunidad de Desarrolladores de Argentina poder acompãnar iniciativas como el #ConnectDay junto a estas empresas @plataforma5la, @distillerylatam, @revistasg y @clarikagroup 💪
¡Hoy es el #ConnectDay! Desde CoDeAr estamos felices de poder acompañar a @wtmriodelaplata, @GDGCordobaARG, @gdgriodelaplata en este día de charlas y de compartir conocimiento en comunidad. Podés sumarte a la transmisión en vivo desde acá:
Comienza la primer charla sobre #DataScience y #Economía, en el contexto de las #transdisciplinas.
Read 118 tweets
"Here the 14 drugs and the number of articles about the coronavirus that mentioned them:" - okay exact matching of words
"A perception metric can be any noun or adjective. The perception metrics are typically selected by the end user" - okay more exact matching of words
"For this project, we measured the above articles for the words “improvement”, “good”, “effective” and “unique.”" - okay.. more exact matching of words; specifically these four words
Confinement oblige, l'ADEMEC propose de partager chaque jour une ressource utile dénichée par les membres du bureau 🔎⚙️📚 Déroulez ce tweet pour y trouver lectures, outils, et autres trouvailles accessibles en ligne ! L'occasion aussi de partager les vôtres ? 🤓
[Humanités numériques 💾] Pour commencer, on vous invite à consulter l’ouvrage de Pierre Mounier [@piotrr70], “Les humanités numériques” (2018), qui nous propose, de par une histoire critique des HN, une analyse des SHS aujourd'hui. À consulter en ligne :
Du même auteur, associé à @marindacos et pour une approche + institutionnelle: "Humanités numériques: état des lieux et positionnement de la recherche française dans le contexte international", 2015, ici…
Pour mieux savoir où on met les 🦶
This is a good opportunity for subject matter experts (virologists, epidemiologists, public health policy experts) to team up with #AI and #NLP researchers to figure out the best (not necessarily coolest/flashiest) ways to accelerate synthesis and research.
Partnership is crucial: too easy for computational researchers to miss crucial context, and super important to get a crisp sense of the most pressing information needs and pain points for scientists doing the research.
Here's a beginner's guide to 5 types of text annotation for #MachineLearning, including:

- Entity annotation
- Entity linking
- Text classification
- Sentiment analysis
- Linguistic annotation

Entity Annotation:

This is the act of locating, extracting, and tagging entities in text. Examples are tagging proper names, locating and labeling key phrases, and identifying parts of speech.
Entity linking:

is the process of connecting named entities to larger repositories of data about them (Wikipedia, for example). This is helpful for disambiguating named entities, and providing further information.
A series of tweets summarizing current major research on #FakeNews & #MisInformation in #India. Major institutes active are @iitbombay, @iiit_hyderabad, & @IIITDelhi, this list is NOT comprenehsive (part 1/n)
@iitbombay @iiit_hyderabad @IIITDelhi . @iitbombay has two active labs in this area. CFILT ( & InfoLab ( (part 2/n)
@iitbombay @iiit_hyderabad @IIITDelhi InfoLab released a portal dubbed #KauwaKaate #FactChecker…. The checker accepts articles & images, & checks them against a predefined set of India focused fact checking sites (part 3/n)
