π™·πš’πš–πšŠ π™»πšŠπš”πš”πšŠπš›πšŠπš“πšž Profile picture
Professor @Harvard; PI @ai4life_harvard; Co-founder @trustworthy_ml; #AI #ML #Safety; Stanford PhD; MIT @techreview #35InnovatorsUnder35
Apr 12 β€’ 14 tweets β€’ 4 min read
As we increasingly rely on #LLMs for product recommendations and searches, can companies game these models to enhance the visibility of their products?

Our latest work provides answers to this question & demonstrates that LLMs can be manipulated to boost product visibility!

Joint work with @AounonK. More detailsπŸ‘‡ [1/N]https://arxiv.org/abs/2404.07981 @AounonK @harvard_data @Harvard @D3Harvard @trustworthy_ml LLMs have become ubiquitous, and we are all increasingly relying on them for searches, product information, and recommendations. Given this, we ask a critical question for the first time: Can LLMs be manipulated by companies to enhance the visibility of their products? [2/N]
May 2, 2023 β€’ 14 tweets β€’ 5 min read
Regulating #AI is important, but it can also be quite challenging in practice. Our #ICML2023 paper highlights the tensions between Right to Explanation & Right to be Forgotten, and proposes the first algorithmic framework to address these tensions arxiv.org/pdf/2302.04288… [1/N] Image @SatyaIsIntoLLMs @Jiaqi_Ma_ Multiple regulatory frameworks (e.g., GDPR, CCPA) were introduced in recent years to regulate AI. Several of these frameworks emphasized the importance of enforcing two key principles ("Right to Explanation" and "Right to be Forgotten") in order to effectively regulate AI [2/N]
Nov 26, 2022 β€’ 10 tweets β€’ 12 min read
Our group @ai4life_harvard is gearing up for showcasing our recent research and connecting with the #ML #TrustworthyML #XAI community at #NeurIPS2022. Here’s where you can find us at a glance. More details about our papers/talks/panels in the thread below πŸ‘‡ [1/N] Image @ai4life_harvard [Conference Paper] Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations (joint work with #TessaHan and @Suuraj) -- arxiv.org/abs/2206.01254. More details in this thread [2/N]
Sep 18, 2022 β€’ 13 tweets β€’ 6 min read
One of the biggest criticisms of the field of post hoc #XAI is that each method "does its own thing", it is unclear how these methods relate to each other & which methods are effective under what conditions. Our #NeurIPS2022 paper provides (some) answers to these questions. [1/N] In our #NeurIPS2022 paper, we unify eight different state-of-the-art local post hoc explanation methods, and show that they are all performing local linear approximations of the underlying models, albeit with different loss functions and notions of local neighborhoods. [2/N]
Jun 2, 2022 β€’ 13 tweets β€’ 3 min read
Explainable ML is a rather nascent field with lots of exciting work and discourse happening around it. But, it is very important to separate actual findings and results from hype. Below is a thread with some tips for navigating discourse and scholarship in explainable ML [1/N] Overarching claims: We all have seen talks/tweets/discourse with snippets such as "explanations dont work" or "explanations are the answer to all these critical problems". When we hear such claims, they are often extrapolating results or findings from rather narrow studies. [2/N]
May 19, 2021 β€’ 8 tweets β€’ 3 min read
Excited to share our @AIESConf paper "Does Fair Ranking Improve Outcomes?: Understanding the Interplay of Human and Algorithmic Biases in Online Hiring". We investigate if fair ranking algorithms can mitigate gender biases in online hiring settings arxiv.org/pdf/2012.00423… [1/n] More specifically, we were trying to examine the interplay between humans and fair ranking algorithms in online hiring settings, and assess if fair ranking algorithms can negate the effect of (any) gender biases prevalent in humans & ensure that the hiring outcomes are fair [2/n]
May 7, 2021 β€’ 8 tweets β€’ 6 min read
If you have less than 3 hours to spare & want to learn (almost) everything about state-of-the-art explainable ML, this thread is for you! Below, I am sharing info about 4 of our recent tutorials on explainability presented at NeurIPS, AAAI, FAccT, and CHIL conferences. [1/n] NeurIPS 2020: Our longest tutorial (2 hours 46 mins) discusses various types of explanation methods, their limitations, evaluation frameworks, applications to domains such as decision making/nlp/vision, and open problems explainml-tutorial.github.io/neurips20 @sameer_ @julius_adebayo [2/n]
Apr 25, 2021 β€’ 12 tweets β€’ 3 min read
As I struggled to deal with the impact of COVID on my family members in India, I got delayed by a day for submitting my reviews for a conference & I got a message from a senior reviewer with the blurb below. My humble request to everyone - pls don't say this to anyone ever! [1/n] Image I dont typically share any of my personal experiences on social media. But, I strongly felt that I need to make an exception this time. I am so incredibly hurt, appalled, flabbergasted, and dumbfounded by that blurb. It shows how academia can lack basic empathy! [2/n]
Sep 26, 2020 β€’ 12 tweets β€’ 4 min read
Twitter might seem like a not-so-kind place especially if you are a young student who just had your paper rejected by #NeurIPS2020. You might be seeing all your peers/professors talking about their paper acceptances. Let me shed some light on the reality of the situation [1/N] Twitter (and generally social media) paints a biased view of a lot of situations including this one thechicagoschool.edu/insight/from-t…. Looking at your twitter feed, you might be feeling that everyone else seems to have gotten their papers accepted except for you. That is so not true! [2/N]
Jul 15, 2020 β€’ 5 tweets β€’ 8 min read
Excited to join the team of and contribute to @trustworthy_ml handle. We will be covering the latest developments and research in "Trustworthy ML" regularly. Follow us and don't forget to tag @trustworthy_ml if you want us to tweet about your work. One of the goals of our @trustworthy_ml handle is to provide visibility to the work of researchers who are new to the field. Please RT widely & follow @trustworthy_ml. Don't forget to tag us if you want us to tweet about your work! @black_in_ai @_LXAI @QueerinAI @icmlconf
Jul 15, 2020 β€’ 6 tweets β€’ 2 min read
Want to generate black box explanations that are more stable and are robust to distribution shifts? Our latest #ICML2020 paper provides a generic framework that can be used to generate robust local/global linear/rule-based explanations.
Paper: proceedings.icml.cc/static/paper_f…. Thread ↓ Image Many existing explanation techniques are highly sensitive even to small changes in data. This results in: 1) incorrect and unstable explanations, (ii) explanations of the same model may differ based on the dataset used to construct them.
Dec 31, 2019 β€’ 5 tweets β€’ 1 min read
A recap of my past decade:
1. Started doing research
2. Wrote a bunch of papers and collaborated with some awesome people
3. Got some external recognition for my work e.g., MIT Tech Review 35 Under 35
[1/n] 4. Relocated from India to Bay area
5. Relocated from Bay area to Boston
6. Started and finished my PhD
7. Survived major health situations
8. Accepted my first faculty job (will start on 1/1/2020 - yayy!)
9. Taught my first ever (full fledged) course
[2/n]
Dec 6, 2019 β€’ 4 tweets β€’ 4 min read
Two of our papers just got accepted for oral presentation at AAAI Conference on AI and Ethics (AIES):
1. Designing adversarial attacks on explanation techniques (arxiv.org/pdf/1911.02508…)
2. How misleading explanations can be used to game user trust? (arxiv.org/pdf/1911.06473…) Super grateful to amazing collaborators: @dylanslack20, Sophie Hilgard, @emilycjia, @sameer_, and Osbert Bastani
Nov 25, 2019 β€’ 5 tweets β€’ 1 min read
I will be recruiting PhD students both through Harvard Business School (TOM Unit) and Harvard CS.

Focus areas include but are not limited to:

Machine Learning,
Interpretability,
Fairness,
Causal Inference,
HCI If you are interested in working with me, please apply to Harvard Business School (TOM Unit) and/or Harvard CS and write my name in your statements/application.
Nov 24, 2019 β€’ 11 tweets β€’ 5 min read
It is that time of the year where many aspirants will be applying for grad school and tenure track positions. I just wanted to share some advice that I wish I had known when I was going through these things. [continued below] Applicants are often ridden with self-doubt during these phases. I know I was. If I knew better, I would have trusted myself and my potential much more. So, let me say this out loud - Believe in yourself and your potential as you go through these processes.
Nov 7, 2019 β€’ 4 tweets β€’ 3 min read
Want to know how adversaries can game explainability techniques? Our latest research - "How can we fool LIME and SHAP? Adversarial Attacks on Explanation Methods" has answers: arxiv.org/abs/1911.02508. Joint work with the awesome team: @dylanslack20, Sophie, Emily, @sameer_ @dylanslack20 @sameer_ We show severe vulnerabilities in post hoc explanation techniques such as LIME and SHAP. 1) We find these techniques rely heavily on data points generated by perturbations of input data. 2) Instances generated by perturbation are often out of distribution samples
Oct 14, 2019 β€’ 4 tweets β€’ 6 min read
Interested in working on fair/interpretable ML (#FATML) and reinforcement learning, and its applications to decision making in criminal justice, healthcare, and business?

@s_jagabathula (NYU) and I are looking for postdocs! More details and application: apply.interfolio.com/70045 @s_jagabathula We will be available to meet prospective applicants at INFORMS Annual Meeting 2019. So, if you are attending
@INFORMS2019 and are interested to know more about this opportunity, please fill out the form at bit.ly/2pZjyLI and we will reach out to you.