Ida Momennejad Profile picture
Principal Researcher @MSFTResearch. I build & evaluate neuro/behavior inspired AI/models of learning, memory, planning. Cohost https://t.co/rgvLjhC3If
Sep 29, 2023 9 tweets 5 min read
Delighted to share our #neurips2023 paper w @grockious @hmd_palangi et al
Evaluating Cognitive Maps & Planning in LLMs with CogEval

We test planning in 8 LLMs.
Failures like hallucinating invalid paths/falling in loops don't support emergent planning.
1/n
arxiv.org/abs/2309.15129



Image
Image
Image
Image
Recently an influx of studies claim emergent cognitive abilities in LLMs & doomers warn of AI planning a takeover.
But can LLMs plan?!
Such claims often lack systematic evaluation involving multiple tasks, control conditions, iterations, stats, etc.
We make 2 contributions.
2/n
May 23, 2021 13 tweets 8 min read
Excited to share new work w @katjahofmann @smdvln @ralgeorgescu @JarekRzepecki Evelyn Zuniga & colleagues!

Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation
arxiv.org/abs/2105.09637
accepted@ICML

We propose a method to evaluate human-like navigation
🧵1/n Many algorithms pass benchmarks, like navigation from a given location to a goal location in 3D games.

But passing benchmarks doesn't guarantee human-like navigation behavior nor cognitively or neurally plausible human-like algorithms/representations. This matters whether...2/n
Sep 26, 2019 7 tweets 7 min read
Thrilled to share new work w Stacey Sinclair & @profcikara! Computational Justice: Simulating Structural Bias and Interventions.
We ran agent based simulation of structural bias, params set from studies.
Then simulated/compared different interventions.1/n
biorxiv.org/content/biorxi… @profcikara We distinguish interpersonal bias (sexism) & structural bias, allow social learning. We exclude gender differences in interpersonal bias to isolate effect of structural bias. Unequal gender ratios => gender differences in # sexist comments received & increase in p(sexism). 2/n