How to get URL link on X (Twitter) App
https://twitter.com/lennypruss/status/1909652555677737085AI agents execute complex, multi-step retrieval workflows. They formulate detailed questions, analyze intermediate results, refine their approach, and synthesize information across disparate sources—a far cry from the traditional single turn query=>response pattern.
https://twitter.com/jxnlco/status/17893643947043803911) Synthetic data can be used both for evaluation and to improve the retrieval function with fine-tuning. In this example, we trained a cross-encoder model using synthetic data. blog.vespa.ai/improving-text…
https://twitter.com/vboykis/status/1784215456552784278Largely due to LLMs and synthetic data, you can get decent text retrieval performance. Then your largest problem is going to be infrastructure cost, crawling and dealing with spam which is also easier now. Crawling is the hardest problem where there are few good alternatives.
https://twitter.com/OpenAI/status/1486047258499948544
https://twitter.com/Nthakur20/status/13845552496885800962) Dense retrievers (aka vector search) outperforms BM25 in data rich tasks significantly but generalization suffers when applied out of domain.