Discover and read the best of Twitter Threads about #SemanticSearch

Most recents (2)

AI提示工程系列第五篇:构建一个基于知识库的AI机器人

1/12
在第二篇讲了如何让ChatGPT写长文,是从生成的角度来说。
那如果我们手头有一套知识库,如何根据知识库来建立一个类似 #chatpdf 的问答系统呢?今天就来聊聊这个话题👇🧵

#AI #ChatGPT #提示工程 #promptengineering #SemanticSearch
2/12
本文偏系统设计方面,适合于AI领域的产品/技术同学阅读。

没看过前四篇的朋友可以回顾一下:
3/12
假设我们的场景是要建立一个公司的AI自动客服系统,手头有公司的一整套知识库。

我们清楚ChatGPT有4096的Token上下文的限制,是无法一次性将整套知识库灌给ChatGPT的。这就要求我们想办法将数据“剪枝”,而剪枝的方法就是将用户的“问题”和知识库中可能的“段落上下文”联系起来。
Read 12 tweets
A scientific paper from Google gives some interesting insights into how Google today probably divides search queries into different thematic areas. Here is my summary of the paper in this thread.🧵 #seo #semanticsearch #google
The document"Improving semantic topic clustering for search queries with word co-occurrence and bipartite graph co-clustering"presents two methods that Google uses to contextually classify search queries.So-called lift scores play a central role in word co-occurrence clustering.
"Wi" in the formula stands for all terms that are closely related to the root of the word, such as misspellings, plural, singular or synonyms.
"a" can be any user interaction such as searching for a specific search term or visiting a specific page.
Read 14 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!