Thread by @Lin_Thesis on Thread Reader App

#TwitterAPI #爬虫

【How to automatically get Twitter data through Python】

（Python自动抓取Twitter数据经验分享，中英双语）

It may be helpful for those who know a little Python but not a speciallized programmer.

（本经验适用于那些对Python略有了解、但并不精通编程的小伙伴）

Twitter is the most commonly used social media for Web3ers. If we have plenty of Twitter data, we can perform many interesting tasks, like automatically tracks a KOL’s new followings and find a project’s early followers. Here I'm going to share my experience for this automation:

Twitter是Web3用户中使用频率最高的社交媒体。如果我们可以自动获取大量Twitter数据，我们就可以做很多有意思的事情，比如自动跟踪KOL的新关注、分析一个项目的早期关注者和其传播链。这里我将分享有关Python自动爬取Twitter数据的亲身经验：

【Way 1: Twitter API】

Description: Twitter’s official access to get data. You can apply for a Twitter API in (developer.twitter.com/en/docs/twitte…) and use Tweepy (tweepy.org) to access the API through Python.

【Way 1: Twitter API】

Pros: Simple, Fast, Stable

Cons: (Fatal!) The requests rate are quite limited (900 requests/every 15min) for any research that requires large amount of data.

【方案1：通过Twitter API】

说明：Twitter API是Twitter官方开发的数据接口，可以通过developer.twitter.com/en/docs/twitte…申请，并用Tweepy（tweepy.org）库来用Python调用它。
（中国大陆的用户建议淘宝购买一个临时的英美手机号注册临时Twitter，因为中国大陆地区的手机很难通过申请）

【方案1：通过Twitter API】

优点：简单、快速、稳定

缺点：（可能致命！）Twitter API的访问频率有900次/每15分钟的限制，这让几乎任何需要大规模数据的自动化任务难以实现。

【2. Twitter Scraper】

Description：Access the data by automatically control your browser to mimic real human’s actions and get the data through HTML. However, due to Twitter’s anti-scraper efforts, many public Github repository on Twitter scraper no longer works now.

Scweet(htttps://github.com/Altimis/Scweet) is an available repository, but it still needs many adjustments to make it work on your computer

Pros: No limits, Personalize

Cons: Complex, comparatively Slow, Unstable(depends on your network connections), Against the Twitter Terms

【方案2：通过Twitter爬虫】

说明：通过自动控制浏览器模拟机械重复的人工操作，从网页HTML获取数据。但由于Twitter反爬虫策略的迭代，多数公开的Github爬虫代码库已经不可用。

Scweet(htttps://github.com/Altimis/Scweet) 是个人亲测可用的代码库，但也无法直接跑通，需要做不少的调整。

优点：无限制、个性化

缺点：复杂、相对较慢、不稳定（取决于你的网络连接）、违反了Twitter使用条款

If find my experience sharing is helpful, please like&retweet it! I am going to share more relavant experiences if there are enough retweets~
如果你觉得我的经验分享有帮助，请点赞+转发！如果有足够的转发，我将分享更多的相关经验~

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll