Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Shubham Saboo

@Saboo_Shubham_

May 9, 2024 • 6 tweets • 2 min read • Read on X

Scrolly

Document OCR in 90+ languages using python (100% free and without internet):

For document OCR, we'll use opensource toolkit Surya.

Python toolkit that does:
• OCR in 90+ languages
• Line-level text detection in any language
• Layout analysis (table, image detection)
• Reading order detection
• Outperforms Tesseract, on par with Google Cloud vision

1. Install the Python library

Run the following command from your terminal. You'll need python 3.9+ and PyTorch.

2. Try out the OCR (text detection) boilerplate code

3. Try out the built-in interactive demo with Streamlit.

Install Streamlit and run the following command in your terminal: 'surya_gui'

🌟 Support the opensource project:

8000+ readers receive my AI newsletter everyday and you are not one of them. Join @_unwind_ai to stay on top of the latest AI developments and tools: github.com/VikParuchuri/s…
unwindai.substack.com

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @Saboo_Shubham_

Shubham Saboo

@Saboo_Shubham_

May 28

I vibe coded Instagram clone with this AI Agent using Claude 4 in less than 5 minutes.

Ship full-stack apps using Claude 4 without writing a single line of code.

Let that sink in.

Emergent is an agentic coding platform to build web apps, games, SaaS, Chrome extensions, and everything else.

It doesn't just generate code snippets, it builds entire systems with databases, APIs, authentication, and infrastructure, all with just simple prompts.

Here's another one:

I connected it with my GitHub repo Awesome LLM Apps and asked it to create a nice landing page for my Legal AI Agent Team app.

Read 6 tweets

Shubham Saboo

@Saboo_Shubham_

May 27

5 Autonomous General Purpose AI Agents that can literally do your work.

1. II-agent can think, code, reason and browse in a single loop just like humans.

Outperforms other AI Agent frameworks like Manus AI, Genspark AI, and OpenAI Deep Research.

100% opensource.

2. Skywork Super Agents can generate documents, slides, Excel sheets, webpages, and podcasts with deep research, using multi-agents.

These Super Agents can complete 8 hours of office work in just 8 minutes.

I have created 50+ AI Agents and RAG tutorials, 100% free and opensource.

To get started:
1. Follow me → @Saboo_Shubham_
2. Subscribe to Unwind AI (for free): theunwindai.com
3. Star the repo: github.com/Shubhamsaboo/a…

New AI Agents and RAG tutorials added every week.

Read 8 tweets

Shubham Saboo

@Saboo_Shubham_

May 4

10 MCP servers to supercharge your AI Agents.

1. Firecrawl MCP Server

Turns any site, even those rendered by JavaScript into ready‑to‑parse HTML or Markdown for your AI agent.

2. Browserbase MCP Server

Spins up a cloud browser so the model can load JS‑heavy pages, click around and grab screenshots.

Read 13 tweets

Shubham Saboo

@Saboo_Shubham_

Apr 18

I built an MCP AI Agent using Gemini Flash 2.5 with access to AirBnB and Google Maps in just 30 lines of Python Code.

100% Opensource Code.

I have created 50+ AI Agents and RAG tutorials, 100% free and opensource.

Two simple steps to get started:
1. Subscribe to Unwind AI (for free): theunwindai.com
2. Star the repo: github.com/Shubhamsaboo/a…

New AI Agents and RAG tutorials added every week.

Here's the full Opensource code.

🌟 Star the repo for more: github.com/Shubhamsaboo/a…

Read 4 tweets

Shubham Saboo

@Saboo_Shubham_

Apr 16

OpenAI just launched o3 and o4-mini with agentic tool use.

It's been just a few hours, and the internet is filled with incredible AI examples.

10 wild examples:

https://x.com/emollick/status/1912597487287705965

1. o3 creates a movie without video tools by sketching each frame of an otter and airplane scene and stitching them into a GIF, all in one shot.

https://x.com/emollick/status/1912597487287705965

2. o3 and o4-mini nailed the coding vibe check

Read 13 tweets

Shubham Saboo

@Saboo_Shubham_

Apr 16

Web Action AI Agent that doesn’t just scrape but finds the data you need.

Firecrawl launched FIRE-1, an AI agent that navigates complex websites, interacts with buttons, fills forms, and gathers data beyond traditional scraping.

No manual steps required.