A list of the most useful #Python libraries you can use for #SEO right now. 🐍

This updated thread will tell you the main libraries for #DataScience and #NLP that you should consider.

Use them in your workflow! 🧵
Numpy & Pandas: the foundations for data analysis, just learn them.

Without these 2 libraries, you cannot do Data Science at all. Good knowledge of Pandas can get you quite far.
Advertools: the best SEM library out there.

It’s very useful for crawling, log file analysis, analyzing SERPs and querying the Knowledge Graph.

The ideal Swiss knife you need in your arsenal.

advertools.readthedocs.io/en/master/
Ecommercetools: The ideal package for analyzing eCommerce data and getting access to some useful NLP functions.

It’s a rare jewel in your collection that is very handy for technical SEO and e-commerce as well.

pypi.org/project/ecomme…
Requests: Make HTTPS requests via Python, essential for web scraping.

Sure, there are alternatives but you should learn them. It's very important and a lot of your initial work will require this library.

pypi.org/project/reques…
urllibb: for working with URLs. It should be part of your arsenal.

Take some time to study all the options and possible use cases.

docs.python.org/3/library/urll…
BeautifulSoup: a library to extract data from HTML/XML files, used in combination with scraping libraries to convert data into Python objects.

One of the first ones you’ll probably learn in your Python journey.

crummy.com/software/Beaut…
Scrapy: the absolute peak of scraping.

Nothing is better than this, even though the setup may be hard.

You can carry out any scraping task with this library.
Matplotlib/Seaborn/Plotly: you need some sort of visualization and these libs are here to help you.

You can start with Seaborn which is easier to use. DataViz is an important topic and you should value it.
NLTK/spaCy: work with human language to analyze text data and get insights into the nuances of our language.

This is necessary to get your hands dirty with text data.

The latter can be used to recognize entities and parts of speech.
Querycat: few functions but good quality thanks to association rule mining and BERT.

It's one of my favorite libraries, but the installation may not be immediate.

It's useful for visualizing losses in impressions over time.

github.com/jroakes/queryc…
Sklearn: A staple for Machine Learning.

I don't think you really need it, but it's one of the first libs you will encounter.

scikit-learn.org/stable/
Transformers: Pretrained models to handle a wide range of tasks. Essential for NLP!

This library is crucial for the most advanced tasks and quite reliable too. I highly suggest you check my other thread:

sentence_transformers: Python framework for state-of-the-art sentence, text, and image embeddings.

Use it for keyword clustering and other text-related tasks. It's one of my most used libraries right now.

sbert.net
Trafilatura: download, parse and scrape web pages.

If you work with content, look no further.

Cleaning the HTML elements of a page is overrated, don't waste your life on it!

trafilatura.readthedocs.io/en/latest/
Streamlit/Dash: interactive web applications.

Useful for prototyping and communicating.

Streamlit is one of the most favorite solutions for the SEO community.
Typer: create apps that you can run from your command line.

Extremely powerful for personal uses and for running local scripts.

A game-changer for automating your workflow.

typer.tiangolo.com
networkx: the must-have graph theory library.

I recommend you learn it once you have mastered the basics.

Graph Theory is of great importance for analysts who want to level up their game.

More on this in future threads.

networkx.org
searchconsole: Use this library to import your data from the GSC API.

It's easy to set up and it's one of the most used libraries in my workflow.

github.com/joshcarty/goog…
BERTopic: one of my most used NLP libraries and for good reasons. I dedicated an entire thread on the topic:

scattertext: library for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot.

A short example from the official docs: (github.com/JasonKessler/s…).
openpyxl: if you have to work with Excel data and create spreadsheets.

There are other libraries but I prefer to use this one. It's quite nice and it works well for most of the tasks.

openpyxl.readthedocs.io/en/stable/
Start with scraping and data analysis.

Then, you can move to NLP libraries and study topics like NER and Clustering.
Sticking to the mainstream libraries is necessary to get access to "better" documentation.

My suggestion is to try alternatives and always look for new opportunities across the web.

Be sure to always do your research, you could find the perfect library for your needs.
Follow me for threads, tips, and case studies (coming soon) about SEO, content, and Python/data.

If you liked this thread, consider liking and retweeting it!🧵
I offer short consultancies and full freelancing for publishers and B2C content.

bookk.me/marcogiordano

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Marco Giordano

Marco Giordano Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @GiordMarco96

Sep 19
How to boost your content with some methods you didn't know about (maybe).

Nurture your content, don't let it perish!

A 1% improvement can boost your organic performance...

Let's see how to create and develop better content 🧵
Internal links. The most obvious, yet a lot of sites still fail it. This is your priority, especially if you have a large website.

Use contextual anchor texts and do it, it's extremely powerful!
Adding a ToC/tables/bullet lists/any HTML element that makes sense.

Help Google and the user to skim through your content. You don't need a wall of text when you can just create a table.

Table of Contents is my favorite solution to get sitelinks.
Read 18 tweets
Sep 18
I had a short discussion with some of my older friends today.

If you have graduated recently like me or if you are still <30, I have some great advice.

How to get started in #SEO or #contentmarketing in 2022 🧵

N.B. These tips are accessible to everyone.
First of all, don't listen to advice from people that didn't do what they suggest to you.

This includes your parents, your friends, and whoever has anecdotal evidence.

You must face the hard truth and go after it instead of finding comfort.
Many Twitter folks/gurus are from the US/UK, so it's quite easy to understand why they have the edge.

Those 2 countries are historically the peak for marketing, they have a stronger story and culture compared to the rest.

Just check Marketing history!
Read 28 tweets
Sep 18
An experiment with Graph Theory: internal links audit with graphs.

In the picture: Larger dots are URLs receiving more internal links.

And much more... let me explain Image
You can check some of these metrics to audit a website:

- In-Degree: internal links you receive
- Out-Degree: what are you linking to
- Betweenness: what are the URLs that act as brokers?
It's still a WIP but the idea is sound.

I've checked one competitor and quickly got an idea of how they handled their internal linking strategy.

Checking inbound and outbound internal links is already good.
Read 7 tweets
Sep 15
15 Actionable Tips that are good for aspirant #SEO Specialists 🧵:
1. Learn the popular tools but acknowledge that they don't give you a competitive advantage as a professional.

Everyone can buy them, it's how you use them that makes a difference.

I suggest you start with Semrush/Ahrefs and Screaming Frog/Sitebulb.
2. Experience different flavors of SEO and pick your favorite.

Content, PR outreach, or Technical are just some of the choices you have.

Pick whatever suits you the best, don't over-optimize your career.
Read 18 tweets
Sep 14
Evergreen content is one of my favorite #SEO topics.

It's super crucial for publishers and especially for niche websites.

A short thread about its relevance and why you should master research. 🧵
Evergreen refers to content that is not time-sensitive and is not particularly subject to seasonality or timeframes.

Search demand is constant through the years. They usually tackle recurring problems or super generic topics.
This type of content makes it easy to apply some advanced principles, as you want them to last.

Having good authors, good syntax with proper entities, and clear internal linking is the bare minimum for me.

This is very simple once you know the niche very well.
Read 15 tweets
Sep 1
Quick #SEO tips to get easy wins and exploit low-hanging fruits.

Sometimes the best solutions are easy and fast 🧵
Bulk scrape your competitors' pages and get their most frequent entities.

You now have an idea of what they talk about and what are their most prominent concepts.

You can use this info to create a more complete content network.
There is a way to check Wikipedia Analytics.

You can quickly get what are the most visited pages and how many edits they receive.

pageviews.wmcloud.org
Read 19 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(