How can you apply #Python for #SEO to get outstanding results?

I will show you some examples of what you can do without spending money.

This thread is all about Python and quick wins and insights 🧵
Disclaimer: I am not listing source codes here, I will make separate threads later.

The goal of this thread is to give you the footprint and the direction.

And remember, Analytics is different.

Scripting alone is not doing analysis and who tells you the opposite... is lying.

I recommend the following framework below:
Scraping - It's no secret that Python can be used for this purpose.

There are many types of projects, in this case, the most important revolving around sitemaps.

You can get a lot of information from sitemaps alone because they store time data.

And this means...
...that you can often analyze how much a website is publishing and their structure easily.

Sitemaps can be found by looking at the robots.txt file, though there are exceptions.

Analyze them to get an idea of how much content you could need for a project.
For large publishers or news websites you can scrape RSS feeds.

This is super underrated but you can do crazy stuff if you combine 1 month of data for 6 publishers.

This can be used to get fresh news from foreign sources too...

The only limitation is the RSS format sometimes.
Text analysis - Python is king when it comes to extracting insights from text data.

SEO is largely based on this type of data, lucky us.

There are many things you can do, the only limit is your imagination (and computational power).
Scrape ranking pages, extract their text and get the entities via Google NLP API.

What you learn:

- Most used entities, include them in your copy
- What a machine can recognize
- Terms you didn't even mention

You can use spaCy too but Google API's official.
Scrape the content from a website and identify the topics.

LDA is a technique that allows you to classify these pages and find what they are about.

BERTopic is another candidate for this task, so you have more than one option.
Understanding the main topics of a website (topic modeling) without visiting it allows you to work on other tasks.

This one is a computationally expensive task, not suggested to beginners.

Still, it's very powerful with the right mindset.
Text generation - The most (in)famous topic of the moment.

You can do a lot of crazy stuff in this case too and I am studying it.

While generating dozens of articles is not a gamechanger, it is if you do it 1 million times.

Disclaimer: I am against this strategy.
You can use the openai or transformers libraries for this task.

In some cases, it's better to use models suitable for that niche and test.

More on this in the future.

I want to thank my friend @theDrewDag, drop him a follow!
Text generation is super effective for meta descriptions, alt texts and ads.

While this is not my favorite use case, it's certainly effective.

Long-form content is almost impossible to publish without any editing.
Clustering - Yes, you can do that too.

You can cluster keywords, pages, pretty much whatever you want.

The process can be so demanding that you may want to use paid tools instead.

Still, you can use DIY scripts for a lot of projects.
The official™️ god of Clustering is @LeeFootSEO, whose work you can find below:

patreon.com/leefootseo
Internal linking - Yes, you can use coding to check internal links on your website and plot them.

This is quite hard for a beginner and that's why I don't recommend it unless you have some practice.

It gets more and more complex for larger websites though.
SERP Analysis - You can check what's happening and the history for rankings.

Or you can scrape content once again and repeat some of the steps mentioned before.

You can get SERP data via APIs or by doing it manually, but the former approach is more consistent.
Quality reports - Check broken links, canonicals etc.

The most boring thing you can do, useful if you cannot use tools and you need some fast and free alternative.

Do it once, repeat it forever.
Data Analysis - While Python is not my first choice for analysis, it's still super powerful.

Sometimes Excel is just not enough and you may want to use Python.

SEO tools won't give you a lot of info that you need or you have to pay crazy prices.
Checking Google Analytics, Search Console and even other data sources enrich your reports and your understanding.

We know that many SEOs use data dumps or the usual templates from tools.

Having your own style adds value and is good for clients.
You don't want to delegate this stuff to other people unless you're in an agency.

You should handle analysis because you have SEO knowledge.

This is not about coding, more about data understanding.
Web apps - Want to have that custom-ish look?

Build a web app with some Python libraries, it's easy and it can be used by your client too.

Still better than a PPT presentation.

You can build your own dashboards and scripts too...
Automation - repeat what was said before, don't waste time on tasks.

You can automate a lot of things as long as you get value.

Python is the king of automation.

I will list some examples of what you can automate...
- 404 reports, quality reports
- RSS feed/sitemap scraping
- SERP data extraction
- GSC data extraction
- Scraping prices from a set of websites
- Any type of scraping
As a consequence, if you merge some of these processes you can get some nasty stuff.

I don't advocate black hat methods but Python can definitely do that as well.

Coding opens up a lot of doors.
There are many other variations and use cases so you can just test them out.

Find what is more suitable for your style and your needs.

Data Analysis and text stuff are my priorities.
Mainstream SEO tools will never give you a competitive advantage.

Everyone is using them and your client will have probably seen those tools for quite some time.

Having something custom forces you to use your brain and propose something new.
The aforementioned reason is already enough to justify learning something new.

You want to be unique and differentiable.

If you present data dumps you cannot expect special treatment lol
Building your own stuff is still a strange topic in the SEO world.

A lot of people want to be spoon-fed and standardize everything (in a negative way).

I get extremely pissed off when people ask me stuff w/o giving me context.
So yeah, find what works for you and test.

Don't pay $1000 for a template or a course lol

Be sure that you have the correct mindset before you do something.

Building your tools helps you to understand what you really really need.
I am writing an ebook about Analytics for SEO, available starting from March 8 (circa).

This is NOT scripting, it's all about insights, practice and management.
Follow me for threads, tips, and case studies (coming soon) about SEO, content, and Python/data.

If you liked this thread, consider liking and retweeting it!🧵
Do you want to know more?

And maybe you have a lot of data and you don't know how to use it.

You can book a call with me (if you have a B2C content website/publisher):

bookk.me/marcogiordano

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Marco Giordano

Marco Giordano Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @GiordMarco96

Feb 27
Tired of the usual boring content website advice?

"Publish more, wait some time"

I give you some tips to actually future-proof your business.

No fluff, only cold facts from an #SEO and Data Analyst 🧵
The generic advice "publish more" isn't wrong per se.

But it's subpar advice and many savvy entrepreneurs already know that.

Just publishing is not enough because you need to maintain your content.

It costs a lot of money and time to update 1K+ articles.
Let's assume you need to update 100 articles every month.

You need to pay someone and while they are at it, they can't do other things.

It's clear that you have to take content management into account.

Who will update the content?
Read 17 tweets
Feb 26
You've probably asked yourself what these 5 terms mean...

Analytics, AI, SaaS, Big Data, NLP.

If you work in #SEO, this thread is for you.

Let's explain them like we are 5 years old 🧵
Analytics is the discovery and communication of patterns and insights in data.

Analyzing Google Search Console data is such an example.

This is where many SEOs fall short, as this discipline isn't just filtering data!
Data Science is the parent of Analytics.

You may also want to check this thread to broaden your knowledge.

Read 13 tweets
Feb 9
Data Skills and Analytics for #SEO essentials.

What you need to combine 2 of the most lucrative skills out there.

💯 Updated version with new content.

My 30 best threads in a single place:
Introduction to #nlp to understand how modern search engines work.
The Best SEO Python Libraries You Have to Use.

What I list here is more than necessary to start getting results.
Read 36 tweets
Feb 8
Testing is crucial in #SEO and you should always trust your results.

Don't listen to the buzz if you have no experience.

So how can you improve?

Some tips for more $$ and to working better with limited resources 🧵
2022 was the year of shiny objects, I barely recall 2-3 things that actually had an impact on my work.

I can tell you that simplification is still the best skill you can have.

If you have the confidence to make a complex problem look easy, then you are fine.
While I don't advocate practices that can negatively affect your business, I suggest you always test.

Google documentation isn't always that truthful.

Want one example?

How we use the Indexing API.

As long as you add value to the user, you can pretty much whatever you want.
Read 19 tweets
Feb 7
Do you know where many content websites struggle?

Processes because neither tools nor the #SEO industry talks about them.

It's all about abstraction and shiny objects, right?

Let's see some practical steps for ROI and 💸 🧵
Disclaimer: this is not about project management or strange English words.

It's mostly simple stupid concepts that can be put to great use.

The majority of what's here is Analytics and common sense!
The first recommendation is to always label your data.

If you have a content plan (you should), it needs to contain:

- The cluster(s) an article belongs to

- Additional information like author, date published

You add what you can't scrape or see elsewhere.
Read 15 tweets
Jan 25
The 5 Most Underrated But Profitable Google Search Console Metrics And Concepts.

Many websites don't use GSC to its full potential.

💡 You can have access to so many insights...

This data are for FREE but only a few can use it.

The most unique thread on GSC across TW 🧵
1. Unique query count by page.

Get the unique number of queries for each page.

This is an excellent measure to benchmark your efforts and find influential pages.

You can also track it over time with heatmaps.

It's the hero we don't deserve to have.
This is one of the most useful metrics you could ever use.

Find the pages with the highest value and investigate those queries with 0 clicks.

Is it because they are not relevant?

Do you need subpages?

This is one way to use it.
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(