Squeezing the juices out of robots.txt.

A fully automated workflow that you've never seen before.

(thread)
1. This script scrapes the disallowed paths from the robots.txt files of a list of domains and saves them to a single file. It also removes any unwanted entries and sorts the file in a particular way.

Can you write it yourself? Here’s how the script should look like.
2. Create a directory called "massrobots" in the pwd. This is where you'll save all the robots.txt files for later processing.
3. Read a file passed as an argument (referred to as "$1" in the script) which contains a list of domains/subdomains.
4. For each domain in the file, run another script "getrobo.py" passing the current domain as an argument.
5. The script "getrobo.py" makes a request to the robots.txt file for the passed domain and writes the response to a local file with the same name as the domain.
6. Filter the contents of the downloaded robots.txt file to only show the "Disallow" entries and remove the "Disallow" keyword.
7. Sort the filtered contents and write them to a file in the "massrobots" directory using the same name as the domain.
8. Remove the original downloaded robots.txt file.
9. For each file in the "massrobots" directory, append the domain name to each line of the file.
10. Concatenate all the files in the "massrobots" directory into a file called "waybackbots.txt". Remove the "massrobots" directory.
11. Remove any lines starting with ".", "/.", "?." "","/","?" from the "waybackbots.txt" file and sort the entries.
12. Finally, concatenate the original file passed as an argument to the script and rename the file to "waybackbots.txt". This step is optional and does not affect the results.
13. Comment below if you were able to build the script. If not and if this tweet gets 836+ likes, I’ll share my version on my Github github.com/cristivlad25
14. If you enjoyed this thread, there's much more to come! So, stay tuned.

Motivate me to continue posting by liking, retweeting, and following me @cristivlad25.

#pentesting #appsec #infosec #cybersecurity #hacking #bugbountytips

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with 🇷🇴 cristi

🇷🇴 cristi Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @CristiVlad25

Jan 19
The 4 Pentesting Learning Paths by @RealTryHackMe.

(thread)
1. CompTIA Pentest+ Pathway (Easy)

Gain hands-on exercises and practical exam preparation to prepare you for the CompTIA PenTest+ exam.

Go here: tryhackme.com/path-action/pe…
2. Jr Penetration Tester Pathway (Intermediate)

This pathway gives you with the practical skills to perform security assessments against web applications and enterprise infrastructure.

Go here: tryhackme.com/path-action/jr…
Read 6 tweets
Jan 18
2023 Hacker's Guide: How to Break into Pentesting and AppSec.

(thread)
1. Hands down, one of the best practical resources is @RealTryHackMe. I would take the "Complete Beginner" learning path, then the "Jr. Penetration Tester" path. It will get you off the ground in no time.

tryhackme.com/path-action/be…
2. Simultaneously, I would practice at @PortSwigger Academy. Personally, I'd focus on all things Broken Access Control.

portswigger.net/web-security/a…
Read 8 tweets
Jan 17
Ultimate GraphQL Recon.

(thread) Image
1. Fingerprint the API endpoint using graphw00f.

github.com/dolevf/graphw0…
2. Check the results against the Threat Matrix.

github.com/nicholasaleks/…
Read 7 tweets
Jan 9
How to learn reverse engineering fast. A Practical approach.

(thread)
1. Go to crackmes.one and start playing with the challenges there.
2. Study from the book "Reverse Engineering for Beginners". You can do it for free at: beginners.re
Read 5 tweets
Jan 9
How about some AI straight into your terminal?
Just say: gpt3 <your_command>.

(thread)
1. Go to my GitHub and clone this repository.

github.com/CristiVlad25/g…
2. Copy gpt3.sh to /root or another convenient location.
Read 8 tweets
Jan 8
The most frequent vulns I found in 80+ pentests in 2022.

(thread)
1. Rate Limit Bypass

In more than half of the pentests I conducted, I found that there were no rate limits imposed on login functions or authentication mechanisms. This lack of rate limiting, combined with weak password policies, can lead to accounts being easily compromised.
2. Session Token not Invalidated upon Logout

This vuln is caused when the session token is not invalidated at user logout. If the validity of the token or the time to expiry is long, this can lead to the token being leaked, potentially resulting in a data breach or other vulns.
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(