AI Sapiens Profile picture
Jul 11 6 tweets 3 min read
We have added new #nodejs modules for #URL #Database which contains millions of domains classified for their categories using #machinelearning model:
yarnpkg.com/package/urldat…
npmmirror.com/package/urldat…
npmtrends.com/urldatabase
URL database was formed by collecting websites of millions of domains.
pre-processing their texts (lemmatization, punctuation removal, removal of stop words)
and then sending them to website classifier
which was previously trained on millions of labelled texts, according to 21 main categories. We also have other classifiers, which are more detailed and have 440+ categories.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with AI Sapiens

AI Sapiens Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @_aisapiens

Jun 11
We added a new article on #URL #Database where the goal is to classify over 80 million #domain for their IAB #categories: alpha-quantum.com/blog/url-datab…
A set of interesting links about #URL #Classification linktr.ee/urlclassificat…
Read 6 tweets
Apr 19
#Tutoring has become an important discipline to improve #STEM education background.
Here are some interesting stats about STEM:
1. 10.2 million people work in STEM in US in 2021.
2. Average annual #salary for STEM workers is around 90k USD and thus more than double the average.
Read 6 tweets
Apr 13
It deals with specific #machinelearning problem, namely how to classify a given website into specific categories, also called #taxonomy.
the most common #taxonomies are those of IAB and Google Products Taxonomy. But there others, e.g. one from Facebook for products.
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(