URL database was formed by collecting websites of millions of domains.
pre-processing their texts (lemmatization, punctuation removal, removal of stop words)
and then sending them to website classifier
which was previously trained on millions of labelled texts, according to 21 main categories. We also have other classifiers, which are more detailed and have 440+ categories.