Gary 鯨理/경리 Illyes Profile picture
House Elf and Chief of Sunshine and Happiness at Google. Resident bad guy and Lizzi's parrot. Onions are my own
Aug 11, 2020 4 tweets 1 min read
The indexing system, Caffeine, does multiple things:
1. ingests fetchlogs,
2. renders and converts fetched data,
3. extracts links, meta and structured data,
4. extracts and computes some signals,
5. schedules new crawls,
6. and builds the index that is pushed to serving. If something goes wrong with most of the things that it's supposed to do, that will show downstream in some way. If scheduling goes awry, crawling may slow down. If rendering goes wrong, we may misunderstand the pages. If index building goes bad, ranking & serving may be affected
Jun 17, 2020 4 tweets 2 min read
The other week @jdevalk asked if you can use nofollow on html link tags. I didn't know if that would work, so i spent a disgusting amount of time on running different systems' unittests to figure out. Following tweets explain why it took so long and what the answer is. First, the answer: yes, you can use nofollow on <link... tags in the form of rel="alternate nofollow", and that will prevent Google from using the link from the href attribute. If you don't specify a nofollow, the URL from href will be extracted as a weightless outlink.