Return to Articles 2 mins read

How to make sure you aren't blocking AI crawlers in robots.txt for ecommerce

Posted January 29, 2025 by Will Critchlow

I think that ecommerce and retail websites should want their websites to be in the training data for all machine learning data sets. The main reason that this is less complicated than in other industries is that the AI is not going to ship physical products, so the best user experience for someone asking a chatbot for product recommendations is to end up being directed to someone's website. You might as well give it a better chance of being yours.

You have the option for them to learn from your website about product specifications, product details, pricing, availability, delivery options, reviews and ratings, and you should take it.

Due to the way that many sites have had special-case allowances for googlebot to crawl in the past, it's easy to have accidentally blocked newer crawlers (and it's also common to have blocked the AI crawlers explicitly).

I built a simple tool called the real robots.txt parser at realrobotstxt.com that is powered by the open source version of the technology underpinning how Google interprets robots.txt files (those who know me will know I'm something of a robots.txt geek). In this short video I outline the two main ways you might be blocking the new bots - explicitly or through allowing search crawlers despite a blanket block:

 

Build out the rest of your AI search strategy

Being crawlable by the LLM crawlers is just one of my recommendations for how retailers should be ready for AI search. It's a hot topic right now and I know many people are getting urgent requests from their bosses or from leadership to put together their company's AI search plan.

If that's you, I encourage you to copy my homework! I put together a presentation entitled what digital marketers should DO about AI right now which is available in both video and slide form totally free with no registration needed:

what-to-do-about-ai-thumbnail

Get future talks by email:

I originally delivered this talk as a webinar. Sign up to hear about future talks and similar content by email:

Get crucial SEO insights from SearchPilot and hear when I'm giving my next presentation:

 

Sign up to receive the results of two of our most surprising SEO experiments every month