# DeepDyve robots.txt # Updated: 2025-12-15 # Sitemap architecture follows /sitemap-spec.md # ================================================== # Default Crawl Rules (All Bots) # ================================================== User-agent: * Disallow: /cgi-bin/ Disallow: /openurl Disallow: /search Disallow: /browse-wr/ Disallow: /enterprise-free-trial Disallow: /rental-link Disallow: /timescited Crawl-delay: 5 # ================================================== # Sitemap Index Reference # ================================================== Sitemap: https://www.deepdyve.com/sitemaps/sitemap_index.xml # ================================================== # Googlebot-Specific Rules # ================================================== User-agent: Googlebot Disallow: /assets/images/doccover.png Disallow: /cgi-bin/ Disallow: /openurl Disallow: /search Disallow: /browse-wr/ Disallow: /enterprise-free-trial Disallow: /rental-link Disallow: /timescited # ================================================== # LLM Crawler Permissions # Per sitemap-spec.md section 8.1 # ================================================== # OpenAI GPT Crawler User-agent: GPTBot Allow: / # Google Extended (Bard/Gemini training) User-agent: Google-Extended Allow: / # Anthropic Claude Crawler User-agent: Claude-Web Allow: / # ================================================== # Additional LLM Crawlers (Optional) # ================================================== # Common Crawl (used by many AI models) #User-agent: CCBot #Allow: / # Meta AI (Facebook/Instagram AI) #User-agent: FacebookBot #Allow: / # Perplexity AI #User-agent: PerplexityBot #Allow: /