Managing Search Engines and AI Bots with robots.txt and llms.txt

General

Managing Search Engines and AI Bots with robots.txt and llms.txt

March 26, 20251 min read

Share this post

Server-side generation with client fallback

Search engine crawlers and AI bots hit every site, so we created clear guidelines on what they can access.

robots.txt Highlights

Allowed bots to crawl all public pages and static assets like /images/ and /portfolio/.
Blocked sensitive areas such as /api/, /_next/, and /admin/.
Added a sitemap reference and a modest Crawl-delay to reduce server strain.
Included specific rules for bots like GPTBot and ChatGPT-User to limit API and admin access.

llms.txt Guidelines

Outlined which site content AI models may quote, analyze, and index.
Prohibited training on client data, custom implementations, and pricing strategies.
Listed acceptable uses, such as answering questions about public services.
Provided contact emails for AI/LLM inquiries and permissions.

By publishing these files, we encourage good-faith indexing while protecting proprietary information and client privacy.

Share this post:

Related Posts

Discover more content you might find interesting

Instant Quote Calculator Evolution

March 19, 2025•2 min read

Instant Quote Calculator Evolution

A look at how our multi-step quote calculator grew, the bugs we fixed, and the optimizations that make it fast today.

Secrets to Lightning Fast Portfolio Sites

April 3, 2025•1 min read

Secrets to Lightning Fast Portfolio Sites

The performance tricks we use to keep our Next.js portfolio blazing fast.

Bulletproof Contact Form Bot Prevention

March 6, 2025•1 min read

Bulletproof Contact Form Bot Prevention

Combining honeypot fields and time checks to block spam without hurting real users.