Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
User-agent: *
Crawl-delay: 5
Disallow: /apps/
Disallow: /appsint/
Disallow: /aspnet_client/
Disallow: /bin/
Disallow: /bin-BKP/
Disallow: /certificates/   
Disallow: /cms/
Disallow: /crm/
Disallow: /Home/  
Disallow: /includes/
Disallow: /pdf/
Disallow: /policy/
Disallow: /product/
Disallow: /res/
Disallow: /reservations/
Disallow: /rss/
Disallow: /shared/
Disallow: /softripnext/ 
Disallow: /STNAttach/
Disallow: /STNView/
Disallow: /stw/
Disallow: /stsw/
Disallow: /temp/
Disallow: /test/
Disallow: /testing/
Disallow: /view_invoice/
Disallow: /view_voucher/
Disallow: /webctrl_client/
Disallow: /groups/*
Allow: /groups/$
Disallow: /Cms/
Disallow: /cms/
Allow: /cms/xmlsitemap
#Block Amazon crawler
User-agent: Amazonbot
Disallow: /
#Block dotbot
User-agent: dotbot 
Disallow: /
#Block Yandex
User-agent: Yandex 
Disallow: /
#Block all Semrush crawlers/bots
User-agent: SemrushBot
Disallow: /
User-agent: SplitSignalBot
Disallow: / 
User-agent: SiteAuditBot
Disallow: / 
User-agent: SemrushBot-BA
Disallow: / 
User-agent: SemrushBot-SI
Disallow: / 
User-agent: SemrushBot-SWA
Disallow: / 
User-agent: SemrushBot-CT
Disallow: / 
User-agent: SemrushBot-BM
Disallow: / 
#Block PetalBot
User-agent: PetalBot
Disallow: /  
# Block Claude (LLM Scraper)
User-agent: ClaudeBot
Crawl-delay: 100
Disallow: /
# Block Common Crawl (LLM Scraper)
User-agent: CCBot
Crawl-delay: 100
Disallow: /
# Block GPT bot (OpenAI Scraper)
User-agent: GPTBot
Crawl-delay: 100
Disallow: /
# Block OAI-SearchBot (OpenAI Search Bot)
User-agent: OAI-SearchBot
Crawl-delay: 100
Disallow: /
# Block Facebook/Meta
User-agent: facebookexternalhit
Crawl-delay: 100
Disallow: /
# Block Facebook/Meta
User-agent: meta-externalagent
Crawl-delay: 10100
Disallow: /

Using IIS Request Filtering

...

Code Block
languagexml
<configuration>
  [...]
  <system.webServer>
        [...]
		<security>
			<requestFiltering>
				<filteringRules>
					<filteringRule name="Block Bots and Crawlers" scanUrl="false" scanQueryString="false">
						<scanHeaders>
							<add requestHeader="User-Agent" />
						</scanHeaders>
						<denyStrings>
							<add string="facebookexternalhit" /> <!-- Block Facebook crawler -->
							<add string="meta-externalagent" /> <!-- Meta/facebook -->
							<add string="GPTBot" /> <!-- Block OpenAI GPT crawler -->
							<add string="OAI-SearchBot" /> <!-- Block OpenAI GPT crawler -->
						</denyStrings>
					</filteringRule>
				</filteringRules>
			</requestFiltering>
		</security>
  </system.webServer>
</configuration>