Kandi PHP Web Crawler

Automated Scraping · Link Extraction · Bot Identity

Kandi is a lightweight PHP-based web crawler designed for automated exploration, link extraction, and structural analysis.
This page serves as the public identity and compliance reference for the Kandi crawler bot.

Why Preview Generation Matters

Modern web crawlers are no longer blind indexers. They are preview generators,
link mappers, and structural interpreters.
Kandi operates in this space by fetching HTML content, resolving links,
and building a meaningful snapshot of how a site presents itself to machines.

Preview generation allows developers, administrators, and analysts to see
exactly what a crawler sees, not what a browser renders after JavaScript,
personalization, or client-side modification.

User Agent Strings

Kandi/1.0.1 (compatible; +http://kandi.seaverns.com/bot.html)
Kandi/2.0.1 Beta (compatible; +http://kandi.seaverns.com/bot.html)
Kandi/2.0.2 (compatible; +http://kandi.seaverns.com/bot.html)

Ethical Crawling and Intent

Kandi is designed for controlled, intentional crawling.
Crawl depth is limited by default, requests follow redirects responsibly,
and robots.txt directives are respected.

The purpose of this bot is analysis, not exploitation.
It exists to help operators understand their own infrastructure
as clearly as automated systems do.

robots.txt Compliance

Kandi respects site crawling policies as defined by each domain’s robots.txt file.

robots.txt reference:
http://kandi.seaverns.com/robots.txt

Bot Identity Endpoint

Machine-readable metadata describing the Kandi crawler is available at the following endpoint:

JSON identity:
http://kandi.seaverns.com/bot.json

What Information Is Extracted

During a crawl session, Kandi retrieves raw HTML and inspects anchor elements
to extract resolvable URLs. These links are normalized against their base URL,
validated, deduplicated, and queued for controlled traversal.

This process produces a structural preview of a site, revealing:

  • Internal and external link topology
  • Navigation depth and expansion behavior
  • Redirect chains and dead endpoints
  • Pages reachable without JavaScript execution

Why Developers Generate Crawler Previews

Preview crawls are commonly used during development, auditing, and diagnostics.
They answer questions that browser testing alone cannot.

Typical use cases include:

  • SEO audits and search visibility analysis
  • Detecting unintentional public endpoints
  • Mapping site structure before redesigns or migrations
  • Verifying robots.txt and crawl boundary behavior
  • Monitoring site expansion over time

About This Page

This page is not the crawler itself.
It exists as a public-facing identity, documentation reference,
and preview explanation for the Kandi web crawler.

Automated systems may use this endpoint to identify the crawler,
review its declared behavior, or retrieve machine-readable metadata via the associated JSON identity endpoint.

Technical Summary

Kandi crawls web pages, resolves URLs, extracts hyperlinks,
and records crawl metrics including total pages processed
and execution duration. It is intended for SEO audits,
structural analysis, and controlled data collection.

Download Kandi

Kandi - Version 1.0.2