# Contro1 Platform - AI-Ready Website Scanner

> Transform your website for AI agents with automated llms.txt, robots.txt, and manifest generation

## Overview

Contro1 is a comprehensive platform that scans websites and generates AI-ready navigation files. We help developers, SEO specialists, and AI engineers make their websites discoverable and navigable by AI agents, including GPT, Claude, Gemini, and custom automation tools.

## Core Features

### 1. Intelligent Website Scanner
- Automatically crawls and analyzes website structure
- Detects workflows, forms, authentication patterns
- Identifies public vs protected routes
- Discovers API endpoints and data flows

### 2. llms.txt Generation
- Creates human-readable AI documentation
- Describes site purpose, structure, and navigation
- Includes workflow instructions
- Optimized for LLM consumption

### 3. AI-Optimized robots.txt
- Generates bot-specific rules for AI agents
- Balances accessibility with privacy
- Includes AI navigation directives
- Supports all major AI bots (GPTBot, Claude, Google-Extended, etc.)

### 4. Semantic Navigation Manifests
- Machine-readable site structure
- Puppeteer/Playwright configurations
- CSS selectors and interaction patterns
- Workflow automation guides

### 5. Browser Automation Support
- Export Puppeteer configs
- Playwright-compatible selectors
- Session management patterns
- OAuth workflow detection

## How to Use Contro1

### Quick Start (No Login Required)

1. Visit https://contro1.com/home
2. Enter your website URL
3. Click "Analyze Website"
4. Get instant AI readiness report
5. Sign up for full scan and file generation

### Full Scan (Requires Account)

1. Register at https://contro1.com/register
2. Access dashboard at https://contro1.com/dashboard
3. Start a new scan with your website URL
4. Watch live progress with AI agent insights
5. Download generated files:
   - llms.txt
   - robots.txt
   - AI manifest (JSON)
   - Navigation map
   - Puppeteer config

## API Access

### Public Endpoints

#### Quick Discovery
```
POST /api/public/quick-discovery
Content-Type: application/json

{
  "url": "https://example.com"
}
```

Returns instant analysis of website AI-readiness.

#### Get Manifest by Domain
```
GET /api/scanner/public/manifest/{domain}
```

Retrieve AI manifest for a previously scanned domain.

#### Registry of Scanned Sites
```
GET /api/scanner/public/registry
```

Browse all public scans and manifests.

### Authenticated Endpoints

Requires JWT token in Authorization header.

#### Start Full Scan
```
POST /api/scanner/scan
Authorization: Bearer {token}

{
  "url": "https://example.com",
  "options": {
    "depth": "full",
    "includeWorkflows": true
  }
}
```

#### Get Scan Results
```
GET /api/manifest/{scanId}
GET /api/llms-txt/{scanId}
GET /api/navigation-map/{scanId}
GET /api/robots-txt/{scanId}
```

## AI Agent Guidelines

### Recommended Interaction Flow

1. **Discovery Phase**
   - Check /robots.txt for AI-Manifest directive
   - Read /llms.txt for human-readable overview
   - Fetch manifest from /api/manifest/{id}

2. **Understanding Phase**
   - Review site structure in manifest
   - Identify key workflows and actions
   - Note authentication requirements

3. **Execution Phase**
   - Use Puppeteer configs for automation
   - Follow workflow instructions
   - Respect rate limits and authentication

### What AI Agents Can Do

- ✅ Scan public websites to generate manifests
- ✅ Analyze website structure and workflows
- ✅ Generate optimized navigation files
- ✅ Export browser automation configs
- ✅ Detect OAuth and authentication patterns
- ❌ Cannot access private user dashboards
- ❌ Cannot modify existing scans
- ❌ Cannot access protected routes without auth

## Key Workflows

### Workflow 1: Scan a Website (Public Quick Scan)
```
1. Navigate to https://contro1.com/home
2. Enter website URL in input field (id: "url-input")
3. Click "Analyze Website" button
4. View real-time analysis results
5. Sign up to access full features
```

### Workflow 2: Full Website Scan (Authenticated)
```
1. Login at https://contro1.com/dashboard
2. Click "New Scan" button
3. Enter target website URL
4. Configure scan options:
   - Scan depth (quick/medium/full)
   - Include workflows
   - Include API detection
5. Click "Start Scan"
6. Monitor live progress with AI insights
7. Download generated files when complete
```

### Workflow 3: View Existing Manifests
```
1. Visit https://contro1.com/api/scanner/public/registry
2. Browse available manifests
3. Click on domain to view manifest
4. Download llms.txt, robots.txt, configs
```

## Technical Details

### Technology Stack
- Frontend: React 18, React Router 7
- Backend: Node.js, Express
- AI Engine: Python, LangGraph
- Browser Automation: Puppeteer
- Database: MongoDB Atlas
- Hosting: Google Cloud Run

### File Formats

**AI Manifest (JSON)**
- Semantic structure description
- Workflow definitions
- CSS selectors and patterns
- Authentication metadata

**llms.txt (Plain Text)**
- Human-readable documentation
- Workflow instructions
- API endpoints
- Usage guidelines

**robots.txt (Plain Text)**
- Bot-specific rules
- AI navigation directives
- Sitemap references
- Crawl policies

**Navigation Map (JSON)**
- Page-level selectors
- Interactive elements
- Form configurations
- API endpoints

## Rate Limits

- **Public Quick Discovery**: 10 requests/minute per IP
- **Authenticated Scans**: 100 requests/minute per user
- **Manifest Access**: Unlimited reads
- **Large Scans**: May take 5-15 minutes depending on site size

## Support & Contact

- **Contact Form**: https://contro1.com/contact (preferred)
- **Email**: ariel@contro1.com
- **Website**: https://contro1.com
- **Documentation**: https://contro1.com/llms.txt
- **Security Issues**: https://contro1.com/contact (select "Security Issue")
- **API Status**: https://contro1.com/api/health

## Privacy & Security

- We respect robots.txt of scanned sites
- No personal data is stored from scanned sites
- Authentication tokens are encrypted
- GDPR compliant
- User data is private by default

## Example Use Cases

1. **SEO Optimization**: Generate AI-optimized robots.txt for better discoverability
2. **AI Agent Development**: Get structured manifests for training agents on your site
3. **Browser Automation**: Export ready-to-use Puppeteer configurations
4. **Documentation**: Auto-generate llms.txt for your developer docs
5. **Testing**: Create test suites based on detected workflows

## Generated by Contro1

This llms.txt was created as a demonstration of what Contro1 generates for websites.

**Scan your own website at https://contro1.com to get your custom llms.txt!**

---

Last updated: 2026-02-05
Version: 1.0.0
Format: llms.txt (https://llmstxt.org)