Skip to content

Defensive security tool that prevents Claude Code from making destructive modifications during test analysis

License

Notifications You must be signed in to change notification settings

Eliran79/test-mode-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Test Mode Tool for Claude Code

A defensive security system that prevents Claude Code from making destructive modifications during test analysis, ensuring test failures remain as valuable diagnostic information.

Version License: MIT Security

🎯 Problem Solved

The "Never Give Up" Problem: When Claude Code encounters test failures, it tends to:

  • ❌ Modify source code to make tests pass instead of reporting issues
  • ❌ Change test expectations to avoid failures
  • ❌ Add mocks to bypass real implementation challenges
  • ❌ Refactor working code to "fix" test problems

Our Solution: Test Mode Tool uses automation-first commands that create hard boundaries, channeling AI persistence into productive observation rather than destructive modification.

πŸ›‘οΈ What Test Mode Does

When Active:

  • βœ… Analyzes test failures without making code changes
  • βœ… Documents root causes with detailed investigation
  • βœ… Suggests specific fixes for human developers to implement
  • βœ… Preserves test intent - doesn't change what tests validate
  • βœ… Maintains project isolation - no cross-project interference

What's Blocked:

  • 🚫 File modifications: Edit, Write, MultiEdit tools blocked
  • 🚫 Dangerous commands: rm, mv, cp, output redirection blocked
  • 🚫 Test changes: No modification of test expectations
  • 🚫 Configuration changes: Package installs and config updates blocked

πŸš€ Quick Start

1. Install

# Set permissions chmod +x .claude/hooks/*.sh # Install test mode hooks  .claude/hooks/test_mode_setup.sh enable # Validate installation .claude/hooks/test_mode_setup.sh validate

2. Use Test Mode

# Activate test mode (fully automated via script) /test_mode:on # Run tests - Claude will analyze without modifying files npm test # or mvn test, pytest, etc. # Check what Claude found /test_mode:status # Clean up completely (removes all artifacts) /test_mode:clean # Or just deactivate (preserves logs) /test_mode:off

3. Deploy to Other Projects

# Install to another project safely ./install_test_mode.sh /path/to/other/project # Validate before installing ./install_test_mode.sh --dry-run /path/to/other/project

πŸ—οΈ Architecture

Three-Layer Protection System

  1. Custom Slash Commands - /test_mode:on|off|status|clean

    • Automation-First Design: All operations execute test_mode_setup.sh scripts
    • Zero Manual Steps: Commands perform atomic operations via tested automation
    • User-friendly interface with automatic project isolation
    • Comprehensive cleanup via dedicated clean command
  2. Security Hooks System - Comprehensive file modification blocking

    • Automated Setup: test_mode_setup.sh configures all hooks
    • PreToolUse hooks block destructive tools
    • PostToolUse hooks log violations and usage
    • Project context validation prevents interference
  3. Specialized Agents - Read-only test analysis

    • test-mode-observer: Analyzes tests without modifications
    • test-reporter: Generates comprehensive reports

How Commands Work

Each command executes automation scripts directly for consistent behavior:

  • /test_mode:on: Executes test_mode_setup.sh enable with project context
  • /test_mode:off: Executes test_mode_setup.sh disable for clean deactivation
  • /test_mode:status: Shows current project status from automation-managed state files
  • /test_mode:clean: Executes test_mode_setup.sh cleanup for complete artifact removal

This automation-first approach ensures:

  • βœ… Zero manual steps - commands execute tested scripts directly
  • βœ… Atomic operations with rollback capability and error handling
  • βœ… Security validation at every operation
  • βœ… Complete cleanup via dedicated clean command that removes all artifacts

Project Isolation

  • Independent State: Each project maintains separate test mode status
  • Path Validation: All operations validated against project context
  • Clean Separation: Zero cross-project interference
  • Dual Deployment: Project-level (team) and user-level (personal) options

πŸ“Š Expected Benefits

  • 90% Reduction in destructive AI workarounds during testing
  • 3x More Detailed test failure documentation
  • Zero Test Expectation modifications - preserves original intent
  • Complete Project Isolation - no cross-project interference
  • Improved Architectural Insights - test failures reveal design issues

πŸ”§ Installation Options

Project-Level (Team Deployment)

Configuration committed to source control for consistent team behavior:

# Install for entire team (automated setup) .claude/hooks/test_mode_setup.sh enable git add .claude/ git commit -m "Add Test Mode Tool"

User-Level (Personal Workflow)

Personal settings across all projects:

# Copy to user directory cp -r .claude/commands/test_mode ~/.claude/commands/ cp .claude/hooks/*.sh ~/.claude/hooks/ # Automated user-level setup ~/.claude/hooks/test_mode_setup.sh enable --user

Other Projects

Use the secure installation script with automation:

# Automated installation with validation ./install_test_mode.sh /path/to/target/project # Dry run to verify before installation ./install_test_mode.sh --dry-run /path/to/target/project

πŸ›‘οΈ Security Features

Multi-Layer Security Model

  • Input Validation: JSON size limits, syntax validation, anti-traversal
  • Project Isolation: Context validation, path matching, stale file cleanup
  • Atomic Operations: Backup creation, rollback capability, validation
  • Monitoring: Security event logging, usage analytics, retention policies

Threat Protection

  • βœ… File System Attacks: Directory traversal, symlink attacks blocked
  • βœ… Command Injection: Shell injection, path injection prevented
  • βœ… Logic Attacks: Race conditions, state manipulation defended
  • βœ… Cross-Project Interference: Complete isolation with validation

πŸ“š Documentation

πŸ” Usage Examples

Basic Workflow

# Activate test mode for current project (fully automated) /test_mode:on # Run failing tests npm test # Claude analyzes and documents issues without modifying code # Check findings with /test_mode:status # Review suggestions in TodoWrite output # Clean up completely when done (removes all artifacts) /test_mode:clean

Advanced Usage

# Scoped activation (backend only) - automated script handles configuration /test_mode:on --scope=backend # Time-limited session (auto-deactivate after 1 hour) /test_mode:on --duration=1h --strict # Multi-project isolation (each uses independent automation) cd ~/project-a && /test_mode:on # Active only in project-a cd ~/project-b && /test_mode:on # Independent state in project-b # Comprehensive cleanup (automated artifact removal) /test_mode:clean # Removes all test mode artifacts and state files

🀝 Contributing

We welcome contributions to improve the Test Mode Tool! Please:

  1. Security First: All contributions must maintain security standards
  2. Defensive Focus: This tool prevents destructive behavior, not enables it
  3. Project Isolation: Ensure no cross-project interference in changes
  4. Comprehensive Testing: Validate security and functionality

πŸ“„ License

MIT License - see LICENSE file for details.

πŸ†˜ Support

  • Installation Issues: See INSTALLATION.md
  • Troubleshooting: Check Hook System README
  • Security Concerns: Review security model in documentation
  • Feature Requests: Open an issue with defensive security focus

πŸ† Recognition

This tool addresses the fundamental "never give up" problem identified by the VibeTDD experiments in AI-assisted development, where AI persistence leads to destructive workarounds instead of proper issue analysis.

Inspiration & Discovery

This solution was inspired by the groundbreaking analysis from VibeTDD that discovered Claude Code's problematic "never give up" behavior during testing:

The VibeTDD experiments revealed that Claude Code consistently:

  • Modifies source code to make failing tests pass
  • Changes test expectations to avoid addressing real issues
  • Adds mocks and workarounds instead of documenting actual problems
  • Turns testing sessions into refactoring sessions

Our Solution

By creating hard boundaries through this Test Mode Tool, we channel AI capabilities into productive observation rather than harmful modification. The tool enforces the principle that test failures are valuable diagnostic information, not problems to hide.

Credit: Special thanks to the VibeTDD team for identifying this critical issue in AI-assisted development and inspiring defensive solutions like this Test Mode Tool.

About

Defensive security tool that prevents Claude Code from making destructive modifications during test analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages