This document provides comprehensive testing guidelines for XcodeBuildMCP plugins, ensuring consistent, robust, and maintainable test coverage across the entire codebase.
- Testing Philosophy
- Test Architecture
- Dependency Injection Strategy
- Three-Dimensional Testing
- Test Organization
- Test Patterns
- Performance Requirements
- Coverage Standards
- Common Patterns
- Manual Testing with Reloaderoo
- Troubleshooting
createMockExecutor()/createNoopExecutor()for command execution (xcrun,xcodebuild, AXe, etc.)createMockFileSystemExecutor()/createNoopFileSystemExecutor()for file system interactions
- Vitest mocking (
vi.fn,vi.mock,vi.spyOn,.mockResolvedValue, etc.) is allowed for internal modules and in-memory collaborators - Prefer straightforward, readable test doubles over over-engineered mocks
- Hitting real external systems in unit tests (real
xcodebuild,xcrun, AXe, filesystem writes/reads outside test harness) - Bypassing dependency injection for external effects
Simple Rule: Use dependency-injection mock executors for external boundaries; use Vitest mocking only for internal behavior.
Why This Rule Exists:
- Reliability: External side effects stay deterministic and hermetic
- Clarity: Internal collaboration assertions remain concise and readable
- Architectural Enforcement: External boundaries are explicit in tool logic signatures
- Maintainability: Tests fail for behavior regressions, not incidental environment differences
XcodeBuildMCP follows a dependency-injection testing philosophy for external boundaries:
- ✅ Test plugin interfaces (public API contracts)
- ✅ Test integration flows (plugin → utilities → external tools)
- ✅ Use dependency injection with createMockExecutor()/createMockFileSystemExecutor() for external dependencies
- ✅ Use Vitest mocking when needed for internal modules and collaborators
- Implementation Independence: Internal refactoring doesn't break tests
- Real Coverage: Tests verify actual user data flows
- Maintainability: No brittle vitest mocks that break on implementation changes
- True Integration: Catches integration bugs between layers
- Test Safety: Default executors throw errors in test environment
To enforce external-boundary testing policy, the project includes a script that checks for architectural test-pattern violations.
# Run the script to check for violations
node scripts/check-code-patterns.jsThis script is part of the standard development workflow and should be run before committing changes to ensure compliance with the testing standards.
- Manual mock executors:
const mockExecutor = async (...) => { ... } - Manual filesystem mocks:
const mockFsDeps = { readFile: async () => ... } - Manual server mocks:
const mockServer = { ... } - External side-effect patterns that bypass injected executors/filesystem dependencies
- Test data tracking:
commandCalls.push({ ... })- This is just collecting test data, not mocking behavior - Regular variables:
const testData = { ... }- Non-mocking object assignments - Test setup: Regular const assignments that don't implement mock behavior
The script has been refined to minimize false positives while catching all legitimate violations of our core rule.
Test → Plugin Handler → utilities → [DEPENDENCY INJECTION] createMockExecutor()
- Plugin parameter validation
- Business logic execution
- Command generation
- Response formatting
- Error handling
- Integration between layers
- Command execution via
createMockExecutor() - File system operations via
createMockFileSystemExecutor() - Internal modules can use Vitest mocks where appropriate
All plugin handlers must support dependency injection:
export function tool_nameLogic(
args: Record<string, unknown>,
commandExecutor: CommandExecutor,
fileSystemExecutor?: FileSystemExecutor
): Promise<ToolResponse> {
// Use injected executors
const result = await executeCommand(['xcrun', 'simctl', 'list'], commandExecutor);
return createTextResponse(result.output);
}
export default {
name: 'tool_name',
description: 'Tool description',
schema: { /* zod schema */ },
async handler(args: Record<string, unknown>): Promise<ToolResponse> {
return tool_nameLogic(args, getDefaultCommandExecutor(), getDefaultFileSystemExecutor());
},
};Important: The dependency injection pattern applies to ALL handlers, including:
- Tool handlers
- Resource handlers
- Any future handler types (prompts, etc.)
Always use default parameter values (e.g., = getDefaultCommandExecutor()) to ensure production code works without explicit executor injection, while tests can override with mock executors.
All tests must explicitly provide mock executors:
it('should handle successful command execution', async () => {
const mockExecutor = createMockExecutor({
success: true,
output: 'BUILD SUCCEEDED'
});
const result = await tool_nameLogic(
{ projectPath: '/test.xcodeproj', scheme: 'MyApp' },
mockExecutor
);
expect(result.content[0].text).toContain('Build succeeded');
});Every plugin test suite must validate three critical dimensions:
Test parameter validation and schema compliance:
describe('Parameter Validation', () => {
it('should accept valid parameters', () => {
const schema = z.object(tool.schema);
expect(schema.safeParse({
projectPath: '/valid/path.xcodeproj',
scheme: 'ValidScheme'
}).success).toBe(true);
});
it('should reject invalid parameters', () => {
const schema = z.object(tool.schema);
expect(schema.safeParse({
projectPath: 123, // Wrong type
scheme: 'ValidScheme'
}).success).toBe(false);
});
it('should handle missing required parameters', async () => {
const mockExecutor = createMockExecutor({ success: true });
const result = await tool.handler({ scheme: 'MyApp' }, mockExecutor); // Missing projectPath
expect(result).toEqual({
content: [{
type: 'text',
text: "Required parameter 'projectPath' is missing. Please provide a value for this parameter."
}],
isError: true
});
});
});describe('Command Generation', () => {
it('should execute correct command with minimal parameters', async () => {
const mockExecutor = createMockExecutor({
success: true,
output: 'BUILD SUCCEEDED'
});
const result = await tool.handler({
projectPath: '/test.xcodeproj',
scheme: 'MyApp'
}, mockExecutor);
// Verify through successful response - command was executed correctly
expect(result.content[0].text).toContain('Build succeeded');
});
it('should handle paths with spaces correctly', async () => {
const mockExecutor = createMockExecutor({
success: true,
output: 'BUILD SUCCEEDED'
});
const result = await tool.handler({
projectPath: '/Users/dev/My Project/app.xcodeproj',
scheme: 'MyApp'
}, mockExecutor);
// Verify successful execution (proper path handling)
expect(result.content[0].text).toContain('Build succeeded');
});
});Test response formatting and error handling:
describe('Response Processing', () => {
it('should format successful response', async () => {
const mockExecutor = createMockExecutor({
success: true,
output: 'BUILD SUCCEEDED'
});
const result = await tool.handler({ projectPath: '/test', scheme: 'MyApp' }, mockExecutor);
expect(result).toEqual({
content: [{ type: 'text', text: '✅ Build succeeded for scheme MyApp' }]
});
});
it('should handle command failures', async () => {
const mockExecutor = createMockExecutor({
success: false,
output: 'Build failed with errors',
error: 'Compilation error'
});
const result = await tool.handler({ projectPath: '/test', scheme: 'MyApp' }, mockExecutor);
expect(result.isError).toBe(true);
expect(result.content[0].text).toContain('Build failed');
});
it('should handle executor errors', async () => {
const mockExecutor = createMockExecutor(new Error('spawn xcodebuild ENOENT'));
const result = await tool.handler({ projectPath: '/test', scheme: 'MyApp' }, mockExecutor);
expect(result).toEqual({
content: [{ type: 'text', text: 'Error during build: spawn xcodebuild ENOENT' }],
isError: true
});
});
});src/plugins/[workflow-group]/
├── __tests__/
│ ├── index.test.ts # Workflow metadata tests (canonical groups only)
│ ├── re-exports.test.ts # Re-export validation (project/workspace groups only)
│ ├── tool1.test.ts # Individual tool tests
│ ├── tool2.test.ts
│ └── ...
├── tool1.ts
├── tool2.ts
├── index.ts # Workflow metadata
└── ...
Test individual plugin tools with full three-dimensional coverage.
Test workflow metadata for canonical groups:
describe('simulator-workspace workflow metadata', () => {
it('should have correct workflow name', () => {
expect(workflow.name).toBe('iOS Simulator Workspace Development');
});
it('should have correct description', () => {
expect(workflow.description).toBe(
'Complete iOS development workflow for .xcworkspace files including build, test, deploy, and debug capabilities',
);
});
});Test re-export integrity for project/workspace groups:
describe('simulator-project re-exports', () => {
it('should re-export boot_sim from simulator-shared', () => {
expect(bootSim.name).toBe('boot_sim');
expect(typeof bootSim.handler).toBe('function');
});
});import { vi, describe, it, expect, beforeEach } from 'vitest';
import { z } from 'zod';
// Use dependency-injection mocks for external boundaries.
// Vitest mocks are acceptable for internal collaborators when needed.
import tool from '../tool_name.ts';
import { createMockExecutor } from '../../utils/command.js';
describe('tool_name', () => {
describe('Export Field Validation (Literal)', () => {
it('should export correct name', () => {
expect(tool.name).toBe('tool_name');
});
it('should export correct description', () => {
expect(tool.description).toBe('Expected literal description');
});
it('should export handler function', () => {
expect(typeof tool.handler).toBe('function');
});
// Schema validation tests...
});
describe('Command Generation', () => {
it('should execute commands successfully', async () => {
const mockExecutor = createMockExecutor({
success: true,
output: 'Expected output'
});
const result = await tool.handler(validParams, mockExecutor);
expect(result.content[0].text).toContain('Expected result');
});
});
describe('Response Processing', () => {
// Output handling tests...
});
});- Individual test: < 100ms
- Test file: < 5 seconds
- Full test suite: < 20 seconds
- No real system calls: Tests must use mocks
❌ Real command execution:
[INFO] Executing command: xcodebuild -showBuildSettings...
❌ Long timeouts (indicates real calls) ❌ File system operations (unless testing file utilities) ❌ Network requests (unless testing network utilities)
- Overall: 95%+
- Plugin handlers: 100%
- Command generation: 100%
- Error paths: 100%
# Check coverage for specific plugin group
npm run test:coverage -- plugins/simulator-workspace/
# Ensure all code paths are tested
npm run test:coverage -- --reporter=lcovEvery plugin test must cover:
- ✅ Valid parameter combinations
- ✅ Invalid parameter rejection
- ✅ Missing required parameters
- ✅ Successful command execution
- ✅ Command failure scenarios
- ✅ Executor error handling
- ✅ Output parsing edge cases
it('should use default configuration when not provided', async () => {
const mockExecutor = createMockExecutor({
success: true,
output: 'BUILD SUCCEEDED'
});
const result = await tool.handler({
projectPath: '/test.xcodeproj',
scheme: 'MyApp'
// configuration intentionally omitted
}, mockExecutor);
// Verify default behavior through successful response
expect(result.content[0].text).toContain('Build succeeded');
});it('should extract app path from build settings', async () => {
const mockExecutor = createMockExecutor({
success: true,
output: `
CONFIGURATION_BUILD_DIR = /path/to/build
BUILT_PRODUCTS_DIR = /path/to/products
FULL_PRODUCT_NAME = MyApp.app
OTHER_SETTING = ignored_value
`
});
const result = await tool.handler({ projectPath: '/test', scheme: 'MyApp' }, mockExecutor);
expect(result.content[0].text).toContain('/path/to/products/MyApp.app');
});it('should format validation errors correctly', async () => {
const mockExecutor = createMockExecutor({ success: true });
const result = await tool.handler({}, mockExecutor); // Missing required params
expect(result).toEqual({
content: [{
type: 'text',
text: "Required parameter 'projectPath' is missing. Please provide a value for this parameter."
}],
isError: true
});
});- EVERY SINGLE TOOL - All 83+ tools must be tested individually, one by one
- NO REPRESENTATIVE SAMPLING - Testing similar tools does NOT validate other tools
- NO PATTERN RECOGNITION SHORTCUTS - Similar-looking tools may have different behaviors
- NO EFFICIENCY OPTIMIZATIONS - Thoroughness is more important than speed
- NO TIME CONSTRAINTS - This is a long-running task with no deadline pressure
- NEVER assume testing
build_sim_id_projvalidatesbuild_sim_name_proj - NEVER skip tools because they "look similar" to tested ones
- NEVER use representative sampling instead of complete coverage
- NEVER stop testing due to time concerns or perceived redundancy
- NEVER group tools together for batch testing
- NEVER make assumptions about untested tools based on tested patterns
- Individual Tool Testing: Each tool gets its own dedicated test execution
- Complete Documentation: Every tool result must be recorded, regardless of outcome
- Systematic Progress: Use TodoWrite to track every single tool as tested/untested
- Failure Documentation: Test tools that cannot work and mark them as failed/blocked
- No Assumptions: Treat each tool as potentially unique requiring individual validation
- Start Count: Record exact number of tools discovered (e.g., 83 tools)
- End Count: Verify same number of tools have been individually tested
- Missing Tools = Testing Failure: If any tools remain untested, the testing is incomplete
- TodoWrite Tracking: Every tool must appear in todo list and be marked completed
Black Box Testing means testing ONLY through external interfaces without any knowledge of internal implementation. For XcodeBuildMCP, this means testing exclusively through the Model Context Protocol (MCP) interface using Reloaderoo as the MCP client.
-
✅ ONLY ALLOWED: Reloaderoo Inspect Commands
npx reloaderoo@latest inspect call-tool "TOOL_NAME" --params 'JSON' -- node build/cli.js mcpnpx reloaderoo@latest inspect list-tools -- node build/cli.js mcpnpx reloaderoo@latest inspect read-resource "URI" -- node build/cli.js mcpnpx reloaderoo@latest inspect server-info -- node build/cli.js mcpnpx reloaderoo@latest inspect ping -- node build/cli.js mcp
-
❌ COMPLETELY FORBIDDEN ACTIONS:
- NEVER call
mcp__XcodeBuildMCP__tool_name()functions directly - NEVER use MCP server tools as if they were native functions
- NEVER access internal server functionality
- NEVER read source code to understand how tools work
- NEVER examine implementation files during testing
- NEVER diagnose internal server issues or registration problems
- NEVER suggest code fixes or implementation changes
- NEVER call
-
🚨 CRITICAL VIOLATION EXAMPLES:
// ❌ FORBIDDEN - Direct MCP tool calls await mcp__XcodeBuildMCP__list_devices(); await mcp__XcodeBuildMCP__build_sim_id_proj({ ... }); // ❌ FORBIDDEN - Using tools as native functions const devices = await list_devices(); const result = await doctor(); // ✅ CORRECT - Only through Reloaderoo inspect npx reloaderoo@latest inspect call-tool "list_devices" --params '{}' -- node build/cli.js mcp npx reloaderoo@latest inspect call-tool "doctor" --params '{}' -- node build/cli.js mcp
- Higher Fidelity: Provides clear input/output visibility for each tool call
- Real-world Simulation: Tests exactly how MCP clients interact with the server
- Interface Validation: Ensures MCP protocol compliance and proper JSON formatting
- Black Box Enforcement: Prevents accidental access to internal implementation details
- Clean State: Each tool call runs with a fresh MCP server instance, preventing cross-contamination
Reloaderoo starts a fresh MCP server instance for each individual tool call and terminates it immediately after the response. This ensures:
- ✅ Clean Testing Environment: No state contamination between tool calls
- ✅ Isolated Testing: Each tool test is independent and repeatable
- ✅ Real-world Accuracy: Simulates how most MCP clients interact with servers
Some tools rely on in-memory state within the MCP server and will fail when tested via Reloaderoo inspect. These failures are expected and acceptable as false negatives:
swift_package_stop- Requires in-memory process tracking fromswift_package_runstop_app_device- Requires in-memory process tracking fromlaunch_app_devicestop_app_sim- Requires in-memory process tracking fromlaunch_app_simstop_device_log_cap- Requires in-memory session tracking fromstart_device_log_capstop_sim_log_cap- Requires in-memory session tracking fromstart_sim_log_capstop_mac_app- Requires in-memory process tracking fromlaunch_mac_app
- Test the tool anyway - Execute the Reloaderoo inspect command
- Expect failure - Tool will likely fail due to missing state
- Mark as false negative - Document the failure as expected due to stateful limitations
- Continue testing - Do not attempt to fix or investigate the failure
- Report as finding - Note in testing report that stateful tools failed as expected
- ✅ Test ALL 83+ tools individually - No exceptions, every tool gets manual verification
- ✅ Follow dependency graphs - Test tools in correct order based on data dependencies
- ✅ Capture key outputs - Record UUIDs, paths, schemes needed by dependent tools
- ✅ Test real workflows - Complete end-to-end workflows from discovery to execution
- ✅ Use programmatic JSON parsing - Accurate tool/resource counting and discovery
- ✅ Document all observations - Record exactly what you see via testing
- ✅ Report discrepancies as findings - Note unexpected results without investigation
# Generate complete list of all tools
npx reloaderoo@latest inspect list-tools -- node build/cli.js mcp > /tmp/all_tools.json
TOTAL_TOOLS=$(jq '.tools | length' /tmp/all_tools.json)
echo "TOTAL TOOLS TO TEST: $TOTAL_TOOLS"
# Extract all tool names for systematic testing
jq -r '.tools[].name' /tmp/all_tools.json > /tmp/tool_names.txt# Create individual todo items for each of the 83+ tools
# Example for first few tools:
# 1. [ ] Test tool: doctor
# 2. [ ] Test tool: list_devices
# 3. [ ] Test tool: list_sims
# ... (continue for ALL 83+ tools)For EVERY tool in the list:
# Test each tool individually - NO BATCHING
npx reloaderoo@latest inspect call-tool "TOOL_NAME" --params 'APPROPRIATE_PARAMS' -- node build/cli.js mcp
# Mark tool as completed in TodoWrite IMMEDIATELY after testing
# Record result (success/failure/blocked) for each tool# Verify all tools tested
COMPLETED_TOOLS=$(count completed todo items)
if [ $COMPLETED_TOOLS -ne $TOTAL_TOOLS ]; then
echo "ERROR: Testing incomplete. $COMPLETED_TOOLS/$TOTAL_TOOLS tested"
exit 1
fi- Every tool name from the JSON list must be individually tested
- Every tool must have a TodoWrite entry that gets marked completed
- Tools that fail due to missing parameters should be tested anyway and marked as blocked
- Tools that require setup (like running processes) should be tested and documented as requiring dependencies
- NO ASSUMPTIONS: Test tools even if they seem redundant or similar to others
- ✅ Test only through Reloaderoo MCP interface - Simulates real-world MCP client usage
- ✅ Use task lists - Track progress with TodoWrite tool for every single tool
- ✅ Tick off each tool - Mark completed in task list after manual verification
- ✅ Manual oversight - Human verification of each tool's input and output
- ❌ Never examine source code - No reading implementation files during testing
- ❌ Never diagnose internal issues - No investigation of build processes or tool registration
- ❌ Never suggest implementation fixes - Report issues as findings, don't solve them
- ❌ Never use scripts for tool testing - Each tool must be manually executed and verified
- Symptom: "These tools look similar, I'll test one to validate the others"
- Correction: Every tool is unique and must be tested individually
- Enforcement: Count tools at start, verify same count tested at end
- Symptom: "I see the pattern, the rest will work the same way"
- Correction: Patterns may hide edge cases, bugs, or different implementations
- Enforcement: No assumptions allowed, test every tool regardless of apparent similarity
- Symptom: "This is taking too long, let me speed up by sampling"
- Correction: This is explicitly a long-running task with no time constraints
- Enforcement: Thoroughness is the ONLY priority, efficiency is irrelevant
- Symptom: "The architecture is solid, so all tools must work"
- Correction: Architecture validation does not guarantee individual tool functionality
- Enforcement: Test tools to discover actual issues, not to confirm assumptions
- Every tool is potentially broken until individually tested
- Every tool may have unique edge cases not covered by similar tools
- Every tool deserves individual attention regardless of apparent redundancy
- Testing completion means EVERY tool tested, not "enough tools to validate patterns"
- The goal is discovering problems, not confirming everything works
- Generated complete tool list (83+ tools)
- Created TodoWrite entry for every single tool
- Tested every tool individually via Reloaderoo inspect
- Marked every tool as completed in TodoWrite
- Verified tool count: tested_count == total_count
- Documented all results, including failures and blocked tools
- Created final report covering ALL tools, not just successful ones
CRITICAL: Tools must be tested in dependency order:
-
Foundation Tools (provide data for other tools):
doctor- System infolist_devices- Device UUIDslist_sims- Simulator UUIDsdiscover_projs- Project/workspace paths
-
Discovery Tools (provide metadata for build tools):
list_schemes- Scheme namesshow_build_settings- Build settings
-
Build Tools (create artifacts for install tools):
build_*tools - Create app bundlesget_*_app_path_*tools - Locate built app bundlesget_*_bundle_idtools - Extract bundle IDs
-
Installation Tools (depend on built artifacts):
install_app_*tools - Install built appslaunch_app_*tools - Launch installed apps
-
Testing Tools (depend on projects/schemes):
test_*tools - Run test suites
-
UI Automation Tools (depend on running apps):
snapshot_ui,screenshot,tap, etc.
Must capture and document these values for dependent tools:
- Device UUIDs from
list_devices - Simulator UUIDs from
list_sims - Project/workspace paths from
discover_projs - Scheme names from
list_schems_* - App bundle paths from
get_*_app_path_* - Bundle IDs from
get_*_bundle_id
- Build the server:
npm run build - Install jq:
brew install jq(required for JSON parsing) - System Requirements: macOS with Xcode installed, connected devices/simulators optional
# Generate complete tool list with accurate count
npx reloaderoo@latest inspect list-tools -- node build/cli.js mcp 2>/dev/null > /tmp/tools.json
# Get accurate tool count
TOOL_COUNT=$(jq '.tools | length' /tmp/tools.json)
echo "Official tool count: $TOOL_COUNT"
# Generate tool names list for testing checklist
jq -r '.tools[] | .name' /tmp/tools.json > /tmp/tool_names.txt
echo "Tool names saved to /tmp/tool_names.txt"# Generate complete resource list
npx reloaderoo@latest inspect list-resources -- node build/cli.js mcp 2>/dev/null > /tmp/resources.json
# Get accurate resource count
RESOURCE_COUNT=$(jq '.resources | length' /tmp/resources.json)
echo "Official resource count: $RESOURCE_COUNT"
# Generate resource URIs for testing checklist
jq -r '.resources[] | .uri' /tmp/resources.json > /tmp/resource_uris.txt
echo "Resource URIs saved to /tmp/resource_uris.txt"# Generate markdown checklist from actual tool list
echo "# Official Tool Testing Checklist" > /tmp/tool_testing_checklist.md
echo "" >> /tmp/tool_testing_checklist.md
echo "Total Tools: $TOOL_COUNT" >> /tmp/tool_testing_checklist.md
echo "" >> /tmp/tool_testing_checklist.md
# Add each tool as unchecked item
while IFS= read -r tool_name; do
echo "- [ ] $tool_name" >> /tmp/tool_testing_checklist.md
done < /tmp/tool_names.txt
echo "Tool testing checklist created at /tmp/tool_testing_checklist.md"# Generate markdown checklist from actual resource list
echo "# Official Resource Testing Checklist" > /tmp/resource_testing_checklist.md
echo "" >> /tmp/resource_testing_checklist.md
echo "Total Resources: $RESOURCE_COUNT" >> /tmp/resource_testing_checklist.md
echo "" >> /tmp/resource_testing_checklist.md
# Add each resource as unchecked item
while IFS= read -r resource_uri; do
echo "- [ ] $resource_uri" >> /tmp/resource_testing_checklist.md
done < /tmp/resource_uris.txt
echo "Resource testing checklist created at /tmp/resource_testing_checklist.md"# Get schema for specific tool to understand required parameters
TOOL_NAME="list_devices"
jq --arg tool "$TOOL_NAME" '.tools[] | select(.name == $tool) | .inputSchema' /tmp/tools.json
# Get tool description for usage guidance
jq --arg tool "$TOOL_NAME" '.tools[] | select(.name == $tool) | .description' /tmp/tools.json
# Generate parameter template for tool testing
jq --arg tool "$TOOL_NAME" '.tools[] | select(.name == $tool) | .inputSchema.properties // {}' /tmp/tools.json# Create schema reference file for all tools
echo "# Tool Schema Reference" > /tmp/tool_schemas.md
echo "" >> /tmp/tool_schemas.md
while IFS= read -r tool_name; do
echo "## $tool_name" >> /tmp/tool_schemas.md
echo "" >> /tmp/tool_schemas.md
# Get description
description=$(jq -r --arg tool "$tool_name" '.tools[] | select(.name == $tool) | .description' /tmp/tools.json)
echo "**Description:** $description" >> /tmp/tool_schemas.md
echo "" >> /tmp/tool_schemas.md
# Get required parameters
required=$(jq -r --arg tool "$tool_name" '.tools[] | select(.name == $tool) | .inputSchema.required // [] | join(", ")' /tmp/tools.json)
if [ "$required" != "" ]; then
echo "**Required Parameters:** $required" >> /tmp/tool_schemas.md
else
echo "**Required Parameters:** None" >> /tmp/tool_schemas.md
fi
echo "" >> /tmp/tool_schemas.md
# Get all parameters
echo "**All Parameters:**" >> /tmp/tool_schemas.md
jq --arg tool "$tool_name" '.tools[] | select(.name == $tool) | .inputSchema.properties // {} | keys[]' /tmp/tools.json | while read param; do
echo "- $param" >> /tmp/tool_schemas.md
done
echo "" >> /tmp/tool_schemas.md
done < /tmp/tool_names.txt
echo "Tool schema reference created at /tmp/tool_schemas.md"-
Create TodoWrite Task List
- Add all 83 tools to task list before starting
- Mark each tool as "pending" initially
- Update status to "in_progress" when testing begins
- Mark "completed" only after manual verification
-
Test Each Tool Individually
- Execute ONLY via
npx reloaderoo@latest inspect call-tool "TOOL_NAME" --params 'JSON' -- node build/cli.js mcp - Wait for complete response before proceeding to next tool
- Read and verify each tool's output manually
- Record key outputs (UUIDs, paths, schemes) for dependent tools
- Execute ONLY via
-
Manual Verification Requirements
- ✅ Read each response - Manually verify tool output makes sense
- ✅ Check for errors - Identify any tool failures or unexpected responses
- ✅ Record UUIDs/paths - Save outputs needed for dependent tools
- ✅ Update task list - Mark each tool complete after verification
- ✅ Document issues - Record any problems found during testing
-
FORBIDDEN SHORTCUTS:
- ❌ NO SCRIPTS - Scripts hide what's happening and prevent proper verification
- ❌ NO AUTOMATION - Every tool call must be manually executed and verified
- ❌ NO BATCHING - Cannot test multiple tools simultaneously
- ❌ NO MCP DIRECT CALLS - Only Reloaderoo inspect commands allowed
# Test server connectivity
npx reloaderoo@latest inspect ping -- node build/cli.js mcp
# Get server information
npx reloaderoo@latest inspect server-info -- node build/cli.js mcp
# Verify tool count manually
npx reloaderoo@latest inspect list-tools -- node build/cli.js mcp 2>/dev/null | jq '.tools | length'
# Verify resource count manually
npx reloaderoo@latest inspect list-resources -- node build/cli.js mcp 2>/dev/null | jq '.resources | length'# Test each resource systematically
while IFS= read -r resource_uri; do
echo "Testing resource: $resource_uri"
npx reloaderoo@latest inspect read-resource "$resource_uri" -- node build/cli.js mcp 2>/dev/null
echo "---"
done < /tmp/resource_uris.txtecho "=== FOUNDATION TOOL TESTING & DATA COLLECTION ==="
# 1. Test doctor (no dependencies)
echo "Testing doctor..."
npx reloaderoo@latest inspect call-tool "doctor" --params '{}' -- node build/cli.js mcp 2>/dev/null
# 2. Collect device data
echo "Collecting device UUIDs..."
npx reloaderoo@latest inspect call-tool "list_devices" --params '{}' -- node build/cli.js mcp 2>/dev/null > /tmp/devices_output.json
DEVICE_UUIDS=$(jq -r '.content[0].text' /tmp/devices_output.json | grep -E "UDID: [A-F0-9-]+" | sed 's/.*UDID: //' | head -2)
echo "Device UUIDs captured: $DEVICE_UUIDS"
# 3. Collect simulator data
echo "Collecting simulator UUIDs..."
npx reloaderoo@latest inspect call-tool "list_sims" --params '{}' -- node build/cli.js mcp 2>/dev/null > /tmp/sims_output.json
SIMULATOR_UUIDS=$(jq -r '.content[0].text' /tmp/sims_output.json | grep -E "\([A-F0-9-]+\)" | sed 's/.*(\([A-F0-9-]*\)).*/\1/' | head -3)
echo "Simulator UUIDs captured: $SIMULATOR_UUIDS"
# 4. Collect project data
echo "Collecting project paths..."
npx reloaderoo@latest inspect call-tool "discover_projs" --params '{"workspaceRoot": "/Volumes/Developer/XcodeBuildMCP"}' -- node build/cli.js mcp 2>/dev/null > /tmp/projects_output.json
PROJECT_PATHS=$(jq -r '.content[1].text' /tmp/projects_output.json | grep -E "\.xcodeproj$" | sed 's/.*- //' | head -3)
WORKSPACE_PATHS=$(jq -r '.content[2].text' /tmp/projects_output.json | grep -E "\.xcworkspace$" | sed 's/.*- //' | head -2)
echo "Project paths captured: $PROJECT_PATHS"
echo "Workspace paths captured: $WORKSPACE_PATHS"
# Save key data for dependent tools
echo "$DEVICE_UUIDS" > /tmp/device_uuids.txt
echo "$SIMULATOR_UUIDS" > /tmp/simulator_uuids.txt
echo "$PROJECT_PATHS" > /tmp/project_paths.txt
echo "$WORKSPACE_PATHS" > /tmp/workspace_paths.txtecho "=== DISCOVERY TOOL TESTING & METADATA COLLECTION ==="
# Collect schemes for each project
while IFS= read -r project_path; do
if [ -n "$project_path" ]; then
echo "Getting schemes for: $project_path"
npx reloaderoo@latest inspect call-tool "list_schems_proj" --params "{\"projectPath\": \"$project_path\"}" -- node build/cli.js mcp 2>/dev/null > /tmp/schemes_$$.json
SCHEMES=$(jq -r '.content[1].text' /tmp/schemes_$$.json 2>/dev/null || echo "NoScheme")
echo "$project_path|$SCHEMES" >> /tmp/project_schemes.txt
echo "Schemes captured for $project_path: $SCHEMES"
fi
done < /tmp/project_paths.txt
# Collect schemes for each workspace
while IFS= read -r workspace_path; do
if [ -n "$workspace_path" ]; then
echo "Getting schemes for: $workspace_path"
npx reloaderoo@latest inspect call-tool "list_schemes" --params "{\"workspacePath\": \"$workspace_path\"}" -- node build/cli.js mcp 2>/dev/null > /tmp/ws_schemes_$$.json
SCHEMES=$(jq -r '.content[1].text' /tmp/ws_schemes_$$.json 2>/dev/null || echo "NoScheme")
echo "$workspace_path|$SCHEMES" >> /tmp/workspace_schemes.txt
echo "Schemes captured for $workspace_path: $SCHEMES"
fi
done < /tmp/workspace_paths.txt- Create task list with TodoWrite tool for all 83 tools
- Test each tool individually with proper parameters
- Mark each tool complete in task list after manual verification
- Record results and observations for each tool
- NO SCRIPTS - Each command executed manually
# STEP 1: Test foundation tools (no parameters required)
# Execute each command individually, wait for response, verify manually
npx reloaderoo@latest inspect call-tool "doctor" --params '{}' -- node build/cli.js mcp
# [Wait for response, read output, mark tool complete in task list]
npx reloaderoo@latest inspect call-tool "list_devices" --params '{}' -- node build/cli.js mcp
# [Record device UUIDs from response for dependent tools]
npx reloaderoo@latest inspect call-tool "list_sims" --params '{}' -- node build/cli.js mcp
# [Record simulator UUIDs from response for dependent tools]
# STEP 2: Test project discovery (use discovered project paths)
npx reloaderoo@latest inspect call-tool "list_schems_proj" --params '{"projectPath": "/actual/path/from/discover_projs.xcodeproj"}' -- node build/cli.js mcp
# [Record scheme names from response for build tools]
# STEP 3: Test workspace tools (use discovered workspace paths)
npx reloaderoo@latest inspect call-tool "list_schemes" --params '{"workspacePath": "/actual/path/from/discover_projs.xcworkspace"}' -- node build/cli.js mcp
# [Record scheme names from response for build tools]
# STEP 4: Test simulator tools (use captured simulator UUIDs from step 1)
npx reloaderoo@latest inspect call-tool "boot_sim" --params '{"simulatorUuid": "ACTUAL_UUID_FROM_LIST_SIMS"}' -- node build/cli.js mcp
# [Verify simulator boots successfully]
# STEP 5: Test build tools (requires project + scheme + simulator from previous steps)
npx reloaderoo@latest inspect call-tool "build_sim_id_proj" --params '{"projectPath": "/actual/project.xcodeproj", "scheme": "ActualSchemeName", "simulatorId": "ACTUAL_SIMULATOR_UUID"}' -- node build/cli.js mcp
# [Verify build succeeds and record app bundle path]- Executed individually - One command at a time, manually typed or pasted
- Verified manually - Read the complete response before continuing
- Tracked in task list - Mark tool complete only after verification
- Use real data - Replace placeholder values with actual captured data
- Wait for completion - Allow each command to finish before proceeding
-
Direct MCP Tool Usage Violation:
// ❌ IMMEDIATE TERMINATION - Using MCP tools directly await mcp__XcodeBuildMCP__list_devices(); const result = await list_sims();
-
Script-Based Testing Violation:
# ❌ IMMEDIATE TERMINATION - Using scripts to test tools for tool in $(cat tool_list.txt); do npx reloaderoo inspect call-tool "$tool" --params '{}' -- node build/cli.js mcp done
-
Batching/Automation Violation:
# ❌ IMMEDIATE TERMINATION - Testing multiple tools simultaneously npx reloaderoo inspect call-tool "list_devices" & npx reloaderoo inspect call-tool "list_sims" &
-
Source Code Examination Violation:
// ❌ IMMEDIATE TERMINATION - Reading implementation during testing const toolImplementation = await Read('/src/mcp/tools/device-shared/list_devices.ts');
- First Violation: Immediate correction and restart of testing process
- Documentation Update: Add explicit prohibition to prevent future violations
- Method Validation: Ensure all future testing uses only Reloaderoo inspect commands
- Progress Reset: Restart testing from foundation tools if direct MCP usage detected
# ✅ CORRECT - Step-by-step manual execution via Reloaderoo
# Tool 1: Test doctor
npx reloaderoo@latest inspect call-tool "doctor" --params '{}' -- node build/cli.js mcp
# [Read response, verify, mark complete in TodoWrite]
# Tool 2: Test list_devices
npx reloaderoo@latest inspect call-tool "list_devices" --params '{}' -- node build/cli.js mcp
# [Read response, capture UUIDs, mark complete in TodoWrite]
# Tool 3: Test list_sims
npx reloaderoo@latest inspect call-tool "list_sims" --params '{}' -- node build/cli.js mcp
# [Read response, capture UUIDs, mark complete in TodoWrite]
# Tool X: Test stateful tool (expected to fail)
npx reloaderoo@latest inspect call-tool "swift_package_stop" --params '{"pid": 12345}' -- node build/cli.js mcp
# [Tool fails as expected - no in-memory state available]
# [Mark as "false negative - stateful tool limitation" in TodoWrite]
# [Continue to next tool without investigation]
# Continue individually for all 83 tools...# ✅ CORRECT Response to Expected Stateful Tool Failure
# Tool fails with "No process found" or similar state-related error
# Response: Mark tool as "tested - false negative (stateful)" in task list
# Do NOT attempt to diagnose, fix, or investigate the failure
# Continue immediately to next tool in sequence# Test error handling systematically
echo "=== Error Testing ==="
# Test with invalid JSON parameters
echo "Testing invalid parameter types..."
npx reloaderoo@latest inspect call-tool list_schems_proj --params '{"projectPath": 123}' -- node build/cli.js mcp 2>/dev/null
# Test with non-existent paths
echo "Testing non-existent paths..."
npx reloaderoo@latest inspect call-tool list_schems_proj --params '{"projectPath": "/nonexistent/path.xcodeproj"}' -- node build/cli.js mcp 2>/dev/null
# Test with invalid UUIDs
echo "Testing invalid UUIDs..."
npx reloaderoo@latest inspect call-tool boot_sim --params '{"simulatorUuid": "invalid-uuid"}' -- node build/cli.js mcp 2>/dev/null# Create comprehensive testing session report
cat > TESTING_SESSION_$(date +%Y-%m-%d).md << EOF
# Manual Testing Session - $(date +%Y-%m-%d)
## Environment
- macOS Version: $(sw_vers -productVersion)
- XcodeBuildMCP Version: $(jq -r '.version' package.json 2>/dev/null || echo "unknown")
- Testing Method: Reloaderoo @latest via npx
## Official Counts (Programmatically Verified)
- Total Tools: $TOOL_COUNT
- Total Resources: $RESOURCE_COUNT
## Test Results
[Document test results here]
## Issues Found
[Document any discrepancies or failures]
## Performance Notes
[Document response times and performance observations]
EOF
echo "Testing session template created: TESTING_SESSION_$(date +%Y-%m-%d).md"# Essential testing commands
npx reloaderoo@latest inspect ping -- node build/cli.js mcp
npx reloaderoo@latest inspect server-info -- node build/cli.js mcp
npx reloaderoo@latest inspect list-tools -- node build/cli.js mcp | jq '.tools | length'
npx reloaderoo@latest inspect list-resources -- node build/cli.js mcp | jq '.resources | length'
npx reloaderoo@latest inspect call-tool TOOL_NAME --params '{}' -- node build/cli.js mcp
npx reloaderoo@latest inspect read-resource "xcodebuildmcp://RESOURCE" -- node build/cli.js mcp
# Schema extraction
jq --arg tool "TOOL_NAME" '.tools[] | select(.name == $tool) | .inputSchema' /tmp/tools.json
jq --arg tool "TOOL_NAME" '.tools[] | select(.name == $tool) | .description' /tmp/tools.jsonThis systematic approach ensures comprehensive, accurate testing using programmatic discovery and validation of all XcodeBuildMCP functionality.
Symptoms: Test fails with error about real system executor being used Cause: Handler not receiving mock executor parameter Fix: Ensure test passes createMockExecutor() to handler:
// ❌ WRONG
const result = await tool.handler(params);
// ✅ CORRECT
const mockExecutor = createMockExecutor({ success: true });
const result = await tool.handler(params, mockExecutor);Symptoms: Test fails when trying to access file system Cause: Handler not receiving mock file system executor Fix: Pass createMockFileSystemExecutor():
const mockCmd = createMockExecutor({ success: true });
const mockFS = createMockFileSystemExecutor({ readFile: async () => 'content' });
const result = await tool.handler(params, mockCmd, mockFS);Symptoms: TypeScript errors about handler parameters Cause: Handler doesn't support dependency injection Fix: Update handler signature:
async handler(args: Record<string, unknown>): Promise<ToolResponse> {
return tool_nameLogic(args, getDefaultCommandExecutor(), getDefaultFileSystemExecutor());
}# Run specific test file
npm test -- src/plugins/simulator-workspace/__tests__/tool_name.test.ts
# Run with verbose output
npm test -- --reporter=verbose
# Check for banned patterns
node scripts/check-code-patterns.js
# Verify dependency injection compliance
node scripts/audit-dependency-container.js
# Coverage for specific directory
npm run test:coverage -- src/plugins/simulator-workspace/# Check for architectural pattern violations
node scripts/check-code-patterns.js
# Check dependency injection compliance
node scripts/audit-dependency-container.js
# Both scripts must pass before committing- Dependency injection: Always use createMockExecutor() and createMockFileSystemExecutor()
- External boundaries via DI: mock command execution/filesystem with injected executors
- Three dimensions: Test input validation, command execution, and output processing
- Literal expectations: Use exact strings in assertions to catch regressions
- Performance: Ensure fast execution through proper mocking
- Coverage: Aim for 95%+ with focus on error paths
- Consistency: Follow standard patterns across all plugin tests
- Test safety: Default executors prevent accidental real system calls
This testing strategy ensures robust, maintainable tests that provide confidence in plugin functionality while remaining resilient to implementation changes and keeping external boundaries deterministic.