Skip to content

Commit 33dcba5

Browse files
Update unit tests to remove supervisorctl references
- Remove ProcessMonitor tests as the class has been removed - Update test expectations to not check for supervisorctl config sections - Fix mock setup for process returncode - All 77 unit tests passing
1 parent 7b89a6e commit 33dcba5

File tree

8 files changed

+58
-165
lines changed

8 files changed

+58
-165
lines changed

PR_DESCRIPTION.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ ENTRYPOINT ["standard-supervisor", "./sagemaker-entrypoint.sh"]
5454
standard-supervisor vllm serve model --host 0.0.0.0 --port 8080
5555

5656
# With custom configuration
57-
PROCESS_MAX_START_RETRIES=5 SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=30 \
57+
PROCESS_MAX_START_RETRIES=5 SUPERVISOR_PROGRAM__APP_STARTSECS=30 \
5858
standard-supervisor python -m tensorrt_llm.hlapi.llm_api
5959
```
6060

@@ -67,8 +67,8 @@ RUN pip install model-hosting-container-standards
6767

6868
# Configure your ML framework with supervisor settings
6969
ENV PROCESS_MAX_START_RETRIES=3
70-
ENV SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=30
71-
ENV SUPERVISOR_PROGRAM__LLM_ENGINE_STOPWAITSECS=60
70+
ENV SUPERVISOR_PROGRAM__APP_STARTSECS=30
71+
ENV SUPERVISOR_PROGRAM__APP_STOPWAITSECS=60
7272
ENV LOG_LEVEL=info
7373

7474
# Use supervisor for process management
@@ -84,10 +84,10 @@ CMD ["vllm", "serve", "model", "--host", "0.0.0.0", "--port", "8080"]
8484
- `LOG_LEVEL=info` - Logging level (debug, info, warn, error, critical)
8585

8686
**Advanced Supervisor Settings:**
87-
- `SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=30` - Time process must run to be considered "started"
88-
- `SUPERVISOR_PROGRAM__LLM_ENGINE_STOPWAITSECS=60` - Time to wait for graceful shutdown
89-
- `SUPERVISOR_PROGRAM__LLM_ENGINE_AUTORESTART=true` - Enable automatic restart on failure
90-
- `SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES=3` - Startup retry attempts
87+
- `SUPERVISOR_PROGRAM__APP_STARTSECS=30` - Time process must run to be considered "started"
88+
- `SUPERVISOR_PROGRAM__APP_STOPWAITSECS=60` - Time to wait for graceful shutdown
89+
- `SUPERVISOR_PROGRAM__APP_AUTORESTART=true` - Enable automatic restart on failure
90+
- `SUPERVISOR_PROGRAM__APP_STARTRETRIES=3` - Startup retry attempts
9191
- `SUPERVISOR_CONFIG_PATH=/tmp/supervisord.conf` - Custom config file location
9292

9393
**Custom Sections:**

python/model_hosting_container_standards/supervisor/README.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -63,17 +63,17 @@ export LOG_LEVEL=info # Log level (default: info,
6363
Use the pattern `SUPERVISOR_{SECTION}_{KEY}=VALUE` for advanced supervisord customization:
6464

6565
**Important**:
66-
- The default program name is `llm_engine`
66+
- The default program name is `app`
6767
- To target specific programs, use double underscores `__` to represent colons in section names
68-
- Program names in environment variables use the same format (e.g., `LLM_ENGINE` for `llm_engine`)
68+
- Program names in environment variables use the same format (e.g., `APP` for `app`)
6969

7070
```bash
71-
# Program section overrides (for default program "llm_engine")
72-
export SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=10 # Seconds to wait before considering started (default: 1)
73-
export SUPERVISOR_PROGRAM__LLM_ENGINE_STOPWAITSECS=30 # Seconds to wait for graceful shutdown (default: 10)
74-
export SUPERVISOR_PROGRAM__LLM_ENGINE_AUTORESTART=unexpected # Advanced restart control (true/false/unexpected)
71+
# Program section overrides (for default program "app")
72+
export SUPERVISOR_PROGRAM__APP_STARTSECS=10 # Seconds to wait before considering started (default: 1)
73+
export SUPERVISOR_PROGRAM__APP_STOPWAITSECS=30 # Seconds to wait for graceful shutdown (default: 10)
74+
export SUPERVISOR_PROGRAM__APP_AUTORESTART=unexpected # Advanced restart control (true/false/unexpected)
7575

76-
# For program-specific overrides, use the program name (default: "llm_engine")
76+
# For program-specific overrides, use the program name (default: "app")
7777
# Or use application-level variables like PROCESS_MAX_START_RETRIES for simpler configuration
7878

7979
# Supervisord daemon configuration
@@ -89,17 +89,17 @@ export SUPERVISOR_UNIX_HTTP_SERVER_FILE=/tmp/supervisor.sock # Socket file loca
8989
```bash
9090
# High availability setup with more retries (recommended approach)
9191
export PROCESS_MAX_START_RETRIES=10
92-
export SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=30
93-
export SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES=10
92+
export SUPERVISOR_PROGRAM__APP_STARTSECS=30
93+
export SUPERVISOR_PROGRAM__APP_STARTRETRIES=10
9494

9595
# Debug mode with verbose logging
9696
export LOG_LEVEL=debug
9797
export SUPERVISOR_SUPERVISORD_LOGLEVEL=debug
9898

9999
# Quick restart for development
100-
export SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=1
101-
export SUPERVISOR_PROGRAM__LLM_ENGINE_STOPWAITSECS=5
102-
export SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES=1
100+
export SUPERVISOR_PROGRAM__APP_STARTSECS=1
101+
export SUPERVISOR_PROGRAM__APP_STOPWAITSECS=5
102+
export SUPERVISOR_PROGRAM__APP_STARTRETRIES=1
103103

104104
# Disable auto-recovery for debugging
105105
export PROCESS_AUTO_RECOVERY=false
@@ -129,8 +129,8 @@ docker run \
129129

130130
# Advanced: Direct supervisord configuration override
131131
docker run \
132-
-e SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=30 \
133-
-e SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES=5 \
132+
-e SUPERVISOR_PROGRAM__APP_STARTSECS=30 \
133+
-e SUPERVISOR_PROGRAM__APP_STARTRETRIES=5 \
134134
-e SUPERVISOR_SUPERVISORD_LOGLEVEL=debug \
135135
my-image
136136
```
@@ -169,8 +169,8 @@ RUN pip install model-hosting-container-standards
169169
# Configure supervisor behavior (recommended approach)
170170
ENV PROCESS_MAX_START_RETRIES=5
171171
ENV LOG_LEVEL=debug
172-
ENV SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS=30
173-
ENV SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES=5
172+
ENV SUPERVISOR_PROGRAM__APP_STARTSECS=30
173+
ENV SUPERVISOR_PROGRAM__APP_STARTRETRIES=5
174174

175175
# Use standard-supervisor with custom configuration
176176
CMD ["standard-supervisor", "vllm", "serve", "model", "--host", "0.0.0.0", "--port", "8080"]
@@ -251,7 +251,7 @@ export PROCESS_MAX_START_RETRIES=1
251251
```bash
252252
# Fix: Use recommended application-level variables first
253253
# Recommended: PROCESS_MAX_START_RETRIES=5
254-
# Advanced (specific program): SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES=5
254+
# Advanced (specific program): SUPERVISOR_PROGRAM__APP_STARTRETRIES=5
255255
```
256256

257257
## Framework-Specific Examples

python/model_hosting_container_standards/supervisor/generator.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@
2323
# - startretries=N: Maximum restart attempts before entering FATAL state
2424
#
2525
# FATAL state examples (supervisorctl status output):
26-
# llm_engine FATAL Exited too quickly (process log may have details)
27-
# llm_engine FATAL can't find command '/path/to/missing/binary'
28-
# llm_engine FATAL spawn error
26+
# app FATAL Exited too quickly (process log may have details)
27+
# app FATAL can't find command '/path/to/missing/binary'
28+
# app FATAL spawn error
2929
#
3030
# When a program enters FATAL state (too many restart failures), the entrypoint script
3131
# will detect this and exit with code 1 to signal container failure.
@@ -72,7 +72,7 @@ def get_base_config_template(
7272
def generate_supervisord_config(
7373
config: SupervisorConfig,
7474
launch_command: str,
75-
program_name: str = "llm_engine",
75+
program_name: str = "app",
7676
) -> str:
7777
"""Generate supervisord configuration content with validation and logging.
7878
@@ -134,7 +134,7 @@ def write_supervisord_config(
134134
config_path: str,
135135
config: SupervisorConfig,
136136
launch_command: str,
137-
program_name: str = "llm_engine",
137+
program_name: str = "app",
138138
) -> None:
139139
"""Write supervisord configuration to file with comprehensive error handling.
140140

python/model_hosting_container_standards/supervisor/scripts/generate_supervisor_config.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,7 @@ def main() -> int:
2727
"-o", "--output", required=True, help="Output path for config file"
2828
)
2929

30-
parser.add_argument(
31-
"-p", "--program-name", default="llm_engine", help="Program name"
32-
)
30+
parser.add_argument("-p", "--program-name", default="app", help="Program name")
3331
parser.add_argument(
3432
"--log-level",
3533
choices=["ERROR", "INFO", "DEBUG"],

python/model_hosting_container_standards/supervisor/scripts/standard_supervisor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ def run(self) -> int:
161161
return 1
162162

163163
config_path = config.config_path
164-
program_name = "llm_engine"
164+
program_name = "app"
165165

166166
try:
167167
# Generate and start supervisor

python/tests/integration/test_supervisor_cli_integration.py

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,9 @@ def test_basic_cli_execution_and_config_generation(self, clean_env):
6060
"""Test basic CLI execution with configuration generation and validation."""
6161
env = {
6262
"PROCESS_MAX_START_RETRIES": "2",
63-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS": "2",
64-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STOPWAITSECS": "5",
65-
"SUPERVISOR_PROGRAM__LLM_ENGINE_AUTORESTART": "true",
63+
"SUPERVISOR_PROGRAM__APP_STARTSECS": "2",
64+
"SUPERVISOR_PROGRAM__APP_STOPWAITSECS": "5",
65+
"SUPERVISOR_PROGRAM__APP_AUTORESTART": "true",
6666
"LOG_LEVEL": "info",
6767
}
6868

@@ -100,10 +100,10 @@ def test_basic_cli_execution_and_config_generation(self, clean_env):
100100

101101
# Check main sections exist
102102
assert "supervisord" in config.sections()
103-
assert "program:llm_engine" in config.sections()
103+
assert "program:app" in config.sections()
104104

105105
# Verify program configuration
106-
program_section = config["program:llm_engine"]
106+
program_section = config["program:app"]
107107
assert "python" in program_section["command"]
108108
assert program_section["startsecs"] == "2"
109109
assert program_section["stopwaitsecs"] == "5"
@@ -126,10 +126,10 @@ def test_ml_framework_configuration(self, clean_env):
126126
"""Test supervisor configuration for ML framework scenarios."""
127127
env = {
128128
"PROCESS_MAX_START_RETRIES": "3",
129-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS": "30", # ML models need longer startup
130-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STOPWAITSECS": "60", # Graceful shutdown time
131-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES": "3",
132-
"SUPERVISOR_PROGRAM__LLM_ENGINE_AUTORESTART": "true",
129+
"SUPERVISOR_PROGRAM__APP_STARTSECS": "30", # ML models need longer startup
130+
"SUPERVISOR_PROGRAM__APP_STOPWAITSECS": "60", # Graceful shutdown time
131+
"SUPERVISOR_PROGRAM__APP_STARTRETRIES": "3",
132+
"SUPERVISOR_PROGRAM__APP_AUTORESTART": "true",
133133
"LOG_LEVEL": "info",
134134
}
135135

@@ -164,7 +164,7 @@ def test_ml_framework_configuration(self, clean_env):
164164
), f"Config file not found at {config_path}"
165165

166166
config = parse_supervisor_config(config_path)
167-
program_section = config["program:llm_engine"]
167+
program_section = config["program:app"]
168168

169169
# ML frameworks need longer startup and shutdown times
170170
assert program_section["startsecs"] == "30"
@@ -191,8 +191,8 @@ def test_signal_handling(self, clean_env):
191191
"""Test that supervisor handles signals correctly."""
192192
env = {
193193
"PROCESS_MAX_START_RETRIES": "1",
194-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS": "1",
195-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STOPWAITSECS": "5",
194+
"SUPERVISOR_PROGRAM__APP_STARTSECS": "1",
195+
"SUPERVISOR_PROGRAM__APP_STOPWAITSECS": "5",
196196
"LOG_LEVEL": "info",
197197
}
198198

@@ -240,9 +240,9 @@ def test_signal_handling(self, clean_env):
240240
def test_continuous_restart_behavior(self, clean_env):
241241
"""Test that supervisor continuously restarts processes when autorestart=true."""
242242
env = {
243-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS": "2",
244-
"SUPERVISOR_PROGRAM__LLM_ENGINE_AUTORESTART": "true",
245-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES": "10",
243+
"SUPERVISOR_PROGRAM__APP_STARTSECS": "2",
244+
"SUPERVISOR_PROGRAM__APP_AUTORESTART": "true",
245+
"SUPERVISOR_PROGRAM__APP_STARTRETRIES": "10",
246246
"LOG_LEVEL": "info",
247247
}
248248

@@ -313,7 +313,7 @@ def test_continuous_restart_behavior(self, clean_env):
313313

314314
# Verify config
315315
config = parse_supervisor_config(config_path)
316-
program_section = config["program:llm_engine"]
316+
program_section = config["program:app"]
317317
assert program_section["autorestart"] == "true"
318318

319319
print(
@@ -332,9 +332,9 @@ def test_continuous_restart_behavior(self, clean_env):
332332
def test_startup_retry_limit(self, clean_env):
333333
"""Test that supervisor respects startretries limit."""
334334
env = {
335-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTSECS": "5", # Process must run 5 seconds to be "started"
336-
"SUPERVISOR_PROGRAM__LLM_ENGINE_STARTRETRIES": "3", # Only 3 startup attempts
337-
"SUPERVISOR_PROGRAM__LLM_ENGINE_AUTORESTART": "true",
335+
"SUPERVISOR_PROGRAM__APP_STARTSECS": "5", # Process must run 5 seconds to be "started"
336+
"SUPERVISOR_PROGRAM__APP_STARTRETRIES": "3", # Only 3 startup attempts
337+
"SUPERVISOR_PROGRAM__APP_AUTORESTART": "true",
338338
"LOG_LEVEL": "info",
339339
}
340340

@@ -386,7 +386,7 @@ def test_startup_retry_limit(self, clean_env):
386386
# Verify config was generated
387387
assert os.path.exists(config_path), "Config file should exist"
388388
config = parse_supervisor_config(config_path)
389-
program_section = config["program:llm_engine"]
389+
program_section = config["program:app"]
390390
assert program_section["startretries"] == "3"
391391
assert program_section["startsecs"] == "5"
392392

@@ -406,7 +406,7 @@ def test_startup_retry_limit(self, clean_env):
406406
), f"Expected {expected_attempts} startup attempts, got {attempt_count}"
407407

408408
# Check supervisord log for FATAL state
409-
log_path = "/tmp/supervisord-llm_engine.log"
409+
log_path = "/tmp/supervisord-app.log"
410410
if os.path.exists(log_path):
411411
with open(log_path, "r") as f:
412412
log_content = f.read()

python/tests/supervisor/test_generator.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,10 +39,7 @@ def test_basic_template_structure(self):
3939

4040
# Check all required sections exist
4141
expected_sections = [
42-
"unix_http_server",
43-
"supervisorctl",
4442
"supervisord",
45-
"rpcinterface:supervisor",
4643
"program:test_program",
4744
]
4845

0 commit comments

Comments
 (0)