Skip to content

Commit aa5059c

Browse files
committed
Enhance chaos experiment documentation and configuration for multiple experiments. Updated environment variables, added SDK authentication details, and improved clarity on chaos execution processes across various chaos experiments including container-kill, disk-fill, node-cpu-hog, and others. Updated image references to use the latest version and standardized configuration parameters.
1 parent 6f6628b commit aa5059c

File tree

13 files changed

+3532
-713
lines changed

13 files changed

+3532
-713
lines changed

experiments/container-kill/README.md

Lines changed: 286 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -23,30 +23,58 @@ jobs:
2323
uses: litmuschaos/[email protected]
2424
env:
2525
KUBE_CONFIG_DATA: ${{ secrets.KUBE_CONFIG_DATA }}
26-
##If litmus is not installed
27-
INSTALL_LITMUS: true
28-
##Give application info under chaos
26+
27+
# Litmus SDK Authentication
28+
LITMUS_ENDPOINT: "https://chaos-center.example.com"
29+
LITMUS_USERNAME: "admin"
30+
LITMUS_PASSWORD: ${{ secrets.LITMUS_PASSWORD }}
31+
LITMUS_PROJECT_ID: "your-project-id"
32+
33+
# Infrastructure Setup
34+
INSTALL_INFRA: "true"
35+
INFRA_NAME: "container-kill-infra"
36+
INFRA_NAMESPACE: "litmus"
37+
INFRA_SCOPE: "namespace"
38+
39+
# Application Info
2940
APP_NS: default
3041
APP_LABEL: run=nginx
3142
APP_KIND: deployment
43+
44+
# Experiment Configuration
3245
EXPERIMENT_NAME: container-kill
33-
##Custom images can also be used
34-
EXPERIMENT_IMAGE: litmuschaos/go-runner
35-
EXPERIMENT_IMAGE_TAG: latest
46+
EXPERIMENT_IMAGE: litmuschaos.docker.scarf.sh/litmuschaos/go-runner
47+
EXPERIMENT_IMAGE_TAG: 3.16.0
3648
IMAGE_PULL_POLICY: Always
49+
50+
# Container Kill Specific Configuration
3751
TARGET_CONTAINER: nginx
3852
TOTAL_CHAOS_DURATION: 20
3953
CHAOS_INTERVAL: 10
40-
CONTAINER_RUNTIME: docker
41-
##Select true if you want to uninstall litmus after chaos
54+
CONTAINER_RUNTIME: containerd
55+
SOCKET_PATH: /run/containerd/containerd.sock
56+
SIGNAL: SIGKILL
57+
SEQUENCE: parallel
58+
PODS_AFFECTED_PERC: 100
59+
DEFAULT_HEALTH_CHECK: false
60+
61+
# Optional Probe Setup
62+
LITMUS_CREATE_PROBE: "true"
63+
LITMUS_PROBE_NAME: "http-status-check"
64+
LITMUS_PROBE_TYPE: "httpProbe"
65+
LITMUS_PROBE_MODE: "Continuous"
66+
LITMUS_PROBE_URL: "http://nginx-svc:80/"
67+
LITMUS_PROBE_RESPONSE_CODE: "200"
68+
69+
# Cleanup
4270
LITMUS_CLEANUP: true
4371
```
4472
45-
## Environment Variabels
73+
## Environment Variables
4674
47-
The application pod for container-kill will be identified with the app info variables.
75+
The following environment variables are used to configure the container-kill experiment.
4876
49-
**Supported Chaos Action Tunables**
77+
### SDK Authentication Variables (Required)
5078
5179
<table>
5280
<tr>
@@ -56,81 +84,297 @@ The application pod for container-kill will be identified with the app info vari
5684
<th> Default Value </th>
5785
</tr>
5886
<tr>
59-
<td> EXPERIMENT_NAME </td>
60-
<td> For Running container kill experiment keep it container-kill</td>
87+
<td> LITMUS_ENDPOINT </td>
88+
<td> URL of the Litmus Chaos Center </td>
6189
<td> Mandatory </td>
6290
<td> No default value </td>
6391
</tr>
6492
<tr>
65-
<td> TARGET_CONTAINER </td>
66-
<td> The name of container to be killed inside the pod </td>
93+
<td> LITMUS_USERNAME </td>
94+
<td> Username for Litmus authentication </td>
95+
<td> Mandatory </td>
96+
<td> No default value </td>
97+
</tr>
98+
<tr>
99+
<td> LITMUS_PASSWORD </td>
100+
<td> Password for Litmus authentication </td>
101+
<td> Mandatory </td>
102+
<td> No default value </td>
103+
</tr>
104+
<tr>
105+
<td> LITMUS_PROJECT_ID </td>
106+
<td> Project ID in Litmus </td>
107+
<td> Mandatory </td>
108+
<td> No default value </td>
109+
</tr>
110+
</table>
111+
112+
### Infrastructure Setup Variables
113+
114+
<table>
115+
<tr>
116+
<th> Variables </th>
117+
<th> Description </th>
118+
<th> Specify In Chaos Action </th>
119+
<th> Default Value </th>
120+
</tr>
121+
<tr>
122+
<td> INSTALL_INFRA </td>
123+
<td> Whether to install infrastructure </td>
67124
<td> Optional </td>
68-
<td> Default value is nginx</td>
125+
<td> true </td>
69126
</tr>
70127
<tr>
71-
<td> CHAOS_INTERVAL </td>
72-
<td>Time interval b/w two successive container kill (in seconds) </td>
128+
<td> USE_EXISTING_INFRA </td>
129+
<td> Whether to use existing infrastructure </td>
73130
<td> Optional </td>
74-
<td> Default value is 10s </td>
131+
<td> false </td>
75132
</tr>
76133
<tr>
77-
<td> TOTAL_CHAOS_DURATION </td>
78-
<td> The time duration for chaos injection (seconds) </td>
134+
<td> EXISTING_INFRA_ID </td>
135+
<td> ID of existing infrastructure </td>
136+
<td> Required if USE_EXISTING_INFRA=true </td>
137+
<td> No default value </td>
138+
</tr>
139+
<tr>
140+
<td> INFRA_NAME </td>
141+
<td> Name for the infrastructure </td>
142+
<td> Optional </td>
143+
<td> ci-infra-container-kill </td>
144+
</tr>
145+
<tr>
146+
<td> INFRA_NAMESPACE </td>
147+
<td> Kubernetes namespace for infrastructure </td>
148+
<td> Optional </td>
149+
<td> litmus </td>
150+
</tr>
151+
<tr>
152+
<td> INFRA_SCOPE </td>
153+
<td> Scope of infrastructure </td>
154+
<td> Optional </td>
155+
<td> namespace </td>
156+
</tr>
157+
<tr>
158+
<td> INFRA_SERVICE_ACCOUNT </td>
159+
<td> Service account for infrastructure </td>
160+
<td> Optional </td>
161+
<td> litmus </td>
162+
</tr>
163+
</table>
164+
165+
### Probe Configuration Variables
166+
167+
<table>
168+
<tr>
169+
<th> Variables </th>
170+
<th> Description </th>
171+
<th> Specify In Chaos Action </th>
172+
<th> Default Value </th>
173+
</tr>
174+
<tr>
175+
<td> LITMUS_CREATE_PROBE </td>
176+
<td> Whether to create a probe </td>
177+
<td> Optional </td>
178+
<td> false </td>
179+
</tr>
180+
<tr>
181+
<td> LITMUS_PROBE_NAME </td>
182+
<td> Name of the probe </td>
183+
<td> Optional </td>
184+
<td> http-probe </td>
185+
</tr>
186+
<tr>
187+
<td> LITMUS_PROBE_TYPE </td>
188+
<td> Type of probe </td>
189+
<td> Optional </td>
190+
<td> httpProbe </td>
191+
</tr>
192+
<tr>
193+
<td> LITMUS_PROBE_MODE </td>
194+
<td> Mode of the probe (SOT, EOT, Continuous) </td>
195+
<td> Optional </td>
196+
<td> SOT </td>
197+
</tr>
198+
<tr>
199+
<td> LITMUS_PROBE_URL </td>
200+
<td> URL for HTTP probe </td>
201+
<td> Required for HTTP probe </td>
202+
<td> No default value </td>
203+
</tr>
204+
<tr>
205+
<td> LITMUS_PROBE_TIMEOUT </td>
206+
<td> Timeout for probe </td>
207+
<td> Optional </td>
208+
<td> 30s </td>
209+
</tr>
210+
<tr>
211+
<td> LITMUS_PROBE_INTERVAL </td>
212+
<td> Interval for probe </td>
213+
<td> Optional </td>
214+
<td> 10s </td>
215+
</tr>
216+
<tr>
217+
<td> LITMUS_PROBE_ATTEMPTS </td>
218+
<td> Number of attempts for probe </td>
219+
<td> Optional </td>
220+
<td> 1 </td>
221+
</tr>
222+
<tr>
223+
<td> LITMUS_PROBE_RESPONSE_CODE </td>
224+
<td> Expected HTTP response code </td>
79225
<td> Optional </td>
80-
<td> Default value is 20s </td>
226+
<td> 200 </td>
227+
</tr>
228+
</table>
229+
230+
### Application Info Variables
231+
232+
<table>
233+
<tr>
234+
<th> Variables </th>
235+
<th> Description </th>
236+
<th> Specify In Chaos Action </th>
237+
<th> Default Value </th>
81238
</tr>
82239
<tr>
83240
<td> APP_NS </td>
84241
<td> Provide namespace of application under chaos </td>
85242
<td> Optional </td>
86-
<td> Default value is default</td>
243+
<td> default </td>
87244
</tr>
88245
<tr>
89-
<td> APP_LABEL </td>
90-
<td> Provide application label of application under chaos. </td>
246+
<td> APP_LABEL </td>
247+
<td> Provide application label of application under chaos </td>
91248
<td> Optional </td>
92-
<td> Default value is run=nginx </td>
249+
<td> run=nginx </td>
93250
</tr>
94251
<tr>
95252
<td> APP_KIND </td>
96-
<td> Provide the kind of application </td>
97-
<td> Optional </td>
98-
<td> Default value is deployment </td>
253+
<td> Provide the kind of application </td>
254+
<td> Optional </td>
255+
<td> deployment </td>
99256
</tr>
257+
</table>
258+
259+
### Container Kill Experiment Variables
260+
261+
<table>
100262
<tr>
101-
<td> INSTALL_LITMUS </td>
102-
<td> Keep it true to install litmus if litmus is not already installed.</td>
263+
<th> Variables </th>
264+
<th> Description </th>
265+
<th> Specify In Chaos Action </th>
266+
<th> Default Value </th>
267+
</tr>
268+
<tr>
269+
<td> EXPERIMENT_NAME </td>
270+
<td> For Running container kill experiment keep it container-kill </td>
271+
<td> Mandatory </td>
272+
<td> No default value </td>
273+
</tr>
274+
<tr>
275+
<td> TARGET_CONTAINER </td>
276+
<td> The name of container to be killed inside the pod </td>
103277
<td> Optional </td>
104-
<td> Default value is not set to true </td>
278+
<td> nginx </td>
105279
</tr>
106-
<tr>
107-
<td> LITMUS_CLEANUP </td>
108-
<td> Keep it true to uninstall litmus after chaos </td>
280+
<tr>
281+
<td> CHAOS_INTERVAL </td>
282+
<td> Time interval b/w two successive container kills (in seconds) </td>
109283
<td> Optional </td>
110-
<td> Default value is not set to true </td>
284+
<td> 10 </td>
285+
</tr>
286+
<tr>
287+
<td> TOTAL_CHAOS_DURATION </td>
288+
<td> The time duration for chaos injection (seconds) </td>
289+
<td> Optional </td>
290+
<td> 20 </td>
111291
</tr>
112292
<tr>
113293
<td> CONTAINER_RUNTIME </td>
114294
<td> Give the target container runtime </td>
115295
<td> Optional </td>
116-
<td> Default value is <code>'docker'</code> </td>
117-
</tr>
296+
<td> containerd </td>
297+
</tr>
298+
<tr>
299+
<td> SOCKET_PATH </td>
300+
<td> Socket path for the container runtime </td>
301+
<td> Optional </td>
302+
<td> /run/containerd/containerd.sock </td>
303+
</tr>
118304
<tr>
119305
<td> EXPERIMENT_IMAGE </td>
120306
<td> We can provide custom image for running chaos experiment </td>
121307
<td> Optional </td>
122-
<td> Default value is litmuschaos/go-runner </td>
308+
<td> litmuschaos.docker.scarf.sh/litmuschaos/go-runner </td>
123309
</tr>
124310
<tr>
125311
<td> EXPERIMENT_IMAGE_TAG </td>
126312
<td> We can set the image tag while using custom image for the chaos experiment </td>
127313
<td> Optional </td>
128-
<td> Default value is latest </td>
314+
<td> 3.16.0 </td>
129315
</tr>
130316
<tr>
131-
<td>IMAGE_PULL_POLICY </td>
317+
<td> IMAGE_PULL_POLICY </td>
132318
<td> We can set the image pull policy while using custom image for running chaos experiment </td>
133319
<td> Optional </td>
134-
<td> Default value is Always </td>
320+
<td> Always </td>
321+
</tr>
322+
<tr>
323+
<td> SIGNAL </td>
324+
<td> Signal to be sent to the container </td>
325+
<td> Optional </td>
326+
<td> SIGKILL </td>
327+
</tr>
328+
<tr>
329+
<td> SEQUENCE </td>
330+
<td> Sequence of chaos execution </td>
331+
<td> Optional </td>
332+
<td> parallel </td>
333+
</tr>
334+
<tr>
335+
<td> DEFAULT_HEALTH_CHECK </td>
336+
<td> Enable/disable default health checks </td>
337+
<td> Optional </td>
338+
<td> false </td>
339+
</tr>
340+
<tr>
341+
<td> RAMP_TIME </td>
342+
<td> Time to wait before and after chaos injection (in seconds) </td>
343+
<td> Optional </td>
344+
<td> Not set </td>
345+
</tr>
346+
<tr>
347+
<td> PODS_AFFECTED_PERC </td>
348+
<td> Percentage of pods affected by chaos </td>
349+
<td> Optional </td>
350+
<td> 0 (All pods) </td>
351+
</tr>
352+
<tr>
353+
<td> TARGET_PODS </td>
354+
<td> Comma-separated list of specific pods to target </td>
355+
<td> Optional </td>
356+
<td> Not set </td>
357+
</tr>
358+
<tr>
359+
<td> INSTALL_LITMUS </td>
360+
<td> Keep it true to install litmus if litmus is not already installed </td>
361+
<td> Optional </td>
362+
<td> Not set to true </td>
363+
</tr>
364+
<tr>
365+
<td> LITMUS_CLEANUP </td>
366+
<td> Keep it true to uninstall litmus after chaos </td>
367+
<td> Optional </td>
368+
<td> Not set to true </td>
135369
</tr>
136370
</table>
371+
372+
## Experiment Execution Process
373+
374+
The experiment execution has evolved to use a more sophisticated SDK-based approach with these steps:
375+
1. **Authentication & Setup**: Connect to Litmus Chaos Center using SDK credentials
376+
2. **Infrastructure Provisioning**: Create or use existing chaos infrastructure
377+
3. **Experiment Configuration**: Configure experiment parameters and probe settings
378+
4. **Execution & Monitoring**: Run the experiment with unique ID and monitor progress
379+
5. **Result Verification**: Verify results through detailed phase checking
380+
6. **Optional Cleanup**: Remove chaos infrastructure if LITMUS_CLEANUP is set to true

0 commit comments

Comments
 (0)