AppLogic 2.1/2.2 Documentation The latest production release is AppLogic 2.9.9 SLA: Controller - Test Plan
Preparation
Host platform
The tests described here for the SLA appliance are designed to run on an AppLogic grid.
3rd-party tools
http_load is a simple load generator and is included in the test archive in binary form.
http_load is released under GPL and can be obtained from
http://www.acme.com/software/http_load/. The version used in the test is
http://www.acme.com/software/http_load/http_load-29jun2005.tar.gz.
http_load is used to generate load on WEBx4 in the test harness.
Tests Summary
Load Tests
- appliance start - verify appliances start under increasing load, according to policy, until all are started
- appliance stop - verify appliances stop under decreasing load, according to policy, until just one is left running
GUI Tests
- authentication test - verify the GUI functions properly with both an empty username and an explicitly set username and password
- lock file test - verify a lock file prevents GUI manual appliance start/stop while the SLA daemon is doing start/stop and vice versa.
- pause & play test - verify pause prevents SLA from enforcing the policy and play resumes enforcement
- stop & start test - verify appliances can be manually stopped and started
- graph test - verify the graph of the current counter average against that provided by MON
- browser test - verify GUI functions in IE6x, IE7x and FireFox 2x
Failure Tests
- grid access failure - verify that failure to access the grid controller causes user notification in the GUI
- MON access failure - verify that failure to access the MON appliance causes user notification in the GUI
- daemon failure - verify the SLA daemon re-starts on failure (the daemon performs counter collection and policy enforcement)
- notification upon failure - verify a dashboard message is generated if:
- all appliances are started and the trailing average passes the start value
- the daemon fails to start or re-start
- an appliance fails to start or stop
- counters cannot be collected from at least one appliance for more than one minute
- on-start failure - verify SLA fails to start if:
- the property
grid_ctl_ip is incorrect
- the
grid.private.key file does not exist
- the
grid.private.key file does not authenticate on the grid controller
- the
appliance_group property does not resolve to two or more appliances within the application
Example Test
- example test - verify the steps in the example are sufficient and work.
Terminal Tests
-
aux test - verify aux terminal passes through traffic not directed to the GUI
-
log test - verify logs are written to the logs_base_dir of the cifs share if the log terminal is connected, or to the config volume if not connected
-
mon test - verify SLA counters are monitored by MON
Running the Tests
Copy and uncompress the test application archive file
sla-tst-app.tar.bz2. Import the test application. Update the grid.private.key file with a usable key.
This is a diagram of the test harness application:
Design
Structure
The test application comprises the following:
- an input gateway INSSL configured to forward tcp traffic on port 8080 through its
aux terminal, providing access to the SLA GUI
- an input gateway IN used to test that traffic not directed to the GUI is passed through the SLA
aux terminal
- a SLA appliance
- a LUX appliance LOAD with the http_load load generator and test script installed (used to put WEBx4 under load to test dynamic start/stop)
- a scaling web server WEBx4 whose individual web servers comprise the
appliance_group
- a webserver WEB used to test that traffic not directed to the SLA GUI is passed through the SLA
aux terminal
- a network storage appliance NAS which contains content for all of the web servers.
index.php on the root of this share displays phpinfo(); ?>, a web page showing the PHP configuration of the web server. This is the document which is fetched during the load tests.
- a monitor MON which exposes its data collection interface on its
aux terminal
- a net gatway NET which allows SLA to contact the grid controller
Test Details
Load Tests
- appliance start - verify appliances start under increasing load, according to policy, until all are started
- Open the standard monitoring view for the application, the SLA GUI, and an ssh session to the
load appliance. Execute the script test.sh on the load appliance -- it is located in /usr/load/. Verify appliances start under increasing load, according to policy, until all are started. The test continues until the start value is passed with all appliances running.
- Change the policy several times, using different counters, start & stop values, and periods. Execute the script
test.sh and verify appliances start and stop according to policy. Load can be generated without using test.sh using the command:
/usr/load/http_load -rate 50 -seconds 300 /usr/load/url &
- appliance stop - verify appliances stop under decreasing load, according to policy, until just one is left running
- During each execution of
test.sh in the appliance start test, as test.sh winds down, verify appliances stop under decreasing load, according to policy, until all but one are stopped.
GUI Tests
- authentication test - verify the GUI functions properly with both an empty username and an explicitly set username and password
- Set
username and password explicitly. Either set these properties and re-start the application, or simply edit /etc/applogic.sh. Point a browser at the SLA GUI and verify authentication is performed.
- Set
username and password to empty. Point a browser at the SLA GUI and verify authentication is not performed.
- Set
username explicitly and password to empty. Point a browser at the SLA GUI and verify the empty password is used for authentication.
- lock file test - verify a lock file prevents GUI manual appliance start/stop while the SLA daemon is doing start/stop and vice versa.
- Place WEBx4 under load (use
test.sh or manually invoke http_load). While SLA is automatically stopping or starting an appliance, verify the GUI prevents manual start and stop.
- Open a shell to
main.sla.ctl. Manually create an empty lock file /mnt/config/3tera/lock. Place WEBx4 under load. Verify SLA does not automatically start an appliance in accordance with the policy. Delete the lock file.
- pause & play test - verify pause prevents SLA from enforcing the policy and play resumes enforcement
- Stop policy enforcement. Place WEBx4 under load. Verify SLA does not start appliances in accordance with the policy.
- Start policy enforcement. Verify SLA starts an appliance in accordance with the policy.
- stop & start test - verify appliances can be manually stopped and started
- Manually initiate an appliance start using the GUI. Verify it occurs.
- Manually initiate an appliance stop using the GUI. Verify it occurs.
- Manually initiate an appliance stop when only one appliance in the group is running. Verify it does not occur and a suitable message is displayed.
- Manually initiate an appliance start when all appliances in the group are running. Verify it does not occur and a suitable message is displayed.
- graph test - verify the graph of the current counter average against that provided by MON
- Examine side by side the SLA GUI graph, a MON graph of the SLA custom counters, and a MON graph of the counter in the policy for each of the web servers in WEBx4.
- With one appliance in the group running, generate changes in load on WEBx4. Verify the three graphs are nearly in sync. Note that SLA collects these counters from MON once every 5 seconds whereas MON is likely collecting the counters every second, so the results will not be identical. Nonetheless, they should closely match overall. Note also that the SLA GUI graph has a 24 hour horizontal axis where each pixel represents a 90 second average at its standard width. This graph will not therefore respond quickly to changes.
- Repeat the test with two appliances running. The SLA current counter average should closely match the average of the two values being graphed by MON.
- browser test - verify GUI functions in IE6x, IE7x and FireFox 2x
- Open the SLA GUI in each of IE6x, IE7x and FireFox 2x. Verify the rendering is acceptably uniform. Verify the GUI functions properly in each browser.
Failure Tests
- grid access failure - verify that failure to access the grid controller causes user notification in the GUI
- With the application running, rename the
grid.private.key file. Refresh the SLA GUI. Verify the GUI notifies the user of the missing file and prevents access to the policy.
- Edit the
grid.private.key so that it no longer authenticates on the grid controller. Verify the GUI notifies the user of the unusable file and prevents access to the policy.
- MON access failure - verify that failure to access the MON appliance causes user notification in the GUI
- With the application running, stop the MON appliance. Refresh the SLA GUI. Verify that GUI notifies the that MON is not accessible and prevents access to the policy.
- daemon failure - verify the SLA daemon re-starts on failure (the daemon performs counter collection and policy enforcement)
- Open a shell to
main.sla.ctl. Manually stop the SLA daemon, /usr/SLA/sla_daemon, by executing /usr/SLA/stopd. Use ps ax | grep sla_daemon to verify it is automatically re-started within five minutes by crond.
- notification upon failure - verify dashboard message is generated if:
- all appliances are started and the trailing average passes the start value
- Generate load on WEBx4 using
test.sh. Verify that a dashboard message is generated when all appliances are running and the trailing average passes the start value.
- the daemon fails to start or re-start
- Copy the test application. In the new application, branch the SLA appliance and change the /usr volume to
rw. Start the new application. Rename /usr/SLA/sla_daemon. Stop the SLA daemon with /usr/SLA/stopd. Verify a dashboard message is generated within five minutes when crond fails to re-start the daemon. Stop and destroy the application.
- an appliance fails to start or stop
- To perform this test, SLA must be unable to start one of the web servers in WEBx4. Either of these methods will work:
- Starve the grid for memory resources by creating and starting another application which uses nearly all the available memory.
- Or, branch WEBx4 and cripple
main.srv.srv2 by mounting its boot volume and renaming /appliance/vme. Unmount the volume and start the application.
- Generate load on WEBx4. Verify the daemon generates a dashboard message after failing to start
main.srv.srv2.
- counters cannot be collected from at least one appliance for more than one minute
- Stop all of the web servers in WEBx4. Verify SLA generates a dashboard message indicating none of the appliances in the
appliance_group is running.
- on-start failure - verify SLA fails to start if:
- the property
grid_ctl_ip is incorrect
- Set the
grid_ctl_ip to an incorrect value. Start the application. Verify it fails to start. Examine the log (log list n=100) and verify an appropriate message is generated.
- the
grid.private.key file does not exist
- Mount the application volume
sla_config. Rename the file grid.private.key. Unmount the volume and start the application. Verify it fails to start and an appropriate message is generated in the log.
- the
grid.private.key file does not authenticate on the grid controller
- Mount the application volume
sla_config. Copy an empty file to grid.private.key. Unmount the volume and start the application. Verify it fails to start and an appropriate message is generated in the log. Replace the file grid.private.key with a working private key file.
- the
appliance_group property does not resolve to two or more appliances within the application
- Change the
appliance_group property to an incorrect value. Start the application. Verify it fails to start and an appropriate message is generated in the log.
Example Test
- example test - verify the steps in the example are sufficient and work.
- Follow the instructions in the example. Verify that following the example steps, one by one, results in a running LampX4? application where the SLA appliance is accessible through its GUI.
Terminal Tests
-
aux test - verify aux terminal is passes through traffic not directed to the GUI
- Point a browser at
aux_in_ip of the application. Verify that the phpinfo() results are displayed.
-
log test - verify logs are written to the logs_base_dir of the cifs share if the log terminal is connected, or to the config volume if not connected
- Ensure the
log terminal of SLA is connected to the cifs terminal of the NAS appliance. Change the logs_base_dir property to test and re-start the application. Open a shell to main.sla.ctl.
- Execute
mount. Verify //log/share on /mnt/log type cifs (rw,mand) is listed.
-
cd /mnt/config/3tera and ll. Verify the link log -> /mnt/log//test is listed.
-
cd /mnt/log/test and ll. Verify the file sla.log exists.
- Stop the application and remove the connection to the
log terminal of SLA. Start the application. Open a shell to main.sla.ctl.
- Execute
mount. Verify //log/share on /mnt/log type cifs (rw,mand) is not listed.
-
cd /mnt/config/3tera and ll. Verify the directory log is listed.
-
cd /mnt/config/3tera/log and ll. Verify the file sla.log exists.
-
mon test - verify SLA counters are monitored by MON
- SLA provides three custom counters to MON through its
mon terminal: Current Average, Trailing Average, and Running Appliances. With the application running, open the monitoring GUI for the MON appliance. Verify these counters are available by creating a graph which includes them.
--
StephenQ - 22 Nov 2007