LoadSim Users Guide

1 - Introduction

2 - Installing LoadSim

To install LoadSim, follow these steps (these instructions assume you have a JVM installed):

Download the tarball and unpack it.
Add a LOADSIM_HOME environment variable that points at the root directory where you have installed LoadSim.
Add loadsim to your system path by adding $LOADSIM_HOME/bin to your PATH environment variable.
Type loadsim --help for a usage blurb.

3 - Recording Browser Sessions

Before you can run a load test you will need to have a list of links to simulate. You get this list by "recording" a browser session. I would start by creating a new directory for your simulation scripts and results. Copy the contents of your <LOADSIM_HOME>/examples/templates to this new directory. The templates directory contains some example configuration files that should help you get started using LoadSim.

If you do not have Java Web Start installed then to record your browser session follow these instructions:

Start LoadSim with the --record switch. That is, type: loadsim --record
An alternative to this is to launch LoadSim via Java Web Start.
This should bring up the Muffin window. Choose the "Edit/Filters..." menu item.
If this is the first time running in record mode you will need to add the LoadSim filter. You do this by first pushing the "New" button in the filters window.
Then, in the text field of the new window type org.openware.loadsim.muffin.LoadSim.
Select it from the list of filters and press the "Enable" button. Click the "Save" button and this configuration will be saved so that you can bypass the preceding steps in subsequent uses of LoadSim.
Select this filter in the enabled list and press the "Preferences" button.
This should open up the LoadSim filter. Select the file that you want to record to.
Open up your browser. Setup it up to use a proxy server instead of going directly to the Internet. Use the following settings:

Host: localhost

Port: 51966 (this is Muffin's default port, if you change Muffin's port then change this value accordingly).
Go back to the LoadSim filter window and now press the "Record" button.
Go back to the browser and start using your web application/site.
When you are done, press the "Stop" button.
If you are done recording it is wise to go ahead and reset your browser to connect directly to the Internet.

4 - Running the Simulation

If you have a simple website (no page data, such as URL encoded session IDs) then you have a sequence that is ready to use with LoadSim.

Now, you will need to create a new simulation. You can use the files in the examples/templates directory as a starting point for your simulation definition. Copy all of the files in that directory to a new directory. The easiest thing is to copy the record you made during the recording phase into this same directory.

Edit the timers.xml file to set up realistic timers for you application. If people are going to spend a good deal of time at each page then have longer average times.

Now, edit the simulation.xml file. Replace the file name for the <use-sequence> tag with the file that you've just recorded. The hostname is the name of the host to run the simulation on (not the website you are testing). For small numbers of virtual users one machine should suffice and the hostname will usually just be "//localhost:8000". If you want to distribute the simulation across multiple machines, each machine will need to have an instance of LoadSim running on it. If the default port of 8000 is okay all you need to do on the other machines is start LoadSim with no arguments. For each machine you want to use you will need to add a <use-sequence> tag.

In order to determine how many virtual users your machine can simulate you should start with some number of virtual users for a particular simulation, say 50. Run the simulation and monitor the CPU utilization. Keep adding more virtual users until the CPU utilization is consistently in the 50-100% range for most of the simulation. Why do this? Basically, the timing data that LoadSim collects is just simply the difference in time between when the first and last bytes are received and when the request was sent. The simulation, itself, takes time to execute. So, in order for the difference to most accurately reflect the performance of your website, you must ensure that the simulation does not stress out the CPU too much, and, therefore, not interfere with the results. You will only need to do this once per differnt type of machine (if 2 different machines have the same number of CPUs each with the same speed and the same amount of memory they should both be able to simulate the same number of users).

In addition to the above "calibration", you will need to do a "sanity check" to make sure that the results from all of the machines are roughly consistent with each other. For instance, if you've run a simulation on 10 machines, and machines 1 through 9 all have a mean time-to-last-byte (TTLB) time of right around 1000 ms, but machine 10 has a mean TTLB time of 6000 ms, you might conclude that the results from machine 10 are bogus. In that case, you could just throw out the results of machine 10 and calculate summary statistics on the results from machines 1 through 9.

Once you are done editing your simulation.xml file, you can start the console by typing loadsim --startconsole.

At the Tcl prompt ('%') type source runsim.tcl.

Type init simulation.xml.

Type start.

If you want to end the simulation prematurely type stop or exit. If you've run a simulation across multiple machines you can also type fetch. This will fetch all of the results from the remote machines and append them to the results file on your local machine.

You'll wind up with a log file and comma separated variable file with the following columns:

LABEL,HOSTNAME,ramping[ done],TTFB,TTLB,SIZE,CONTENT-TYPE,RESPONSE-CODE

The LABEL will be the name of the "file" for the requested URL. That is, everything after the last '/' and before the '?' (if there is a querystring). If you need to change this you should edit the recorded sequence file before running the simulation.

The HOSTNAME is the name of the host that is driving the load, not the host name of the website being tested.

When you perform a load test with a large number of virtual users it can take some time to get them all up and running. The ramping[ done] column indicates if the sample was taking as the simulation is still ramping up the virtual users (ramping) or if all of the users have been started (ramping done).

All times are in milliseconds, size is in bytes. If any of the time or size values are -1 then that sample is invalid. This happens when an error occurs while reading from or writing to the website being tested.

Finally, the RESPONSE-CODE is the HTTP response code. Normal is 200 and redirects are 302. For other codes see an HTTP reference guide.

5 - Content Handlers

If you want to do something with the downloaded content you can use an existing content handler or write your own. Currently, LoadSim ships with content handlers to save the pages to a file, print the anchor tags to a file and print the headers to a file. This can provide a sanity check during simulation that you are indeed getting what you asked for.

To use a content handler during a simulation you will need to add lines like the following after your <default-timer> tag in the simulation.xml file:

    <content-handler content-type="text/html" 
    handler-class="org.openware.loadsim.output.SaveHtmlHandler"/>

You can add as many handlers as you like for each content type. Each one will be executed in the order you specify in the simulation defintion file. They are executed after the current page is completely downloaded and the next link is ready to go. This allows your handler to override any querystring or form data that LoadSim setup based on how the simulation and links have been configured.

You can also add your own handlers. To do so you must implement the IContentHandler interface.

LoadSim passes the last page downloaded along with the timing results to the handler. The page is accessible as an HttpUnit WebResponse which provides a parsed page allowing a developer to gain access to all the tags in the page. This can be handy for verifying that the correct page is downloaded. In addition to the page and the timing results the next link in the simulation is also provided. With this a developer can set querystring or form data values based on information gleaned from the page (to set the value of a particular link datum that datum type must be page). It should be noted that if you access the WebResponse methods then the file must be parsed, which takes time away from the actual simulation. The performance hit may be acceptable during functional testing or during monitoring of a web application, but it may severely degrade the number of virtual users that can be simulated in a load test.

For a more concrete example I've included the source code for the handler class that dumps the links (anchor tags) to a file:

package org.openware.loadsim.output;

import com.meterware.httpunit.WebLink;
import com.meterware.httpunit.WebResponse;

import org.openware.loadsim.link.LinkException;

import java.io.PrintWriter;
import java.io.FileWriter;

public class DumpLinksHandler implements IContentHandler
{
    public void handlePage(SampleResult sr) throws LinkException {
        PrintWriter writer = null;
        
        try {
            WebLink [] links = null;
            writer = new PrintWriter(new FileWriter(sr.getLabel() + 
	                             ".links"));
            links = sr.getLastPage().getLinks();
            for (int i = 0; i < links.length; i++) {
                writer.println("link[" + 
                               i + "] = '" + 
                               links[i].getURLString() + "'");
            }
            writer.close();
        } catch (Exception e) {
            throw new LinkException("Error writing file: " + 
                                    e.toString());
        }
    }
}

Each set of content handlers is cloned for each virtual user, but the same set is used throughout the simulation. If, for instance, you need a handler to set a URL encoded session id, and one that will be the same for all links, you can parse the first page (after logging in) find the session id and save for all future links.

6 - Advanced Simulations

While the mode that has been described above is fine for simple sites (and probably most load simulation), LoadSim supports more sophisticated modes of operation. The main way that LoadSim supports more sophisticated simulations is in how you can specify link data (link data is a general term that refers to both query string and form data).

Data Sets

The first method for specifying more realistic simulations is through the use of data sets. In a separate file that adheres to the datasets.dtd you can define a set of data values that you can then substitute into your form and/or query string data.

Take the following as an example dataset defintion:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE loadsim SYSTEM "loadsim.dtd">

    <datasets>
        <dataset id="usernames"
                    itertype="round">
            <item value="alincoln"/>
            <item value="tjefferson"/>
            <item value="gwashington"/>
            <item value="jadams"/>
        </dataset>
        <dataset id="passwords"
                    itertype="round">
            <item value="alincoln_pw"/>
            <item value="tjefferson_pw"/>
            <item value="gwashington_pw"/>
            <item value="jadams_pw"/>
        </dataset>
    </datasets>

The itertype refers to how the dataset is traversed. round means to traverse the dataset in a round-robbin manner. You can also specify this to be random which means the values are picked at random.

To use this in a simulation you must import the dataset file in the simulation.xml file.

          .
          .
          .
 
    <import type="datasets" filename="users.xml"/>

          .
          .
          .

To use the datasets in your link sequence file you can do something like the following:

          .
          .
          .
    
    <link id="login"
             tid="medium"
             host="http://www.somecompany.com"
             pathroot="/login.php">
        <formdata>
            <data>
                <name>UserName</name>
                <value><datasetref dsid="usernames"/></value>
            </data>
            <data>
                <name>Password</name>
                <value><datasetref dsid="passwords"/></value>
            </data>
        </qsdata>
    </link>

          .
          .
          .

This way the login form for the somecompany website will sequence through the list of U.S. presidents and log them in using the usernames and passwords defined in the dataset.

Redirects

When a browser session is recorded any HTTP redirects are marked in the resulting sequence file. By default they are marked as "static". This tells LoadSim to just use the link that was recorded verbatim. You can change this value to "dynamic" which will cause LoadSim to use the link that was returned in the HTTP header (HTTP redirects tell the browser to go to another URL that is supplied in the HTTP header). The latter is handy if there is any per session data encoded in the URL that will change each time a user logs into the web based application being tested.

7 - Using SSL with LoadSim

LoadSim supports SSL via the JSSE 1.0.2 library from JavaSoft. Download the JSSE distribution, unzip it into a directory, and then copy the contents of the JSSE lib directory to the LoadSim lib directory. That is all you have to do to enable SSL support in LoadSim. Just start it up the same way you always do (you should see a message that says SSL has been enabled).

Note for JDK 1.4

The JSSE library is included with JDK 1.4, so there is no need to include the JSSE jar if you are using that version of the JDK.

SSL Implications for Load Testing

There are a couple of important considerations to take into account when using SSL. First, you cannot record a browser session in SSL mode. The reason for this is that LoadSim simply uses an HTTP proxy to record the session and if all you have to do to break the security behind SSL is stick a proxy server in between a browser and a website then SSL wouldn't be that secure. So, you will need to record the session without SSL, edit the resulting sequence file to replace all occurrences of http with https.

Another consideration is that you will not be able to simulate as many virtual users when using SSL as you can when you are not using SSL. This is due to the simple fact that SSL takes up CPU cycles encrypting/decrypting the HTTP stream, which are cycles that cannot be used by virtual users.

8 - Reporting the Results

During a simulation LoadSim produces raw output that contains timing data for each link accessed. To get the raw output into a form that is more useful for diagnostic purposes LoadSim comes with some basic report generation/publishing support.

To configure the published results you will need to create an XML file that will tell the publisher exactly what to publish. See the publish.dtd for details on the format of this file.

To use the publishing tool you can just type "loadsim_pub" at the command line prompt (this is assuming you have added the LoadSim bin directory to your path). This should give you the usage blurb. The command has the following form:

loadsim_pub input_file.csv xml_config_file [ --pubdir output directory ]

When the command is invoked the output files will by default be put into a directory called html unless overridden by the --pubdir option. In both cases the directory should be relative to the directory in which the loadsim_pub command is executed.

Appendix 1 -- LoadSim Tcl Commands

simulation

NAME

simulator - Manipulate simulations

SYNOPSIS

simulation command ?options?

DESCRIPTION

simulation new -config configuration_file

simulation simulation_ref start

simulation simulation_ref stop

simulator simulation_ref stat

NAME

simulation - Manipulate simulations

SYNOPSIS

This command allows you to create and use a simulation.

simulator new -config configuration_file: Create a new simulation. The -config option tells LoadSim where to get the configuration information for the simulation. You should use the simulation.xml from the examples/templates directory of the LoadSim distribution as a starting point for this file.
simulation simulation_ref start: Simply starts the simulation running.
simulation simulation_ref stop: Stops the simulation.
simulator simulation_ref stat: Displays the status of the simulation. If your simulation is simulating multiple link sequences on multiple machines then this will return a list of those and their statuses (either running or stopped at this point).

Appendix 2 -- LoadSim DTDs

Following is the set of DTDs that define LoadSim simulations: