SpellChecker
Load Testing
Revision
07/17/02
Background
When a user accesses the SpellChecker feature on a web
site, several requests are issued to the server. These
requests are to both the engine and to the web server.
Requests to the engine do the spell-checking work -
checking words, getting suggestions for the incorrect words,
etc. These requests begin by invoking the CGI script
residing on the server. This script, in turn, issues
requests to the application server that contains the system
logic and works with dictionaries. Each request utilizes
its own copy of CGI script, but all these scripts work with
only one instance of the application server. Therefore,
such requests impose the load on both web server and
application server.
Requests to the web server retrieve static elements, such
as images, scripts, style sheets, etc. Each time the
user presses an action button in the SpellChecker dialog,
several static items are loaded from the server.
Since the length of text and number of incorrect words vary
significantly among users, performance cannot be measured in
"sessions", or series of actions from the moment the user
started to check text to the moment (s)he submitted the
corrected text. Instead, "request" is the best unit of
measurement. "Request" is defined as the process that
occurs when the user presses an action button in SpellChecker
dialog window (Change, Ignore). In our testing process, we
issued one engine request (specifically, "get all suggestions
for the word", as this is most time-consuming action) and four
static requests (two .JS files, one .CSS, and one image).
Server Information
The system tested was installed on a server with following
parameters:
- Dual AMD 1.53 GHz
- 512M RAM
-
Windows 2000 Server
- Apache Web Server
Methodology
To emulate simultaneous user access, a multi-thread stress
tool was created. 20 to 30 threads were started on each
workstation participating in load testing, with each thread
issuing requests to the server. The words to check were drawn
randomly from a list of 4,000 correct and 4,000 incorrect
words. These words were the most popular correct and
incorrect words, based on analysis of the web system where
SpellChecker is used. The requests contained about 4%
incorrect words (this percentage is usual for message boards
and online systems). The number of words was enough to
over-run any caches present in the system.
Testing was performed from five workstations, with network
connectivity rates ranging from 512K DSL to 100Mbit LAN.
Each stress tool logged its requests. After testing,
the logs from different workstations were combined and the
number of requests in each minute was calculated. To achieve
this, system clocks were synchronized on all participating
workstations.
Results
Testing showed that the SpellChecker server, with the
hardware and software configuration described above, was able
to process 3500 - 4000 requests per minute. While the
number varied throughout testing, most points were in this
range. System processes on server and workstations and
variations in Internet connectivity may explain this
variation.
3500 - 4000 requests per minute are equal
to:
210 - 240 K requests per hour
5 - 5.5 M requests per
day
Testing demonstrated that the only resource critical to
overall performance is processor speed. Memory usage
remained stable during the testing and was equal to 80M-90M
total (for SpellChecker software, web and ftp servers,
operating system etc.).
Processor load on the server oscillated between 20% and
90%, with 45% being the average.
Current configuration of
SpellChecker.net
Our system configuration is dual AMD 1.53 GHz,
512M RAM, HDD 40 Gb.