Streaming Architecture

This short article describes a J2EE architectural pattern known as a Streaming Architecture. For a comprehensive practical example with full source code, as well as a performance comparison with other architectures, see Streaming Presidents.

Intent

Improve application server scalability by streaming response data through the application server, thereby never holding more than a small portion of a result set in memory at once.

Motivation

Consider a J2EE-based system that includes reporting functionality that may, depending on the selection criteria, return several hundred rows of data in response to a single request.

Sun's J2EE Blueprints traditionally structure applications with a (1) web-tier, within which a Front Controller servlet runs, and (2) an application tier wherein session beans encapsulating core business logic run. The front controller invokes a handler that, in turn, remotely invokes the appropriate session bean to perform its business logic. The session bean returns output results using Transfer Objects. To complete handling of the request, the web-tier handler forwards these transfer objects to a JSP to be rendered.

However, when large datasets may be returned to the client, using this approach has serious scalability limitations. To see why, let us track the memory usage, as a function of the number of rows returned, for a single request. In particular, we are concerned with the asymptotic behavior of the function.

The memory used for request propagation is clearly independent of the number of rows returned, so consider this a constant-memory operation, O(1). After the session bean queries the datastore, the results are stored as a list of transfer objects, with one transfer object per record: Thus the memory usage is O(n). The list of transfer objects must then be transferred to the web tier and reconstituted where they will again occupy O(n) memory. Therefore, the total memory used on the application server to process n records is O(n).

Certain techniques can be applied to reduce memory strain in the above scenario. Most importantly, serialization between the web and application tier can be eliminated through the use of local interfaces, introduced in EJB 2.0. As a result, the list of transfer objects can be passed by reference from the application to the web tier, effectively halving the total memory used to process the request. However, despite this performance improvement, local interfaces do not change the asymptotic memory performance of Sun's endorsed architecture.

Thus, this approach causes a scalability bottleneck on the application server, especially under high concurrency and with large datasets. For example, a two-machine server cluster running IBM WebSphere 5.0 with serialization turned on and with nearly 1GB memory allocated to WebSphere on each machine, was observed to crash when three or four simultaneous requests for very large data sets (80-90 pages of data) where made during peak hours of usage.

The key to alleviating this bottleneck is to decouple memory usage on the application server from the number of results returned by a particular query, thereby reducing memory usage from O(n) to O(1). In the large majority of business reporting scenarios, doing so is possible, and, the approach for doing so is termed Streaming Architecture.

From a high level, a streaming architecture works by simultaneously keep two data "pipes" open when handling a request: one to the data source and one to the client. When a row is read from the data source, it is processed and then immediately written to the client. As a result, no more than a single row is kept in memory at once and memory usage remains constant for the duration of the request.

One consequence of the streaming architecture is that it precludes the use of EJBs to handle a request. This is because neither can an EJB stream results back to the web tier, which would allow the request handler to write results to the client one-by-one, and neither can the request handler pass its connection to the client to an EJB. Because of this fundamental constraint, a streaming architecture is often overlooked in J2EE implementations, even in situations where it could significantly improve scalability.

Applicability

Use a streaming architecture when:

Scalability is a concern.
Requests may return a large number of records.
The system will receive heavy load.
Records can be processed relatively independently of one another.

Do not use a Streaming Architecture when:

Processing is transactional in nature.
Records cannot be processed independently of one another.

Consequences

Implementing a Streaming Architecture has the following benefits:

Dramatically Improved Scalability. Memory usage as a function of result-set size drops from O(n) to O(1). Instead, memory usage scales with the number of concurrent connections. The application server is no longer the bottleneck within the system—50 simultaneous requests for a 100-page report use no more memory on the application server than 50 simultaneous requests for a 10-page report. Bottlenecks, if they exist, will be elsewhere—for example, on the database or the network. In practice, overall system scalability can improve by a full order of magnitude by implementing a Streaming Architecture.
Improved maintainability. The Streaming Architecture requires less code and fewer concepts to implement than a traditional J2EE architecture. Generally a single request handler, a data transfer object, and a page template are all that is required. Using a standard J2EE architecture, a session bean would also be required, and, unless one happens to be using an EJB 3 container, session bean configuration, invocation, and deployment can consume a disproportionate amount of development time.
Improved response time. Since a streaming architecture begins sending results to the client almost as soon as it begins receiving data from the back-end datasource, the amount of time that elapses from initiation of a request till the first results are returned is significantly improved.