This short article describes a J2EE architectural pattern known as a Streaming Architecture. For a comprehensive practical example with full source code, as well as a performance comparison with other architectures, see Streaming Presidents.
Improve application server scalability by streaming response data through the application server, thereby never holding more than a small portion of a result set in memory at once.
Consider a J2EE-based system that includes reporting functionality that may, depending on the selection criteria, return several hundred rows of data in response to a single request.
Sun's J2EE Blueprints traditionally structure applications with a (1) web-tier, within which a Front Controller servlet runs, and (2) an application tier wherein session beans encapsulating core business logic run. The front controller invokes a handler that, in turn, remotely invokes the appropriate session bean to perform its business logic. The session bean returns output results using Transfer Objects. To complete handling of the request, the web-tier handler forwards these transfer objects to a JSP to be rendered.
However, when large datasets may be returned to the client, using this approach has serious scalability limitations. To see why, let us track the memory usage, as a function of the number of rows returned, for a single request. In particular, we are concerned with the asymptotic behavior of the function.
The memory used for request propagation is clearly independent of the number of rows returned, so consider this a constant-memory operation, O(1). After the session bean queries the datastore, the results are stored as a list of transfer objects, with one transfer object per record: Thus the memory usage is O(n). The list of transfer objects must then be transferred to the web tier and reconstituted where they will again occupy O(n) memory. Therefore, the total memory used on the application server to process n records is O(n).
Certain techniques can be applied to reduce memory strain in the above scenario. Most importantly, serialization between the web and application tier can be eliminated through the use of local interfaces, introduced in EJB 2.0. As a result, the list of transfer objects can be passed by reference from the application to the web tier, effectively halving the total memory used to process the request. However, despite this performance improvement, local interfaces do not change the asymptotic memory performance of Sun's endorsed architecture.
Thus, this approach causes a scalability bottleneck on the application server, especially under high concurrency and with large datasets. For example, a two-machine server cluster running IBM WebSphere 5.0 with serialization turned on and with nearly 1GB memory allocated to WebSphere on each machine, was observed to crash when three or four simultaneous requests for very large data sets (80-90 pages of data) where made during peak hours of usage.
The key to alleviating this bottleneck is to decouple memory usage on
the application server from the number of results returned by a particular
query, thereby reducing memory usage from O(n) to O(1). In the large
majority of business reporting scenarios, doing so is possible, and, the
approach for doing so is termed
From a high level, a streaming architecture works by simultaneously keep two data "pipes" open when handling a request: one to the data source and one to the client. When a row is read from the data source, it is processed and then immediately written to the client. As a result, no more than a single row is kept in memory at once and memory usage remains constant for the duration of the request.
One consequence of the streaming architecture is that it precludes the use of EJBs to handle a request. This is because neither can an EJB stream results back to the web tier, which would allow the request handler to write results to the client one-by-one, and neither can the request handler pass its connection to the client to an EJB. Because of this fundamental constraint, a streaming architecture is often overlooked in J2EE implementations, even in situations where it could significantly improve scalability.
ApplicabilityUse a streaming architecture when:
ConsequencesImplementing a Streaming Architecture has the following benefits: