John Reichard

Subscribe to John Reichard: eMailAlertsEmail Alerts
Get John Reichard: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Java EE Journal, Java Developer Magazine

J2EE Journal: Article

Managing Java Performance Across the Application Life Cycle

A lot depends on Jave J2EE applications - your company's future, for instance

A lot depends on J2EE applications - your company's future, for instance. Ensuring the performance of complex enterprise applications built on the J2EE architecture can be difficult. To meet that challenge, managing these strategic assets needs to be looked at as a process that spans the J2EE application life cycle.

End-user experience is everything today. Poor application performance degrades end-user experience; lack of availability eliminates it altogether. External users can click away to the competition. The business suffers lost transactions and revenue, customer frustration, and a poor reputation for availability and online presence. Internal users are also affected, as employee productivity plummets.

Business reliance on live applications makes it imperative that software performance and availability problems are resolved more quickly than ever before. But why do performance problems happen in the first place? Why do applications that are tuned satisfactorily in development develop performance problems later in the application life cycle? Why is it so difficult for IT operations to solve J2EE application performance problems in production?

Developers, QA testers, and IT operations analysts have accommodated newer technologies like Java, albeit within the context of their traditional IT silos. Development now creates distributed applications with Java technology; QA tests these applications with Java testing tools; yet IT operations monitor these applications with traditional device-centric monitoring tools.

This puts IT operations at a disadvantage when dealing with J2EE applications, mainly due to the lack of visibility into the Java runtime environment. The Java virtual machine appears as a "black box" to traditional IT monitoring tools, which makes diagnosing J2EE application problems problematic. As a recent Gartner report aptly stated, there is a "continuing J2EE skill deficit among IT staff, which has posed a major challenge to the effectiveness of IT operations groups."

The communication barrier between IT silos becomes painfully obvious when a J2EE application performance problem arises in production. IT operations may be alerted to a performance problem through the help desk or an alert from a monitoring agent. Those responsible diligently check network and server status, CPU, and memory utilization as well as other components of the infrastructure to ensure that everything is normal. If the IT infrastructure is operating within acceptable service levels, the application itself or its runtime environment are blamed. This is often due to a lack of understanding of the J2EE application or how to diagnose a performance problem that has no infrastructure symptoms.

Eventually a triage meeting convenes in which stakeholders attempt to isolate the problem to a particular device or infrastructure component. Since no single IT staff member has the tools or ability to troubleshoot a multi-tiered J2EE application performance problem, the natural tendency is for each member to demonstrate that their own part of the infrastructure is performing properly. The servers are healthy, the network is fine, the database is okay, and therefore the problem must be with the application. Yet, developers and testers counter with evidence that the application has passed functional and load testing, and has been performing just fine in production until now.

Triage sessions like this occur in corporate IT departments around the globe on a regular basis. They are unproductive because each stakeholder views the problem through the lens of their own domain-specific tools and knowledge. Creating and sustaining application performance and availability in today's complex distributed computing environments calls for a life-cycle process to manage application performance, and the proper tools to solve problems quickly through collaboration across IT disciplines.

The Application Performance Life Cycle
In order to develop, deploy and maintain high-performing applications, organizations must integrate performance into their application life cycles. This requires enhancing software engineering and IT practices to include specific performance-related activities at each appropriate stage of the life cycle, and using the right tools to facilitate those activities.

Performance-related life-cycle activities should include:

  • Performance objectives and requirements definition
  • Architecture and design reviews for performance
  • Performance testing, tuning, and optimization in development
  • Test case and test suite timing analysis in QA
  • Preproduction load testing and performance baselining
  • Application service-level specification and monitoring
  • Production-level application performance management
In practice, the activities are more numerous and detailed, but the preceding list should provide a reasonable high-level understanding of the performance life-cycle approach.

During the planning phase, line-of-business stakeholders normally define application needs and objectives in terms of business processes and functions. Such objectives may sound something like this: "maximum response time of two seconds for a customer lookup transaction"; "must be able to support 1,200 internal users and up to 30,000 external users simultaneously"; or "must be able to process 85,000 point-of-sale transactions per minute from 2,100 retail locations."

Well-defined requirements are critical because they drive all subsequent phases of a project. Whether you are building a house, a road, or a J2EE application, requirements dictate the desired end result. With a house you might specify the number of floors, rooms, and windows while with a software application, well-defined performance requirements should specify throughput, response times, scalability, and so on.

During the architecture and design phase, technical performance requirements are further decomposed into elements of a design proposal. For example, if a three-tier architecture is proposed, the requirement for a two-second customer lookup transaction must be broken down into design criteria for the presentation layer, business logic, and database access.

While good architecture and design create a foundation for good application code, it is the work of developers to actually implement application features and functionality as specified in the product requirements and design specifications. The knowledge and expertise of the software development team is arguably one of the most critical determinants of an effective implementation. At this stage in the life cycle, bad things can happen to good applications.

Poor coding techniques can easily and transparently introduce performance-robbing side effects into the application code base. A poorly coded application can deliver 100 percent of required features along with hidden performance bottlenecks, memory utilization problems, and potentially fatal thread synchronization issues. Such problems may even slip through QA testing unnoticed, until load testing or live production use exposes their presence. By that time, the cost and business impact of finding and fixing problems in the application code is significantly higher than if they were corrected in development.

Performance tuning an application is a good development practice for ensuring that code executes quickly with no significant bottlenecks. Similarly, memory profiling an application in development is effective for ensuring correct and efficient use of memory resources. Thread synchronization analysis is a third development-specific task that can help optimize runtime performance and avoid potentially fatal thread deadlocks and race conditions.

Platform and configuration dif-ferences between development, QA, preproduction, and deployment environments dictate the type of tuning effort that is appropriate at each stage. Without a reasonable set of guidelines, it is possible to spend too much time tuning and optimizing during the development cycle. In a three-tier J2EE application, all of the presentation-layer components can and should be tuned with precision by the time the code is feature-complete. At this stage it is appropriate to optimize any rich-client Java code, Java applets, JavaServer Pages (JSPs), presentation-layer servlets, and browser-hosted JavaScript. Applications that are exposed as Web services need additional optimization during development, particularly in the marshaling, unmarshaling, and transformation of XML.

Quality Assurance
QA testers focus primarily on functional testing to ensure that application features and functions meet requirements. Whether manual or automated testing is performed, the outcome of each functional test is a pass/fail result indicating whether the code responded as the tester or test script expected.

In an automated test environment, each automated functional test is timed from start to finish. This wall clock timing approach is adequate for QA test assessment and for functional tests that are not on the performance critical path. Unfortunately, it is far too coarse and unreliable for measuring code performance on the critical path. The same test could be run several times, producing different wall-clock results for each run due to system loading, network traffic, and other environmental factors. Wall clock timing can easily generate false or misleading results if system loading or other environmental factors cause a performance-related test to execute slowly.

Further complicating this issue is the fact that a small 100-line Java program can execute thousands, or even tens of thousands, of lines of library and system code during a test. So when a functional test executes slowly based on wall clock timing, there is no effective means of identifying which class or method of application code is responsible for the problem. Coarse-level wall clock timing is simply inadequate for communicating meaningful information back to developers. It makes troubleshooting and correcting problems more difficult and time-consuming in development, that is, without any specific information on which class, method, or line of code was responsible for the problem.

More Stories By John Reichard

John Reichard is a senior technical specialist for Compuware Corporation.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

Most Recent Comments
JDJ News Desk 12/22/05 10:58:23 PM EST

Managing Java Performance Across the Application Life Cycle
A lot depends on J2EE applications - your company's future, for instance. Ensuring the performance of complex enterprise applications built on the J2EE architecture can be difficult. To meet that challenge, managing these strategic assets needs to be looked at as a process that spans the J2EE application life cycle.