With the rise of cloud computing and Web applications, monitoring and management complexity has crossed the line from the network deep into applications. Businesses that are dependent on the web (companies like Facebook, Twitter and Salesforce.com) are concerned with more than just the red light/green light mentality of the client/server days.
Monitoring has evolved from "Am I alive?" to "How well is everything running?" and "Is my performance maximized?" It follows that businesses need performance data from applications, not just infrastructure, to ensure proper delivery and function (and, down the line, good user experience).
1. Frequent Innovation and Rapid Change
Web application companies deal with change hourly. The more pieces that change and the faster the changes occur, the higher the likelihood of new problems being introduced into what is already a dynamic environment.
This can happen at the largest shops, as witnessed recently when Google claimed every site on the internet was malware.
The challenge becomes keeping track of all of the changes and knowing what change resulted in what improvement (or degradation) to applications. This data is crucial to ensuring application health, but keeping pace with changes and the varied impact is a complicated process that legacy monitoring tools like HP OpenView and IBM Tivoli by design are not designed to handle.
2. Specialized Technology
Web platforms that include LAMP, Java, and J2EE applications require specialized, cohesive metric collection to correlate application performance up and down the stack. This includes visibility into all the technologies that matter in Web application environments - from operating systems, Web servers, application servers, databases and virtualization - is critical.
3. Small Staff, Large Responsibility
The web ops people at any business wear many hats: monitoring 24/7, capacity planning, SLA compliance reporting, business metrics delivery to the rest of the company to name a few.
The aforementioned "rapid change" adds fuel to an already roaring (and hectic) fire. Shrinking budgets mean smaller web ops teams, and the fewer people to spread out across those tasks, the harder monitoring becomes.
Finding a solution designed to fill that gap means the difference between a band-aid (restarting an already- dead server) and avoiding cutting yourself in the first place (a diagnostic process to prevent and/or manage around a problem).
You can follow me on Twitter @daveofdoom