Keeping Performance Promises

Comments
Print
Performance-monitoring tools are moving away from single network environments toward multinetwork convergence. Designed to help carriers manage networks where ATM, Frame Relay, SONET, DSL, IP and other technologies coexist, these management tools offer better visibility into the networks—and to customer services.

Typically, each network technology requires its own management system, explains Deepak Swamy, senior vice president of marketing and chief strategy officer at Trendium. “Carriers have OSS islands—one for Frame Relay, another for ATM, and another for IP. These separate systems don’t provide a clear, customer-centric, end-to-end view of the service. To provide this type of view, you must model two or three layers of the network rather than attempt a root cause analysis.”

Trendium and other performance monitoring companies want to move beyond testing a single customer experience, or testing the network to gain average response times or quality of service. And they want to provide more performance information than what is available in proprietary element management systems. Their goal is to increase visibility by creating performance-monitoring software that tests multiple layers across different technologies in a multivendor network.

These developers also want to tie such network information to the customer by mapping network performance to customer service. They are still promoting their tools as a means for carriers to be proactive, but now they are especially emphasizing the ability to identify quickly which customers are affected by a network outage.

“Service providers must know the network to enter the game, but they must know how the service is affecting the customers to stay in the game,” says Dave Gellerman, vice president of technology and corporate development at Spirent. “When an OC-12 outage occurs, for example, carriers don’t know if it’s a voice trunk between switches, ATM trunks or packet loss. It’s more important to know what the outage means in terms of services sold. Carriers need to be able to validate and tie together all the networks by looking at all the layers and touching many physical and logical points.”

As these developers take a different approach to performance monitoring, others are sticking with their tried-and-true methods. Some are focusing specifically on the transport layer to identify network problems. Equipment manufacturers, in the meantime, are trying to provide universal views across their devices to support monitoring in networks that combine IP with optical or ATM technology. Regardless of approach, all the developers are offering new features that seek to improve performance monitoring.

Addressing Optical Buildouts

The recent increase of optical technology in metro networks has led to new management tools focused solely on optical testing. (For more on quality of service and service assurance for optical networks, see “Catalyst Showcase Aiming For QoS Accounting.”) Companies see this segment as supporting unproven technologies. The multidirectional DWDM and mesh networks, which pose more management problems than the bidirectional SONET, require sophisticated tools.

Lincolnshire, Ill.-based Clear recently refocused its fault and performance management products for optical networks with its Clearview suite. The suite marries static network information from the inventory database with dynamic information about optical switching paths. Clearview gathers the information about the different flows and provides a picture of the traffic’s path through each circuit.

Using this information, Clearview displays service configurations, identifies problems and affected customers, and consolidates performance details into reports that can be measured against SLAs. “Carriers want to see QoS at the ingress and egress points on these networks,” says Adan Pope, vice president of software development at Clear.

Akara is another company planning to enter the optical management space. It intends to introduce its service management platform to optical providers by the end of 2001. As a bridge between the enterprise customer and the service provider, the Akara platform provides performance monitoring for both.

“Optical networks have a fundamental need for demarcation,” says Solomon Wong, founder and vice president of corporate strategy at Akara. “Service providers must have performance monitoring and fault management that shows where the enterprise begins and where the service provider network ends. This is the point where we define the SLA, adjust latency and throughput, provide bandwidth on demand and guarantee QoS.”

Cisco is also upgrading its optical network management tools, combining IP and optical performance metrics. “Previously these networks were operated as discreet domains of management control,” explains David Kirsch, manager of business development at Cisco. “One group handled optical transport, and a separate group managed the IP network. If a fiber was cut, the transport team would respond by rerouting the services. They didn’t care what services were transported through the fiber or what customers were affected. On the IP side, though, the operators were getting calls from customers about service outages. The Cisco Information Center [CIC] allows the two teams to implicitly understand the problem at the transport layer and the services layer.”

Kirsch expects CIC to be particularly useful for carriers running virtual private networks. If a line is cut on the optical network, CIC will show which VPNs are connected, how they are affected and what types of customers are on those VPNs. It will also interact with the provisioning system to automatically reconfigure the VPN without human intervention.

However, Cisco’s management tools support only Cisco’s element management system (EMS). For carriers with multivendor devices, performance-monitoring companies are offering tools that integrate with different EMSs, as well as popular fault management systems such as Micromuse and HP OpenView.

“Almost every carrier has a heterogeneous environment. They have a wide array of elements, and an EMS from each hardware vendor,” says David Little, worldwide marketing manager at Ai Metrix. Each of these systems sends out an alarm when an outage occurs on the network and causes the operator to check each system for faults. Instead of this swivel-chair approach to network management, Little says, “carriers need an integrated network management tool that takes the faults from each EMS and gives a single alarm showing what services are out and which customers are affected. EMSs have no ability to map the outage to the customer.”

QoS Struggles Over IP

Due to IP’s connectionless nature, developers have spent a significant amount of time trying to design performance-monitoring software. Many of the tools were designed to help carriers monitor and measure VPN performance, as well as manage tiered services.

“Everyone knows that best-effort performance on the Internet is not that good, and they assume the Internet is not capable of handling mission-critical services,” says Charlie Gallucci, Viewgate’s vice president of marketing.

Viewgate’s tools reside at the network operations center and on the customer premises. Its Inteligo software uses SNMP to poll network devices for specific traffic data, and then correlates the data into performance reports for the service provider and the customer. Although Viewgate was initially designed for IP, it also tracks Frame Relay and ATM traffic.

Cable and Wireless has been using Viewgate since October 1999 to provide performance reports to about 50 managed services customers. The Inteligo software polls ATM and Frame Relay switches, providing reports on network usage, discarded traffic and throughput. It lets Cable and Wireless customers know what types of services are delivered and allows for capacity planning. Cable and Wireless uses Inteligo for a real-time view of availability and for meeting SLAs.

Brix Networks also sells performance-monitoring tools for IP service providers that offer VPNs, VoIP, hosted applications and streaming media. Level 3 is using BrixWorx to monitor its VoIP network. Brix’ dedicated hardware appliances are distributed at certain access points of presence or nodes, or at the customer premises. These appliances actively test the traffic and verify performance in a deterministic, repeatable way, says Jamie Warter, vice president of marketing at Brix. “We can do simulated voice transactions, measure call performance and grade the infrastructure,” he says. “We can measure latency, echo and voice characteristics.”

Revamping the SLA

During the last few years, service providers have been promoting service level agreements that outline their network performance requirements. The contracts are supposed to make customers feel confident and give them a guarantee that they are receiving the service they paid for. But some industry watchers have gone so far as to describe SLA reports as a marketing tool with little substance.

SLAs have been ineffectual because capturing performance metrics and translating them into reliable, useful information is haphazard and difficult. Many of the newer performance monitoring companies expect that their tools could improve SLAs and make them more customer specific.

“Today’s SLAs are a joke,” says Trendium’s Swamy. “They are about aggregate service, drops, peaks, and typically only cover one or two parameters. They are static and not driven by service performance.”

Viewgate’s Gallucci agrees that SLAs are simplistic, but claims that service providers are only seeking basic information, such as port availability and utilization for simple Internet access. “Providers want to report a sampling of typical end-to-end measurements, showing availability, how the customer is using the port and the typical delay from the point of presence.”

Opinions are mixed as to whether the reports will evolve and include more useful metrics. Spirent’s Gellerman expects that they will become more application-specific. “In the past it has been a challenge to associate the application behavior with certain metrics, so service providers chose to measure availability or throughput,” he says. “In the future, SLAs will be driven by what applications the customer is using.”

Redesigning SLAs to be more customer-centric is an ongoing quest. Quallaby’s Mike Combs, director of marketing, suggests that the agreements be written from the customer’s point of view. “They will be based on common sense,” he says. “Availability, throughput and latency are network terms. SLAs need to be geared toward how the customer views the performance he is receiving.”

Others expect SLAs to fall out of favor or have only limited usage. “Service assurance shouldn’t have anything to do with reports. Customers aren’t looking for a more explicit or detailed agreement; they want a higher quality of service delivered. Customers want the assurance that what they bought was what they received,” says Andrew Hurrell, director of product management and service management of Acterna’s software division. “Service assurance is not about fancier reports or thicker contracts. Service assurance is about the ability to consistently deliver specialized service.”

With these calls to strengthen SLAs and improve overall service, this new set of performance monitoring companies wants to garner attention from upstarts and traditional providers. The developers are banking on carriers’ directives to save costs and enrich current services, and the sector is seeing success in the venture capital community (see Table 1).

These successes show that even in slow economic times, keeping track of services and performance levels is high on the priority list for customers and service providers.
Comments