Billing World and OSS Today Magazine
Search
Weekly E-mail Newsletter 

Finding Common Ground to Resolve Outages

Susana Schwartz
12/26/2007

With increasingly complex convergent services, the incidence of “outages” is expected to grow. In 2007, there were a variety of high-profile out-ages that were difficult to resolve because of a lack of consistency in language and data among stakeholders in the delivery of services.

Persistent problems with outages and reporting of network outages drove AT&T, Alcatel Lucent, Qwest, Sprint, T-Mobile, U.S. Cellular and Verizon to participate in the ATIS Network Reliability Steering Committee (NRSC). After 18 months of concentrated work, the NRSC has released its Outage Classifi-cation Standard (https://www.atis.org/docstore/default.aspx). The standard is designed to provide carriers, vendors and the FCC a template for uniformity in outage data and reporting models. The hope is to resolve inconsistencies that plague network analysis and reporting.

The project was driven initially by AT&T, whose principals realized consistency was paramount after the company’s divestiture: “We realized a ‘common view’ of processes and data among disparate pieces of the company would be necessary to ensure network reliability,” says Archie McCain, chairman of the NRSC and director of Six Sigma programs for AT&T.

AT&T and the other NRSC members recognized that outage analysts and coordinators grappled with different protocols and methodologies for notifica-tions, updates and modifications to outages. The disparity was attributable to the fact service providers, government agencies and suppliers all possessed unique methods and systems for classifying outages. For that reason, most outages had to be re-classified or converted as they worked their way through carriers’ systems, and on to suppliers and/or government systems.

“We all needed outages to be classified just once, even in instances where changes and modifications had to be made to outage information,” says Jay Bennett, principal scientist at Telcordia, one of the lead editors of the standard’s documentation for enumerating possible outage classifications.

The ATIS standard was designed to classify outages within three categories: cause of outage failure (i.e, hardware, software, cable, wireless transmis-sion or capacity); reason for outage (including a primary and secondary descriptive protocol); and responsibility for outage, (such as acts of nature, report-ing service provider, government, or vendor).

The creation of these broader categories was meant to ensure that the classification system would be applicable to all network types, and to a wide range of statistical analysis and outage causes.

“The template helps answer questions around ‘what failed,’ ‘why’ and ‘who’ is responsible,” says McCain, noting there are lists and sub-lists under each of those questions.

“You can have primary reasons and then secondary reasons for an outage. A primary may be damage, design problems or engineering issues. Then within that, there can be more granular detail such as whether it was an accident, an intentional act, a documentation problem, or a supervision problem,” explains McCain.

Once the “what” and “why” are known, it’s easier to understand the “who,” such as whether it was the service provider, the utility company, or the indi-vidual that was responsible.

The ATIS standard provides “guidance” through documentation of certain outage examples and ensuing classifications for the “what,” “why” and “who.” (See table, “Examples of Application to Various Outage Scenarios.)

In the ATIS system, there exist primary and secondary reasons for outages: For example, an invalid pointer added to an office retrofit tape indicates that a trunk group may experience a failure. The “what failed” question would yield an answer of “software” under which a primary reason of “design” could be determined. That would point to the vendor as the source to which the carrier must go for resolution. That process would also lead to a sub-group, or “secondary” issue of “accident” so that further rules about actions to take can be implemented

THE ‘DIRTY WORK’

Having a standard template for mapping issues and resolutions is a first step, but the next challenge is writing the software to implement systems in da-tabases. “To do this effectively, service providers may have to establish a ‘dual system’ in their databases so that they can classify the ‘old way’ while start-ing to classify the ‘new way,’” concedes McCain.

By allowing the traditional and burgeoning methods to run side-by-side for a period of time, data would be allowed to build up so that service providers could conduct operational reporting without disruption. “You don’t want to do a ‘clean sweep’ because you still need more than a few days worth of data to report on things like card failures or digital switches,” notes Bennett, who believes “weaning” off of older systems is a better approach to implementation.

He notes some companies may choose to use a phased approach. “Some may implement the software for re-classification of a year’s worth of records to start off, so they are mapping the old classification into the new,” says Bennett.

The approach to conversion will depend on the methodologies and systems used in their original approaches.

A standard to address those questions gives service providers a chance to prioritize worthwhile outage issues, as opposed to exhausting resources on those where improvements cannot be made. “Rather than throw resources at issues where further improvements are difficult to attain, the template will help identify places where there is room for improvement,” says McCain. “The outages caused by human error may require more attention to procedure than those which are more straightforward, like a cable cut.”

With the Outage Classification standard, he believes service providers can target areas where rules are lacking.

Alcatel-Lucent www.alcatel-lucent.com/wps/portal
ATIS www.atis.org
FCC www.fcc.gov
Qwest Communications Inc. www.qwest.com
U.S. Cellular www.uscc.com/uscellular/SilverStream/Pages/uscellular.html 
Verizon Communications Inc. www.verizon.com


Share this article: Email, Slashdot, Digg, Del.icio.us, Yahoo!MyWeb, Windows Live Favorites, Furl
RSS Add this article feed to: RSS, My Yahoo, Newsgator, Bloglines

Read Comments [0]

Post a Comment

Email Email this article Comment Add a comment
Print Printer version Reprints Order reprints
RSS RSS Feed Bookmark Bookmark article





   

Subscribe to Billing & OSS World Magazine
First Name Last Name
E-mail

Sponsored LinksB/OSS Magazine Announcements