Traditional carriers and upstart competitors may seem to have very little in common when it comes to infrastructure or business practices, but one thing they are all experiencing is a tremendous increase in call volume. As residences order multiple phone lines and as businesses generate more and more calls, the sheer number of transmissions that need to be tracked and billed for is increasing exponentially.
Each of these phone connections generates a call detail record (CDR), which contains vital information about the transmission such as start and end times, duration of call, and originating and terminating number. CDRs for circuit-switched voice are nothing new, and all providers have their own method of storage. And while traditional CDRs are growing rapidly, many providers are also experiencing (or are poised to begin producing) CDRs for IP-based services, also referred to as IPDRs. A recent International Data Corp. (IDC) research report estimates that the worldwide IP telephony market will reach 2.68 billion minutes in 1999. The report further predicts that next year that number will triple, and by 2004 IP telephony minutes will reach 135 billion.
Currently, IP services are generally billed at a flat rate rather than by usage; howeve,r many predict that this will change. IP services such as voice and video could be billed on a whole set of parameters other than simply duration and distance of the call. And the real-time nature of billing for IP services brings that much more difficulty in not only capturing call records but also making sure the support systems that need that information can readily get to it.
During the process of collecting data from carrier switches, moving it through support systems and ultimately storing it, a number of issues and concerns often come up that will only become more intense as call volume continues to increase.
These include problems with redundancy of data. Because a number of back-end support systems rely on CDR information, there is a tendency to create multiple copies of these records. Not only does this add to the data bottleneck, but it can also be confusing when it comes time to figure out which customer used what services.
Another major issue is scalability. Some larger telcos are handling millions of call records daily and need to develop storage solutions to accommodate this influx of data, whose volume in many cases is reaching the terabyte level.
Data Collection
While a growing number of traditional telephony CDRs has to be gathered and stored, the use of IP to deliver voice, video, and other services will only exacerbate the data management issue.
For providers of IP telephony services, the floodgates have already opened, and they are coming face-to-face with not just having to bill for these services but with how to capture and store these specialized detail records. “We’re approaching 50 million minutes a month, all of which are billed and recorded,” says David Greenblatt, COO of IP telephony service provider Net2Phone. In September 1999 IDC named it the leader in IP telephony services, with 39.4 percent of the market.
Those 50 million minutes translate to about 7 million CDRs, estimating each call is about 7 minutes long, according to Greenblatt. With this heavy volume of records, Net2Phone had to develop its own system for retrieving CDRs and moving them through their back-end organization. “If you go to a legacy software vendor, they will sell you a solution that can support 250,000 users. That’s fine for a large enterprise, which involves a finite number of users, but when you’re dealing with the Internet, the number of users is technically infinite.” Greenblatt adds that this kind of complexity actually doubled the work necessary for designing the system Net2Phone uses for capturing and processing CDRs.
For other telcos, the concern regarding IP services is not so much how to extract the CDRs from a switch or router but how to do it quickly and without losing information along the way.
In the case of regular, PSTN-based phone calls, CDRs might accumulate on a carrier’s switch before being moved in batches at predefined intervals to a database and then to the rating and billing system. Traditional switches usually have a lot of RAM, as well as local storage space. Therefore they can record all relevant metrics and then dump the data into storage. Once the switch reaches a predefined threshold, it either pushes the data out in a batch to the mediation or billing system, or holds onto the information until a mediation platform pulls it out.
That batch method isn’t as effective in an Internet model, mostly because CDRs for IP services stay on a switch or router for a only few seconds and therefore need to be quickly captured. “On the Internet, everything is real time,” Greenblatt says. “You can’t have delayed batch processing or wait until the end of the day to get the information off the switch.”
Instead, service providers need to continually grab the raw data from a router and bring it into their system without dropping it during the process. “In most cases the data is pushed from the router when a session ends and goes to some kind of local data store,” says Vikash Varma, worldwide marketing and business development manager at Hewlett-Packard’s Internet Services Infrastructure Organization. “You don’t want to do too much processing, but rather simply get the information and make sure nothing gets lost.”
After raw IPDRs are captured by the local data store, they usually go immediately to the mediation platform. Typically the data is only rudimentary. It could contain detailed metrics such as duration, byte count, and the like. But since most customers will use dynamically allocated IP addresses, the raw CDR may not contain information about the user. To find out which user had access to a particular IP address and at what time, the mediation platform would have to communicate with a RADIUS or other database. It could then link up that information to the metrics about the session.
The end result of this processing is a billable data record that can be sent to the billing system.
The migration from batch processing to a more real-time model isn’t attractive to just IP-only service providers; telcos of all types are examining this much more closely. “Everyone who isn’t already implementing a real-time CDR processing system is at least looking into it,” says Brian Agnew, lead architect for Sun Microsystem’s OSS and BSS group.
Due to the inherent drawbacks of using a batch system for moving Internet CDRs, telcos that are offering usage-based services such as voice, video or fax over IP are having to look at a real-time billing model. (For more on real-time billing for IP services, see “Batch Systems for Internet Billing,” by Dr. Matthew Lucas and Dave Lubuda, Billing World, January 1999.)
A batch processing model does afford several advantages, since each step along the route is independent of the others. This includes the ability to reload CDRs to the rating server in the event of a connection failure, ensuring that CDRs don’t get dropped along the way and maintaining the integrity of the data collection process. Also CDR formats in use by different switch manufacturers can be equalized to a common format. However, the delay inherent to batch processing means several things can slip through the cracks, such as timely fraud detection, on-the-fly service provisioning and instant authorization of accounts and transactions.
For these and other reasons, carriers are increasingly considering a real-time model. “The trend is definitely going toward real-time collection from switches,” says Yancy Oshita, director of communications industry marketing at data warehouse vendor NCR. But rather than only capturing CDRs in real time, Oshita adds that other information—such as customer events—can also be extracted from a switch in real time: “There is the need to capture CDRs, but carriers also want to know about metrics that might make a customer a good candidate for churn.” He adds that if a customer’s usage level drops off by 30 percent, for example, the ability to know about that immediately makes it that much easier to intervene through a customer service representative and possibly keep that particular user.
Agnew says that if a mediation device is able to analyze CDRs in real time, it can identify usage patterns through the use of algorithms. Then, depending on how the system has been designed, it can forward alerts to customer service or another part of the organization.
In addition to the complications of simply collecting CDRs in real time, another difficulty involves the formatting of IP-based call records. While mediation devices are able to make sense of traditional CDRs regardless of whose switch they come from, IP detail records are another matter entirely. Currently, no standard exists to define a common way of representing IP call detail records. IPDR.Org has begun working on such a specification, but until that takes shape early next year, telcos have to write their own algorithms.
IPDRs generally contain information not normally found in traditional CDRs. This might include quality of service levels of a particular session or how many packets went back and forth. Telcos configure their mediation devices to determine the necessary information and how to send that information to the billing system.
Although IPDRs do include unique metrics, the actual collection of these records from switches and routers is basically the same process used for moving traditional CDRs into a carrier’s back-end system. (However, as mentioned earlier, since IP-based CDRs stay on a switch only for seconds, they need to be continuously extracted by the telco.)
Storage Space
After CDRs are collected, rated and billed, the archived data can be used to reconcile accounts and is indispensable to many of a carrier’s CRM strategies. Also, regulations require these records to be kept for several years.
The challenge of making this data useful for many purposes lies in setting up a system to not only store the CDRs, but to make them easily accessible in a format that means something. While they’re extracting and processing CDRs, telcos also have to think about managing them as efficiently as possible.
When CDRs leave a switch and enter the mediation and rating process, most carriers will want to do a backup of the information as it enters the system. It’s no secret that CDRs do get lost, and as a result carriers lose money. But while a real-time backup on incoming CDRs is a good way to cover any potential losses, often the backup process simply can’t keep up with the constant influx of data. This type of backup can’t be put off until 2 a.m. each day, due to the ongoing generation of CDRs and the need to immediately send them through the billing and rating system and then on to other parts of the carrier’s support organization.
So some carriers may be experiencing bottlenecks with their network-attached storage servers. These might be attached via a SCSI interface, which can deliver top speeds of 160MBytes/sec, but in some cases congestion might still occur. “Telcos can’t back up as fast as they would like because of the batch nature of the billing process,” says Kay Benaroch, manager of telecommunications field marketing programs at enterprise storage vendor EMC. “There just aren’t enough hours in the day to do backups.”
After being backed up and going through a billing and rating system, CDR information needs to be replicated and passed along to fraud detection systems and customer service. In fact, it’s not unusual for a provider to have at least a half-dozen copies of a CDR for transmission to various parts of the company. This redundancy can lead to data glut unless it’s properly managed.
One way to do that is through an enterprise storage solution. This could come in the form of servers running storage software that are attached to the multiple servers that collect and process CDRs. After CDRs are extracted from a switch or router and before they hit the mediation system, they could be temporarily stored within an enterprise storage server. The server could then create a mirror or backup of the data.
In another scenario, CDR data could be passed directly from a switch to the mediation system. The data received by the mediation device could be backed up and kept on a storage server. “I found this to be the overriding theme of telco customers using our equipment for billing and customer care—particularly in the areas of storing and replicating CDRs for other areas of the organization,” Benaroch says.
To decrease the likelihood of bottlenecks going from switch or mediation device to storage server, some telcos may consider designing a storage area network (SAN). A SAN is a subset of a company’s network that is devoted to storage. It might contain servers, tape libraries and RAID arrays.
An important feature of a SAN is the many-to-many connection it provides among all resources on the network. In addition this type of connection means that in the event of a failure of a particular device, alternate paths can still allow access. The high-speed interconnect of choice in most SANs is fiber channel, which currently supports speeds of up to 1Gbit/sec.
Data Warehouse
Another method of archive management for CDRs is to use a data warehouse. Some large carriers are completing millions of phone calls each day, resulting in multiple terabytes of data that need to be stored.
After CDRs—whether representing PSTN or IP—make their way off a switch and through the telco’s billing system, the task of managing those records is highly complex. Many providers use relational databases to handle the task of storing CDRs, because they can provide access to large amounts of data with the intelligence to sort data according to the user’s preference. However, they sometimes fall short when applied to a very specific task such as managing thousands or even millions of CDRs. “The main goal [of databases] is not to reduce the amount of storage you need, so much as easy accessibility, reporting and analysis,” says Net2Phone’s Greenblatt. “But the more you compress and compact, the slower the retrieval, and relational databases are not known for their economy and efficiency of storage.”
Another drawback—for the largest service providers—is that these providers often have numerous relational databases to store call detail records. They may be using different databases to store different types of information.
The problem isn’t so much that different pieces of data floating are around a system, but that what’s responsible for managing this information usually is disparate databases without any cohesion. Most telcos and service providers are likely to have combinations of several databases, Unix servers and even mainframes holding onto their important data. More often than not, these systems can’t easily communicate with each other or share information. And in many cases, this lack of integration can lead to unnecessary redundancy.
For example, several copies of each CDR might be floating around so that each component within an OSS infrastructure can process the information in its own way. At the same time these different systems, such as fraud detection, reporting and customer care, have their own view of the customer that might be separate and distinct from each other. These aren’t things that come up at meetings—they usually exist as rules that are not implemented in the billing. And many times they are manual so they can’t easily be implemented in the billing system. “Right now the telcos have stovepipes of data that they can’t marry together,” adds EMC’s Benaroch. “They might have to go through 20 or 30 screens to see all the services a business or household has purchased.”
Telcos also need to streamline the process of loading data into the warehouse, so it will be easily accessible to anyone who needs it later. “About two-thirds of the warehousing process has to do specifically with extracting data from the OSS infrastructure—such as a billing system—cleansing, formatting and then loading it into the warehouse,” Oshita says. “The data in the warehouse is only as good as how it is handled prior to loading.”
One way to load CDRs, especially ones based on IP services, is directly from a mediation platform. Once IPDRs go through the billing system, they are generally in a format that’s optimized for billing, but not for other uses. Instead, after IP call detail information has been processed by the mediation platform to include not just the raw metrics but also the correlation to a user, some providers are pushing those records directly to a warehouse, says HP’s Varma.
“The data will be needed by different departments and is not strictly for billing or other systems. By sending information directly from mediation, you’re not touching the billing system and overloading it. Instead, you’re drawing information from as close as possible to the source, and you can reformat the data there instead of waiting until further downstream, when it will be more difficult,” Varma says.
IP telephony provider Net2Phone utilizes a homegrown data warehouse to store its CDRs. The system is based on Sun Solaris servers and Oracle database software, but the warehouse design was done in-house. “It took us many months to build the software to dynamically create the warehouse,” says Greenblatt. “It can’t just be raw copies of your data; it should be in a format and structure so you can quickly get a consolidated view.”
Besides taking the time to develop the warehouse, Greenblatt emphasizes the need for fault tolerance within the infrastructure. “You can’t add [fault tolerance] later; it has to be built on day one,” he says. “If there aren’t multiple systems tracking a transmission, or if any processor goes down, you’ll lose all the data up to that point.”
One of the points of creating a data warehouse is to keep all information under one roof, so to speak. This move away from a distributed environment should help reduce redundancy issues. If all CDRs go to a single destination after going through the billing system, the chance of having multiple copies in storage decreases.
Telcos do regularly generate multiple copies of CDRs for various OSS components, but when it comes time to load CDRs into the warehouse, care must be taken not to get bogged down with duplicates. Redundancy of CDRs is generally addressed during data transformation, which occurs when pieces of data from the OSS system are filtered through algorithms and programs before going to the warehouse. A good chunk of the transformation process actually works to eliminate redundancy.
By keeping everything in a single repository, carriers are going to run into scalability concerns. Also, the flurry of mergers and acquisitions in the carrier space means that all of a sudden, one provider may start having to deal with a marked increase in CDRs and other customer information. “As you acquire new business units and start to grow and extend, you still need to be able to access that information on a customer-by-customer basis,” Oshita says.
Most telcos will store 3 to 4 months’ worth of CDRs on site, with archived volumes stored on tape or optical disk at other locations. Even just a few months’ worth of CDRs can easily add up to multiple terabytes of data in a warehouse or an enterprise storage system.
And once telcos have created this enormous storehouse, they need to also consider how non-technical employees—such as those from marketing, sales or customer service—will access the data and put it to use in their own business units.
Similar Articles
- CABS Revenue Assurance Disputes: May the Carrier With the Best Data Win
- Telecom Merger Juggling Act: How to Convert the Back Office and Keep Customers and Investors Happy at the Same Time
- 4G BSS Defuses the Mobile Data Traffic Explosion
- 6 Questions on Customer Centricity with TELUS
- Move Over Data Plans, Make Way for Data Experiences