The IETF’s Session Initiation Protocol (SIP) is gaining momentum alongside increased interest in Internet telephony—which is expected to boom with VoIP and other services that integrate multimedia with email, buddy lists, instant messaging and online games.
By definition, SIP is a standard protocol for initiating interactive user sessions involving multimedia, such as video, voice, chat, gaming and virtual reality.
Unlike H.323 and other vertically integrated protocols, SIP is designed to expand to an infinite number of channels, using far fewer messages and less bandwidth to set up and tear down calls. As a result, carriers are expected to roll out the SIP infrastructure and trunking for sending data and voice packets across backbones.
What It Offers
What carriers are recognizing is SIP’s developer-friendly nature. As a protocol modeled after HTTP, SIP leverages approaches and techniques used by Web developers for years; hence, it is expected to spur the development of Internet, Web and email applications for wireless, wireline and PC-based services.
SIP uses URLs and email routing mechanisms, supporting multipurpose Internet mail extensions (MIME) and images, MP3s and Java applets. Using these protocols, it can redirect users to multiple phones in much the same way users are redirected on Web pages.
“As a result, you will be able to make a call from your desk phone, and in real time switch the call to your cell phone or handheld device if you have to leave the desk and walk outside the building. You could carry on the call either via voice or IM, or video,” says Trefor Davies, head of next-generation network strategy and director of standards-based activities at Mitel.
How It Works
In SIP, addresses are URLs, such as sip:???? @billingworld.com. So, to initiate a session, the caller—known as the user agent client (UAC ), sends a request (an INVITE) addressed to the person to whom the caller wants to speak. (SIP also has been upgraded to carry phone numbers through a new telephone URL [tel:5551212, for example, which can be punched in through telephones possessing 12-key entry.) SIP then takes dynamic location information into account in order to locate users, whether mobile or sedentary on a home- or work-based PC, or an IP desk phone. Once the user is identified, SIP delivers a description of the session to which the user is invited through a MIME that describes content—whether HTML, audio or video.
A feature called “forking” enables a proxy to receive a single INVITE request, or enables it to send out more than one INVITE request to different addresses, thus allowing a session initiation attempt to reach multiple locations in order to locate the desired user.
Once messages are sent (directly to proxy servers, not the called party), the called party sends a response, accepting or rejecting the invitation. That response is then passed back through the same set of proxies, in reverse order. Requests can be sent through any transport protocol, such as UDP, SCTP or TCP.
The most common protocol used to describe sessions is the session description protocol (SDP). SIP can be used to modify the session as well, as the originator can reinitiate a session by sending the same message with a new session description, thus enabling the addition or removal of audio streams, video and codec—all supported via SDP.
Finally, SIP can be used to terminate the session.
SIP was first delivered with voice in mind, but enhancements are rolling out for IM and buddy lists for text messages across networks. This will enable people to use buddy lists, IM, wireless and wireline, regardless of the network, whether AOL, Yahoo, Microsoft or another.
For example, MS Messenger users will be able to send messages to AOL IM users. “Soon it will decipher where buddies are located, their willingness or ability to talk via voice, messaging or video, and then the ability to automatically initiate a text, audio or video conference,” says Davies at Mitel.
“If a group of executives wants to have a conference call, they can have automatic call creation the moment each person respectively shows up in his office, lands at the airport, gets to her home,” says Jonathan Rosenberg, a chief technologist at Dynamicsoft, who has co-authored the SIP protocol as a chair of the IETF’s IP Telephony working group, and former co-chair of the SIP working group. The technology, he believes, will get to the point that networks will monitor where everyone is, and send IMs announcing the conference call’s inception and asking whether people can participate at that moment. Additionally, enhanced caller ID will emerge. “Today you see a name and phone number, but with SIP you will see a thumbnail picture of the person or an MP3 file,” thus opening up a world of richer content, says Rosenberg.
Presence information will also enable users to make themselves available or unavailable to certain callers on all or selected devices. With a SIP-enabled phone, you can register with SIP service providers, so that data about your signaling on that call is automatically carried on your phone. Users determine how they want information given out.
“If you are on someone’s XP buddy list, you can see if your buddy is available and on what device, whether PDA or cell phone or phone. The buddy can control what you see, so if he wants co-workers to contact him for internal conferencing, yet wants to block his mother-in-law, he can do so,” says Rosenberg. SIP will also enable presence information to be given to designated external parties.
“You can enable your assistant to know what conference room you are in, but block competitors from knowing that you are in negotiations with a certain company,” says Davies, who notes that SIP makes it possible for manufacturers to develop multiple applications in the same environment. “In two days, we downloaded software and developed a program to enable SIP video conferencing with proprietary systems from multiple vendors,” says Davies, explaining how Mitel built a SIP interface to Nuance Communications voice recognition products’ engines. “By doing that, we easily developed a button that glows on a regular phone if the other party wants to change to a video capability. You press the button, and it automatically sets up a video conference.”
The expectation is that an Internet phone call will use SIP protocols, as 3GPP and 3GPP2 have now selected SIP for call control and call establishment on packet networks. “On handsets, a phone call will be SIP-based in the next couple years,” says Rosenberg. He notes that SIP will be pervasive now that Microsoft has announced it will support SIP for chat and IM in its Messenger program, shipping with Microsoft Windows XP.
Despite its huge win with Microsoft and impending rollout with 3GPP, Rosenberg concedes there are still enhancements to be made, particularly in terms of compression technology. Slated to come out this summer under the name SigComp, the SIP compression enhancement is designed to expedite deployment of SIP in mobile networks. “We intend to deliver it to 3GPP this summer,” notes Rosenberg, who is co-creator of SigComp, along with John Peterson of NeuStar.
Until now, there have not been any compression algorithm standards, but with bandwidth becoming a precious resource, SIP will have to be more conservative in bandwidth usage, as messages traditionally could be 800 to 900 bytes long. Consequently, the IETF has been working on signaling compression to reduce bandwidth. The result has been a universal decompressor virtual machine (UDVM)—ostensibly, a Java virtual compression virtual machine. “Now, when a handset talks to a network, users can download the compression algorithms of their choice on the fly,” says Rosenberg. The result, he says, will be free choice, as network elements can read algorithms from any of the major manufacturers, so users don’t have to be locked into one or the other.
Standards Watch: SIP: Finding a Common Language