The Session Initiation Protocol is, by far, the most common implementation for Wi-Fi-based phones. SIP runs over the User Datagram Protocol (UDP), and so can be run over any IP-based network. Transmission Control Protocol (TCP) is also an option, though it is not commonly used for plain SIP, given the shortness of a standard SIP message. SIP was created by the Internet Engineering Task Force (IETF), the group that standardizes basic protocols such as TCP, Hypertext Transfer Protocol (HTTP), and Transport Layer Security (TLS), among many others. The definition for SIP is in IETF RFC 3261.
SIP is loosely based on the concepts of another popular Internet protocol, HTTP, used by web browsers and servers to access web pages. This means that SIP is constructed around a request and response model, where one side sends a request for an action for a particular resource, and the other side reports with a response, complete with response code. Every SIP device has the ability to operate as a requester and as a responder, depending on which device is initiating the specific request/response exchange. Furthermore, every SIP message is in text, and so is theoretically human readable. (When you see some of the text that is used in a SIP message, you may beg to disagree!)
The goal of SIP is to provide a simpler method, compared to the prior H.323 protocol and others, for performing the basic tasks of call signaling. The introduction of SIP opened up the development of soflphones, or applications that run on computers and devices not originally designed for telephone usage, to interact with calling services and act like real phones. Even the Microsoft Messenger got into the act and used SIP for instant messaging and chat. The most interesting part of SIP is that a world of open-source or low-cost applications came into the industry, spurred on by its simpler, easier-to-use interface, free from significant intellectual property encumbrances. Now, major digital PBX vendors have gotten into the act, offering SIP services on their systems, eager to allow nontraditional devices onto networks created by their equipment.
1 SIP Architecture
The SIP name for a handset is a
user agent. A user agent is an endpoint in the SIP communication, applying to both handsets and servers, and is capable of dialing out or
receiving phone calls. User agents have IP addresses, and also have users. The users are identified by a SIP
Uniform Resource Identifier (URI). These look like web URLs, and are based on the same concept, but apply to the domain of telephone calls, rather than web servers.
A URI for a typical caller might look like the following:
This looks like an email address, but it is preceded by the "sip:" marker (in the same way as web pages are proceeded by the "http:" marker). The 5300 marks the phone number, and the @ sign and everything following it represents some notion of the system that the phone number lives on. More often than not, users can ignore the @ sign and the remainder of the string, and concentrate on the phone number before it, just as email users can on a corporate email network. The fact that the URI looks a lot like an email address lets you know that SIP can also use text "phone numbers," such as
which requires the phone user to be able to type in letters rather than numbers, but performs the same way.
SIP phones register their presence with a SIP registrar Registrars perform one of the major functions of the PBX, which is keeping track of phones, the users, their capabilities, and locations. The registrar is how one phone knows that another phone exists. When a phone is first turned on, or it changes IP addresses or networks, it registers itself with the registrar. Before doing that, phone calls placed to that number will be rejected, or possibly sent to voicemail. After registration, however, a phone call to the number will be sent to the registered phone.
This raises the question of whom a phone sends requests for phone calls to. The registrar needs to get its registrations out into the network, so that a placed phone call can find its way to the right party. This is where the second concept, the SIP proxy, comes in. The SIP proxy's job is to take requests for phone calls, look up the location of the called party in some database—the one created by the registrar would be ideal, but not required-and forward the call signals appropriately. In this sense, the SIP proxy is the switch, or PBX, for the signaling protocol. Registrars, in fact, are generally integrated into the SIP proxy, making for one device that performs the functions expected of a PBX, including endpoint, or extension management, permissions-checking, logging, and so forth.
The SIP proxy is called a proxy, however, because it does not exist transparently in the process. Rather, its job is to act as a server for the calling party and a client for the called party, responding to the caller's requests by creating nearly identical ones of its own and sending them to the called party. This looks a lot like a web proxy, which is intentional. We will get to the mechanics of SIP signaling shortly.
SIP does not get involved with the actual carrying of voice. In fact, it is not voice-specific, and works just as well for video calls. We will look at how the SIP signaling protocol specifies the different bearer protocol used for the voice (or video) call. One other thing that SIP was not designed to do is phone conference management. SIP is fundamentally call-based, and so is great for phones setting up a call into a conference server. However, the conference server is expected to have some other intelligence built on top that lets it tie the calls together into a conference and knows which users manage the conference and which do not.
Figure 1 shows the architecture diagram for SIP, mapped to the standard PBX model.
SIP is based on the concept of a caller inviting the other caller to join the call. Once the invitation goes out to the proxy, who knows where the other party is located, the endpoints and the proxy exchange messages until the call is established. Each invitation, and its successful response, both carry information that is used by other, non-SIP parts of the phone, to establish the bearer channels of the call. Invites are not just for new calls. A phone is allowed to send a new invite to a party while it is in the middle of a call to that device. This would be done when the caller wants to renegotiate the bearer channel, or perhaps to tear it down, such as when a call is placed on a silent (no music) hold.
SIP is heavily oriented toward the notion of the proxy. The proxy, being the switch or PBX, can take care of complex routing decisions that phones should not be bothered with. One wrinkle to this is what two phones do once they find out about the other one's addresses. Some SIP proxies will allow the contact information (which IP address an extension is currently at) to pass through the call, from one party to the other. This allows the two endpoints to take over after the call is set up, and exchange messages exclusively with each other. In this description, however, we will focus on proxies that intentionally hide the addresses of one side from the other. Doing so ensures that the PBX is always a party to every call, making network design simpler and enabling the PBX to support a larger number of features than if the clients communicated peer-to-peer.
Media gateways appear, in SIP, just as ordinary endpoints. The difference lies in how the registrar and proxies treat them. The proxy will know to forward all phone numbers in the dialing plan that must go to the next network (such as outside calls) to the media gateway, as if the gateway had registered for that number. Incoming calls from the other network operate in the same way as outgoing calls do from a phone: the call is routed to the proxy. In this way, the same protocol can work for bundles of lines or general routes as easily as it can for simple devices.
SIP includes provisions to allow for user authentication, and for encryption of parts of the packets.
2 SIP Registration
As mentioned before, the SIP registrar knows about the existence of a phone by the process of registration. When the phone is turned on, or when it changes its network address, or when its old registration has expired and it needs to refresh it, the phone sets up a SIP request to the registrar. This means that the SIP phone must know which IP address the registrar is at, as that registrar becomes the constant point of contact for the network.
Because registration is so important, we will use the SIP registration process as our way of understanding the format of SIP messages. For the examples in the section on SIP, we will use the following:
Let's look at our first SIP message, then. SIP is sent in UDP packets to port 5060, and so the contents in Table 1 show the payload of the UDP packet, sent from Phone 1's IP address at 192.168.0.10, port 5060, to the registrar's IP address at 10.0.0.10, port 5060.
Table 1: SIP REGISTER requestREGISTER sip:corp.com SIP/2.0 Via: SIP/2.0/UDP 192.168.0.10:5060;branch=z9hG4bK1072017640 From: "7010" sip:7010@corp.com;tag=915317945 To:"7010"<sip:7010@corp.com> Call-ID: 1422523958@192.168.0.10 CSeq: 1 REGISTER Contact:<sip:7010@192.168.0.10>;expires=3600 Max-Forwards: 70 Content-Length: 0
|
This is all text, with newlines given by a carriage return and linefeed, just as with HTTP. It is structured the same way, as well.
The first line begins with the action, in this case, to "REGISTER." The URI for the registration is that of the registrar, which is "sip: 10.0.0.10". Finally, the version is "SIP/2.0", meaning, understandably, SIP 2.0. This message is a request to register with the registrar.
The rest of the lines are presented as SIP (HTTP) headers. That is, there is the text string naming the header, followed by a colon. The first header is the Via header, identifying the most recent sender of this message. Remember that all messages could potentially be proxied in the protocol, and the Via header allows the receiver to understand why the IP sender of the message is involved in sending it, especially if the From line doesn't match. In this case, the Via header just specifies the phone who sent the REGISTER message, as no one proxied it. The line can be broken down as follows. "SIP/2.0/UDP" just repeats that the phone sends UDP. "192.168.0.10:5060" is the IP address and UDP port of the phone. With this information, the recipient—the registrar—knows that the response has to go to 192.168.0.10:5060 using SIP 2.0 on UDP. The registrar has to use this, and not the IP and UDP sender (which is identical, of course), as this allows messages to be routed in stranger ways. Think of the Via as a "Reply-to" header from email. The last piece, the "branch" part, specifies a unique identifier for this request/response transaction. (The semicolon sets the branch and other pieces that might follow aside from what came before it, and the equal sign sets the value of the branch, until the end of the line or another semicolon.)
Because UDP has no real concept of a connection, this branch parameter is used to establish that concept.
The next line is the From line, which specifies the identity of the user agent making the transaction. This line looks like a From email header, for good reason. The quoted "7010" is the user-displayable phone number. Just as with email addresses, in which the account name may be "
bob@corp.com" but the person's name would be "Bob Baker," a user might have a different name that the callers see than that of the SIP account he uses. The "〈
sip:7010@corp.com〉" is the URI for the account, set aside in angled brackets. Finally, the "tag" serves the purpose of identifying the overall call sequence for this series of requests. Whereas the branch strictly identifies the request/response pair, the tag identifies the entire sequence of requests and responses that make up one action between callers.
The To line is similar to the From line. Here, there is no tag yet, because the "called party"—because this is a REGISTER, that party is just the registrar, and there is no real call from a user's point of view—is required to pick its own tag.
The Call-ID is unique for the particular call from that caller, and is given in the similar email-address format, with the IP address of the caller defining the part after the @ sign.
The CSeq field defines where we are in the back-and-forth of the particular action. The value of "1 REGISTER" tells us that this is message one of the handshake, and this is a REGISTER message. These are useful for human debugging of call problems, as it tells you where you are in the process, even if the earlier parts of the process are missed.
All of this previous stuff is just mechanics. The important part of the REGISTER message comes now. The Contact field tells the registrar that this is a registration for "〈
sip:7010@192.168.0.10〉", meaning that the phone number is at 192.168.0.10, and goes by the name "7010". It is actually possible for one user agent to have multiple phone numbers, and this registration is for the one and only one phone number here. The "expires" tag states that the registration expires 3600 seconds, or one hour, from now.
The Max-Forwards header just states that any intervening proxy can proxy this message, for a total of 70 times, after which, the message is dropped. This protects the network from times when a proxy might be misconfigured to forward a message back along the path from where it came.
The Content-Length states that there is no SIP message body. Message bodies are used in INVITEs, which we will see later.
Now that the registrar has received the request, it will send a response. The response lets the client know that the registration went well, or had an error. Table 2 has the response.
Table 2: SIP REGISTER responseSIP/2.0 200 OK Via: SIP/2.0/UDPÞ 192.68.0.10:5060;branch=z9hG4bK1072017640;received=192.168.0.10 From: "7010"<sip:7010@corp.com>;tag=915317945 To: "7010"<sip:7010@corp.com>;tag=as7374d984 Call-ID: 1422523958@192.168.0.10 CSeq: 1 REGISTER User-Agent: My PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Expires: 3600 Contact: <sip:7010@192.168.0.10>;expires=3600 Date: Tue, 27 Jan 2008 00:25:14 GMT Content-Length: 0
|
The first line has the response. "SIP/2.0" is the version, but more importantly, "200 OK" means that the response was a success. Registrars can fail with different codes, such as
when the extension is not known by the registrar. That would happen when the phone really does not belong to this particular telephone network.
The Via, From, and To fields serve the same purpose as before. Note that the To field now has a tag. The From field is still from the original caller, even though this is a response, and the To field is still from the called party. The Via line, and the indented one following it, are all one line in the packet: the description will use Þ to mark that the following line continues the current one.
The CSeq field has not changed, as this is still the REGISTER request/response pair.
The User-Agent is the vendor name of the registrar. In this case, it is "My PBX." Because you cannot go out and purchase "My PBX," you may have to settle for someone else's.
The Allow header states what types of actions a caller can request from the registrar. We will look at the more important ones—INVITE and BYE—shortly.
The Expires header states that the registrar agreed with the client, and is going to let the registration live for 3600 seconds. The registrar can shorten this amount, if it so chooses and the client will lose its registration if it does not come in before then. You will notice that the Expires information repeats as a separate field in the Contact header. There are, unfortunately, still multiple ways of encoding exactly the same information in SIP, and there will be a lot of redundancy. The problem, of course, with redundancy, is that no one really knows how a complex system will work if the redundant information actually changes.
Finally, the date of the response came in. Now, the registrar knows the phone exists, and the phone can make and receive calls. One good thing about registration is that the phone
number and IP address bindings can be looked up in the PBX by the administrator. SIP-aware intervening networks, such as some Wi-Fi systems, also track this state and show which phones are wireless and where they are.
3 Placing a SIP Call
Our user, 7010, wants to place a call to his already registered coworker, 7020. In SIP, this process belongs to a request-response series kicked off by an INVITE. The idea behind the invite is that the caller invites the called party into the call.
Phone calls always start off with the user dialing the called party. Once that is done, the call should ring until the called party answers, or until the system forwards the call to voicemail or delivers a busy signal. Because there are phases of the call setup process—ringing, then a forward to voicemail—we will see how SIP deals with multiple responses for the same request. Figure 2 shows the messages that are exchanged in setting up the call, in order from 1 to 9. The top and bottom half of the diagram are of the same equipment, but are
separated to make the arrows clearer. Each arrow represents a message being sent from one part to the other. The black lines are for messages that represent the main setup messages, representing the call being placed or answered. The dashed lines are for messages that inform the caller of the progress. The dotted lines represent messages that are necessary for the protocol to work, if a message gets lost, but do not carry meaningful information about the call itself.
Table 3 shows the outgoing request for a phone call. This is delivered to the proxy, as the calling phone has no idea where the called party is. The proxy will ask the registrar for details, and will then forward the message appropriately.
Table 3: SIP INVITE Request from Caller to ProxyINVITE sip:7020@corp.com SIP/2.0 Via: SIP/2.0/UDP 192.168.0.10:5060; branch z9hG4bK922648023 From: "7010"<sip:7010@corp.com>;tag=1687298419 To: <sip:7020@corp.com> Supported: replaces,100rel, timer Call-ID: 2114455679@192.168.0.10 CSeq: 20 INVITE Session-Expires: 1800 Contact: <sip:7010@192.168.0.10> Max-Forwards: 70 Expires: 180 Content-Type: application/sdp Content-Length: 217 v=0 o=7010 1352822030 1434897705 IN IP4 192.168.0.10 s=A_conversation c=IN IP4 192.168.0.10 t=0 0 m=audio 9000 RTP/AVP 0 8 18 a=rtpmap:0 PCMU/8000/1 a=rtpmap:8 PCMA/8000/1 a=rtpmap:18G729/8000/1 a=ptime:20
|
So Table 3 shows a substantially more involved message than we saw with registration. Let's walk through it.
The first line states that this is an "INVITE", asking for URI "
sip:7020@corp.com", or the phone number 7020. So now we know who the call is for.
The To field repeats that the message is really destined for URI "
sip:7020@corp.com".
The CSeq field is "20 INVITE". This INVITE started at 20. There may be multiple INVITEs coming for this call, depending on what happens as the call progresses. So 20 is a starting point.
The Session-Expires header means that the entire session to start a phone call, including future changes, shouldn't live for more than 1800 seconds, or a half-hour. This has no influence on the endpoints, but the proxy may be tracking the call, and this expiration lets the proxy know when the call can be flushed from the system, if, for some reason, one side or another disconnects from the network without informing the proxy.
The Expires header, on the other hand, states how long the calling phone is willing to wait until it gets a yes-or-no answer from the other party. After 180 seconds, or three minutes, the caller will give up and cancel the invitation.
The Content-Type field states that there is a body to this message, and it is in the Session Description Protocol (SDP) format. As mentioned earlier, SIP has no understanding of voice traffic itself. It is charged only with setting up calls. The SDP, on the other hand, describes how to set up the voice traffic. We'll talk more about SDP in the section on bearer protocols, but for now, it's enough to note that the rest of the packet is for the client asking for an Real-time Transport Protocol (RTP) session with voice.
Table 4 describes the response from the proxy back to the caller. The code is "100 Trying", meaning that the proxy has more work to do, and a better result will be provided shortly. In the meantime, the calling phone may show some state reflecting that the call is being attempted, but no ringing should be heard by the caller.
Table 4: SIP INVITE Trying Responses from Proxy to CallerSIP/2.0 100 Trying Via: SIP/2.0/UDPÞ 192.168.0.10:5060;branch=z9hG4bK922648023;received=192.168.0.10 From: "7010"<sip:7010@corp.com>;tag=1687298419 To: <sip:7020@corp.com> Call-ID: 2114455679@192.168.010 CSeq: 20 INVITE User-Agent: My PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Contact: <sip:7010@192.168.0.10> Content-Length: 0
|
The proxy will also proceed to forward the INVITE request to the called party.
Table 5 shows the INVITE as the proxy has forwarded it. This message, from the proxy to the called party, 7020, looks similar to the original INVITE. There is one major difference, however, besides the Via header. The PBX in use is configured to direct the bearer channel through itself, where it will bridge the two legs of the call, one leg from each phone to the PBX. Therefore, it has substituted its own session description, in SDP, requesting the voice connection to come back to its own IP address. Also note that the CSeq for this invite is 102, not 20. The proxy maintains its own sequence. Finally, note that the URI for the To header has now changed to have the part after the @ sign refer to the phone's IP.
Table 5: SIP INVITE Request from Proxy to Called PartyINVITE sip:7020@192.168.0.20 SIP/2.0 Via: SIP/2.0/UDP 10.0.0.10:5060;branch=z9hG4bK51108ed9;rport From: "7010"<sip:7010@corp.com>;tag=as13a69dc0 To: <sip:7020@192.168.0.20> Contact: <sip:7010@10.0.0.10> Call-ID: 524f6a41059ad64d43dd4cce4b001 b69@10.0.0.10 CSeq: 102 INVITE User-Agent: My PBX Max-Forwards: 70 Date: Tue, 27 Jan 2008 00:25:28 GMT Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Content-Type: application/sdp Content-Length: 281 v=0 o=root 10871 10871 IN IP4 10.0.0.10 s=session c=IN IP4 10.0.0.10 t=0 0 m=audio 11690 RTP/AVP 0 3 8 101 a=rtpmap:0 PCMU/8000 a=rtpmap:3 GSM/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-16 a=silenceSupp:off---- a=ptime:20 a=sendrecv
|
The called party's phone responds to the proxy with the next message, in Table 6.
Table 6: SIP Ringing Response from Called Party to ProxySIP/2.0 180 Ringing Via: SIP/2.0/UDP 10.0.0.10:5060;branch=z9hG4bK51108ed9;rport=5060 From: "7010"<sip:7010@corp.com>;tag=as13a69dc0 To: <sip:7020@172.27.0.20>;tag=807333543 Call-ID: 524f6a41059ad64d43dd4cce4b001 b69@10.0.0.10 CSeq: 102 INVITE Contact: <sip:7020@192.168.0.20> Content-Length: 0
|
The called party's phone will also start to ring. This may not happen if the phone is busy or not available. But, in this case, the phone is available, and is ringing.
Table 7 shows the ringing message sent from the proxy, back to the caller. The caller receives this message, and starts playing a ringback tone for the caller to listen to, as the other party decides whether to answer. Some amount of time passes, and then the called party answers.
Table 7: SIP Ringing Response from Proxy to CallerSIP/2.0 180 Ringing SIP/2.0 Via: SIP/2.0/UDPÞ 192.168.0.10:5060;branch=z9hG4bK922648023;received=192.168.0.10 From: "7010"<sip:7010@corp.com>;tag=1687298419 To: <sip:7020@corp.com>;tag=as500634a6 Call-ID: 2114455679@192.168.0.10 CSeq: 20 INVITE User-Agent: My PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Contact: <sip:7020@10.0.0.10> Content-Length: 0
|
Table 8 shows the "200 OK" message, reflecting that the call has been answered. In the OK message, we can see that the called party states where it wants its leg of the call to arrive to, and how it would like to set up the bearer channel.
Table 8: SIP OK Response from Called Party to ProxySIP/2.0 200 OK Via: SIP/2.0/UDP 10.0.0.10:5060;branch=z9hG4bK51108ed9;rport=5060 From: "7010"<sip:7010@corp.com>;tag=as13a69dc0 To: <sip:7020@192.168.0.20>;tag=807333543 Supported: replaces Call-ID: 524f6a41059ad64d43dd4cce4b001 b69@10.0.0.10 CSeq: 102 INVITE Contact: <sip:7020@192.168.0.20> Content-Type: application/sdp Content-Length: 160 v=0 o=9002 1367913267 845466921 IN IP4 192.168.0.20 s=A_conversation c=IN IP4 192.168.0.20 t=0 0 m=audio 9000 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=ptime:20
|
The proxy will receive this message and patch together the second leg of the call from it to the called party. To do so, the proxy, which has inserted itself between the two endpoints of the call, needs to send an acknowledgment (ACK), as shown in Table 9. This is needed because UDP is a lossy protocol, and there is a possibility that one of the messages didn't get through. If the original INVITE did not get through, the sender will continue to send duplicate INVITEs, every second or so, until the other side gets it. Now, with the OK having been sent, the called party needs to know that the caller knows and is in the call.
Table 9: SIP ACK Request from Proxy to Called PartyACK sip:7020@192.168.0.20 SIP/2.0 Via: SIP/2.0/UDP 10.0.0.10:5060;branch=z9hG4bK632b9079;rport From: "7010"<sip:7010@corp.com>;tag=as13a69dc0 To: <sip:7020@192.168.0.20>;tag=807333543 Contact: <sip:7010@10.0.0.10> Call-ID: 524f6a41059ad64d43dd4cce4b001 b69@10.0.0.10 CSeq: 102 ACK User-Agent: My PBX Max-Forwards: 70 Content-Length: 0
|
The proxy also needs to send the OK through to the caller.
Table 10 shows the message the caller gets. This lets the caller stop playing the ringback tone, and connecting the phone call from itself to the PBX. Now, both callers can hear each other.
Table 10: SIP OK Response from Proxy to CallerSIP/2.0 200 OK Via: SIP/2.0/UDPÞ 192.168.0.10:5060;branch=z9hG4bK922648023;received=192.168.0.10 From: "7010"<sip:7010@corp.com>;tag=1687298419 To: <sip:7020@corp.com>;tag=as500634a6 Call-ID: 2114455679@192.168.0.10 CSeq: 20 INVITE User-Agent: My PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Contact: <sip:7020@10.0.0.10> Content-Type: application/sdp Content-Length: 202
v=0 o=root 10871 10871 IN IP4 10.0.0.10 s=session c=IN IP4 10.0.0.10 t=0 0 m=audio 12482 RTP/AVP 0 8 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=silenceSupp:off--- a=ptime:20 a=sendrecv
|
There is nothing to be done after the ACK, as it marks the end of the call setup protocol.
We can now ask how the proxy participated. The messages you have seen look as if they could have gone directly from party to party. And this is true. However, the SIP proxy and registrar work together to provide a way for clients to find out about other clients, and have
their calls routed to voicemail, the outside world, or to have any other advanced call feature performed. For this to work, the SIP proxy and registrar must work together to form a PBX. Clients are prevented from communicating directly. You may have noticed that at no point do the messages from the caller to the proxy have the IP addresses of the called party, and vice versa. The Contact and Via headers are always kept consistent, and all of the messages are forced to flow through the proxy. Not all proxies act this way, but most proxies with full PBX functionality do.
Table 11: SIP ACK Request from Caller to ProxyACK sip:7020@corp.com SIP/2.0 Via: SIP/2.0/UDP 192.168.0.10:5060;branch=z9hG4bK1189369993 From: "7010"<sip:7010@corp.com>;tag=1687298419 To: <sip:7020@corp.com>;tag=as500634a6 Call-ID: 2114455679@192.168.0.10 CSeq: 20 ACK Max-Forwards: 70 Content-Length: 0
|
4 A Rejected SIP Call
Sometimes the called party does not want to answer the call. This happens when the phone call comes in and the user is busy. The process looks very similar to a successful SIP call, except that, instead of the called party sending a 200 OK message, it sends a different one.
Figure 3 shows the call setup flow. The change is that the OK is now replaced with a "486 Busy Here" message, which means that the called party is busy at the endpoint they were called at. The proxy could have used this status to forward the call onto a voicemail system, by either sending a new INVITE to the voicemail system (if it has a separate user agent) and bridging those two legs together, or by sending a "302 Moved Temporarily" response to the client and expecting the client to try the voicemail.
Table 12 shows the rejection that the called party will send if the user declines to take the call. The proxy will continue with its ACK, as before.
Table 12: SIP Busy Here Rejection from Called Party to ProxySIP/2.0 486 Busy Here Via: SIP/2.0/UDP 10.0.0.10:5060;branch=z9hG4bK51108ed9;rport=5060 From: "7010"<sip:7010@corp.com>;tag=as13a69dc0 To: <sip:7020@192.168.0.20>;tag=807333543 Call-ID: 524f6a41059ad64d43dd4cce4b001 b69@10.0.0.10 CSeq: 102 INVITE Content-Length: 0
|
Next, the proxy will send to the caller a busy message (Table 13), stating that the called party is busy at every user agent that it is aware of, which, for this example, is the one at 192.168.0.20. Because the proxy knows that the called party has no other alternatives, it changed the message code from "486 Busy Here" to "600 Busy Everywhere".
Table 13: SIP Busy Everywhere Rejection from Proxy to CallerSIP/2.0 600 Busy Everywhere Via: SIP/2.0/UDPÞ 192.168.0.10:5060;branch=z9hG4bK922648023;received=192.168.0.10 From: "7010"<sip:7010@corp.com>;tag=1687298419 To: <sip:7020@corp.com>;tag=as500634a6 Call-ID: 2114455679@192.168.0.10 CSeq: 20 INVITE User-Agent: My PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Supported: replaces Content-Length: 0
|
5 Hanging Up
Once the SIP call is going, there are no response codes that can hang the call up. Rather, the party who hangs up first sends a new request to the proxy, to disconnect the call. This type of message, called a BYE request, tells the receiver that the call is now over.
Table 14 shows the BYE message, sent when the caller hangs up the phone call. The proxy server will respond to the BYE with a "200 OK" response, which is similar to the ones we have seen before, except without any SDP description of the bearer channel. Then, the proxy server will send a BYE to the called party, and wait for the OK from there. Unlike with the call setup, the OK message is enough to mark the end of the teardown.
Table 14: SIP BYE from Caller to ProxyBYE sip:7020@corp.com SIP/2.0 Via: SIP/2.0/UDP 192.168.0.10:5060;branch=z9hG4bK1818649196 From: "7010"<sip:7010@corp.com>;tag=1687298419 To: <sip:7020@corp.com>;tag=as500634a6 Call-ID: 2114455679@192.168.0.10 CSeq: 21 BYE Max-Forwards: 70 Content-Length: 0
|
6 SIP Response Codes
The protocol has a number of response codes, some of which are more esoteric than others. We'll go through the main ones, one by one, to get a better sense of what can happen.
In-Progress Codes
100 Trying: sent by any endpoint, when it needs to inform the requester that it is handling the request, but might not return quickly enough to prevent the sender from thinking the message was lost.
180 Ringing: sent by the called party while it is ringing the phone.
181 Call is Being Forwarded: might be sent by a proxy to the caller when the call is being forwarded to another destination, such as a second handset or voicemail. You may never see this one.
182 Queued: sent by a server running a phone bank, informing the caller that the call is on hold to be answered. It can also be sent at a chokepoint when lines are busy but one of the proxies does not want to give up on establishing the call just yet.
Success Code
Redirection Codes
301 Moved Permanently: sent by a proxy (or possibly an endpoint) when a user has permanently changed his or her phone number to something else. The Contact header should have the forwarding address.
302 Moved Temporarily: sent when the call is being forwarded. The Contact header will have where the call needs to be forwarded to, and the caller must start a new call to that forwarding address.
Request Failure Codes
400 Bad Request: sent when the SIP message is not formatted properly. This is a good sign of interoperability issues between advanced features of different products.
401 Unauthorized: sent by a device when it requires SIP authorization.
403 Forbidden: sent for a number of reasons. Usually, this happens when a caller is not allowed to use a certain feature, perhaps because of access rights in the proxy. This also gets sent when unknown endpoints try to register and are not provisioned. Finally, this status can also be thrown in when devices, for whatever reason, do not wish to handle the request. Some phones or proxies send this when the other side is busy.
404 Not Found: sent by a proxy for a number that does not exist.
405 Method Not Allowed: sent by a proxy or an endpoint when they do not want to perform a SIP method for the called party. Devices which do not allow calls, such as registrars when the registrar is not a proxy, can send this, as can proxies that are not registrars and are being registered to. This is a good sign that the phone is misconfigured to use the wrong IP address for some PBX function.
407 Proxy Authentication Required: sent by the proxy or PBX when it requires authentication. The calling phone might respond automatically by authenticating.
408 Request Timeout: sent by an endpoint or proxy when the request just cannot be handled in time. This is often a sign of long delays or problems between specific SIP infrastructure resources, such as a proxy being unable to reach a separate proxy or registrar.
410 Gone: sent when the called party did once exist, but no longer does, there is no forwarding information, and the proxy does not know why the extension is gone. Similar to "604 Does Not Exist Anymore," where the difference between the two will be described shortly.
480 Temporarily Unavailable: sent when the called party's registration has expired, has not logged in, or is not yet electronically ready to receive the call.
482 Loop Detected: sent by a proxy when it sees the same message twice. This is a sign that inter-proxy forwarding is not correctly configured. This concept is similar to an email forwarding loop.
483 Too Many Hops: sent by a proxy when the Max-Forwards header field hits zero. This too is usually a sign that inter-proxy forwarding is not configured correctly, and that there is a forwarding loop.
486 Busy Here: sent by a phone when it is already in a call or does not want to take the call because the user is busy. The proxy receiving the message can try the next registration for the extension, or can try forwarding the call off.
491 Request Pending: a rare message, which happens when INVITEs cross pass for the same two clients from opposite directions. If you see this message, it is usually because of a coincidence, rather than misconfiguration.
493 Undecipherable: the SIP security model cannot decrypt the message.
Server Failure Codes
500 Server Internal Error, sent when the SIP protocol is not being followed correctly by the requester, or when there is something wrong with the endpoint receiving the request, such as misconfiguration or an error.
501 Not Implemented: the recipient of the request does not implement the method.
502 Bad Gateway: one proxy did not like the response it got from another proxy, and wishes to report it to the caller. Usually a sign of misconfiguration or incompatibility between the gateways.
503 Service Unavailable: sent when the proxy or endpoint is not fully configured or is undergoing maintenance.
504 Server Time-out: sent by a proxy when it cannot access a non-SIP service, such as DNS, in a timely manner.
505 Version Not Supported: the requester is using a version of SIP too old or too new for the other end.
Global Failure Codes
600 Busy Everywhere: sent by a proxy when it has tried every endpoint for the extension, they are all busy, and there is no place to route the call. This should generate a busy tone or busy indication, for the caller.
603 Decline: the called party does not want to participate, and does not want to explain why.
604 Does Not Exist Anymore: similar to "410 Gone," but is definitive. The called party no longer exists, because the extension was deleted.
606 Not Acceptable: sent by the endpoint when it cannot handle the media stream being requested. This can happen when an IPv6 phone (with no IPv4 address) calls an IPv4 phone, or when a videophone calls a standard audio phone, for example.
This list is not exhaustive. Moreover, SIP is a very flexible protocol, subject to a number of different interpretations. Because of this, different phones and different proxies might send different messages, depending on how they interpret the standard and what devices they interoperate with will do when they receive the message.
7 Authentication
SIP authentication is based on the challenge-response protocol used in HTTP, and llows username and password authentication into the registrar, proxies, and services. With IP, the concept for authentication is that the registrar or proxy will send one of the status
messages that requires authentication (401 or 407), and the client or upstream server will respond.
The authentication method is based on the WWW-Authenticate header. The more common authentication method is using the digest scheme. The response comes in an Authorization header in the retransmitted request. The digest method uses an MD5 hash over the username, password, realm, and resource that is being accessed, along with a nonce from the responder and a nonce from the requester.
Table 15 shows an example authentication challenge header. The realm is "
corp.com". "qop" asks for the specific algorithm to be used to calculate the hashes. The nonce is, as expected, a random number from the server that proves that the client is alive and is not just a replay of an older, valid session. The opaque value is some server-specific information that the client is required to repeat, but has no cryptographic significance.
Table 15: Example SIP Authentication ChallengeWWW-Authenticate: Digest realm="corp.com", qop="auth, auth-int", nonce="235987cb9879241b1424d4bea417b22f71", Opaque="aca9787a9e87cae532a685e094ce8394"
|
The client will perform a series of MD5 hashes on this information, as well as the resource being requested, and will add the results into the Authorization header shown in Table 16.
Table 16: Example SIP Authentication ResponseAuthorization: Digest username="7010", realm="corp.com", nonce="235987cb9879241b1424d4bea417b22f71", uri="sip:7020@corp.com", qop=auth, nc=00000001, cnonce="3a565253", response="d0ea98700991 bceObdObaOeabl beccO1," Opaque="aca9787a9e87cae532a685e094ce8394"
|
Authentication, by itself, does not provide any privacy or encryption for the traffic or prevent modifications of the parameters to any message. But it does prevent accidental misuse of resources.
8 Secure SIP
Full hop-by-hop security is possible with SIP, using what is called
SIPS. This extension uses TLS in essentially the same way as HTTPS does. URIs are modified from "sip:" to "sips:", just as "http:" is modified to "https:" with SSL (noting that SSL is just an older version of TLS).
Using SIPS requires that both endpoints communicate using TCP. When the TCP connection is connected, the server will use TLS to request that the client authenticate. The client will then use its TLS credentials to authenticate, including using no credentials as an option, akin to the typical use of HTTPS in which only the server needs to authenticate using TLS. Once TLS finishes, both the requester and responder have a master key, which they use to encrypt the session. Within the session, password-based authentication can be used using the SIP digest authentication method. With the connection being encrypted with TLS, both parties are provided confidentiality and protection from forgeries or modification.