Internet Multimedia is a very happening field. Lots of work is going on to improve the quality of multimedia on the network. SIP (Session Initiation Protocol) is a signaling protocol used to create, manage and terminate sessions in an IP based network. A session could be a simple two-way telephone call or it could be a collaborative multi-media conference session. This makes possible to implement services like voice-enriched e-commerce, web page click-to-dial or Instant Messaging with buddy lists in an IP based environment. Don’t worry if you don’t know about these services. You don’t need to know them before you learn about SIP.
Users may move from terminal to terminal with different capabilities and change their willingness to communicate. To setup a communication session between two or more users, a signaling protocol is needed: Session Initiation Protocol (SIP) supports locating users, session negotiation (audio/video/instant messaging, etc.) and changing session state. So now you can go to a hotel and let your hotel phone ring ….really are you forwarding all the calls …lets wait before making the conclusion.
I am not going to waste your time with history lesson, so lets discuss what SIP has and how it works before before discussing how can it help you in saving some cost.
SIP is signaling protocol have nothing to do with actual voice payload(Real Time Protocol – RTP), so what you need to have a SIP network , a SIP registrar accepting registration requests from users maintains user’s whereabouts at a Location Server. SIP Proxy Server relays call signaling, i.e. acts as both client and server operates in a transactional manner, i.e., it keeps no session state. SIP redirect server redirects callers to other servers.
The good news is that all these components are logical and can reside in one server , I did not included one more component to the mix that is the PSTN gateway ….wait as we need to see if we still like to keep things simple mode.
Entities interacting in a SIP scenario are called User Agents (UA), it can operate as client or server (UAC/UAS) a single UA can work as both.
Client the notion normally associated to the end user , in this case it could be your IP phone, soft phone, which generate a request when you try to call another person over the network and send a request to a server(generally a proxy server), it is the most common type of server in SIP environment. So the client sends the request to a proxy server. The server on behalf of the client (as if giving a proxy for it) forwards the request to another proxy server or the recipient itself.
So the INVITE message arrives and ACK is used to facilitate reliable message and by to terminate. Wait we got Cancel to terminate a request or search for a user, option to solicits information about servers capabilities, register with (s) a user to current location and Info used for mid-session signaling (Do not worry about them all at this stage), remember Invite, ACK and BYE.
The transaction starts with user(Bob) making an INVITE request for user(Chris). But user(Bob) doesn’t know the exact location of (Chris) in the IP network. So it passes the request to a proxy server – 1(Lets call it Bob’s side Proxy server). Server -1 (Bob side) on behalf of Bob forwards an INVITE request for Chris to server-2(Chris Side of Proxy Server). It sends a TRYING response to Bob informing that it is trying to reach Chris. If it didn’t know the location, it would have forwarded it to another proxy server. So an INVITE request may travel through several proxies before reaching the recipient
You are wondering how server-1(Bob side) knows that it has to forward the request to server-2(Chris Side), just hold on for a moment. We will discuss that while going through the registration process of SIP
The SIP phone, on receiving the INVITE request, starts ringing informing Chris that a call request has come. It sends a RINGING response back to server-2(Chris Side) which reaches Bob through server-1(Bob side). So Bob gets a feedback that Chris has received the INVITE request.
Chris at this point has a choice to accept or decline the call. Let’s assume that he decides to accept it. As soon as he accepts the call, a 200 OK response is sent by the phone to server-2(Chris Side). Retracing the route of INVITE, it reaches Bob. The softphone of Bob sends an ACK message to confirm the setup of the call. This 3-way-handshaking (INVITE+OK+ACK) is used for reliable call setup. Note that the ACK message is not using the proxies to reach Chris as by now Bob knows the exact location of Chris
Once the connection has been setup, media flows between the two endpoints. Media flow is controlled using protocols different from SIP e.g. RTP(Real Time Protocol).
When one party in the session decides to disconnect, it (Chris in this case) sends a BYE message to the other party. The other party sends a 200 OK message to confirm the termination of the session.
I also did not forgot about discussion how did how server-1(Bob) knew the location of Chris during the call setup, the page about SIP Registration will help you.
While going through a typical SIP Session you have already seen that the caller doesn’t know the address of the callee initially. The proxy servers do the job of finding out the exact location of the recipient. What actually happens is that every user registers its current location to a REGISTRAR server. The application sends a message called REGISTER informing the server of its present location. The Registrar stores this binding (between the user and its present address) in a location server which is used by other proxies to locate the user.
User(Chris our old friend) uses the IP 126.96.36.199 as its current location and registers it with the server. This actually helps in user mobility. Say there is a messaging application. You can log in from different computers. As soon as you log in using your username, the application REGISTER the username with the IP of that computer. The ‘Expire’ field reflects the duration for which this registration will be valid. So the user has to refresh its registration from time to time. That is how you can provide more mobility and hey now you can let your hotel phone ring too when somebody try to reach you on your office number.
The other important thing is the difference between a proxy server and a registration or a location server is often only logical. Physically they may be situated on the same machine.
Is that a bit long? Need a break? Go, get it! ……………We will have more discussion on saving some money using SIP…..I know what you are thinking SIP Trunking …….Are you….
VoIP readiness testing , are you do to start ….you forgot to implement QOS (Quality of Service) …how can you test something without implementing your traffic priority, lot of people ask the same question and make the same mistake of calling people to conduct testing without even considering QOS and in most of the cases those test do pass so everybody is so happy and network administrator felt so good that he do not have to read the book or call a friend to help him with QOS. The test pass and we got a jazzy report from a unprofessional person who seem to know what he is doing….I can tell you do not count on this kind of approach and also have noticed that many customers do like to test how the report will look like without implementing QOS , so sit back and relax think what was the use of getting result on BE(Best Effort) traffic which work so good at the time of testing. Also if you are testing without QOS how did you prove that MPLS network network did put your VoIP traffic in proper Q. (COS 1) , which is looking for specific marking of the packet and when you did not mark the packet or did not configured your switches and routers the marked packets by the VoIP devices are not going to do anything automatically .
So before you call your readiness testing team to come and put the endpoints and testing points do make sure that the QOS is configured (I will be putting a blog topic on QOS implementation soon…..). Also use the same ports and switches which will be allocated to real VoIP systems so you can connect the endpoint to the port which will be used later for your VoIP phones (Soft phone are another story)…Do you also know your VoIP phone is actually a layer-2 switch so you can tag ports and have a PC connected to your VoIP phone , your phone and PC can be different Vlans….
I know I am making your life complicated but the reality is those are things you need to before make a decision about convergence and selection of your new VoIP network and Vendor.
Do you need professional to come and conduct a simulation to get you (Mean Opinion Score – MOS) of your network and conduct the test or you can do it on your own, the answer is yes you can or you should if you are large enterprise , many tools are available and as I like my blog to be totally vendor independent so I do not like to sell anybody …but I can tell you out of all those only one product do have better hat and it is in the market for long time. So if you are choosing a company to actually conduct testing or you can give a company to take care of QOS implementation and provide you with complete solution so at the end you do not have to go through finger pointing.
Many VoIP companies will not implement a solution till you actually test your network or you sign a waver form as they have learned that most of the issue in VoIP is due to infrastructure issues and maintenance cost is too high and do not forget the customer satisfaction.
As I am trying to pin point some key areas , do like to tell you even though you have a great MOS score in the report, still you have to think about different things as simple as music on hold (MOH), so if you are using G.729 compression it is aimed at voice communication and not at the variety found in musical sounds.
Average person talk about 40% of the time while on the telephone off course you people who take 80% of the talk time in a given conversation, so people think about introducing a feature called Silence suppression that disables the sending of packets that are empty as much of time one party is talking and other party is listening since you are not talking no need to send packets at that time. On the other hand silence suppression may cause the beginning of the words to get clipped.
You need to choose correct codec, packetization interval and silence suppression settings. To make it clear G.711 codec generate about 1280 bits of data in 20 ms sample period. The G.729 codec only generates about 160 bits in the same exact period. This variation is pretty large. The G711 codec literally generate eight times as much data to send the same audio information.
I know you must be thinking why do I have to go through this pain …I am so happy with my current phone system….oh do not forgot the person who was maintaining it getting retired soon……
Baselining the network is the most important thing most of the people think that they do know about the network they are working on for years but they may be totally wrong, before making a decision to deploy converge network it is important to understand your network and give the final health check and that need to happen before you actually start calling anybody to do your Convergence readiness testing (We will discuss later).
Baselining is not estimating so when I say that you know your network you really do not know your network as you have to really conduct a cross section of your network and find out the actual state of your network. You need to start looking at traffic reports, if you do not have a traffic reporting software may be it is time to have one and you can find lot of those on the web.
Let me tell you one fact about Voice, you have to do it right first time or it will give a taste to your users that they will never forget and the cost is to high “Imagine having bad food along with a bad service”.
Once the baselining is complete now you need to start doing some work in designing your infrastructure for your voice, if you already know your weak areas, so start working on them first do not ignore simple things like upgrading your network (Not so simple) to the current version of your Vendor so the support can be available to you when you need and you are not been subject to a ping pong game during your actual issue which may not have any relationship to the problem in hand, but that is what the real world is all about, so here is what you will be doing looking into your configuration, allocating ports for VoIP servers and telephone sets, allocating different VLAN’s for voice and data for easy troubleshooting the issue. You may need to consider a major change in your switching – Layer 2 network by introducing POE switches.
For all the network managers, it may be a time to allocate dedicated resources for this project and try to cross train your staff from the initial stage in that fashion they will gain more troubleshooting knowledge for future.
I cannot put more emphasis on the fact that you need to have a clean network before you start putting VoIP application, start looking at how your ports are configured in the switches (Auto – Duplex setting –when you connect any device make a habit of looking at the switches after every connection to make sure that you have the correct settings).
As you discussing with your Vendor now about your VoIP deployment, do not forget to start asking about Codecs and zoning and which zone will use what codec or whatever term they use, in nutshell you need to determine at this stage what codec you will be running on WAN, is that G.729(Compressed) or do you have enough bandwidth and like to run G.711 .
Once you know what codec you will be running now it is time to establish the call flow in respect to the design of your network as the new VoIP design should also compliment your business flow. Why is that important because you are going to decided how many simultaneous calls you are going to route over your network during peak hours and what will you be doing to the calls over those limits (Are you going to give them busy treatment or route them over PSTN cloud) .
For example you have a T1 circuit between HQ and a branches and your internal traffic route calls out to PSTN from your HQ and they have no PSTN backup at the remote nodes. So before you even start looking at expected simultaneous calls you have to know what protocol are you running , if it is point to point network but your network admin has configured frame relay for fun sake or you may have recently taken MPLS service or planning to use MPLS service , please do ask this question that what kind of link you have at last mile (Metro Ethernet – Frame – ATM…you will be surprise to know what is actually package for you)
Why all that is important because you have to add protocol overhead into your codec to determine how much space it will take in your T1 pipe. (It does not matter you have T1 or T3 or Gig…you still need to know).
Just for calculation sake G.729 – 8 K codec can be using 13 K – 25 K with all the overhead of the protocol along with G.711- 64 K can range from 80 K to 106 K . So it is not about marketing and voice guy trying to tell you about codecs. Now you have idea that how many simultaneous calls you are going to route or expected calls or you can call that projected whatever you want to call it that is the number which you will use now with the codec to be used and start calculating that your circuit can take …you will be surprised that you cannot do more then 13 calls on T1 circuit which was supposed to be 24 channels of 64 K.
But do wait as you can find all that by conducting readiness testing and we will talk about it …stay tuned
QOS in the domain of computer networking, can be defined in many ways but in simple it does provide traffic management in networking arena. In order to understand QOS , some basic concepts need to be understood about Bandwidth and Throughput along with delay, jitter, loss of packets and availability. Bandwidth is a reference to a data rate. If we say that a particular network connection has 10 megabits per second(Mbps) of bandwidth, we are saying that data can be passed across the connection at a rate of 10 million bits per second. The measurement is the bandwidth or the data rate. Stated differently, 10 Mbps represent the rate of data flow. Throughput on the other hand can be translated into actual amount of user data that has a potential of using available bandwidth. For example a connection with 54 Mbps data rate in wireless may have a throughput of only 22 Mbps.
Also looking at the formula Throughput = Bandwidth -(Bandwidth X Overhead) X efficiency
Delay a measurement of time required to move data from one point to another across the network and it need be measured from end to end . So if you are calculating delay for VoIP call it has be from the phone to receiving phone, you can also name this delay as propagation delay. So when you are conducting any testing the placement of your testing endpoint is very important, normally that is the biggest mistake made by testing companies and results will differ when you actually run VoIP.
VoIP delay need be looked from Encoding delay, Packetizing delay and Network Delay. Lets talk about these three areas, when you refer to encoding delay you actually talking about encoding a voice/video codec (i.e. G711 audio etc.), just to add into this G.729 codec uses compressed voice and it does sound all that great but do have more encoding delay but it has reduced network delay. If you now thinking and confused to will G.729 codec is to be used so let me make you comfortable in reality the reduced network delay is greater then increased encoding delay so it does work.
Another important aspect is understanding of Jitter , variation of delay in time it take for a VoIP packet to travel from one endpoint to another. This variation will always be present in the networks. The issue with Jitter is that it makes call drop when this variation increase and jitter buffer cannot hold it at the receiving end. It is also important to understand the capacity of the jitter buffer which is normally set between 60-100 ms. (Check if you are dropping lot of calls).
Packet loss or simply loss is the term used to describe packets that are missing in a stream. Ignoring a packet loss to certain threshold may be another measure you can take to address the issue along with using codecs with higher compression rate or as a last resort upgrading your network. (Which is not a solution for some and it is always not a good practice to upgrade rather using the current infrastructure more effectively by implementing QOS and again that will not resolve all your problems)
We like to have our networks available all the time but we do have face the real world challenges, on the other hand from VoIP prospective it does have a major impact so alternative path and routing is required. You may be able to live without Internet (Can you) for few hours but Voice can affect your business.
You will see more topics on QOS and VoIP readiness …..