Autoscaling WebRTC apps are not at all easy. A lot of discussion on building large-scale WebRTC apps gets stuck on how to scale. There are no straightforward answers available to this question yet. For this reason, we at CentEdge have developed CWLB, a general-purpose WebRTC load balancer using mediasoup as the media server at its core.
When a customer connects with us to help them build WebRTC apps, the conversation goes something similar to this.
Customer: We want to integrate video conferencing capabilities in our existing web app.
Us : Sure. We can help you with that.
Customer: Our requirement is to have 15 person(max) conferencing rooms with recording capabilities.
Us: Sure. We can help you with that as well.
Customer: We want the solution to be super scalable so that even a million rooms can be started at the same time. We have our own data centre and you can run Kubernetes clusters there. We hope this will be fine for scaling requirements.
Us: No. Kubernetes clusters may not be sufficient to scale WebRTC apps considering the stateful nature of them. Also memory and cpu usage may not be the right indicators for indicating server load in this case.
Customer: What is stateful nature? Why memory and cpu usage are not the right indicators?
Us:....
The conversation goes on where we make our customers fully understand the nature of WebRTC calls and the media server parameters which indicate correctly the current load. Towards the end of the conversation, this question of how should the scaling problem be solved, used to remain open for further discussions as we did not have a ready-made answer for this.
After going through a similar conversation several times, we decided to do something about it. 1st June 2021 is when we started working on a general-purpose load balancer to auto-scale WebRTC apps. After a year of considerable effort, we have successfully developed the load balancer to auto-scale WebRTC apps. We call it CWLB, which stands for Centedge WebRTC Load Balancer. CWLB supports both horizontal as well as vertical autoscaling. Mediasoup is the media server used behind the load balancer to scale WebRTC apps and it currently supports AWS as the cloud provider to create/delete on-demand mediasoup media servers.
Before moving on to discuss more CWLB, we will elaborate on some keywords which we mentioned in the above para for a better understanding of the context.
Why Autoscaling?
The first important question is, why does one need autoscaling? Because one needs more video rooms simultaneously which is beyond the capabilities of a single media server. Let's look at an example.
A c5.2xLarge instance of aws (8vCPU & 16Gb RAM) can handle either one large 50-person conference room or 10 small 5-person conference rooms. Once the server is on full load, it can't cater to any new room creation requests until the rooms running in it are closed. One option is to run multiple simultaneous servers to handle more load irrespective of whether new room requests are coming or not. In this case, it will be huge wastage of resources as one has to pay the server bills while the servers are idle most of the time.
There may also be instances where servers may be required only at a specific time but not all the time. In this case, one needs to manually create new servers just before they are needed and create a mechanism to route new room creation requests to the newly created servers. Once the need is over, again the servers are needed to be shut down and closed manually. This is still okay if the demand for video room creation is predictable as one will get the time for the creation of new servers but it is nearly impossible if the room creation time s highly unpredictable. An example of a predictable load is a church prayer service that happens every day at the same time or a scheduled board meeting that happens every week / every month on a specific date and time. These kinds of services give one ample time to create new servers to cater to these prescheduled demands. An example of an unpredictable service is a teaching-learning app where any teacher can log in at any time to start a room. In this case, the room creation requests are so random that one won't get any time to create new servers. Therefore it is impossible to scale manually in case of an unpredictable load.
To solve the above-mentioned problems, a load balancer is used in front of the media servers whose job is to distribute the incoming load among the available servers based on a predefined algorithm. If no more media servers are available, then create new ones. If some of the media servers are idle, then delete them so that valuable resources can be saved. A load balancer is a must to cater to unpredictable load scenarios. Also, it is good to have for predictable load scenarios because it saves a lot of manual effort while minimizing the chance of error happening from the manual effort.
Is Autoscaling mandatory?
No. It is not mandatory for all kinds of WebRTC applications. When an application has a finite amount of load and also the load is predictable, then autoscaling may not be needed in this case.
Example:
If you have a small school with 100 students in 5 grades which makes 20 students (approx.) in each grade. In this case, an 8vCPU X 16GB RAM server running for 24 X 7 should be economical as well as sufficient enough to handle the peak load of all 5 grades running their classrooms simultaneously. For this kind of use case, adding a load balancer will add a lot of complexity and cost rather than saving it.
Why mediasoup?
Because mediasoup is one of the most capable media servers available out there today with high-performance metrics. It has many cutting edge features like
Simulcast & SVC
Congestion control
Multi-stream (ability to send multiple streams over a single peer connection)
Sender & Receiver side bandwidth estimation
A tiny Nodejs module for easy integration with existing large Nodejs applications
super low-level APIs to provide minute control over media stream flows
Features like ice restarts and prioritization provide application flexibility
We have used the majority of the capabilities provided by mediasoup in our load balancer to provide enough flexibility to our customers who will be using our load balancer to build their super scalable applications on top of it.
Why aws?
Because aws is the leading cloud provider today it is used by many enterprises, and startups as well as individuals for hobby projects. It also has best-in-class uptime and trust among its users. It has very elaborated and easy-to-follow documentation for developer adoption. Also, their critical APIs which are used by our load balancer to scale media servers, are stable with less change frequency. For all of the above reasons, we choose aws as our first cloud provider for CWLB. We will eventually plan to support all leading cloud providers including Microsoft Azure, Google Cloud, Oracle Cloud, Digital Ocean, OVH cloud, etc., once our aws offering is complete and stable.
There can be 4 possible strategies using which one can auto-scale a webrtc application.
Horizontal scaling
Vertical scaling
Hybrid scaling
Hybrid+ scaling
Horizontal Scaling
This is the suitable mode of scaling if your use case needs smaller meeting room sizes of 2-5 users in each but a lot of such rooms are needed simultaneously.
A good example will be of a video contact center where 100+ customer support agents attend daily calls from customers. It is primarily an one to one call between the agent and the customer until the agent's supervisor and /or manager decide to join the call. In this case, there will be a maximum of 4 users in the conferencing room at any point in time but there will be 100+ / 500+ such rooms running at any point of time.
In this case horizontal load balancer can be used to distribute the load from first media server to second media server as soon as the load on the first server reached it's peak. The load balancer would keep track of the real time usage and release resources whenever the load on first sever is reduced. This way the load balancer can upscale / downscale media server resources based on the real time load.
Vertical Scaling
This is the suitable mode of scaling if your use case needs larger meeting room sizes of 20 - 60 users in each but a smaller number of such rooms are needed simultaneously.
A good example will be of a school / educational institution where only 10 teachers conduct daily sessions for their respective classes. In this case, though relatively there will be more number of students in each of the sessions but a maximum of 10 such rooms for 10 teachers need to be run at any point in time.
In this case a vertical load balancer can be used to distribute the load from the first core of the media server to other available cores as soon as the load on the first core reached it's peak. In this case, though only one media server maybe sufficient to cater to the whole school but effectively distributing load between all the available cores of the media server will be key to achieve the desired output from the media server. Here the load-balancer's job would be to keep track of the real time usage and release resources whenever the load on each individual core of the media server is reduced.
The two load balancing strategies mentioned here are the two basic forms of media server load balancing in WebRTC. The other two approaches are advanced uses cases which needs more advanced load balancing with fine grain control. They are described in the second part of this blog series. the link to the 2nd part of the post is here.
CWLB
Introducing CWLB (Centedge WebRTC Load Balancer), a general-purpose WebRTC load balancer designed using mediasoup as the media server. It has been designed from scratch to cater to the demands of those enterprises who don't want to use a video API vendor for certain reasons but want to use a dependable managed video infra with a dedicated support team, along with the possibility of customization of even the core media flows.
Features
Mediasoup as the media server
AWS/DigitalOcean as the cloud provider
Hybrid+ scaling
Highly flexible yet resource-efficient
An advanced load distribution algorithm with 85% efficiency (approx.)
Note: Currently with the CWLB v2 release, the efficiency of CWLB is 85%(approx.). Our goal is to reach >90% efficiency by the v3 release of CWLB.
Now we also have a production grade scalable in-house video conferencing solution named Meetnow on top of CWLB. It has been designed to truly unify your organization's external and internal communication in the today's remote first world. Some of the unique features are 2- 100 user room with different modes of one to one, conferencing and event, Complete meeting and attendance analytics, and last but not the least, pay only for real usage without any monthly / yearly commitments until you are sure about switching on to our Enterprise plan.
If you have mediasoup based open source project like mediasoup demo or edumeet which currently works great but does not autoscale then this is for you. If you have a BBB(bigbluebutton) / jitsi implementation currently in production which does not autoscale then this is for you. If you have any other open-source/custom-built video implementation in production which doesn't autoscale, then this is for you. Even if your current production video setup is working fine but you may need something like this in near future Or you are just curious to know more about CWLB, feel free to drop us a note at hello@centedge.io / sp@centedge.io to know more about how we can help you. If you wish to schedule a free 30 mins discussion for your use case with one of our senior/principal consultants, feel free to do so using this link.
Comentarios