Autoscaling WebRTC with mediasoup, CWLB 2.0 now ready
Updated: Mar 6
As the first post of 2023, Wishing everybody a wonderful new year 2023.
If you are here, you most probably are facing issues related to scaling with your WebRTC application or you are just exploring with some future plans to build a production grade WebRTC app. In both the cases, you are at the right place. This post is going to be a continuation from the previous post we wrote on this topic a couple of months ago. The previous post described in details regarding when auto-scaling is necessary and when it is not. If you are not sure if your solution needs WebRTC autoscaling or not, you should read the previous post here before reading further.
In the last post we discussed about horizontal OR vertical scaling as a strategic option to scale mediasoup media servers based on the use case. In this post, we are going to discuss about another way of auto-scaling and its use case. We also are going to discuss interesting new enhancements to CWLB.
The third WebRTC scaling strategy
The third approach is a combination of vertical and horizontal scaling combined in as one. It can be called a hybrid scaling approach. Here the vertical scaling approach is used to scale one room to all available cores in a mediasoup instance in case of a need.Once this mediasoup instance is totally occupied, but still the same room needs more resources, the horizontal scaling is used to scale to different mediasoup instance located in separate host. For all the new resource allocation requests for the same room, the new server is then used according to the vertical scaling strategy until and unless the first server has free resources to spare. This hybrid approach is typically useful for very large rooms like large event rooms where the load-balancer needs to cater to 100s / 1000s of concurrent users in one room in a complete just in time resource request mode.
Lets understand the 2 important key words mentioned in the above paragraph.
Resource requests: A request is made to the media server to allocate some resources to the user so that the user can send / receive audio / video / screen-share media streams.
Just in time request: This load-balancer strategy is used when the load-balancer has no previous information about the size of the rooms so that it can pre-allocate and reserve the resources.Here the load-balancer has to work really hard to keep track of real time resource usage of each media server and allocate / free resources in real time as the user joins / leaves a room. This type of implementation is relative complex to a pre-allocation and reservation based load-balancing strategy.
The Hybrid+ WebRTC scaling strategy
The hybrid+ scaling strategy has all the things that is there in the hybrid scaling strategy. In addition, it also has some other important aspects which makes this strategy a really good choice for medium / large scale deployments.
An additional relay server between the client and the media server to make a media server completely stateless i.e. the media server will not contain any kind of business logic.
Capable of creating / destroying on demand media servers using APIs of cloud providers in a completely automated manner with least manual intervention
Capable of utilizing advanced techniques like media server cascading for keeping the latency to the minimum while catering to a global user base. Media servers in different geographic locations need to run simultaneously to enable media server cascading.
Capable of HA(High Availability) setup where stand by media servers can take up load when primary media servers fail while in use. Additional standby media servers need to run to ensure HA.
CWLB 1.0 which was released in June 2022 had vertical scaling, horizontal scaling which used AWS EC2 instances for auto-scaling of media servers. This was good enough for small and medium use cases. But for large and very large use cases like large scale event, it had 2 disadvantages. The first is that the load-balancer used to take more media server resources than the number of media servers ideally it should be consuming and the second is that the data transfer costs each room was incurring while using AWS ec2 instances.
In CWLB 2.0 , we have now addressed these 2 points along with many other improvements.
First, the core load-balancer algorithm is now fully JIT request compatible. It means it now uses media server resources very efficiently by keeping track of each media server usage in the real time and allocate / de-allocate resources based on real time user resource consumption demands. It now has all strategies enabled i.e. vertical scaling, horizontal scaling, and a mix of both aka hybrid scaling.
Second, we have integrated another cloud provider, DigitalOcean into the load-balancer which has relatively less data transfer costs than AWS ec2. Lets take an edtech use case as an example to compare the data transfer costs between AWS ec2 and DigitalOcean for your reference so that you can understand why this is important.
A maths tutoring company in India runs online maths tutoring classes for high school students. Here each maths teacher teaches high school maths to1000 students in one online session. They conduct 6 such sessions every day for 6 days a week with each session being conducted for 90mins. Lets try to calculate an approximate data transfer cost for a month. Here we will be using some assumptions to look more realistic.
Lets calculate the amount of data being transferred from the media servers in the cloud to students who have joined the class.
The teacher is speaking while either sharing his/her camera / screen for whole of the class time i.e. 90mins.
Lets assume that the audio is consuming 40Kb/second and the video / screen share is consuming 500Kb/second of internet bandwidth. So each student is consuming 540Kb/second of data.
Here is how the maths looks like.
540 * 60 * 90 = 2.78Gb is what one student consumes for whole 90mins session.
It there are 1000 students in that session, the total data consumption for that session would be 2780.9 Gb or 2.71 Tb.
If there are 6 such sessions happen each day, then the data transfer amount for each day would be 16.29 Tb.
Considering that these sessions happen for 6 days a week, the data transfer amount for the week would be, 97.76 Tb.
Considering 4 weeks in a month, the data transfer amount for the whole month would be, 391.06Tb. That's a lot of data being transferred!
Now lets look at the cost. AWS ec2 charges $0.08/Gb for outbound data transfers from AWS ec2 o public internet. It essentially means AWS doesn't charge for the teacher who is sending his / her audio and video streams to the media server but it charges for the students who are listening to the audio and video streams relayed by media server hosted in AWS ec2.
The maths looks like this.
391.06 * 1024 * 0.08 = $32,036
The amount of data consumed per month in Tb which is converted to Gb by multiplying 1024 along with AWS data transfer cost per Gb. This is the cost only for the data transfer and it doesn't include the cost for running AWS ec2 instances for media servers. That cost will be be added to this cost on the actual usage basis.
Now lets look at the maths for running the same amount of maths tutoring sessions with media servers running on DigitalOcean.
There will be no change on the total amount of data transfers which is 391.06Tb.
The maths will look like this.
391.06 * 1024 * 0.01= $4004
This cost will further come down as there is free data transfer bundled with each DigitalOcean droplets. For example a 4 vCPU , 8GB CPU optimised instance comes with 5TB of free data transfer per month. With DigitalOcean, we can consider the final cost to be in the range of $3200 ~ $3500.
Due to this difference in the data transfer costs, we integrated DigitalOcean into CWLB 2.0 to provide an alternative to AWS ec2 to run media server with lesser cost. But this is purely optional and configurable from the loadbalancer settings of the admin dashboard.
Any organization admin can change their cloud vendor in the dashboard from AWS to DO or vice versa with a button click and the media servers will run the desired cloud as selected by the admin. The default cloud setting for running the media servers is now DO(DigitlOcean).It can be changed to AWS EC2 any time in the loadbalancer settings.
Some other important updates in CWLB 2.0 are as below.
Loadbalancing recording servers
Like media servers, the servers responsible for handling meeting recording can get exhausted quickly if there is a lot of demand for recordings. In order to solve this, we have now integrated recording server autoscaling to the load-balancer. Now the load-balancer can not only auto scale media servers but also recording servers in a fully automated manner.
Loadbalancing breakout rooms
Breakout rooms were already available in CWLB 1.0 but they were not very resource efficient. The customers had to use the same amount of credits to use breakout rooms as the main room. With CWLB 2.0, the breakout rooms are fully integrated in the JIT request handling mode into the load balancer so that the customers need not pay anything extra for using breakout rooms. It's completely dynamic based on the actual usage of the breakout rooms irrespective of the main room size.
Due to current work pressure, we are not able to write an exhaustive list of all the updates that happened in CWLB 2.0 though we would love to write a exhaustive list when time permits. Until then if you have any query / suggestion related to CWLB 2.0, please feel free to drop us a mail at firstname.lastname@example.org.