Ever been frustrated about how long it takes for your Netflix episode to buffer or how blurry the resolution of your YouTube video is?
A team of computer science researchers at Stanford think they can ameliorate video-streaming standards. Francis Yan, a Stanford computer science doctoral student, developed a research project known as Puffer to test existing algorithms and train novel ones to improve the quality of video streaming. He leads a team comprised of fellow doctoral students Sadjad Fouladi and Hudson Ayers and Tsinghua University student Chenzhi Zhu. The team, advised by Keith Winstein and Phil Levis, is conducting research in an attempt to eliminate the thorns that often plague online video.
The Puffer research team recently submitted a paper evaluating the preliminary results of their research to a well-known computer science conference. Because the paper was submitted anonymously, the name of the conference has been withheld.
The paper revealed that the new video-streaming algorithm developed by Yan delivers a significantly better quality of video-streaming experience compared to prior algorithms. Puffer’s algorithm is unique in its ability to learn continuously from a real-world environment, compared to existing algorithms which are only trained once and learn from a simulation of data.
How it works
While Puffer is, first and foremost, an academic exploration and by no means a commercial endeavor, the Puffer team launched a quasi-product to gather data for research. Yan and his team built a live cable television streaming platform, which is available to the public free of charge. The website allows users –– whom Puffer’s researchers call “academic study participants” –– to create an account and watch live cable television on their computers, ad-free and at no cost. Puffer currently receives signals from NBC, CBS, ABC, PBS, FOX and Univision networks from an antenna perched atop the Packard Electrical Engineering Building.
“The purpose of this website is to recruit people [from] all across the country on different kinds of internet connections so that the different algorithms can learn what the real internet is like,” Winstein said. “So then we can learn what algorithms work best.”
Puffer’s secret sauce is that video is streamed over real internet connections to real users so that its algorithms can learn from live traffic in situ, not from simulated data. Historically, researchers developing network algorithms have had to study and test those algorithms in simulation or in small, real-world testbeds. Yan’s research, however, has shown that the performance of algorithms learned in simulation or via small-scale experiments doesn’t correlate with performance in live traffic.
The more diverse the internet connections that study participants use, the more data Puffer’s site gathers, and the more robust the algorithms generated will be. As real users stream video over their internet connections, the Puffer website automatically tunes parameters that control the timing and quality of video and monitors the efficacy of the resulting computer-designed algorithm.
The algorithms
Puffer’s adaptive, computer-generated algorithm uses reinforcement learning and continual learning to create an algorithm that doesn’t just learn once from a singular training set.
The ultimate goal of the algorithm is to maximize a video-watcher’s overall quality of experience, measured by average video quality, the variability of video quality and the rebuffer rate.
Aside from developing and training a novel algorithm, Puffer is testing three types of existing algorithms: “congestion-control” algorithms that decide when to send a piece of data, “throughput forecasters” that predict the time it takes for a chunk of data to reach the user and “adaptive-bitrate” algorithms that determine the quality of video to send.
Learning in situ
“This is a kind of artificial intelligence research –– a kind of sequential decision-making under uncertainty,” Winstein said. “And it’s harder than other kinds of artificial intelligence research.”
Rather than, say, an image classification algorithm that can learn from a single set of training data (“Is this a stop sign, or not a stop sign?”), a sequential decision-making algorithm requires more sophisticated training.
“A training set isn’t sufficient,” Winstein said. “Because we aren’t just making one decision –– e.g. is it a stop sign or not? –– we’re making a series of decisions over time, and those decisions affect the environment and the subsequent decisions that the algorithm has to make.”
This quandary necessitates a training environment, rather than a set of training data.
Commercial companies like Twitch, Netflix and YouTube have access to a plethora of real-world data. However, these companies tend to be more conservative about experimenting on real customers or changing their low-level systems code to continuously test different algorithms.
Puffer occupies the sweet spot between commercial companies and academic research groups. “What we’re trying to do here with our Puffer website is get a real training environment like Netflix would have, but with the flexibility to experiment that academics would have,” Winstein said.
The highly dynamic nature of the internet poses another challenge: No two internet connections behave exactly the same way.
“The challenge here is that we’re doing machine learning on internet connections between us and uncontrolled people across the country,” Winstein said. “Our algorithm has to learn the best way to send data … over an internet connection we may have never experienced before.”
Puffer’s researchers aim for transparency –– not only is Puffer open-source, but users watching television on Puffer’s platform are able to view the technical information the website is analyzing, from debugging statistics to the bit rate — or number of bits per second — of each channel.
Inception
According to Winstein, three-fourths of all internet traffic results from video streaming. Despite the prevalence and volume of video streaming that occurs online, very little research has been conducted to analyze how video streaming algorithms perform across real internet traffic.
Yan and his colleagues first began working on Puffer on Oct. 1, 2017.
Initially, Yan said that he felt “uncertain” about the prospect of enhancing video-streaming algorithms.
“At the very beginning, people thought there was little room for improvement in this area, because prior work pointed out that their algorithms were very close to what they called ‘optimal,’” Yan said.
“When we embarked on this project, we didn’t know if it was possible or not,” Winstein added.
Funded by grants from the National Science Foundation and the Defense Advanced Research Projects Agency, with additional support from Google, Huawei, VMWare, Dropbox, Facebook and the Stanford Platform Lab, the Puffer television streaming website went live in the winter of 2018.
Touching real people
Winstein reiterated that Puffer is an academic research project, not a commercial startup endeavor. However, he understands that the involvement of Puffer’s “users” is crucial to the gathering data for the project.
“And yet the fun part is that we get to touch real people, because that is fundamental to doing the research,” Winstein stated.
Puffer’s website observes the trends of real television viewers; for instance, the website experiences heavy traffic when a sports game is broadcasted and lighter traffic at 3 a.m. when most viewers are asleep.
“The behaviors of real people are slightly inconvenient for the purposes of research,” Winstein said. “But on the other hand, the whole point of the study is to build algorithms that actually are attuned to the behaviors of real people, which we wouldn’t get if people behaved the way we told them to.”
While computer science research typically involves analyzing fundamental mathematics, statistics, probability and machine learning, which can often make scholars feel detached from reality, Winstein says that he has enjoyed supervising Puffer’s research because he “gets to learn something about the real world.”
Key results
According to Levis, when run in real traffic, the Puffer algorithm delivers a better quality of video-streaming experience compared to existing algorithms “by significant margins”. Puffer’s algorithm rebuffers less by a factor of five to 13, delivers higher video quality and suffers from lower video quality variation. In other words, Puffer’s algorithm outperforms all existing ones on every metric measuring quality of experience.
“The numbers were excellent,” Levis said. “So all of the design work … immediately bore out in practice. That was super encouraging.”
The performance of Yan’s algorithm can only improve as the algorithm continues to collect data and learn iteratively.
Next steps
“We would love to see what the community can do working together to explore different algorithms,” he said.
Levis agreed, adding that he sees the role of academic researchers as being promulgators of knowledge. He says he envisions other researchers using the Puffer website to test and improve their own video-streaming algorithms.
According to Levis, the goal is not to pitch Puffer’s algorithm to video-streaming companies for commercial purposes. Rather, he hopes that Puffer’s research not only “pushes the state of the art forward,” but also sets a precedent for “a trajectory that the [research] community can follow.”
“I hope this idea of algorithms that learn continuously in the same environment in which they’re running proves to be an influential one,” Winstein added. “It seems to be very useful in this context, and I think it probably has broader capability.”
Contact Alex Tsai at aotsai ‘at’ stanford.edu.