Georgia Tech Researchers Discover Breakthrough 'Gaussian Cooling' Algorithm in High Dimension

Advances in computing have enabled researchers to collect massive amounts of data with relative ease but have given them an even greater challenge to analyze those enormous collections of data. A single data point might have numerous features and the effort to examine patterns across a data set can reveal an exploding number of possibilities and relationships, a number that grows exponentially with each dimension.

These problems have proved nearly intractable, requiring weeks or even months of high performance processing to handle such data sets.

But not anymore.

Researchers at Georgia Tech’s School of Computer Science have discovered an algorithm – deemed “Gaussian Cooling” – to accurately compute the volume of convex bodies in a matter of minutes using off-the-shelf computers. 

“With this randomized algorithm, mathematicians and researchers can estimate the volume of high dimensional objects in real time,” said Santosh Vempala, a professor in the School of Computer Science (SCS) who developed the new algorithm with doctoral candidate, Ben Cousins. “It can handle bodies in 100 dimensions and greater, using a new, provably correct technique.”

The problem of estimating the volume of a set is ancient. It has spawned a stream of mathematical and algorithmic ideas throughout history, starting with the Egyptians and Greeks who developed formulas in only two or three dimensions.

However, in spite of rapid advances in computers and algorithms and intensive study over the past 25 years, sampling and volume computation for high-dimensional sets has evaded a practical and complete solution. Research has resulted in a suite of tools to address parts of the challenge, such as analyzing and manipulating high-dimensional objects, choosing representative samples, and learning useful properties.

With this latest advance, total volume computation is practical for the first time.

"Professor Vempala and his student, Ben Cousins, made significant and surprising improvements to take the theoretical algorithm for volume computation to where we can now solve volume problems on today's computers,” said Lance Fortnow, chair of SCS. “Their work brings a major tool to the algorithmic arsenal that should have applications across the sciences."

The algorithm can be used as a tool for high-dimensional data analysis. Such data involving great numbers of parameters abounds in many fields today, including biology, neuroscience, as well as applied areas such as medical research, which may involve numerous vital statistics across many patients, or in computer security, where the algorithm can be applied to a model of information flow to estimate number of instances where data might leak.

Ravi Kannan, a principal researcher at Microsoft Research India and a member of the first team to create algorithms in this field, calls the latest discovery a “tour de force.”

“It is the culmination of a decades-long effort by leading researchers to develop an efficient algorithm for the problem of estimating the volume of convex sets,” Kannan said. “I foresee many important consequences. Congratulations to Cousins and Vempala on their achievement.’’

Georgia Tech’s researchers have made their algorithm publicly available as a MATLAB implementation and report that the method has been downloaded  hundreds of times to date. More details of the research and the researchers’ paper are available here.

Related Media

Click on image(s) to view larger version(s)

  • Santosh and Cousins Research News

For More Information Contact

Phillip Taylor

News & Media Relations Manager

ptaylor@cc.gatech.edu