Computer scientist David Blei, with co-authors Matthew Hoffman and Francis Bach, is recognized with a Test of Time Award at NeurIPS, the world’s top

They Found a Way to Thematically Sort All of Wikipedia on a Laptop

submited by
Style Pass
2022-01-25 09:00:05

Computer scientist David Blei, with co-authors Matthew Hoffman and Francis Bach, is recognized with a Test of Time Award at NeurIPS, the world’s top machine learning conference, for scaling his topic modeling algorithm to billions of documents. 

No sooner had the internet put mountains of knowledge at our fingertips than a new problem emerged: finding what we needed. David Blei, a professor of computer science and statistics at Columbia, has helped us find those nuggets of gold with his statistical methods for organizing documents thematically, making it easier to search and explore massive bodies of text. 

His topic modeling algorithm today is embedded in everything from spam filters to recommendation engines, but until a decade ago, topic modeling had limited reach, overwhelmed by datasets much larger than a few hundred thousand documents.

In a landmark paper, Online Learning for Latent Dirichlet Allocation, Blei and his co-authors Matthew Hoffman and Francis Bach introduced a way to extend topic modeling to millions and billions of documents. Blei, Hoffman, and Bach were recently awarded a Test of Time Award for their work at the Neural Information Processing Systems (NeurIPS) conference.

Leave a Comment