Text By the Bay has ended
Saturday, April 25 • 4:00pm - 4:40pm
Scalable Online Learning of Topic Models with Spark

Sign up or log in to save this to your schedule and see who's attending!

This talk deals with the problem of how to learn topic models from large text corpora that are constantly growing such as with online forums. As documents stream into your corpus it is much more efficient to update your already learned topic model rather than batch processing your entire corpus. Furthermore, Apache Spark can be used to perform the sequential updates in a distributed fashion. The talk will also include a discussion on how to use your learned topic model to classify the documents in your corpus based on the topics they contain.


Alex Minnaar

Vertical Scope
Software engineer at VerticalScope Inc.  Previously MSc in Machine Learning at University College London, BSc in Math & Engineering from Queen's University. 

Saturday April 25, 2015 4:00pm - 4:40pm

Attendees (0)