Text By the Bay has ended
Friday, April 24 • 2:20pm - 3:00pm
A Web Worth of Data: Common Crawl for NLP

Sign up or log in to save this to your schedule and see who's attending!

The Common Crawl corpus contains petabytes of web crawl data and is a treasure trove of potential experiments. To introduce you to the possibilities that web crawl data has for NLP, we will take a detailed look at how the data has been used by various experiments and how to get started with the data yourself.

avatar for Stephen Merity

Stephen Merity

Senior Research Scientist, Salesforce Research
Stephen Merity is a senior research scientist at MetaMind, part of Salesforce Research, where he works on researching and implementing deep learning models for vision and text, with a focus on memory networks and neural attention mechanisms for computer vision and natural language... Read More →

Friday April 24, 2015 2:20pm - 3:00pm

Attendees (0)