Each of us have a plethora of a health data that resides in unstructured, non-standard formats and silos. Bringing this data together can reveal powerful insights about our health, but proves to be a staggering technical challenge. Unstructured narratives contain key pieces of information that can not easily be extracted without additional processing.
We are building a system to organize this unstructured data, classify it into known topics, and apply additional levels of normalizations -- all in near real-time and at scale. This talk will cover some of the technical challenges we are facing and how we are solving them with machine learning and natural language processing techniques.