We have unprecedented access to information stored in digital documents, but not all of it is trustworthy . Online discussions on platforms such as twitter, reddit or hypothes.is can help to navigate the uncertainty by providing background information and critical comments. However, it is often hard to get an overview of the comments, because they are spread over various sources and they can be related to any aspect of a document.
The goal of this thesis is to develop NLP techniques that link comments from various sources to the exact part of a document they comment on. This would allow aggregating comments by the aspect they discuss, greatly improving their accessibility.