Edwin Simpson

Edwin Simpson, DPhil

Postdoctoral researcher

+49 6151 16-21677
+49 6151 16-25295

Google Scholar profile

Hochschulstraße 10
64289 Darmstadt

Office: S2|02 B106

Research interests

I am interested in interactive learning for NLP and the use of Bayesian techniques to handle noisy, small and weak information. Weak information is often available when large amounts of reliable training data is not. It consists of noisy signals from different sources -- such as crowdsourced annotations, implicit feedback from users of an application, or untrusted reports of an event on social media. By combining different sources of weak information and modelling their reliability, we can learn reliable models when gold-labelled training data is not sufficient. NLP tasks such as argument mining, in which we extract and collate arguments from text corpora, are a good target case. There is a large variation in the types of arguments we may wish to look for in different domains and applications, and we need to gather data efficiently to adapt to these new situations. I'm interested in how we can address such problems by transferring knowledge between domains and reducing the cost of annotating training data through intelligent crowdsourcing and interactive learning with humans in the loop.

Some key topics: preference learning, Gaussian processes, crowdsourcing, classifier combination, argument mining, argumentation, sequence labelling, summarisation, disaster risk reduction.

Some keywords for topics I am interested in include: natural language processing, Bayesian inference, approximate inference, scalable inference, interactive learning, crowdsourcing, computational argumentation.

Software

IBCC

IBCC is a Bayesian variant of the famous Dawid and Skene (1979) model, which is often used for combining multiple annotators or classifiers, e.g. to aggregate crowdsourced labels. We provide a fairly well-tested implementation using variational Bayes: pyIBCC, which is designed for aggregating crowdsourced annotations and was shown to outperform rival methods. I'm always working on making the code more user-friendly and incorporating new models, so please get in touch if you would like some help using this method.

HeatmapBCC

HeatMapBCC is an extension to IBCC that can predict classifications of test data points given their input features. In other words, it trains a classifier directly from noisy, crowdsourced data without the need for a cleaning step. Underneath, it uses a Gaussian process, and was tested on a mapping task to aggregate weak signals extracted from social media messages and satellite images. For details, see our ECML paper.

Bayesian preference learning with GPPL

Scalable GPPL -- we developed a scalable inference method for learning to rank from pairwise labels. This uses stochastic variational inference.

Biographical information

I joined the UKP lab in April 2016 as a postdoctoral researcher. I was previously a member of the Machine Learning Research Group at the University of Oxford from 2010 to 2016, where I completed my doctorate on decision making with crowds of unreliable classifiers. Before that, I worked as a research engineer at Hewlett Packard Labs in Bristol, UK, and have a Masters in Computer Science from the University of Bristol.

Master's students

Students I have supervised:

  • Anshul Tak (master's): Personalized Scalable Clustering of Information
  • Hamid Zafartavanaelmi (master's): Scalable Infinite Mixture of Markov Chains for Natural Language Processing
  • VeronikaSchindler (master's): ActiveCrowdsourcingUsingHeatmapBCCfor NLP
  • Igor Cherepanov (master's): Interactive fine-tuning of input representations using a GP loss
  • Melvin Laux (student assistant): Bayesian annotator combination

Additional publications

Publications

Exportieren als [feed] Atom [feed] RSS 1.0 [feed] RSS 2.0
Gruppiere nach: Publikationsjahr | Typ des Eintrags | Keine Gruppierung
Springe zu: 2019 | 2018 | 2017
Anzahl der Einträge: 8.

2019

Miller, Tristan ; Do Dinh, Erik-Lân ; Simpson, Edwin ; Gurevych, Iryna (2019):
OFAI–UKP at HAHA@IberLEF2019: Predicting the Humorousness of Tweets Using Gaussian Process Preference Learning.
In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019), Bilbao, Spain, Bilbao, Spain, In: CEUR Workshop Proceedings, [Konferenzveröffentlichung]

Simpson, Edwin ; Gurevych, Iryna (2019):
A Bayesian Approach for Sequence Tagging with Crowds.
In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA, Association for Computational Linguistics, In: The 2018 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, 03.11.2019--07.11.2019, [Konferenzveröffentlichung]

Simpson, Edwin ; Do Dinh, Erik-Lân ; Miller, Tristan ; Gurevych, Iryna (2019):
A Bayesian Approach for Predicting the Humorousness of One-liners.
In: 2019 Conference of the International Society for Humor Studies, Austin, TX, USA, 2019-06-24 to 2019-06-28, [Konferenzveröffentlichung]

Simpson, Edwin ; Do Dinh, Erik-Lân ; Miller, Tristan ; Gurevych, Iryna (2019):
Predicting Humorousness and Metaphor Novelty with Gaussian Process Preference Learning.
In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), In: The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, 28.07.2019--02.08.2019, [Online-Edition: https://fileserver.ukp.informatik.tu-darmstadt.de/UKP_Webpag...],
[Konferenzveröffentlichung]

Eger, Steffen ; Şahin, Gözde Gül ; Rücklé, Andreas ; Lee, Ji-Ung ; Schulz, Claudia ; Mesgar, Mohsen ; Swarnkar, Krishnkant ; Simpson, Edwin ; Gurevych, Iryna (2019):
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems.
In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, In: The 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, USA, 02.06.2019--07.10.2019, [Konferenzveröffentlichung]

2018

Gurevych, Iryna ; Meyer, Christian M. ; Binnig, Carsten ; Fürnkranz, Johannes ; Kersting, Kristian ; Roth, Stefan ; Simpson, Edwin
Gelbukh, Alexander (Hrsg.) (2018):
Interactive Data Analytics for the Humanities.
In: Computational Linguistics and Intelligent Text Processing: Proceedings of the 18th International Conference, Berlin/Heidelberg, Springer, S. 527-549, DOI: 10.1007/978-3-319-77113-7_41,
[Online-Edition: https://link.springer.com/chapter/10.1007%2F978-3-319-77113-...],
[Buchkapitel]

Simpson, Edwin ; Gurevych, Iryna (2018):
Finding Convincing Arguments Using Scalable Bayesian Preference Learning.
In: Transactions of the Association for Computational Linguistics, S. 357-371, 6, ISSN 2307-387X,
[Online-Edition: https://transacl.org/ojs/index.php/tacl/article/view/1304],
[Artikel]

2017

Simpson, Edwin ; Reece, Steven ; Roberts, Stephen J. (2017):
Bayesian Heatmaps: Probabilistic Classification with Multiple Unreliable Information Sources.
In: Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2017), Springer, Skopje, Macedonia, In: Lecture Notes in Computer Science, Lecture Notes in Computer, [Online-Edition: https://link.springer.com/chapter/10.1007/978-3-319-71246-8_...],
[Konferenzveröffentlichung]

Diese Liste wurde am Wed Aug 21 04:51:04 2019 CEST generiert.

go to list