Skip to main content Skip to secondary navigation
Journal Article

Snorkel MeTaL: Weak Supervision for Multi-Task Learning

Abstract

Many real-world machine learning problems are challenging to tackle for two reasons: (i) they involve multiple sub-tasks at different levels of granularity; and (ii) they require large volumes of labeled training data. We propose Snorkel MeTaL, an end-to-end system for multi-task learning that leverages weak supervision provided at multiple levels of granularity by domain expert users. In MeTaL, a user specifies a problem consisting of multiple, hierarchically-related sub-tasks — for example, classifying a document at multiple levels of granularity — and then provides labeling functions for each sub-task as weak supervision. MeTaL learns a re-weighted model of these labeling functions, and uses the combined signal to train a hierarchical multi-task network which is automatically compiled from the structure of the sub-tasks. Using MeTaL on a radiology report triage task and a fine-grained news classification task, we achieve average gains of 11.2 accuracy points over a baseline supervised approach and 9.5 accuracy points over the predictions of the user-provided labeling functions.

Project page

A system for rapidly creating, modeling, and managing training data, focused on accelerating the development of structured or “dark” data extraction applications for domains in which large labeled training sets are not available or easy to obtain.
Author(s)
Alex Ratner
Braden Hancock
Jared Dunnmon
Roger Goldman
Christopher Ré
Journal Name
Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning
Publication Date
June, 2018
DOI
10.1145/3209889.3209898
Publisher
ACM