Publications

2017
Y. Liu and Y. Chen, “Sequential Peer Prediction: Learning to Elicit Effort using Posted Prices,” Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI), San Francisco, CA, 2017. Publisher's VersionAbstract
Peer prediction mechanisms are often adopted to elicit truthful contributions from crowd workers when no ground-truth verification is available. Recently, mechanisms of this type have been developed to incentivize effort exertion, in addition to truthful elicitation. In this paper, we study a sequential peer prediction problem where a data requester wants to dynamically determine the reward level to optimize the trade-off between the quality of information elicited from workers and the total expected payment. In this problem, workers have homogeneous expertise and heterogeneous cost for exerting effort, both unknown to the requester. We propose a sequential posted-price mechanism to dynamically learn the optimal reward level from workers' contributions and to incentivize effort exertion and truthful reporting. We show that (1) in our mechanism, workers exerting effort according to a non-degenerate threshold policy and then reporting truthfully is an equilibrium that returns highest utility for each worker, and (2) The regret of our learning mechanism w.r.t. offering the optimal reward (price) is upper bounded by Õ (T3/4) where T is the learning horizon. We further show the power of our learning approach when the reports of workers do not necessarily follow the game-theoretic equilibrium.
Sequential Peer Prediction: Learning to Elicit Effort using Posted Prices
L. Hu and Y. Chen, “Fairness at Equilibrium in the Labor Market,” Workshop on Fairness, Accountability, and Transparency in Machine Learning (FATML), Halifax, Nova Scotia, 2017. Publisher's VersionAbstract
Recent literature on computational notions of fairness has been broadly divided into two distinct camps, supporting interventions that address either individual-based or group-based fairness. Rather than privilege a single definition, we seek to resolve both within the particular domain of employment discrimination. To this end, we construct a dual labor market model composed of a Temporary Labor Market, in which firm strategies are constrained to ensure group-level fairness, and a Permanent Labor Market, in which individual worker fairness is guaranteed. We show that such restrictions on hiring practices induces an equilibrium that Pareto-dominates those arising from strategies that employ statistical discrimination or a "group-blind" criterion. Individual worker reputations produce externalities for collective reputation, generating a feedback loop termed a "self-fulfilling prophecy." Our model produces its own feedback loop, raising the collective reputation of an initially disadvantaged group via a fairness intervention that need not be permanent. Moreover, we show that, contrary to popular assumption, the asymmetric equilibria resulting from hiring practices that disregard group-fairness may be immovable without targeted intervention. The enduring nature of such equilibria that are both inequitable and Pareto inefficient suggest that fairness interventions are of critical importance in moving the labor market to be more socially just and efficient.
Fairness at Equilibrium in the Labor Market
Y. Chen, C. Podimata, and N. Shah, “Strategyproof Linear Regression,” NIPS’17 Workshop on Learning in the Presence of Strategic Behavior, Long Beach, CA, 2017.Abstract
Designing machine learning algorithms that are robust to noise in training data has lately been a subject of intense research. A large body of work addresses stochastic noise [12, 7], while another one studies adversarial noise [11, 2] in which errors are introduced by an adversary with the explicit purpose of sabotaging the algorithm. This is often too pessimistic, and leads to negative results. The literature on game theory and mechanism design offers an interesting middle ground: strategic noise. In this paradigm, training data is provided by strategic sources that purposefully introduce errors for maximizing their own benefit. This is less pessimistic than adversarial noise where the errors are introduced for simply harming the algorithm. There is a growing body of research on designing machine learning algorithms that are robust to strategic noise. This research can be categorized using three key axes: i) manipulable information, ii) goal of the agents, and iii) use of payments and incentive guarantee. On the first axis, most papers assume that independent variables (feature vectors) are public information, and dependent variables (labels) are private, manipulable information [6, 15, 18], though some papers also design algorithms robust to strategic feature vectors [8]. On the second axis, one body of research focuses on agents motivated by privacy concerns (with a tradeoff between accuracy and privacy) [5, 3], while another one focuses on agents who want the algorithm to make accurate assessment on their own sample, even if this reduced the overall accuracy. Such strategic manipulations have been studied for estimation [4], classification [13, 14, 15], and regression [19, 6] problems. On the third axis, the research differs on whether monetary payments to agents are allowed [3], and on how strongly to guarantee truthful reporting (the stronger “strategyproofness” requirement [18, 19, 15] versus the weaker Bayes-Nash equilibrium requirement [9, 5]). In this paper, we focus on the problem of linear regression, i.e., fitting a hyperplane through given data, which is studied extensively in statistics and machine learning. We consider agents who can manipulate their dependent variables in order to increase the algorithm’s accuracy on their own samples, and design strategyproof mechanisms without payments.
Strategyproof Linear Regression
2016
Y. Chen and B. Waggoner, “Informational Substitutes,” Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS), New Brunswick, NJ, 2016. Publisher's VersionAbstract
We propose definitions of substitutes and complements for pieces of information ("signals") in the context of a decision or optimization problem, with game-theoretic and algorithmic applications. In a game-theoretic context, substitutes capture diminishing marginal value of information to a rational decision maker. We use the definitions to address the question of how and when information is aggregated in prediction markets. Substitutes characterize "best-possible" equilibria with immediate information aggregation, while complements characterize "worst-possible", delayed aggregation. Game-theoretic applications also include settings such as crowdsourcing contests and Q\&A forums. In an algorithmic context, where substitutes capture diminishing marginal improvement of information to an optimization problem, substitutes imply efficient approximation algorithms for a very general class of (adaptive) information acquisition problems. In tandem with these broad applications, we examine the structure and design of informational substitutes and complements. They have equivalent, intuitive definitions from disparate perspectives: submodularity, geometry, and information theory. We also consider the design of scoring rules or optimization problems so as to encourage substitutability or complementarity, with positive and negative results. Taken as a whole, the results give some evidence that, in parallel with substitutable items, informational substitutes play a natural conceptual and formal role in game theory and algorithms.
Informational Substitutes
Y. Liu and Y. Chen, “Learning to Incentivize: Eliciting Effort via Output Agreement,” Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI), New York, NY, 2016. Publisher's VersionAbstract
In crowdsourcing when there is a lack of verification for contributed answers, output agreement mechanisms are often used to incentivize participants to provide truthful answers when the correct answer is hold by the majority. In this paper, we focus on using output agreement mechanisms to elicit effort, in addition to eliciting truthful answers, from a population of workers. We consider a setting where workers have heterogeneous cost of effort exertion and examine the data requester's problem of deciding the reward level in output agreement for optimal elicitation. In particular, when the requester knows the cost distribution, we derive the optimal reward level for output agreement mechanisms. This is achieved by first characterizing Bayesian Nash equilibria of output agreement mechanisms for a given reward level. When the requester does not know the cost distribution, we develop sequential mechanisms that combine learning the cost distribution with incentivizing effort exertion to approximately determine the optimal reward level.
Learning to Incentivize: Eliciting Effort via Output Agreement
M. Yin and Y. Chen, “Predicting Crowd Work Quality under Monetary Interventions,” Proceedings of the 4th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), Austin, TX, 2016.Abstract
Work quality in crowdsourcing task sessions can change overtime due to both internal factors, such as learning and bore-dom, and external factors like the provision of monetary in-terventions. Prior studies on crowd work quality have focusedon characterizing the temporal behavior pattern as a resultof the internal factors. In this paper, we propose to explic-itly take the impact of external factors into consideration formodeling crowd work quality. We present a series of sevenmodels from three categories (supervised learning models,autoregressive models and Markov models) and conduct anempirical comparison on how well these models can predictcrowd work quality under monetary interventions on threedatasets that are collected from Amazon Mechanical Turk.Our results show that all these models outperform the base-line models that don’t consider the impact of monetary in-terventions. Our empirical comparison further identifies therandom forests model as an excellent model to use in prac-tice as it consistently provides accurate predictions with highconfidence across different datasets, and it also demonstratesrobustness against limited training data and limited access tothe ground truth.
Predicting Crowd Work Quality under Monetary Interventions
Y. Liu and Y. Chen, “A Bandit Framework for Strategic Regression,” Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS), Barcelona, Spain, pp. 1821–1829, 2016.Abstract
We consider a learner's problem of acquiring data dynamically for training a regression model, where the training data are collected from strategic data sources. A fundamental challenge is to incentivize data holders to exert effort to improve the quality of their reported data, despite that the quality is not directly verifiable by the learner. In this work, we study a dynamic data acquisition process where data holders can contribute multiple times. Using a bandit framework, we leverage the long-term incentive of future job opportunities to incentivize high-quality contributions. We propose a Strategic Regression-Upper Confidence Bound (SR-UCB) framework, a UCB-style index combined with a simple payment rule, where the index of a worker approximates the quality of his past contributions and is used by the learner to determine whether the worker receives future work. For linear regression and a certain family of non-linear regression problems, we show that SR-UCB enables an O(√log T/T) -Bayesian Nash Equilibrium (BNE) where each worker exerts a target effort level that the learner has chosen, with T being the number of data acquisition stages. The SR-UCB framework also has some other desirable properties: (1) The indexes can be updated in an online fashion (hence computation is light). (2) A slight variant, namely Private SR-UCB (PSR-UCB), is able to preserve (O(log-1 T), O(log-1 T))-differential privacy for workers' data, with only a small compromise on incentives (each worker exerting a target effort level is an O(log6 T/√T)-BNE).
A Bandit Framework for Strategic Regression
C. - J. Ho, R. M. Frongillo, and Y. Chen, “Eliciting Categorical Data for Optimal Aggregation,” Proceedings of the 30th Annual Conference on Neural Information Processing Systems (NIPS), Barcelona, Spain, 2016. Publisher's VersionAbstract
Models for collecting and aggregating categorical data on crowdsourcing platforms typically fall into two broad categories: those assuming agents honest and consistent but with heterogeneous error rates, and those assuming agents strategic and seek to maximize their expected reward. The former often leads to tractable aggregation of elicited data, while the latter usually focuses on optimal elicitation and does not consider aggregation. In this paper, we develop a Bayesian model, wherein agents have differing quality of information, but also respond to incentives. Our model generalizes both categories and enables the joint exploration of optimal elicitation and aggregation. This model enables our exploration, both analytically and experimentally, of optimal aggregation of categorical data and optimal multiple-choice interface design.
Eliciting Categorical Data for Optimal Aggregation
Y. Chen, A. Ghosh, M. Kearns, T. Roughgarden, and J. W. Vaughan, “Mathematical Foundations for Social Computing,” Communications of ACM, vol. 59, no. 12, pp. 102–108, 2016. Publisher's VersionAbstract
Social computing benefits from mathematical foundations, but research has barely scratched the surface.
Mathematical Foundations for Social Computing
2015
N. S. Lambert, et al., “An Axiomatic Characterization of Wagering Mechanisms (Supersedes the EC’08 paper),” Journal of Economic Theory, vol. 156, pp. 389-416, 2015. Publisher's VersionAbstract
We construct a budget-balanced wagering mechanism that flexibly extracts information about event probabilities, as well as the mean, median, and other statistics from a group of individuals whose beliefs are immutable to the actions of others. We show how our mechanism, called the Brier betting mechanism, arises naturally from a modified parimutuel betting market. We prove that it is essentially the unique wagering mechanism that is anonymous, proportional, sybilproof, and homogeneous. While the Brier betting mechanism is designed for individuals with immutable beliefs, we find that it continues to perform well even for Bayesian individuals who learn from the actions of others.
An Axiomatic Characterization of Wagering Mechanisms, 2015
M. Yin and Y. Chen, “Bonus or Not? Learn to Reward in Crowdsourcing,” Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, vol. volume, no. issue, pp. 201–207, 2015. Publisher's VersionAbstract
Recent work has shown that the quality of work produced in a crowdsourcing working session can be influenced by the presence of performance-contingent financial incentives, such as bonuses for exceptional performance, in the session. We take an algorithmic approach to decide when to offer bonuses in a working session to improve the overall utility that a requester derives from the session. Specifically, we propose and train an input-output hidden Markov model to learn the impact of bonuses on work quality and then use this model to dynamically decide whether to offer a bonus on each task in a working session to maximize a requester's utility. Experiments on Amazon Mechanical Turk show that our approach leads to higher utility for the requester than fixed and random bonus schemes do. Simulations on synthesized data sets further demonstrate the robustness of our approach against different worker population and worker behavior in improving requester utility.
Bonus or Not? Learn to Reward in Crowdsourcing
R. M. Frongillo, Y. Chen, and I. A. Kash, “Elicitation for Aggregation,” Proceedings of the 29th Conference on Artificial Intelligence (AAAI), Austin, TX, 2015. Publisher's VersionAbstract
We study the problem of eliciting and aggregating probabilistic information from multiple agents. In order to successfully aggregate the predictions of agents, the principal needs to elicit some notion of confidence from agents, capturing how much experience or knowledge led to their predictions. To formalize this, we consider a principal who wishes to elicit predictions about a random variable from a group of Bayesian agents, each of whom have privately observed some independent samples of the random variable, and hopes to aggregate the predictions as if she had directly observed the samples of all agents. Leveraging techniques from Bayesian statistics, we represent confidence as the number of samples an agent has observed, which is quantified by a hyperparameter from a conjugate family of prior distributions. This then allows us to show that if the principal has access to a few samples, she can achieve her aggregation goal by eliciting predictions from agents using proper scoring rules. In particular, if she has access to one sample, she can successfully aggregate the agents' predictions if and only if every posterior predictive distribution corresponds to a unique value of the hyperparameter. Furthermore, this uniqueness holds for many common distributions of interest. When this uniqueness property does not hold, we construct a novel and intuitive mechanism where a principal with two samples can elicit and optimally aggregate the agents' predictions.
Elicitation for Aggregation
Y. Chen, K. Nissim, and B. Waggoner, “Fair Information Sharing for Treasure Hunting,” Proceedings of the 29th Conference on Artificial Intelligence (AAAI), Austin, TX, 2015.Abstract
In a search task, a group of agents compete to be the first to f ind the solution. Each agent has different private information to incorporate into its search. This problem is inspired by settings such as scientific research, Bitcoin hash inversion, or hunting for some buried treasure. A social planner such as a funding agency, mining pool, or pirate captain might like to convince the agents to collaborate, share their information, and greatly reduce the cost of searching. However, this cooperation is in tension with the individuals’ competitive desire to each be the first to win the search. The planner’s proposal should incentivize truthful information sharing, reduce the total cost of searching, and satisfy fairness properties that preserve the spirit of the competition. We design contract-based mechanisms for information sharing without money. The planner solicits the agents’ information and assigns search locations to the agents, who may then search only within their assignments. Truthful reporting of information to the mechanism maximizes an agent’s chance to win the search.-voluntary participation is satisf ied for large search spaces. In order to formalize the planner’s goals of fairness and reduced search cost, we propose a simplified, simulated game as a benchmark and quantify fairness and search cost relative to this benchmark scenario. The game is also used to implement our mechanisms. Finally, we extend to the case where coalitions of agents may participate in the mechanism, forming larger coalitions recursively.
Fair Information Sharing for Treasure Hunting
J. Abernethy, Y. Chen, C. - J. Ho, and B. Waggoner, “Low-Cost Learning via Active Data Procurement,” Proceedings of the 16th ACM Conference on Economics and Computation (EC), Portland, Oregon, 2015. Publisher's VersionAbstract
We design mechanisms for online procurement of data held by strategic agents for machine learning tasks. The challenge is to use past data to actively price future data and give learning guarantees even when an agent's cost for revealing her data may depend arbitrarily on the data itself. We achieve this goal by showing how to convert a large class of no-regret algorithms into online posted-price and learning mechanisms. Our results in a sense parallel classic sample complexity guarantees, but with the key resource being money rather than quantity of data: With a budget constraint B, we give robust risk (predictive error) bounds on the order of 1/B‾‾√. Because we use an active approach, we can often guarantee to do significantly better by leveraging correlations between costs and data. Our algorithms and analysis go through a model of no-regret learning with T arriving pairs (cost, data) and a budget constraint of B. Our regret bounds for this model are on the order of T/B‾‾√ and we give lower bounds on the same order.
Low-Cost Learning via Active Data Procurement
Y. Chen, X. A. Gao, R. Goldstein, and I. A. Kash, “Market Manipulation with Outside Incentives (Supersedes the AAAI’11 paper below),” Autonomous Agents and Multi-Agent Systems, vol. 29, no. 2, pp. 230-265, 2015. Publisher's VersionAbstract
Much evidence has shown that prediction markets can effectively aggregate dispersed information about uncertain future events and produce remarkably accurate forecasts. However, if the market prediction will be used for decision making, a strategic participant with a vested interest in the decision outcome may manipulate the market prediction to influence the resulting decision. The presence of such incentives outside of the market would seem to damage the market’s ability to aggregate information because of the potential distrust among market participants. While this is true under some conditions, we show that, if the existence of such incentives is certain and common knowledge, in many cases, there exist separating equilibria where each participant changes the market probability to different values given different private signals and information is fully aggregated in the market. At each separating equilibrium, the participant with outside incentives makes a costly move to gain trust from other participants. While there also exist pooling equilibria where a participant changes the market probability to the same value given different private signals and information loss occurs, we give evidence suggesting that two separating equilibria are more natural and desirable than many other equilibria of this game by considering domination-based belief refinement, social welfare, and the expected payoff of either participant in the game. When the existence of outside incentives is uncertain, however, trust cannot be established between players if the outside incentive is sufficiently large and we lose the separability at equilibria.
Market Manipulation with Outside Incentives, 2015
A. Drebera, et al., “Using Prediction Markets to Estimate the Reproducibility of Scientific Research,” PNAS, vol. 12, no. 50, pp. 15343–15347, 2015. Publisher's VersionAbstract

Concerns about a lack of reproducibility of statistically significant results have recently been raised in many fields, and it has been argued that this lack comes at substantial economic costs. We here report the results from prediction markets set up to quantify the reproducibility of 44 studies published in prominent psychology journals and replicated in the Reproducibility Project: Psychology. The prediction markets predict the outcomes of the replications well and outperform a survey of market participants’ individual forecasts. This shows that prediction markets are a promising tool for assessing the reproducibility of published scientific results. The prediction markets also allow us to estimate probabilities for the hypotheses being true at different testing stages, which provides valuable information regarding the temporal dynamics of scientific discovery. We find that the hypotheses being tested in psychology typically have low prior probabilities of being true (median, 9%) and that a “statistically significant” finding needs to be confirmed in a well-powered replication to have a high probability of being true. We argue that prediction markets could be used to obtain speedy information about reproducibility at low cost and could potentially even be used to determine which studies to replicate to optimally allocate limited resources into replications.

Using Prediction Markets to Estimate the Reproducibility of Scientific Research
M. R. Munafo, et al., “Using Prediction Markets to Forecast Research Evaluations,” Royal Society Open Science 2: 150287, vol. 2, no. 10, 2015. Publisher's VersionAbstract
The 2014 Research Excellence Framework (REF2014) was conducted to assess the quality of research carried out at higher education institutions in the UK over a 6 year period. However, the process was criticized for being expensive and bureaucratic, and it was argued that similar information could be obtained more simply from various existing metrics. We were interested in whether a prediction market on the outcome of REF2014 for 33 chemistry departments in the UK would provide information similar to that obtained during the REF2014 process. Prediction markets have become increasingly popular as a means of capturing what is colloquially known as the ‘wisdom of crowds’, and enable individuals to trade ‘bets’ on whether a specific outcome will occur or not. These have been shown to be successful at predicting various outcomes in a number of domains (e.g. sport, entertainment and politics), but have rarely been tested against outcomes based on expert judgements such as those that formed the basis of REF2014.
Using Prediction Markets to Forecast Research Evaluations
2014
S. Jain, Y. Chen, and D. C. Parkes, “Designing Incentives for Online Question and Answer Forums (Supersedes the EC’09 paper.),” Games and Economic Behavior, vol. 86, pp. 458-474, 2014. Publisher's VersionAbstract
Engineering and Applied Sciences In this paper, we provide a simple game-theoretic model of an online question and answer forum. We focus on factual questions in which user responses aggregate while a question remains open. Each user has a unique piece of information and can decide when to report this information. The asker prefers to receive information sooner rather than later, and will stop the process when satisfied with the cumulative value of the posted information. We consider two distinct cases: a complements case, in which each successive piece of information is worth more to the asker than the previous one; and a substitutes case, in which each successive piece of information is worth less than the previous one. A best-answer scoring rule is adopted to model Yahoo! Answers, and is effective for substitutes information, where it isolates an equilibrium in which all users respond in the first round. But we find that this rule is ineffective for complements information, isolating instead an equilibrium in which all users respond in the final round. In addressing this, we demonstrate that an approval-voting scoring rule and a proportional-share scoring rule can enable the most efficient equilibrium with complements information, under certain conditions, by providing incentives for early responders as well as the user who submits the final answer.
Designing Incentives for Online Question and Answer Forums, 2014
P. Zhang and Y. Chen, “Elicitability and Knowledge-Free Elicitation with Peer Prediction,” Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Paris, France, pp. 245–252, 2014.Abstract
The elicitation of private information from individuals is crucially important to many real-world tasks. But elicitation is most challenging when it is most useful: when objective (verifiable) truth is inaccessible or unavailable, and there is no ``answer key" available to verify reports. Prior work has designed mechanisms that truthfully elicit private information without verification for some restricted set of possible information structures of the participants (i.e. the common prior joint distributions of participants' signals). In fact, no mechanism can elicit private information truthfully for all information structures without verification. In this paper, we identify the maximal set of information structures that are truthfully elicitable without verification, and provide a mechanism for such elicitation. This mechanism requires that the designer know the information structure of the participants, which is unavailable in many settings. We then propose a knowledge-free peer prediction mechanism that does not require knowledge of the information structure and can truthfully elicit private information for a set of information structures slightly smaller than the maximal set. This mechanism works for both small and large populations in settings with both binary and non-binary private signals, and is effective on a strict superset of information structures as compared to prior mechanisms that satisfy these properties.
Elicitability and Knowledge-Free Elicitation with Peer Prediction
Y. Chen, I. A. Kash, M. Ruberry, and V. Shnayder, “Eliciting Predictions and Recommendations for Decision Making (Supersedes the WINE’11 and the AAMAS’11 papers.),” ACM Transactions on Economics and Computation, vol. 6, no. 2, pp. 1–27, 2014. Publisher's VersionAbstract
When making a decision, a decision maker selects one of several possible actions and hopes to achieve a desirable outcome. To make a better decision, the decision maker often asks experts for advice. In this article, we consider two methods of acquiring advice for decision making. We begin with a method where one or more experts predict the effect of each action and the decision maker then selects an action based on the predictions. We characterize strictly proper decision making, where experts have an incentive to accurately reveal their beliefs about the outcome of each action. However, strictly proper decision making requires the decision maker use a completely mixed strategy to choose an action. To address this limitation, we consider a second method where the decision maker asks a single expert to recommend an action. We show that it is possible to elicit the decision maker’s most preferred action for a broad class of preferences of the decision maker, including when the decision maker is an expected value maximizer.
Eliciting Predictions and Recommendations for Decision Making, 2014

Pages