Abstract: Our goal is for every data scientist to practice data ethics. In this workshop, you will learn how to make data ethics actionable as a data scientist.
For this workshop, we’ll use the resources collected in deon (http://deon.drivendata.org), an open source command line tool that integrates an ethics checklist into your existing data science workflow. The goal of deon is to enable teams to flexibly carry out the ethical discussions most relevant to them, and to preemptively address issues they may overlook. Instead of relying on an “Ultimately True” philosophy or oath, deon encourages an upfront and ongoing dialogue about the different ethical aspects of your project. This dialogue will span a broad set of ethical issues that frequently arise in machine learning and data science contexts. Specific solutions often vary with the task at hand, but deon helps cultivate ethical intentionality for the first line of defense––the engineers that have an influence over how data science actually gets done.
Throughout this interactive workshop, Casey and Jay will explain the rationale behind building deon, walk through the default checklist content, and provide concrete examples of times where overlooking topics on the ethics checklist has caused unnecessary headache or harm. Using real stories of improperly hashed NYC taxi data, congressional distortions of Planned Parenthood data, biased geometry in embedding spaces, and racial disparities in Amazon Prime delivery areas, you will consider and discuss a diverse set of ethical issues common in the course of data science work.
You'll roll up your sleeves and dive into real world data to explore the trade-offs and nuance of data ethics as you team up to navigate a set of scenarios. In a two-phased case study on public and private sector uses of personal health data, you’ll practice working through the checklist and examining the ethical implications of your team’s choices.
Come learn how to jump-start the ethics conversation all data teams should be having.
- Checklists connect principle to practice and enable data scientists to exercise their data ethics muscles, becoming better at issue-spotting, mitigation, and navigating subtle discussions.
- Having a structured process for data ethics makes it easier for teams to have tough conversations and helps ensure that important work doesn't get overlooked.
- Working through the ethics checklist in deon can help preempt ethical problems down the line.
This workshop is intended for data scientists and managers—the developers that have influence over how data science gets done. This means anyone who spends their days working directly with data, in the realm of data collection, data storage, analysis, modeling and/or deployment.
Bio: Jay Qi is a Senior Data Scientist at DrivenData, where he uses data science for social good and helps mission-driven organizations leverage data to maximize their impact. Previously, Jay was a Lead Data Scientist at Uptake, where he used machine learning with streaming sensor data to predict failures on industrial machines like locomotives and heavy equipment.