- About
- Events
- Calendar
- Graduation Information
- Cornell Learning Machines Seminar
- Student Colloquium
- BOOM
- Fall 2024 Colloquium
- Conway-Walker Lecture Series
- Salton 2024 Lecture Series
- Seminars / Lectures
- Big Red Hacks
- Cornell University - High School Programming Contests 2024
- Game Design Initiative
- CSMore: The Rising Sophomore Summer Program in Computer Science
- Explore CS Research
- ACSU Research Night
- Cornell Junior Theorists' Workshop 2024
- People
- Courses
- Research
- Undergraduate
- M Eng
- MS
- PhD
- Admissions
- Current Students
- Computer Science Graduate Office Hours
- Advising Guide for Research Students
- Business Card Policy
- Cornell Tech
- Curricular Practical Training
- A & B Exam Scheduling Guidelines
- Fellowship Opportunities
- Field of Computer Science Ph.D. Student Handbook
- Graduate TA Handbook
- Field A Exam Summary Form
- Graduate School Forms
- Instructor / TA Application
- Ph.D. Requirements
- Ph.D. Student Financial Support
- Special Committee Selection
- Travel Funding Opportunities
- Travel Reimbursement Guide
- The Outside Minor Requirement
- Diversity and Inclusion
- Graduation Information
- CS Graduate Minor
- Outreach Opportunities
- Parental Accommodation Policy
- Special Masters
- Student Spotlights
- Contact PhD Office
Towards Goal-aware Responsible Data Science
Abstract: Data-based systems are increasingly used in applications that have far-reaching consequences and long-lasting societal impact. However, the development process remains highly specialized, tedious, and unscalable. This produces a manually fine-tuned rigid solution that works only for one specific problem in one specific context. The system fails to adapt to the changing world and severely limits the full utilization of valuable data.
So, how can you avert this fate for your systems?
In this talk, I present my vision of context-aware systems that enable even non-expert users to develop correct, explainable, and equitable data-science pipelines. To achieve this, I will focus on i) using downstream goal as additional information to design data science pipelines, and ii) the importance of causal inference for trustworthy data analysis. I will present a data discovery framework that automatically identifies useful data on behalf of end-users for various tasks. Lastly, I will discuss my proposal of leveraging counterfactual reasoning and causal inference to quantify the impact of an input on the outcome.
Bio: Sainyam Galhotra is an Assistant Professor in the Computer Science department at Cornell University. Before that, he was a Computing Innovation Postdoctoral Fellow at the University of Chicago. The goal of his research is to lay the foundation of responsible data science, that enables efficient development and deployment of trustworthy data analytics applications. His research has combined techniques from Data Management, Probabilistic Methods, Causal Inference, Machine Learning, and Software Engineering. He received his Ph.D. from the University of Massachusetts Amherst under the supervision of Prof. Barna Saha. He is a recipient of the Best Paper Award in FSE 2017 and the Most Reproducible Paper Award in SIGMOD 2017 and 2018. He is a DAAD AInet Fellow, and the first recipient of the Krithi Ramamritham Award at UMass for contribution to database research.