55 Applied Ethics & Responsible Data Science

55.1 Algorithmic Bias Audit — Financial Services

Objective

Apply the seven sources of bias framework to conduct a comprehensive audit of algorithmic decision-making in financial services.

Scenario

You are a senior data scientist at a major regional bank that has been using an AI-powered loan approval system for the past two years. Recent complaints from community advocacy groups suggest the system may be discriminating against certain demographic groups. The bank’s chief risk officer has tasked you with conducting an independent bias audit.

Suresh and Guttag (2021) describe seven distinct sources of bias and associate them with steps in the modeling cycle, from data preparation to model deployment. Your audit must systematically examine each potential source.

Tasks

Part A: Historical Bias Analysis

As part of this investigation you are provided with data on 15,000 loan applications from the period 2019–2023. The data are in file historical_loan_data.csv.

Analyze the approval rates in the pre-AI system by:

Race/ethnicity
Gender
Age groups
Geographic location (ZIP code level)
Income brackets

Write a 2-page report documenting historical lending patterns that might constitute historical (pre-existing) bias, which is rooted in social institutions, practices and attitudes that are reflected in training data.

Part B: Multi-Source Bias Assessment

For each of the seven bias sources, create:

Risk Assessment Matrix: Rate the likelihood (Low/Medium/High) that each bias type affects your system
Evidence Collection: Document specific indicators you would look for
Impact Analysis: Estimate potential harm to affected groups

Focus particularly on:

Representation Bias: Analyze whether your training data represents your current customer base
Measurement Bias: Examine whether credit scores function equally across demographic groups
Evaluation Bias: Review whether your model validation process captures fairness metrics
Deployment Bias: Investigate how loan officers use algorithmic recommendations

Part C: Stakeholder Communication Plan

Design communications for:

Executive Summary (1 page): For C-suite executives, focusing on regulatory and reputational risks
Technical Brief (2 pages): For the AI development team, detailing specific model improvements
Community Response (1 page): For advocacy groups, outlining remediation steps

55.2 Case Study Analysis

Objective

Develop awareness of ethical considerations in data science practice.

Research and analyze two real-world data science ethical failures; for example, biased hiring algorithms, discriminatory lending models, privacy breaches. For each case:

Identify what went wrong
Analyze the impact on stakeholders
Propose how it could have been prevented
Discuss lessons learned

55.3 Ethical Framework Development

Objective

Develop personal framework.

Create a personal ethical decision-making framework for data science projects. Include:

Key questions to ask during each project phase
Red flags that should halt a project
Stakeholder consideration checklist
Bias detection strategies

55.4 Privacy Impact Assessment

Design a template for assessing privacy implications of data science projects, considering:

Data collection and storage
Model transparency requirements
Consent and data rights
Long-term implications

55.5 Generative AI in Media & Entertainment

Objective

Develop an ethical framework for deploying generative AI in content creation while addressing intellectual property, bias, and authenticity concerns.

StreamVision, a major streaming platform, wants to use generative AI to create promotional materials, subtitle translations, and personalized content recommendations. However, recent lawsuits against AI companies and concerns about biased content generation have made executives nervous. You are tasked with creating an ethical AI deployment strategy.

Generative AI models are essentially large machine learning models. The considerations regarding bias in machine learning (ML) apply here as well, while intellectual property concerns create additional complexity.

Intellectual Property Audit

Training Data Assessment

Catalog potential copyrighted material in AI training datasets
Assess fair use implications for different use cases
Design content filtering to avoid copyrighted material reproduction

Artist Rights Protection

Create protocols for obtaining artist consent for style mimicry
Design attribution systems for AI-assisted content
Develop licensing frameworks for AI-generated derivative works

Bias Mitigation in Content Generation

Apply the bias framework to content creation:

Historical Bias in Entertainment

Analyze how historical bias in media representation affects AI-generated content
Design prompting strategies to counteract stereotypical representations
Create diversity benchmarks for generated content

Representation Bias in Global Content

Assess whether training data represents your global audience
Design region-specific content generation guidelines
Create cultural sensitivity review processes

Constitutional Prompting Framework

Develop system prompts that promote inclusive representation
Create bias detection algorithms for generated content
Design human-in-the-loop review processes

Environmental Impact Assessment

Address the environmental impacts of generative AI.

Carbon Footprint Analysis

Calculate energy consumption for different AI use cases
Compare environmental costs to traditional content creation methods
Design efficiency optimization strategies

Sustainable AI Practices

Create guidelines for minimizing unnecessary AI usage
Design content caching to reduce redundant generation
Establish green energy requirements for AI infrastructure

Authenticity and Disclosure Framework

Content Labeling: Design transparent disclosure systems for AI-generated content
Deepfake Prevention: Create detection and prevention systems for malicious AI use
Editorial Standards: Establish quality control processes for AI-assisted content

55.6 Multi-Stakeholder Ethics Simulation

Objective

Conduct a role-playing simulation that demonstrates the complexity of ethical decision-making when multiple stakeholders have conflicting interests.

You will facilitate a simulated ethics committee meeting at “Global Health Analytics”, a company developing AI tools for pandemic response. The committee must decide whether to share proprietary COVID contact-tracing data with public health officials, despite privacy concerns and competitive disadvantages.

This assignment synthesizes multiple ethical frameworks from the course, requiring students to navigate trade-offs between privacy, public health, competitive advantage, and social responsibility.

Stakeholder Role Assignments

Each participant represents a different stakeholder:

Chief Data Officer: Responsible for data governance and privacy compliance
Public Health Advocate: Community health organization representative
Chief Legal Counsel: Concerned about liability and regulatory compliance
Chief Technology Officer: Focused on technical feasibility and security
Marketing Director: Concerned about competitive advantage and customer trust
Privacy Rights Advocate: Representing digital rights organizations
Epidemiologist: Academic researcher studying pandemic spread
Board Member: Representing shareholder interests

Position Papers (Individual)

Each participant must research and write a 2-page position paper that includes:

Stakeholder Interests: What their constituency cares about most
Ethical Framework: Which ethical principles they prioritize
Risk Assessment: What they see as the biggest risks
Preferred Solution: Their ideal outcome and reasoning

Simulation

Opening Statements

Each stakeholder presents their position (5 minutes each)
Facilitator introduces the decision framework
Initial positions and conflicts are identified

Working Groups

Technical Working Group: CTO, CDO, Epidemiologist discuss feasibility
Legal Working Group: Legal Counsel, Privacy Advocate, Board Member assess risks
Public Interest Group: Public Health Advocate, Privacy Advocate, Epidemiologist explore impact

Negotiation Session

Full committee attempts to reach consensus
Facilitator guides discussion using structured ethical decision-making framework
Compromises and trade-offs are negotiated

Final Decision

Committee votes on final recommendation
Dissenting opinions are recorded
Implementation plan is developed

Each participant writes about:

Perspective Changes: How their views evolved during the simulation
Ethical Trade-offs: Which compromises were most difficult and why
Process Insights: What worked well/poorly in the decision-making process
Real-World Applications: How these dynamics apply to their future work

Group Case Study Development

Collectively develop:

Decision Documentation: Comprehensive record of the final decision and reasoning
Process Evaluation: Assessment of the decision-making framework used
Alternative Scenarios: How different circumstances might change the outcome
Best Practices: Recommendations for similar future decisions

Deliverables

Individual position papers (2 pages each)
Simulation notes and decisions
Individual reflection papers (2 pages each)
Group case study document

55.7 Assessment Criteria

Demonstrates understanding of ethical frameworks from the course
Applies multiple ethical perspectives to complex problems
Recognizes and articulates ethical trade-offs

Shows practical understanding of how ethics applies in real business contexts
Considers stakeholder perspectives and business constraints
Proposes implementable solutions

Connects ethical principles to technical design decisions
Demonstrates understanding of how bias enters algorithmic systems
Proposes technical solutions to ethical problems

Communicates ethical concepts clearly to diverse audiences
Creates actionable recommendations
Considers implementation challenges and change management