Practicing Responsible AI

Artificial Intelligence—particularly Generative AI—has opened up exciting possibilities for learning data analysis and working with tools like Pandas and Matplotlib. But with great power comes great responsibility 💡

Whether you’re using AI to support your data analysis learning or building data-driven tools and dashboards powered by AI, it’s important to act ethically, thoughtfully, and responsibly.


🔐 Protecting Privacy: Understanding Data Flow

🌐 Where Your Data Goes When You Use AI

When you send a prompt to an AI tool like ChatGPT, Claude, or Gemini, that information doesn’t stay on your computer. Instead, your prompt gets sent across the internet to the company’s massive data centers where their AI models are running.

Here’s what happens to your data:

  • It travels to company servers - Your prompts are processed on OpenAI’s, Anthropic’s, Google’s, or Microsoft’s infrastructure
  • Companies have different policies - Each AI provider has their own terms about how they handle, store, and potentially use your data
  • You’re trusting them with your information - While most companies have privacy policies, they ultimately have access to everything you send

The key concern: Even if we were to assume good intentions and privacy policies, these companies have the technical capability to access your data. Some may use it to improve their models, others promise not to, but policies can change.

This is why data security must be at the forefront of your mind when working with any confidential or sensitive information. Never assume your prompts are private.

🧩 The Mosaic Data Problem

Even when you think you’re only sharing “harmless” parts of your data, combining seemingly innocent information can reveal someone’s identity. This is called the mosaic data problem.

For example, sharing just age, suburb, and occupation might seem safe, but these three pieces together could uniquely identify someone in a small community. This is why it’s crucial to use dummy data when getting AI help with your analysis.

🎭 Creating Dummy Data with AI

The good news? AI can help you create realistic dummy data that maintains your dataset’s structure without privacy risks:

  • Ask AI to generate sample data that matches your column types and patterns
  • Describe your data structure (column names, data types, ranges) rather than sharing actual records

This way, you get the coding help you need while keeping real people’s information safe.


💻 For Builders: Responsible Use When Creating AI-Enabled Data Tools

If you’re building dashboards, data analysis tools, or applications using GenAI, you carry additional responsibilities. Your design decisions can amplify or mitigate harm, support or undermine users, and shape how others use AI with data.

🔧 Principles for Ethical AI Integration

  • ⚖️ Fairness and Inclusion
    Ensure your data tool doesn’t exclude or harm any group. Be mindful of biases in datasets and how your analysis might affect different communities.

  • 🔍 Transparency
    Clearly communicate what the AI is doing, when users are interacting with AI-generated analysis, and how insights are generated. Make data sources and methods visible.

  • 🛠️ Human-in-the-Loop
    For critical or sensitive data analysis tasks, design your system to involve human review and oversight. Don’t automate decisions that require domain expertise.

  • 📚 Educate Your Users
    Help users understand both the benefits and limitations of your AI-powered data tools. Make it clear when results need validation.

  • 🌿 Reflect and Improve
    Monitor how your tool is being used. Be open to feedback and make improvements that promote responsible usage.

🛡️ Protecting Yourself: Package Security

When AI suggests installing Python packages, you need to be careful about typosquatting attacks. These are malicious packages with names that look similar to popular libraries.

What is typosquatting?

Attackers create fake packages with names that are almost identical to legitimate ones. A single typo in your installation command could download malware instead of the real library.

Common typosquatting examples:
- pandas (correct) vs panda, pandsa, or python-pandas (potentially malicious)
- numpy (correct) vs numppy, nunpy, or numpy-python (potentially malicious)
- matplotlib (correct) vs matplotlb, matplotlib-python (potentially malicious)

How to protect yourself:

  1. Always verify package names on PyPI.org before installing
  2. Check for official documentation links and high download counts
  3. Look for recent updates and active maintenance
  4. When in doubt, search for “[package name] official documentation”
  5. Be extra careful when AI suggests packages you haven’t heard of

This is especially important with AI assistance, as it can hallucinate package names or suggest outdated alternatives that attackers might exploit.

TipTip: Build for Trust

Responsible AI design isn’t just ethical—it builds trust, improves user experience, and helps your data tools stand the test of time.


🎓 Academic Integrity and Learning

Now that we’ve covered the technical and security risks that come with using AI, it all comes back to the core principle of using AI to enhance your learning, not offload it—something you’ll need to keep in mind throughout your education.

The University of Queensland (UQ) has excellent resources to help you use AI ethically in your studies:

🔗 Explore the AI Student Hub

📋 Key Guidelines to Remember:

  • ✅ Verify AI Outputs
    Always cross-check information generated by AI against reliable sources. When AI generates data analysis code, verify the results make sense in your domain context.

  • 🧠 Think Critically
    Don’t blindly accept what AI tells you—use it to spark ideas, not do your thinking for you. Remember: AI can hallucinate functions or generate code that runs but produces incorrect results.

  • ✍️ Acknowledge AI Use
    Clearly state when and how you used AI in assignments to maintain academic integrity.

  • 🤝 Complement, Don’t Replace
    Use AI to enhance your learning, not replace it—see Principle 3.

  • 🌏 Respect Cultural Sensitivities
    Consider issues like Indigenous data sovereignty and cultural perspectives when working with data and prompts.

WarningWarning

Misuse of AI—such as submitting AI-generated data analysis work without acknowledgement or understanding—can breach academic integrity policies. Use AI with honesty and transparency.


🧠 Reflective Questions for All Users

Whether you’re a student learning data analysis or building data-driven tools, ask yourself:

These questions align with the Core Principles:


Task: Addressing Privacy Issues

How can you overcome privacy issues but get an LLM to generate specific data analysis code for your dataset?