Responsible AI: Bias & Equity Concerns in Training AI Models

By Manoj Jha, PhD

Associate Adjunct Professor

July 14, 2025

IT and Computer Science

Artificial Intelligence (AI) is no longer an abstract concept confined to research labs or tech giants. It’s now embedded in everyday life: making predictions, offering recommendations, and even assisting in decisions about hiring, healthcare, and public safety. Having worked with AI applications in cybersecurity and infrastructure analytics for over 20 years, I have seen both its power and its pitfalls.

One of the most pressing concerns I encounter is bias in training data, a problem that isn’t just technical but deeply ethical. Many AI systems don’t just reflect the world as it is; they mirror it through the data we give them. And if that data is flawed, the decisions made by AI can reinforce injustice instead of reducing it.

If you’re considering a career in data science, IT, or cybersecurity, or are already in the field, understanding this issue is not optional. It’s central to designing systems that work fairly and responsibly.

The Role of Training Data: Where AI Learns to Think

Think of AI as a student. Like any learner, it forms its understanding from the examples it sees. In AI’s case, those examples are the training datasets. The problem? Those datasets are built by people, and people are never perfectly neutral.

If an AI model is trained mostly on data from a specific group, such as English-speaking users from Western countries, it will likely perform best for that group. This may sound like technical oversight, but in practice, it means people outside that data profile might be misjudged, excluded, or harmed.

A widely cited study by the National Institute of Standards and Technology (NIST) found that some facial recognition algorithms misidentified Black and Asian faces at rates up to 100 times higher than white faces. These kinds of disparities can have real consequences, especially when AI is used in law enforcement or border security.

How Bias Enters AI Systems

Bias isn’t just about who is represented in the dataset. It can creep in through other avenues. Some examples include

Sampling Bias: The training data doesn’t include enough variation to represent everyone fairly.
Labeling Bias: Humans apply labels to data. If their judgment is influenced by stereotypes, consciously or not, the AI will learn those biases.
Historical Inequity: AI trained on past data may reproduce patterns of discrimination that already exist, like gender imbalances in hiring.
Proxy Bias: Sometimes we use a stand-in for a harder-to-measure variable. For instance, healthcare AI might assume that more money spent equals more need, ignoring unequal access to care.

Once trained, the AI doesn’t know that these patterns are unjust. It simply sees them as “normal.” That’s where the risk lies.

Why Students Should Care About Bias in AI

As a professor and industry professional, I encourage people to see AI not only as a tool, but as a responsibility. Technology doesn’t operate in a vacuum. The choices we make as data scientists, IT specialists, or cybersecurity analysts will shape who gets access, who is denied, and who gets left behind.

In fact, your understanding of AI bias could be what sets you apart in a competitive job market. Companies are increasingly being held accountable for their use of algorithms. They are looking for talent that can build not just high-performing systems, but fair ones.

Real-World Examples of Biased AI

The risk isn’t theoretical. Here are a few examples of how bias in AI can play out:

In Hiring: An AI system trained on previous hires from a male-dominated company may reject resumes that mention “women’s college” or include female-associated names.
In Healthcare: A risk prediction tool in the healthcare system may use healthcare spending as a determinate for illness source, which could underestimate the needs of underserved communities and patients.
In Lending: Credit models may penalize applicants from certain zip codes as an indirect proxy for race or income, despite equivalent creditworthiness.

These are not just system errors. They are structural problems embedded into the data and magnified by automation.

What We Can Do: Practical Steps Toward Responsible AI

Solving bias in AI doesn’t mean throwing out algorithms altogether. It means being more deliberate in how we design them. Here are some ways we can do better:

Diversify the Data
The first step is ensuring that training datasets reflect the full range of the population they are meant to serve. That includes racial, cultural, linguistic and geographic diversity, not just technical correctness.
Audit for Bias
AI models should be stress-tested just like any other system. That includes measuring performance across different demographic groups to detect unintended disparities.
Document the Process
Every dataset and model should come with clear documentation. What were the assumptions? Who labeled the data? What is missing? Transparency builds accountability.
Keep Humans in the Loop
Even in automated systems, human oversight is crucial, especially in sensitive applications like healthcare, finance, and public safety.
Teach Ethics Early
Courses in AI and data science should include ethics as a core component, not a footnote. Students need to be trained to spot risks, question defaults, and design responsibly from day one.

Trust Is the Real Innovation

In a world where machines increasingly influence decision-making, trust is everything. Users, regulators, and the public need to feel confident that AI systems are fair, accurate, and explainable. That starts with the data, and with the professionals who build and manage those systems.

At my company, where I lead our advanced analytics and data science team, we are constantly evaluating not just what our models predict, but how and why. And in my classes at UMGC, I encourage students to do the same. If we want AI to help solve society’s biggest challenges, it must work for everyone, not just those whose data dominates the training set.

Final Takeaway: This Is Your Moment

Whether you are just beginning your degree at UMGC or already working in the tech field, you are entering one of the most exciting and ethically demanding eras in technology. The ability to code or analyze data is valuable, but the ability to do so responsibly is essential.

As AI becomes more powerful, we don’t just need more engineers. We need ethical engineers. We need professionals who understand that data isn’t just information, it represents people. And people deserve fairness.

Reference on this webpage to any third-party entity or product does not constitute or imply endorsement by UMGC nor does it constitute or imply endorsement of UMGC by the third party.