David vs. Goliath – Data Ethics and AI

Oct 3, 2023 | 5 min read

Justin E. Lane

Co-Founder & CEO

17 sites were trading my information actively. I don’t get a penny for this information…but these companies are making billions.

Social media platforms have long made their living turning their users into products for advertisers and businesses to profit from; collecting and selling data has been business as usual for so long that it almost feels like an immutable truth. But it doesn’t have to be. As the AI revolution takes shape and continues to weave its way into our daily lives, the questions around data ethics become increasingly imperative.

Why? Because data – and lots of it – is the backbone of all the emerging generative AI systems. We sat down with our CEO, Justin Lane, to dive into data ethics, the trouble with big data, and the importance of building AI tools that address the issues.

Defining Data Ethics

The ethical use of data, no? Sure, but what does that mean in practice? To get to the heart of it, we need to get more specific. Data ethics revolves around “not violating the trust and rights of other people in terms of how you're using data.”

This includes:

Not using data for purposes that the individual wouldn’t
Not using data for things that could potentially harm somebody else
Not using data to create an AI system that is solely tuned to doing something bad

As simple as it seems, data is a billion-dollar business where right and wrong get murky quickly.

The Folly of Big Data

“The fact is sometimes you don't need bigger data; you just need better algorithms.” Marketers have long relied on demographics like race, religion, sex, age, familial status, nationality, etc. to segment people. The case of Meta’s discriminatory housing ads is a textbook example of coupling those old-school marketing tactics with an almost blind overreliance on big data. As the Department of Justice noted, “the complaint alleges that Meta uses algorithms in determining which Facebook users receive housing ads, and that those algorithms rely, in part, on characteristics protected under the Fair Housing Act.” The thing is, it’s not actually the demographics that lead to and motivate buying decisions – it’s beliefs.

“Why would you want to segment that and just say, I'm not even going to talk to these people? When really you want to talk to everyone who would be interested. That's what's going to make your marketing most effective.”

New-School, Belief Based Marketing

“You're not making products for a demographic; you're making products for people who need them and the demographics were just proxies for those people. With CulturePulse, we’re able to quantify people's beliefs, their values and the things that they're interested in.” If a person believes they should live a healthy lifestyle, for example, you should be able to focus your advertising of vitamins, supplements, exercise regimens, books, podcasts and whatnot to them specifically because they’re health conscious. Something that has nothing to do with their race, their income demographic, etc. – the belief transcends.

“You can be more ethical by not incorporating that data.”

Corporate Responsibility: The Status Quo vs. The People

With great power comes great responsibility as they say. However, legacy companies like Meta, Twitter, etc., have made their billions trading our personal data. Remember, even if you’ve deleted your profile, in many cases they keep your data. “Now that there's this idea in mind of needing to pay people for their data, I think that that's why you're seeing a lot of activation in the legal space and you're going to see more lawsuits coming against the big guys around AI and data ethics in particular.” This dovetails with the Constitution and the 4th amendment – see what we mean about murky. “If we view our data as our property and therefore your security, then it means that someone can't take the data you produce and just sell it without your consent or your knowledge. Or you're being paid for it so you can agree to say a minuscule percent of Facebook’s profit.” And it’s not just about compensating individuals for the trillions of data points they’ve fed these companies over the years; AI models are being trained on the same data.

Corporation-on-Corporation Crime

“Reddit Wants to Get Paid for Helping to Teach Big A.I. Systems”, reads a recent headline from the New York Times. Google, OpenAI, Microsoft and more have been freely crawling and scraping Reddit data to build out their AI tools, creating massive value for themselves in the process without delivering any value back to Reddit – or the users who provided the data. [Note: CulturePulse pays Reddit for this same data.] Of course, users signed away their rights in the never-read terms of service, a classic case of what’s ethical and what’s legal not being quite aligned.

What’s Next?

That’s the big question that CulturePulse will be working towards answering as AI evolves. As Justin points out, “a lot of what I see in the startup community is people wanting to change things. They want to do it. They’re seeking ways to be more innovative in terms of their data ethics and more broadly ethical in terms of their innovations.” It’s less David vs. Goliath and more David feeding Goliath. Maybe the bigger question is, how much longer can that last?

Justin spoke on setting ethical standards at the previous Reflect Festival; check out his thoughts:

Proud NVIDIA Inception Partner since 2024