Beauty in the AI of the Beholder

ODSC - Open Data Science
4 min readMar 29, 2023

Editor’s note: Bonny is a speaker for ODSC East 2023 this May 9th-11th. Be sure to check out her talk, “Telling Geospatial Stories with Open Source Solutions,” there!

I don’t know about you but I only see the limits of AI in the realms of critical thinking and decision-making. I think its moment is Bayesian and should be focused on calculating probabilities of profit (if that is your thing). There are also prediction models I can get behind, also Bayesian, where nodes update information based on a series of priors. A good example is deciding to screen for certain health risks based on genetics, exposure, and other factors.

I remember years ago attending a health conference where IBM Watson was touted as the next big thing. I have read estimates that over 2.5 million scientific papers are published each year meaning it is not even remotely possible for your healthcare professional to know the latest and greatest or even the latest and dangerous. Claims that IBM Watson would spit out treatment algorithms were preposterous for those of us in data science and familiar with the sleight of hand all too common in many (but not all) clinical trials.

Then again, most people don’t consult high-end journals to guide their scientific thinking. Should the devious be so inclined, these chatbots can spew an on-demand stream of citation-heavy pseudoscience on why vaccination doesn’t work, or why global warming is a hoax. That misleading material, posted online, can then be swallowed by future generative AI to produce a new iteration of falsehoods that further pollutes public discourse. The merchants of doubt must be rubbing their hands. — Generative AI is sowing the seed of doubt in serious science, gift article shareable link (https://on.ft.com/3SHf3yM)

Why is this relevant? My point in much of what I write and definitely what I speak about in conferences and panel discussions is being able to hold tension. A listener once commented that he thought I meant, “hold attention” — well that too but let me explain holding a tension. I would argue that the emotionality and motivation are lost when simply exchanging information. Taking away the focus, attention (and tension) leaves a narrative that lands flat. This means once you create a data question and gather the data — you accept the responsibility of telling the story. Not just the bright shiny story you hoped to share but the curious, kinesthetic, auditory remnants of the data.

Watching the video snippet of Boston during the 1930s Home Owners’ Loan Corporation (HOLC) risk calculation you are looking at actual Underwriter Manual text used to decide who would be granted a loan.

In the still image below I selected an area known as Roxbury. As you can see, this area is red-lined and determined not to be worthy of a government-sponsored home loan. The inhabitants were classified as infiltrating foreign-negro and the topography of the area was not favorable for investment.

You might be thinking, well Bonny, this was almost 100 years ago. But what if I can show you that many of these areas are still experiencing social vulnerabilities described as the risk of poverty, decreased lifespan, poorer health outcomes as well as deprivation of the intergenerational wealth transfer possible with home ownership?

Learning about the history of home ownership and how there were populations denied these opportunities should make us all curious to see what else we can reveal about the built infrastructure of our communities or any locations of interest. The image below is a modern-day snapshot with a few data layers engaged around the same area.

For example, a phenotype isn’t something we can legislate or address to improve patient outcomes or lower social vulnerabilities.

Geospatial analysis on the other hand allows examination of the built infrastructure in a community using satellite imagery and vector data (buildings, land use, highways…) to pinpoint potential contributors to poor health outcomes that are independent of phenotype. For example, pollution run-off from highways and roadways, overhead utility lines, pollutants in water and air, impervious surfaces, urban heat islands, noise pollution, lack of walkable streets, proximity to wastewater treatment, proximity to supermarkets yada yada yada. You get the idea.

The tools are open-source and the insights are powerful. There is genius in looking at the world with curiosity and intention.

My tagline has always been, making a big world seem smaller…

I will be elaborating more on geospatial tools and analysis at ODSC East 2023. When we tell stories, at least stories that bring us a fresh understanding of what may seem familiar, we enlarge our consciousness and refine our sensibility.

Using tools like QGIS, Python, and SQL we will build complex themes about the built infrastructure in urban environments, flooding in Pakistan, the Brazilian Rainforest, and post-disaster rebuilding on the island of Puerto Rico by highlighting individual stories noticing connections and the synthesis of larger concepts and emerging ideas.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.