Making Healthcare Data Usable with AI

Healthcare is drowning in data, but still thirsty for insight. Discover how analytics, AI, cloud, and smarter standards are finally helping the system make sense of the 97% of data we’ve barely touched.

It's astonishing that in some corners of the world, life-altering medical decisions still rest on the faint sounds in a stethoscope, a fleeting intuition, and a prayer. Much like lobotomies or bloodletting - eerie reminders of how far we’ve come - this once-standard practice is quietly retiring, giving way to new kinds of wisdom hidden in data and code. What once took hours of chart review and second-guessing, AI now deciphers in seconds, translating complex data into answers that help scientists accelerate discovery and healthcare providers act with confidence and care.

Healthcare Datasphere

With nearly 30% of the global datasphere rooted in healthcare,1 the industry churns out more information than most can fathom, from EHRs, high-res medical imaging, precision genomics, and increasingly, from patients themselves. This year, healthcare data will swell to 10 zettabytes, the equivalent of 76 million years of continuous video streaming. But despite this abundance, only 3% of that data is ever translated into decisions that shape care, operations, or policy. Meanwhile, studies consistently find that between 80% to 97% of this data goes unanalyzed and unused, left gathering digital dust.2

Structured vs Unstructured Health Data

Roughly 80% of healthcare data are unstructured, with only 20% falling into structured formats.3

So, what’s flooding the cloud and cramming the servers with gigabytes of health data? A major source is Electronic Health Records (EHRs), the digital pulse of modern medicine. They house structured data like demographics, diagnoses, medications, allergies, vitals, and provider notes. Many also include lab results and imaging reports. Nearly every U.S. hospital and clinic now uses EHRs; as of 2021, 78% of office-based physicians and 96% of hospitals have adopted a certified EHR.4 These systems alone churn out massive volumes of clinical data daily.

Healthcare generates enormous volumes of data not just from clinical records, but also from administrative and imaging sources. Providers and payers produce vast troves of administrative data tied to billing and insurance claims - structured datasets like Medicare, Medicaid, and private insurer records that log services provided, diagnoses made, and charges billed. Though primarily designed for reimbursement rather than clinical insight (e.g., noting that a test was ordered but not its result), these datasets are population-wide, standardized, and indispensable for operational and policy analysis.

To make healthcare data truly actionable, we must understand not just what each type offers, but also how to navigate their quirks, limitations, and possibilities.

Meanwhile, imaging data from X-rays and MRIs to CT scans and ultrasounds adds another colossal layer. High-resolution by nature, a single MRI can clock in at hundreds of megabytes, making imaging one of the biggest contributors to healthcare’s data footprint. While vital for diagnosing everything from fractures to tumors, these data have often lived in silos, used only by radiologists. But with the rise of AI/ML, there's a growing push to mine these images for diagnostic patterns and deeper insights. Pathology data adds to this richness, combining microscope images with text-based findings. Like radiology, it’s highly unstructured and pixel-based, requiring sophisticated algorithms to interpret. Together, these image-heavy datasets are reshaping what’s possible in healthcare analysis.

Beyond clinical records and claims, healthcare data flow from a mosaic of sources - lab tests, diagnostic reports, pharmacy records, genomic and precision medicine datasets, wearable devices, remote sensors, patient-reported outcomes, public health registries, and research or clinical trial data. Each brings its own strengths and complications:

  • EHRs are clinically rich but often fragmented and inconsistent
  • Claims data are clean and standardized, yet shallow in clinical nuance
  • Imaging and genomics offer profound insights but demand heavy technical infrastructure
  • Patient-generated data deliver real-world context, though they can be noisy and hard to integrate

Together, these sources form a sprawling, heterogeneous ecosystem. To make healthcare data truly actionable, we must understand not just what each type offers, but also how to navigate their quirks, limitations, and possibilities.

Disconnected Data by Design

One of the most fundamental barriers to unlocking healthcare data is the lack of interoperability with health IT systems often speaking different languages and inability to easily share information. Hospitals and clinics may use different EHR vendors like Epic or Cerner, or even custom-built systems, each storing data in unique, often incompatible formats. Even within a single institution, departments like radiology, pharmacy, and the lab may run on separate software systems that don't communicate effectively, leaving valuable data trapped in digital silos. This fragmentation severely limits the ability to coordinate care or conduct meaningful analysis at scale. These silos aren’t just technical, they’re also organizational and historical.

Healthcare Data Sources

EHRs and imaging data make up the largest share of healthcare data, but sources are often fragmented and siloed. **Approximation**

Healthcare data are often dispersed across provider offices, hospitals, labs, insurers, and public health agencies, each holding just one piece of a patient's story. Some organizations treat data as a proprietary asset, reluctant to share it with competitors. Even internally, disconnected databases like a research system isolated from the clinical EHR reinforce the fragmentation. The result is a patchwork that complicates everything from answering basic questions like “what actually improved this patient’s outcome?” to ensuring information is up to date. Siloed systems lead to duplicated efforts, inconsistent records, and, ultimately, a healthcare system that struggles to connect the dots and see the whole picture.

Even when data can be exchanged, mismatched standards and formats remain major obstacles. One system might label blood pressure as “BP_sys,” another as “SystolicBloodPressure,” with different units, making integration a tedious mapping task. With varied coding systems like ICD, CPT, SNOMED, and LOINC, and no universal adherence, combining datasets requires intensive cleaning and normalization. This lack of semantic interoperability means analysts spend more time reconciling definitions than gaining insights. On top of that, data quality often falls short—riddled with typos, gaps, outdated entries, and missing context (like when or how a value was measured). These flaws erode trust and force clinicians and analysts to second-guess the data before they can actually use it.

Robust data standards, AI-driven analytics, and better data stewardship are finally making healthcare data more connected and actionable.

Privacy regulations like HIPAA in the U.S. and GDPR in Europe are essential for protecting patient information, but they also introduce complexity that can stifle data sharing even when it's in patients’ best interest. Strict rules, the fear of breaches, and hefty penalties lead many organizations to lock data away unless sharing is explicitly permitted. While these safeguards uphold confidentiality, they can make it difficult to use data for care coordination or innovation.

Layered on top of regulatory friction are cultural barriers: many clinicians, trained to prioritize patient interaction over paperwork, are skeptical that more data means better care, especially when data tools are clunky or intrusive. Poorly designed systems can contribute to alert fatigue and workflow disruption, making providers wary of digital tools. Meanwhile, many healthcare institutions lack a strong data-driven culture and the internal talent, like data scientists and informaticians, needed to translate raw information into actionable insights. Resource constraints further widen the gap, leaving even valuable data underutilized. In the end, data often sit dormant, trapped by both policy caution and cultural hesitation.

Healing the Data Divide

Here’s the bright side: healthcare is catching up. With robust data standards, next-gen analytics with AI/ML, and a growing push for better data stewardship, the gears are finally turning to make healthcare data more connected and actionable.

Analytics with AI/ML
Robust data standards
Data stewardship

A key step toward solving interoperability in healthcare is adopting common data standards. One major breakthrough is HL7 FHIR (Fast Healthcare Interoperability Resources), a modern, web-based API standard that allows systems to exchange specific data (e.g., lab results, medications, or allergies) in structured formats such as JSON or XML. Under the U.S. 21st Century Cures Act, certified EHRs are now required to support FHIR APIs for core data elements.5Complementing this, the U.S. Core Data for Interoperability (USCDI) defines a standardized set of health data classes that must be exchangeable, including demographics, problems, and labs. Together, these standards enable more consistent and on-demand access to health information. FHIR is also gaining traction globally, with national programs in Europe and Canada adopting it. Other vital standards include DICOM for imaging, LOINC for lab tests, SNOMED CT and ICD-10 for diagnoses, and RxNorm for medications.

Building on data standards, national frameworks are emerging to streamline data sharing. In the U.S., TEFCA (Trusted Exchange Framework and Common Agreement) is creating a “network of networks” that connects providers across regions and vendors under a unified set of rules, allowing them to securely query and share patient data. Private networks like CommonWell and Carequality are aligning with TEFCA to enhance interoperability. Meanwhile, open APIs are giving patients and third-party apps more access to their health data, moving us closer to seamless, real-time exchange.

Quantiles uses AI, big data, and advanced statistical modeling to analyze complex healthcare data.

On another front, AI/ML are helping make sense of the mountains of healthcare data, the exact challenge we're tackling at Quantiles. Tools like natural language processing extract structured data from free-text notes, while computer vision interprets medical images. AI also tackles messy data by matching patient records, mapping across coding systems, and flagging anomalies. Many health systems now use cloud-based platforms to unify data and apply ML for tasks like fraud detection or patient stratification. While ethical and technical hurdles remain, AI is quickly becoming essential to transforming raw health data into actionable insight.

Healthcare’s digital backbone is undergoing big transformation, from a patchwork of on-prem servers and locked-down silos, to collaborative integrated clouds. Increasingly, hospitals and health systems are turning to platforms like Amazon Web Services, Azure, and Google Cloud to build data lakes and warehouses where EHRs, claims, device feeds, and more can flow into one cohesive stream, ready for analysis, AI model deployment, and secure sharing with trusted partners. Collaboration becomes easier when everyone’s working from the same digital map. Cloud platforms, designed with HIPAA-grade security, are doing more than making storage easier. They're enabling a new era of scalable, intelligent health systems finally poised to harness the 97% of data we’ve barely touched. It’s still early days, but the cloud is quickly becoming healthcare’s launchpad for smarter, more connected care.

By combining AI and analytics, healthcare is beginning to turn overwhelming data into strategic intelligence that powers better care and smarter systems.

As healthcare becomes increasingly data-rich, the true opportunity lies in how we analyze and apply that information. As we unify data sources and refine the tools to analyze them, we unlock the ability to ask better questions and get faster, more actionable answers. Whether it's reducing hospital readmissions, identifying high-risk patients, or streamlining supply chains, analytics is becoming the lens that sharpens our vision. It allows us to shift from reactive to proactive, from fragmented to whole, and from intuition alone to informed data-driven decisions at every level of care.

Share:
JOIN OUR NEWSLETTER

Stay Significant

Fresh ideas and timely tips to stay confidently ahead of the curve.

Terms and Policies • Privacy Policy