false
Catalog
Identifying Child Mental Health and Neurodevelopme ...
View Presentation
View Presentation
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Welcome to our session on Identifying Child Mental Health and Neurodevelopmental Conditions Using Real-World Clinical Data. I'm Juliet Edgcomb and I'm joined by my colleagues, Dr. Benson and Dr. Ongle, and our discussant, Dr. Alter, to share new and emerging research on advancing information extraction and accelerating the pace of discovery of children and adolescents with psychiatric and neurodevelopmental conditions. Please note the session is being recorded and if you ask questions, please use the microphone so that it can be captured for the recording. These are our speaker disclosures. Dr. Benson has research support from the Brain and Behavior Research Foundation, volunteers for the EPIC Behavioral Health Specialties Steering Board, and has a spouse who works for Cerebral Therapeutics. I've received research support from the Brain and Behavior Research Foundation, AFSP, Sorenson Foundation, Thrasher Research Fund, and the NIH, and Dr. Ongle has research support from the American Occupational Therapy Foundation and the NIH, and Dr. Alter has no financial relationships to disclose. So our session objectives for today are first, to evaluate commonly used methods to detect populations of children with mental health and neurodevelopmental conditions from existing clinical data sets. Second, to develop reflective practices to mitigate implicit biases and errors when leveraging existing clinical data sets for detection of child mental health and neurodevelopmental conditions at the individual and population level. To differentiate different clinical data types, including multi- and single-payer claims data and EHR data, and contrast the use of these data for detection across clinical presentations, and we'll highlight three conditions, suicide, autism, and psychosis in our talks today. And throughout the presentations, to critically review cases of childhood onset mental health and neurodevelopmental conditions, analyze methods of detection, and demonstrate potential downstream correlates to quality, service delivery, and health equity. So here's our agenda today. First I'll introduce our speakers, then Dr. Benson will speak on identifying children and young adults with schizophrenia spectrum disorder diagnoses in large data sets. Then I'll speak on detection of self-injurious thoughts and behaviors among children receiving emergency care. And finally, Dr. Angle will talk about using machine learning with real-world data to identify signs of autism in children. Can we reduce disparities in age of identification among girls and Latinx children? This will be followed by a panel discussion with our discussant, Dr. Alter, and our Q&A. So a little bit about us. Our first speaker is Dr. Benson. She is Associate Chief Medical Information Officer at McLean Hospital, and Medical Director Digital Solutions at Mass General Brigham, and an Assistant Professor in the Department of Psychiatry at Harvard Medical School. Dr. Amber Angle is an Assistant Professor in the Mrs. T.H. Chan Division of Occupational Science and Occupational Therapy at the University of Southern California. And she runs the DREAMS Lab, Disparity Reduction and Equity in Autism Services, which focuses on understudied and underserved groups of individuals on the autism spectrum. This is me. I'm Assistant Professor of Child Psychiatry and Associate Director of the Mental Health Informatics and Data Science Minds Hub at UCLA. And our discussant is Dr. Alter, who is Associate Chair for Clinical Integration and Operations in the Department of Psychiatry and Biobehavioral Sciences at the University of Texas at Austin Dell Medical School, Professor in the Department of Psychiatry, and Chair of the APA Council on Quality Care. So without further ado, I'll turn it over to my co-presenter, Dr. Benson. All right. Thank you. Good morning. Thanks for coming on Wednesday of APA Week. So this is sort of an outline of what I'm going to be talking about. So I'm going to first talk broadly about digital tools and applications in child and adolescent psychiatry, and then use informatics methods to review two case studies that think about some of the implications and challenges when we're trying to use large data sets to identify children with severe mental illness. So I'm going to start by discussing digital health tools more broadly and their application, as I mentioned, and then touch on some of the challenges within digital tools and how we can start to address them or at least use the information of being aware of the challenges. So as you probably know, digital health tools have become a major disruptor in the health care industry, and I think really have the potential to significantly improve the delivery of health care services and care to our patients. And when I say digital health tool, I mean a variety of things, things from mobile apps to wearable devices to electronic health records, prediction algorithms, and more. To give you a sense, this market is really growing. In 2019, it was estimated that the global digital mental health market was worth more than $100 billion, and that by 2027, the youth wellness and mental health market specifically is estimated to be worth about $26 billion, just because there's been so much investment and funding in this area. And so what you can see on the screen is the results of a market scan by Telocity, which is basically a company that looked at almost around 850 startups in the digital wellness and mental health space, and they essentially created this paradigm of these tools being on the spectrum of health and well-being to access to care and support. And at the individual level, digital tools can be a part of clinical care. So these types of tools can facilitate early intervention. So for example, they can act as digital spaces for help-seeking youth and provide first entry into clinical services. So for example, web-based screening, a platform for emerging signs of psychosis. They can facilitate treatment engagement through text messaging or therapeutic video games like the FDA-approved Endeavor RX for ADHD. And they can also promote recovery. So for example, capturing changes in mobility or socialness that can indicate a relapse even before the next office visit. At the population level, electronic health records contain large repositories of clinical data that we can use to do things like identify at-risk patients, facilitate diagnosis or treatment, and ideally develop risk profiles to predict who might be at risk for adverse events. And prediction using machine learning has really been growing in psychiatry, as I think you'll hear a little bit today. And there have been lots of studies showing machine learning approaches used to provide clinical support and learn diagnostic patterns for children with psychiatric disorders. And you can see the figure on the right. The number of publications in child psychiatry alone that relates to machine learning is really growing exponentially. And that's the black line, which is just all of them. But while digital tools can provide a way to leverage complex data sources, there are a few major challenges that even the best machine learning or prediction algorithms struggle with. And you can see on the right is the Gertner Hype Cycle, which we certainly can't really have an informatics talk without it. And it essentially describes the technology lifestyle. So first, when there's a major innovation, there are high expectations around what it could be. I wonder if CHAP-GPT is in this area. Who knows? It's then followed by disillusionment as interest wanes or the technology fails. If it then, though, overcomes that and shows increasing promise, there's a broader awareness and it can even be adopted mainstream. So on the figure on the right, I've highlighted in purple machine learning because it's sort of estimated to be going down to this trough of disillusionment. And I think one of the major reasons for all the hype and now disappointment around machine learning is that these tools are truly limited if we don't understand our data sources and the limitations, how we're defining outcomes, and how we understand missing data. And so success in this area really surrounds or is needed to understand the data itself and the limitations of the data so you know how to use it. So finding ways to examine outcomes, for example, in large data sets like insurance claims, which I'll talk about today, certainly has its own challenges. When we're using existing data sources, we have to remember that data need to be carefully reviewed because most were not developed to examine quality of care and could lack simple things like procedure codes that align with evidence-based practice. And so we really need to understand the data and the capacity of the data to answer the question we're looking to answer. I think the other thing is missingness in these data are really important. So depending on the data, they may not link across all sites where children access care. So for example, from the clinic to the school. And it's not to say that these types of data can't be helpful, just that we need to understand if we're missing an entire scope of where somebody's receiving care or anything else. So using existing data repositories or clinical registries can help us identify, predict, and monitor treatment outcomes across a population. And if we do this well, we could really determine, ideally, up front, which patients are most or least likely to benefit from, say, a treatment or a treatment paradigm. And at the population level, registries can also help identify populations or geographic areas that may need more support or more resources. So we're going to talk through two case studies that try to address, is this possible right now? So first, we're going to look at some of the implications and challenges when trying to identify children with new-onset schizophrenia spectrum disorder in insurance claims. So to level set, though I imagine everybody knows, first episode psychosis is a broad term that describes the first episode when someone experiences psychotic symptoms. This can be defined based on when symptoms start or when individuals first seek treatment for these symptoms. When we're using population-level data, it is much more feasible to identify the episode when someone first seeks treatment, though of course, as we know, this can be quite delayed from symptom onset. And there are some studies that suggest the average delay obtaining treatment from symptom onset is about 74 weeks or almost a year and a half, and the range of that estimate is pretty wide. But I think it's important to recognize this because early treatment, particularly with coordinated specialty care, improves long-term outcomes as demonstrated in the RAISE study, at least while you're in the program. So given the RAISE findings, there's a push for individuals with psychosis to receive CSC within the first few years after a diagnosis. We know that coordinated specialty care is more expensive than usual care, but it's thought to have better value given the improvement in outcomes for patients treated with this model. Congress has even recognized this and does provide funding for early psychosis programs. So I'm from Massachusetts, and given the increased federal funding to support coordinated specialty care, we've seen an increase in these types of clinics within Massachusetts. And you can see in the graph that, say, in 2016, there were about six clinics, and now there are about 18 with more opening. But I think despite the growth in coordinated specialty care, there is limited information about how many patients actually have early psychosis and what the true demand for this type of treatment is. To truly understand who has psychosis or the disease burden across the population, we need several things. We need population-level data, diagnoses from all settings where someone might seek care, and then longitudinal information for the population to really assess the clinical history so we can answer the question, is this a new onset of psychosis? And you need these types of large clinical data repositories we've basically been talking about. And so in a lot of prior studies using insurance claims, examining incidents of things like early psychosis, studies have used cutoffs of, say, 12 months. So they essentially look for a diagnosis and then review the 12 months prior. And if they don't see the diagnosis in those 12 months, they consider them to be a new case. But I think this can be a little bit problematic. First of all, we don't know if 12 months is enough time. But also, given the limitations of data sets that may not allow for tracking individuals across insurance changes, there could be other diagnoses and other claims you're just not seeing. So in our study, we wanted to try to get a sense of this. So we aimed to examine the schizophrenia spectrum disorder diagnoses using insurance claims data and then varied the amount of historical data to sort of get at the question, is 12 months enough? Is 24? Et cetera. And we went up to 48 months. We used an all-payer claims database in Massachusetts across five years. And this is essentially a comprehensive medical and pharmacy claims database for commercially and publicly insured individuals. It doesn't cover everyone in the state for a variety of reasons. But it's pretty close, 85%-ish. And most importantly, allows for tracking of individuals across insurance types. So we started by identifying all individuals age 15 to 35 who had a schizophrenia spectrum disorder. And we identified everyone who had the diagnosis code in 2016, which was the last year of our data, and reviewed up to four years of historical data. And we essentially classified as somebody having a new schizophrenia diagnosis if there were no prior diagnoses in the four years of history we reviewed. And while the study looked across ages 15 to 35, I'm going to focus on the youngest group we examined, which is individuals age 15 to 17. But just to give you a sense, across the population, there were about just over 7,000 individuals 15 to 35 who had a schizophrenia spectrum disorder in diagnosis in 2016. We then looked to see what proportion had 48 months of historical data that we could even look through. And there were about two-thirds. And of those, 1,100 did not have a schizophrenia diagnosis prior to 2016. And this included 83 kids age 15 to 17 who we would consider to have a new schizophrenia spectrum disorder diagnosis. Notably, in order to look at the full 48 months of history, we needed information from at least two payers for two-thirds of the cohort. So we then looked at some of the characteristics of who was newly diagnosed. And I've put some red boxes around things I find particularly interesting. That is, most diagnoses were received in the emergency room or an outpatient mental health clinic. And the majority of patients had Medicaid at the time of their diagnosis. And so in addition to obtaining an estimate of the incidence of schizophrenia, one of the things I mentioned we were interested in quantifying was, how much history do we actually need to review to be relatively confident that the diagnosis is truly a new diagnosis? And so we started with 2016 and then looked back in 12-month increments to examine for prior schizophrenia spectrum disorders. And you can see the lines here basically showing you how many people had them in the four years before. And we find sort of 60%, the black line, I don't think I can point, but the black line near the top is 60% in the 12 months before, but went up to closer to 80 when you go a full four years back. And of course, this varies by age. And so if you look at the green lines, which is the kids 15 to 17, while they do have diagnoses in the 12 to 48 months before, it's sort of much less dramatic, which I think corresponds to age of onset of the disorder as well. But when we examine incidence using a 48-month review, we find an estimated incidence of 66 per 100,000 individuals. And looking specifically at the 15 to 17-year-old age group, we see that young females had the lowest incidence of 37 per 100,000, and young males had an incidence around 44 per 100,000. And notably, if we had, say, looked only at 12 months, it would have given us estimated rates that were almost double what we found, suggesting that we may be, in that scenario, erroneously identifying or inflating our incidence rates. So we were also interested, and we've done more work looking at before a diagnosis, but we were interested in understanding, are people receiving treatment prior to a diagnosis? So we looked at antipsychotic medication use, and find that more than two-thirds of females age 15 to 17 received treatment with an antipsychotic prior to a schizophrenia spectrum disorder diagnosis, but just over half of males have a prescription. So even before receiving schizophrenia spectrum disorder diagnosis, patients are being treated for psychotic symptoms. I think this ties into some of the limitations, but there are certainly several limitations to a study like this. There's really no gold standard for identifying patients with schizophrenia spectrum disorder in administrative data, though we're working on it. And because we're relying on billing diagnoses, we may miss patients where clinicians say suspect schizophrenia spectrum disorder, but don't code for it and put something like unspecified psychosis. I think this explains why we see so many patients receiving treatment prior to the actual schizophrenia diagnosis. So lots of reasons to explain this, but I think the point is still there, that we find rates of schizophrenia spectrum disorder to be lower than suggested in the literature, and that we really need to review as much history as we can to make sure that we're preventing misclassification. For younger patients, we could probably get away with a shorter window, but I think the important thing is to recognize that these limitations exist and that the duration of feasible historical review will really vary by data set. So in that case example, we looked at how to identify individuals with a particular diagnosis in large data sets to understand the limitations with respect to amount and type of data. A next step is to try to understand who actually has the disorder if they have the diagnosis in the data. So we're going to discuss a clinical registry case study that looks at the accuracy of diagnoses. We're switching gears to pediatric bipolar disorder, which to level set again, bipolar disorder that starts in childhood or adolescence is known as pediatric bipolar disorder. It's associated with severe morbidity and can certainly disrupt the normal developmental trajectory. It's generally diagnosed based on symptoms, including changes in mood or behavior that are more extreme than what we might think are developmentally appropriate. The estimated prevalence is just under 2% in individuals age 7 to 21, so that's much more common than schizophrenia and much less common than depression or ADHD in this cohort. So I really like publication graphs apparently, but just to give you a sense of the literature out there, over the last several decades, there's been significant growth in the number of publications about pediatric bipolar disorder. So we're going to talk about clinical data registries, which typically contain information about a patient's health and treatment, often grouping similar conditions together. And one nice example is from Denmark, where all patients are part of a national registry. And so all Danish citizens are given a civil person registration number and can be tracked across registries. And so you can see some examples here of the registries that they have. Notably, they have a Danish psychiatric central research register, which contains information about psychiatric diagnosis. So using these registries, a research group wanted to understand what proportion of the individuals on the registry with a diagnosis of pediatric bipolar disorder actually had pediatric bipolar disorder when they reviewed the chart. So getting to the question of who has the disorder, they started by identifying all children and adolescents below 18 years of age at the time of a first mania, hypomania bipolar disorder diagnosis. They found about 500 individuals who met that criteria, took a sample of 25%, and then basically did a chart review and tried to say, based on chart review, do we think it is yes, likely, or no, not likely that they have bipolar disorder? Of the 106 charts they reviewed, just under half, so that's 48 patients, had confirmed bipolar disorder diagnoses based on symptoms listed in the chart. And in the table, you can see the demographic and clinical characteristics for those with and without a confirmed bipolar disorder diagnosis. Largely the groups are similar, though the group with confirmed bipolar disorder is slightly older and more likely to be diagnosed during an inpatient hospitalization. But essentially, this study found that fewer than half of patients on a pediatric bipolar disorder registry had confirmed pediatric bipolar disorder upon clinical review, suggesting that if someone is just using these data and relying on diagnoses, that may not actually reflect the patient's true clinical condition. And there are certainly lots of limitations to this study. First of all, they did this assessment or adjudication based on clinical chart review, and not by talking to the patients or doing a clinical evaluation. Also, they didn't look at negatives or those who have pediatric bipolar disorder but aren't on the registry. But I think still it's important to keep in mind these limitations in these types of data. So I think the question here is, how does understanding these challenges impact future directions? Well, I think if we can accurately identify patients with a disorder in our data, then we can begin to build prediction models to help with identifying at-risk individuals or individuals at risk of adverse clinical outcomes. So for example, work has been done in trying to understand response to antidepressants in children, demonstrating that certain symptoms, like difficulty having fun or social withdrawal, may be helpful in predicting type and extent of response. There's other work that's been done examining what factors might predict the development of PTSD after a child experiences an acutely traumatic event. However, often there are limitations in how applicable or implementable these findings are. And one really prominent area of research is predicting suicide. And work to this date, most models that have been developed have pretty low positive predictive values. So for example, averaging around 1%-ish that a positive prediction results in a suicide, which is really too low for use in clinical care. But for any problem we're looking to solve, ultimately we are gonna need models that are translatable across systems and consistent in their prediction. So if they predict suicide in my system, they should ideally do it in yours. And that's to say that just because a model works in one system, it doesn't mean it will work in others. And CMS has recently developed this idea of augmented intelligence rather than artificial intelligence as a best practice. Meaning that algorithm or prediction models really need human review and oversight, and that using them as an adjunct to existing human processes is the key to success. So as this type of work progresses, understanding the limitations and ways that things like culture and data acquisition processes may limit and bias the translational value of machine learning models will be critical. And I know we're not quite there yet, though actively working on it, but I think we can get there and that these types of models will hopefully augment the excellent clinical care being provided to our patients. All right, so now thank you very much for your attention. Dr. Edgecombe is next. Hi everyone, good morning again. So I'm Dr. Juliette Edgecombe and I'm a child psychiatrist at UCLA and I'll be talking on detection of self-injurious thoughts and behaviors among children receiving emergency care. The objectives of this presentation are to distinguish methods to detect self-injurious thoughts and behaviors among youth using health data. To apply existing case surveillance definitions to discover self-injurious thoughts and behaviors among children, and I'll use the acronym SITB for self-injurious thoughts and behaviors. To integrate and interpret machine learning based approaches to detection and to critically review and analyze cases of children presenting for SITB and associated potential detection biases. We know as psychiatrists, there is an extremely high clinical need for effective suicide prevention interventions. Suicide is the second leading cause of death among children 10 to 14 years old in the US and astoundingly one in 13 US children attempt suicide before adulthood. When children are at risk for suicide, emergency departments are often their first point of access to mental health care and there's over one million annual suicide related emergency department visits for children in the United States. As shown by this recent study in JAMA Pediatrics, mental health visits increased 8% annually per year between 2015 and 2020 versus 1.5% for other health conditions. And the same study found that about one in 10 children will return for emergency mental health care within six months. In this context, there is an urgent need to find better ways to understand, monitor, prevent self-injurious thoughts and behaviors in childhood and adolescence. This demand for understanding and preventing suicide and suicide related behavior among youth as Dr. Benton has discussed, it has really risen concurrently with an interest in across medicine in the interface of computers and mental health and specifically a growing interest in using electronic health record data to detect, predict and start to treat and prevent mental health conditions across the lifespan. And the hope is that we can work together leveraging new and existing clinical data sets to really make a difference here in terms of reducing rates of suicide and self-harm. But in addition to the trough of disillusionment, we get the call to action quickly turning into gimmicks and hype illustrated by imagery of medical professionals kind of floating through data, we get concerned about black boxes of predictive analytics. And so my challenge for you is to go back to the patient in the room and think about the population of patients that you treat. That every time you see a patient, you are using probably an electronic health record. And by using that record, you're generating data about that patient. And the data is stored somewhere. And that if we as psychiatrists are ill-equipped to understand and interpret this data, that others will. So I'll start today by asking, what is electronic health record phenotyping? And the official definition is the process of ascertaining a clinical condition or characteristic by means of a computerized query to an electronic health record system or clinical data repository using a defined set of data elements and logical expressions. In other words, what are the characteristics visible in an electronic health record that distinguish a patient with a disease or a condition from those without? And EHR phenotyping is important because as Dr. Benson touched on, it's a method of detection. And as a clinician, when you think about asking a question relevant for clinical care, one of the first things you're going to try to do is detect the people in your health system who have that disease or condition. So of the millions of patients and millions of visits within your health system, whether you're interested in treating or studying autism or anorexia or bipolar disorder or psychosis or suicidality, you gotta be able to figure out who are the children in your health system, who are the people in your health system, who have these conditions. So how do you do this? Well, you could use ICD codes. You could use chief complaint. You could use some kind of proxy measure like medications or labs or visit type. But would you really be effective in detecting all of the people who have these conditions in your health system? The hope is that if you can detect, you can start to predict and possibly prevent or intervene. But if you can't detect, you're gonna have problems with the other three. And if this sounds familiar, it's because we just spent the last three years in a pandemic in which detection of disease was clearly one of the first steps toward prediction and prevention and intervention. So to illustrate the importance of EHR phenotyping, I'm gonna share a few cases with some details changed to protect identity. But as I go through these, I want you to try to think about how would you detect whether these children are presenting for self-harm or suicide-related reasons? So the first case is Ava. She's 11 years old, and she's brought in by her father for suicidal ideations and an attempt at self-harm yesterday. For her father, she attempted to wrap a cord around her neck last night, and she stated she wanted to end it and be done with. The emergency department consults child psychiatry, who reports in their documentation that the patient told her father that she tied a foam cord around her neck in a suicide attempt yesterday. So that's her unstructured data, her clinical text data. Her structured data is that her chief complaint was suicidal. Her emergency department disposition is that she's eventually discharged home after safety planning. Her legal status remains voluntary, and she's given ICD codes of adjustment disorder, unspecified disorder of adult personality and behavior, family history of mental and behavioral health disorders, and an allergy status to narcotic agents. She is given no medications, has no lab tests done. We have some demographics on her, and she comes from a moderately affluent area in Los Angeles. The next patient is Mia. She's 12 years old. She comes in with a history of unspecified psychiatric disorder and is brought in by a clinical social worker who notes that she endorses suicidal ideation as well as homicidal ideations and auditory and visual hallucinations. Child psychiatry is consulted again, and in that consult note, she states that she threatens to hurt herself with gestures including wrapping a string tightly around her wrists and scratching herself with pencils, and that she always thinks about harming or killing herself. Her structured data shows that her chief complaint is psychiatric evaluation. She is admitted to psychiatry on a involuntary mental health detainment for danger to self and danger to others. Her ICD diagnoses are anxiety disorder, ADHD, restlessness and agitation, and other signs and symptoms involving appearance and behavior. She's given Depakote and Tenex as a negative pregnancy test. We have demographics, and she comes from a slightly poorer area in Los Angeles. So thinking about these kids, one of the most prevalent ways we can identify individuals receiving care for suicide and self-harm is to use their ICD code. And in fact, the CDC has this really beautiful list of all of the self-harm and suicide-related ICD codes. And so if you look at this National Health Statistics Report in the appendix, they have a list of all of the ICD codes that fall under this category. So this should work to find our kids with suicide-related behavior, right? Well, let's look. Ava doesn't get an ICD code for suicide or self-harm. Mia does not get an ICD code for suicide or self-harm. So what else could we use? Well, maybe we use suicide-related chief complaint. And in our health system, and this is tricky because it's a little bit health system-dependent, but in our health system, the triage nurse in the emergency room makes a selection using an epic speed button, and they can pick the selection of suicidal. And that's how a patient gets assigned a chief complaint. So let's try this. Well, if we use this criteria, we detect Ava, who has a chief complaint of suicidal, but we miss Mia, who has a chief complaint of psychiatric evaluation. So this problem is not a new one, and people have thought about this. And so what do we know about detecting self-injurious thoughts and behaviors using EHRs? Brief lay of the land of some recent studies. So this is a really nice study that Greg Simon's group put out last year in JAMIA, accuracy of ICD-10 encounter diagnoses from health records for identifying self-harm events. And they used medical records from adults at seven health systems and found that the vast majority of people who receive an ICD code for suicide or self-harm have documentation of that suicide or self-harm in their notes. In other words, that ICD codes have good evidence of being specific or confirmatory. And this may vary across disease conditions, as we just heard from Dr. Benson in pediatric bipolar disorder. But for suicide, it looks like it is fairly specific if you receive an ICD code. On the flip side, we have some nice work by Nick Carson a few years back on identifying suicidal behavior among psychiatrically hospitalized adolescents. This is a single health system, and they did a proof of concept study looking at notes to see if we can sensitively detect whether or not a adolescent was hospitalized for a suicide attempt and found good sensitivity of using NLP and ML approaches to identify a suicide-related behavior. There's a group in the UK, Johnny Downs and Sumitra Velupai, who have done a series of studies on using medical records and NLP to detect children with positive suicidality and found both sensitive and specific means of detection relying on notes. And Mark Akusic and Jodi Pathak put out a cool study two years ago now on using weak supervision and convolution neural networks of deep learning to classify clinical notes and kind of a little bit in parallel to Greg Simon's work, had found that they are able to confirm that by using these approaches, they're able to confirm that if you have documentation, then you likely have suicidality. So specific, again. And Mark Akusic and Jodi Pathak and Johnny Downs and Sumitra Velupai actually cross-validated this across the Atlantic and shared their code, which is really cool, and were able to show that some portability of this approach to using NLP across health systems to detect suicidality. So this is really exciting, but what remains? Well, to back up a bit, we still don't have a great sense of whether ICD codes are sensitive at detecting self-injurious thoughts and behaviors. We've got evidence that they're specific, and we've got evidence that NLP is great for notes, but how well do these ICD codes do in terms of sensitivity? Does other codified data help? Notes are awesome when we have them, but oftentimes we don't have unstructured data. We have structured, de-identified data. How well do we do with that data? Proxy measures, like elevated serum is at a minifin level. Does that help at detecting people with suicide attempts? And then another question is, are some children more likely to be detected than others? So even if we have a system that's very specific, who are we missing? And does it vary based on the child? And finally, just to note that children are not small adults. There is an ontogenetic arc of how children develop, and suicidality in a six-year-old doesn't look like suicidality in a 16-year-old doesn't look like suicidality in a 36-year-old. So this was the inspiration for our pilot study, the Child Suicidality EHR Phenotyping, or CSEP study. In this study, we tried to evaluate the performance of ICD-10 codes and chief complaint, suicide-related ICD-10 codes and chief complaint, in identifying children with self-injurious thoughts and behaviors compared with manual chart review. We explored variation in how well these suicide-related structured data indicators do by child sociodemographics, and we tried to do better by developing and comparing supervised machine learning approaches using codified EHR data to improve detection. This was a cross-sectional observational study of children 10 to 17 who had an emergency department visit for mental health-related reasons. We selected 600 visits and intentionally oversampled so that about half of the children were likely to have a suicide-related visit and half were not. And we used data from a single health system with two emergency rooms and four hospitals in urban Los Angeles. We looked at two major metrics, so these are the two I just discussed, the CDC list of ICD codes related to suicide and self-harm and our chief complaint related to suicide. And we also added 82 more pieces of data, all structured data, so the child's demographics, the area deprivation index of their neighborhood, their legal status, their medications, laboratory tests, other chief complaints, prior care use, diagnostic codes, and the sex of the ED provider. And how do we develop a gold standard, or in other words, try to determine who really has suicide or self-harm? So we used a system that has previously been validated called the CCASA, or the Columbia Classification Algorithm of Suicide Assessment, and we had a team-based approach where two research assistants reviewed the records, assigned a preliminary classification. If their ratings were discordant, they were then reviewed by a psychiatric nurse practitioner and a child psychiatrist. And we adjusted for a sampling probability, we then compared our rule-based classification to the manual review, and we developed and cross-validated a couple of machine-learning-based classifiers using supervised learning, so a random forest and a lasso-penalized logistic regression, and compared those to manual review, and then we compared our classifiers and saw how well they did. So there were 600 children, and on manual review, again, remembering we loaded the cards, 284 were classified as having self-injurious thoughts and behaviors. About 46% were male, and 54% were female. Most were 13 to 17, but we did have one in five pre-teens. About half were white and non-Hispanic, a quarter were Hispanic, 10% black, and the remainder Asian or other. And most came from relatively affluent areas in LA, which is kind of consistent with our medical center location. The most common diagnoses were depressive disorders followed by anxiety disorders and suicide or self-injury-related diagnoses. So how did we do? Well, how did suicide-related ICD-Code and Chief Complaint do? ICD-Code missed 85 of 284 cases, nearly one-third, and ICD-Code and Chief Complaint together missed 64 of 284 cases. Correspondingly, the sensitivity is nothing to write home about, 0.7 and 0.77. Of note, the specificity is very good. And so when you put these together, the accuracy is kind of only fair. And then we looked, okay, how well did Chief Complaint and ICD-Code do across demographic groups? And we found that the sensitivity was poorer for males than for females, was poorer for preteens than for adolescents, meaning we're missing more males, we're missing more preteens if we rely on these metrics. And there was a trend toward a poorer sensitivity among black and Hispanic youth compared to youth who are white, Asian, or other race or ethnicity. So taken together, we tried to develop a model that would be more sensitive at detecting kids. And so this is what our machine learning-based model looked like, how did it do? Well, the false negatives drop, so we miss fewer cases. The sensitivity was higher. On the other hand, the specificity drops a bit, and so we end up with a little bit more false positives. So there's trade-offs here. The specificity drops, but the sensitivity is higher. We then compared, okay, do you really need all 84 pieces of structured data in order to do a better job of identifying kids? And so in order to test this out, we used a series of models. The first had all 84 pieces of data. The second had suicide-related ICD-Codes in chief complaint but then everything else, labs, medications, prior use. And then the other model had just mental health-related ICD-Codes. And we found that actually the mental health-related ICD-Codes did pretty good. The sensitivity was .84, which was a significant increase from the suicide-related ICD-Codes, showing that maybe we can get away with just using mental health-related ICD-Codes to help us in our detection task. So do we really need all 84 pieces of data? No. This graph is too small to read, but it gives you an idea that we can also see which variables are most important in detecting cases. So as you move to the right, the variables are more important, and in the middle, they're less important, but still not zero importance. And we find that if you zoom in at the top, of course, ICD-Code, suicide-related ICD-Code is the most important, but what's interesting is that after that, we have things like mental health-related chief complaint as more important than suicide-related chief complaint. We have trauma and stressor-related disorders and depressive disorders coming up as high risk, in addition to some demographics. Then we have in our less important, but still important, things like prior year emergency department visits, or whether or not they've received an antipsychotic. And then on the flip side, variables that are important in distinguishing non-cases, or children without suicide-related visits, include things like whether they presented to the community hospital versus the academic site, and whether they were medically admitted. So what are some takeaways? ICD-Codes and chief complaint in our study significantly underestimated child emergency department visits for SITB, missing at least one in five children. More false negatives among boys and preteens. Machine learning improved detection sensitivity, but there was a trade-off to specificity. And detection sensitivity may be improved with just a focused set of indicators. Selection of a detection approach ultimately likely depends on the clinical use case. Are we trying to use a screening method to search a large data set to identify children who possibly have self-injurious thoughts and behaviors, and then confirm with manual chart review? Or do we want to, for example, put a chart flag in somebody's medical record that they've had a prior suicide attempt, in which case we probably want something very specific and might be able to get away with just using the ICD-Code or the chief complaint. So kind of thinking ahead, next steps, validation and other health systems and settings, incorporating text data, separate classification models across age groups, trying to increase power to detect differences, and exploring geographic linkage to social determinants of health. And ultimately, the last mile is the hardest. Can we integrate improved detection into these risk prediction models and then embed into clinical decision support at the point of care? And this is a really exciting area for the future. So thank you to my team, and including the Mental Health Informatics and Data Science Hub, my mentor, Bonnie Simma, and our Department of Medicine Statistics Core and the Biomedical Informatics Program. And with that, I'll turn it over to my colleague, Dr. Engel. Thank you. All right, good morning. My name is Amber Engel, and today I'll be talking about an ongoing study that we have that's funded by the National Institute of Mental Health, titled Using Machine Learning with Real-World Data to Identify Signs of Autism in Children. So today we're gonna look at different ways of detecting populations of autistic children, talk about autism discriminators and features associated with autism, talk about potential bias and identification with the focus on girls and Latino children. So this research is based in my lab at USC, the Disparity Reduction and Equity in Autism Services or DREAMS Lab. And in addition to some amazing students, some pictured here, we also have a group of autistic stakeholders, also some pictured here, all of whom identify as women of color. And they really collaborate in all aspects of our research. So I'll share a little bit at the end of this talk about the contributions they've made to this study. This group calls themselves the Autistic Lived Experience Collaborators. So first to give some background on our work. So as the prevalence of autism has steadily increased over the past several decades, the most recent CDC report which came out this March estimates prevalence at one in 36 children age eight in the US. So have numerous initiatives and efforts to improve early and accurate identification from the CDC, APA, special committees like the Interagency Autism Coordinating Committee and initiatives that have earmarked federal funding to this cause. So there is a lot of evidence that early and accurate identification can lead to positive developmental outcomes. So this latest report I mentioned, which was published in late March of this year, actually showed some important changes in prevalence that are different from when we submitted this grant from what we sort of knew then. So I will kind of touch on these as we go through the background. Sorry, one second. I'm sorry, once I cursor do that to move down, I'm just not able to still work. Yeah, but now, oh, panning then doesn't work. I just couldn't get that. Oh, that side, got it, thanks. So generally speaking, despite efforts to reduce the age of diagnosis, there remain large gaps in the average age of identification. So in 2021, the CDC report, the prevalence report put out by the Autism and Developmental Disabilities Monitoring Network, so that was using 2018 data, showed there was a gap of more than two years from the age that autism can reliably be diagnosed, which is 18 months, to the average age of diagnosis, which is 50 months. And then this most recent report uses data from 2020 and it shows not a lot of difference. So overall, the median age at earliest known diagnosis was 49 months. For both 2018 and 2020 data, among children who met the case definition, only about 75% had a recorded clinical diagnosis. So we're looking at a fair number of kids, a quarter or more, who meet case definition but are not diagnosed or identified. And late or missed diagnosis has documented detrimental outcomes, including psychological distress, challenges with daily functioning that are not accurately understood or addressed, and suicidal ideation. We now have several decades of research showing that children of color are disproportionately affected by the problem of delayed and under-identification of autism. So more recently, there's also a growing body of research focused on girls. There's now consensus that genetics accounts for some proportion of the three to one or four to one ratio of boys to girls, but that part of this is actually most likely due to the under-identification of girls. And it may be due in part to their propensity to camouflage or mask their autistic traits. I mentioned earlier some of these changes in the most recent CDC prevalence. For the first time in 2021, using 2018 data, there were no differences between prevalence in white and black children. And in that report, girls and Latino children still had significantly lower prevalence than boys and white children. And Latino children were diagnosed later and were less likely to receive a comprehensive eval by 36 months. In this most recent report using 2020 data, for the first time, prevalence was lower among white children compared to black and Latino children. However, some important caveats. This is the estimated prevalence regardless of a diagnosis, and without a diagnosis, kids don't have access to services. This report didn't speak to racial, ethnic, or sex differences in age at first diagnosis or age at first developmental eval, or who has access to gold standard developmental diagnostic evaluations. The numbers vary dramatically by site, and this represents just 11 sites across the country. And it doesn't mean we should throw all the other evidence out the window coming from a range of different methods that has shown disparities in who gets diagnosed and when they get identified. So while this does add something to the background of our study, Latino kids may still be at risk for later under-identification, and girls still appear to be under-identified. Many signs of autism and features associated with autism are evident in early childhood. And caregivers often report these concerns to their healthcare provider before the children are three years old. Parents of autistic children, however, are less likely to receive a proactive response compared to parents of children with other disabilities. Although parents' concerns may be influenced by culture and gender expectations, these don't appear to significantly impact the timing or type of parents' first concerns. However, perceptions of autism as a white male disorder may make professionals less likely, sorry, less likely to consider autism in girls or Latino children and refer them for an evaluation. So this could constitute implicit or explicit bias. And by explicit bias, I mean that clinicians often know these numbers. For example, autism is much higher in boys than girls, and they may have these numbers in mind when they're deciding whether to send a child for a full evaluation. For example, G. Arelli et al found no sex differences in the presence of autism symptoms in children's records based on parents' reported concerns, but clinicians were more likely to refer boys than girls for an evaluation. So for years, the CDC prevalence was conducted by manual chart review, literally in the effort to get at actual present prevalence, estimated prevalence, people would literally sit and scour through health and educational records. And what they were looking for was indications of autism, whether the child was diagnosed or not. In recent years, they began to use clinical informatics approaches like machine learning rather than manual chart review. So for example, in 2016, Mainer's group found that they could discriminate between children that did and did not meet autism surveillance criteria using machine learning. They did find it was important to have access to children's developmental evaluations, which included autism-specific keywords. So there is an existing, you heard about phenotyping earlier, computable phenotypes. There is one that's been developed for autism. I don't know if that's available, but it uses only ICD-9 codes. And those findings were published in 2016. A later study by Lee and colleagues in 2019 tried multiple types of algorithms to predict whether kids met the case definition of autism using only words from their developmental evals. And they found that traditional and deep learning models performed similarly and were able to predict clinician-assigned cases within the CDC surveillance system. Those prior studies had several limitations that we wanted to try to address in ours. So basically, using ICD-10 codes, those studies lacked up-to-date NLP methods, had non-representative training samples, and importantly, all of these prior studies that have so far used clinical informatics to look at autism identification have lacked verified autism diagnoses. So it's relied on clinician chart review and not had access to gold-standard diagnostic tools to verify. In those prior studies, a lack of attention to subgroups that are likely to experience delays or to be under-identified, and then a need for a model to evaluate for fairness vis-a-vis these sex and ethnic disparities. And so conversely, we had several strengths that we were working from for our study, including more diverse large data sets, attention to model fairness vis-a-vis sex and ethnic disparities. So on to talk about the study. So to avoid delay or to reduce delay in under-identification of autism, particularly in these underserved groups, we need a way to predict possible autism at the population level. And I wanna point out briefly that in my lab, we now avoid the use of the word risk in this context, even though I did use it when I submitted the grant and it is in the official grant title. So this is an example of the contribution of our autistic stakeholders who have asked that we not compare what they experience as an integral part of their identity and being to something negative or harmful or an illness. So here's my team for the project. These are just the co-investigators, my data science collaborators at University of Florida, and then informatics and clinical collaborators at Children's Hospital Los Angeles. So CHLA, Children's Hospital Los Angeles, we have access to developmental evaluations from the Boone Fetter Clinic. So they do multidisciplinary, full comprehensive evaluations there. And in most cases, those kids also have full EHR records at CHLA and they're linked. They use the ADOS and the Child Behavior Checklist, which allow us to verify autism diagnoses. And then we also have access to those children's full records with both structured and unstructured data. So you heard those terms earlier, so structured data, meaning things like diagnosis codes, procedures, and then unstructured data, clinical notes or narratives. The One Florida Data Trust, this is a PCORnet database. So the Patient Centers Centered Outcomes Research Institute. And this includes all major university health systems in Florida, many private insurers and Medicaid. So it's health records for most of the state. We'll get structured data from the larger network and then unstructured data, structured and unstructured data from the UF Health System. So this is University of Florida Health System in and around Gainesville, Florida. We're still in the process of getting all of our data and kind of starting analysis, but I just wanted to give you sort of, these were our initial estimates. So you have a general idea of these numbers. We have about 140 patients across the entire One Florida network. And then CHLA is about 5,500 and about 1,700 of those kids have a full developmental Eval. So our first aim of our study is to take that computable phenotype that exists for autism, but update it using ICD-10 codes and develop an NLP pipeline to be able to extract relevant keywords that could indicate possible autism. So we'll use both structured data such as diagnosis codes and unstructured data from clinical notes to develop these algorithms. We, unfortunately with autism, we don't have other sort of reliable vital signs or biomarkers or things like that, that some phenotypes do include. So something that's important when thinking about how to identify possible autism using clinical records is that in addition to autism discriminators that are related to diagnostic criteria, we also have features associated with autism, which are traits or challenges that are highly associated. So they could be an indicator of autism, but they are not part of the diagnostic criteria. So for example, in this table, you can see the discriminators related to things like social communication challenges, restricted interests, and then examples on the right. I don't know if you can see it, if you can make it out, but examples on the right are phrases of how these might show up in clinical notes. And then below, there are features associated with autism that are not diagnostic, but that could help to identify potential autism in children. So some common examples are sleep problems, eating and motor challenges, attention and behavioral challenges. And then again, we have some examples on the right of the phrases and ways these might show up in notes. So far, we have a cohort from UF, the University of Florida, and CHLA, Children's Hospital Los Angeles. And this is our plan, our original plan for developing the algorithm. And I'll just say in short, using the previous computable phenotype and expanding it as needed, and then developing and testing algorithms and evaluating their performance on UF data and then CHLA. So that will give us a data set with which to complete the second aim of our study, where we want to develop machine learning prediction models. So this will include all the children we identify using the computable phenotype and a matched cohort of non-autistic children who we also will exclude those with developmental disability or intellectual disability in the comparison group, given how highly those co-occur with autism and also are often diagnosed before autism. So we'll explore different widely used machine learning algorithms. And then after the models developed, assess it for fairness. In other words, how does it do particularly in identifying girls and Latino children? Or is there bias, especially if these groups are under-identified in the data sets that could lead to the under-detection of these groups? So far, we've obtained the data that we need very recently and started with keywords from the previous computable phenotypes. And we ended up expanding them to keyword phrases because many of the keywords that were not specific enough on their own. So for example, cars was a keyword. And we found that phrases like lines up cars or plays with cars usually followed by something like repetitively, unusually, or stereotypically are more useful. So we've done manual chart review on random samples of the UF health records, the clinical notes to develop these keyword phrases consulting with our clinical experts. And then we will be refining this really long list using CHLA data where we have this gold standard data set of verified evals. So in other words, we have a population of kind of for sure autistic children. They have been evaluated using ADOS. And then we'll calculate the precision of each keyword phrase and exclude those with a precision lower than 0.3. So I just wanted to give some examples of how these keywords are being used and where in some cases we find phrases more useful. So like here with smell, I'm sure you can imagine in clinical notes, there are many, many more irrelevant uses of the word smell than are relevant. And when we expand it to things like smell things or tends to smell, we get a much more accurate hit. We had eye as a keyword we were working with initially. Eye contact is much more useful. Spin and spin objects was much better. So just very briefly, these are descriptives from the UF sample. Unsurprisingly, more male than female. You can see the age distribution there. Briefly, and I'm sorry if you can't read that. This is just to give you a picture, a breakdown of the racial and ethnic groups in the UF sample. We have far fewer Latino patients in this particular sample than in the larger One Florida dataset because it's Gainesville, Florida, whereas the broader population of Florida has a much larger Latino population. This is the same thing. It's just ethnicity broken down. Here's the breakdown by the types of clinical visits that we have notes for. So largely clinic visits and telephone, but others as well. And some very early numbers coming from the CHLA dataset that we just got. So we have medical record numbers, MRNs, which represent unique patients by ethnicity at the top. And you can see far more, greater proportion of Hispanic Latino patients there. And sex at the bottom, again, is generally what we expect. And then here's the age distribution on the left and on the right, unique patients by racial groups. And those first three are, or some of the top ones are Latino and white. And then finally, we've just been exploring some of the CHLA data in terms of the types of encounters we have. So like specialty visits. So as you, looking at the left, a lot of neuro, mental health, ED, and lab visits, and then going right, somewhat fewer, but still a good number of things like GI, and patient behavioral health. So ultimately we aim to develop a clinical decision support tool. In addition to creating this updated computable phenotype, which can be used for cohort identification, right, for research or clinical practice. What we ultimately hope to do later on is to be able to develop a clinical decision support tool. So an algorithm that can be used in an EHR to recognize these signs of possible autism and flag a child based on their records who might have autism and suggest to the provider that they may want to send them for a full assessment. So particularly with sex differences, we're hoping that this can reduce some of the disparities that are happening based on clinicians not expecting to see possible autism in girls as often. And as I mentioned before, even though we might be closing the gap between Latino and non-Latino children, we aren't in the clear yet. So we similarly hope that this could help to identify these kids at least earlier and more accurately. So this last piece I wanted to leave you with is some perspectives from my autistic collaborators. So ideally in our lab, we're working towards me not speaking on my own about these things, but always bringing one of my autistic collaborators. It doesn't always work out, but I will say often as some conferences are moving towards more and more mechanisms with really, really dramatically reduced rates for community members or patient populations, that's been a really positive thing. But in lieu of any of them being able to come, they wanted me to share their perspectives. It's really important to them that we not just keep trying to change the autistic people. So in other words, when we're looking at disparities, they are really passionate about us thinking about things also on the provider end. In other words, what changes might need to happen in clinical practice? What attitudes or behaviors of those who are on the other end of clinical encounters who do and often hold gatekeeping positions to diagnoses and services? And as I talked about earlier, places where implicit or explicit bias can have an impact on the lives and wellbeing of autistic people and their families and access to services. So interestingly and full disclosure, whereas now all of our studies run through this autistic group, including at the initial idea stage, I developed this study before I had this group. And when I told some of my collaborators about this study, these were some of the questions that they had, which to be honest, some of which caught me off guard completely. It had not occurred to me to see things in the way that they did, which speaks to the importance of having a diversity of perspectives in research. So they had questions about safety. Their first question that totally surprised me was, are we outing people who are successfully masking and is that because the clinician is not safe to that person? Remember masking, there's an emerging body of research on sort of hiding your autistic traits. Concerns about views of autism, are they pathologizing or othering? Concerns about cultural vulnerabilities. Are Latino communities determining it isn't safe for their child to be more individualized, singled out as different, and there's more safety sort of in the community and not being labeled? Considering the benefits and the costs of identification, especially in Latino communities and for girls, and considering the benefits and costs of research, are you putting in similar efforts to make society safer for autistic people to be labeled? So I'll leave you with those questions from them and I think I'll turn it over to Dr. Alter. I do not have slides, and I'm going to try my best. So I'm Carol Alter, and I am, as you know, my day job is at the University of Texas in Austin at Dell Medical School, but I think the relevant point for me today is that I am the chair of the Council on Quality Care for the American Psychiatric Association. I also have been a member of the work group, the presidential work group, to look at the future of psychiatry, and I think that for me, as not a data scientist, which is, this was like a whole new vocabulary, and you guys have done a really remarkable job. For me, the key part, and I'm going to talk a little bit more about this in a second, is that when we start to think about the future of psychiatry, we really started with diagnosis and with quality care. And that may sound like sort of an overly simplistic question, but I think what you all have illustrated is that it is absolutely at the core of everything we do. If we don't adequately diagnose, if we cannot then really utilize the new information we have about the biology of mental illness, and we can't ever get to a point where we can really deliver quality care. I think an important thing, sort of an important concept within the quality world is sort of the power of measurement. And so if we don't measure it, it doesn't exist. And I think you all did a really splendid job in a very sort of much more complex way to highlight those things. So I think that Nicole did a remarkable job laying out the sort of the environment, and I think explaining, at least for me, I understood it in some ways, explaining how not all data is good, that not all analysis is good, that not all technology and tools are good, that we really have to think carefully about how we're documenting information, how we have create, we have biases in our head, not just from a, you know, cultural, racial, socioeconomic perspective, but also because of the way that we have learned to convert our observations of behavior and affect and content to diagnostic certainty, and I think that's really important. But I also think that the work that you described regarding first episode psychosis and bipolar illness really sort of drives home the point that we get it wrong, and that we don't necessarily, we don't necessarily see the whole picture, and I think that was what was really interesting, in addition to the fact that, you know, each sort of kind of the hierarchy of the data that's available and understanding what that means and what it really can and cannot show you, and I think it's really a new world. We've always known that claims data is very limited in terms of what it can show us, and we say, oh, well, you know, electronic health record is better, and of course the best thing we can do is to do a clinical exam. Well, in reality, maybe that's not true. You know, maybe it's about the questions we ask and how we ask them or the time in which we look at them. So I think that that was, that sort of is another one of those things where as we start to think about how do we get the right diagnosis, how do we get to a point of being able to get to outcomes, we really have to really take a lot of this detail into consideration, and we also have to be careful as we begin to build these models to understand really the limitations, the value and the limitations of each of those, of each of those data points. I think that, you know, Juliet, your work is really obviously very important, and I'm going to start, take you back to like one of your first slides where you talked about the fact that we cannot, we have no way of predicting who is, when somebody is at risk for harm versus who actually harms, right, that we don't know, and I think that that is really important and should drive, continue to drive what we've been doing. I think that what was really interesting to me was when you look at those ICD codes for suicide or suicidal behavior, which we've actually been working on in the Quality Council and in terms of the APA's development of measures, that those codes really reflect a behavior and not a diagnosis, and that, you know, to me it really illustrated the fact that our, the subjectivity of our diagnoses and the necessity to teach us to be more granular or to help us become more granular. Now, you know, of course, what we'd really like to do is to not have to, not have to do any of it. We want somebody to come in and I think of, was it, who had the, who had the, it was your slide, Juliet, that had the, you know, this sort of Star Trek-esque, you know, pulling out the data sources and all that stuff. I think we really do want, and we really do want to try to come up with what the minimum amount of data is that we need to have to be able to make an accurate diagnosis. And I think that, for me, is part of, part of the future but the risk, right, is can we get it to the point that you don't need to have the clinical exam on everybody? Is there a validated rating scale that will get you there? Is there a combination of three pieces of data versus 100 pieces of data that will do the same thing? And I think that that, that is in some ways, I guess, a fallacy for all of us because we, we don't know what we don't know, which, you know, which really speaks to what Amber was talking about and the, really the importance of bias. And I think all three of you mentioned this, is that we don't necessarily know which, which factors, which piece of data, which behavior, which, and certainly we don't know the biology well enough to be able to sort of predict because there's always a question mark out there, which, you know, again, to Nicole's point, if we think about the new tools and the new technology, you know, such as chatbot and all of the, the, and all of the AI that's out there is really, it's coming from a data source that may, does not include all of the data. So anyway, I think that what you all is, have done in a, in a very broad sense is, is identify those core pieces about how we do our work and how to be careful about that and, and then to examine, to be able to look over large pieces of large populations, as well as within a very small, you know, single patient, what are we seeing and what are we, what are we documenting? Because if we aren't doing it right, if we don't know how to get there, we'll, we'll, we cannot really identify, develop the right diagnosis. We can't predict risk and we certainly can't identify the right treatments, which is something I think this really has to get to. So and I, and I also keep getting, I was reminded because as, as you all may or may not know, I am not a child psychiatrist. And one of the things that I know as an adult psychiatrist is that if we don't get the right diagnosis that we really don't ever get to the right treatment. And I think that within child psychiatry, one of the things you guys didn't really get to is this whole notion of, of, of being able to even arrive at a diagnosis in that developmental period of time. And so even more important, the, the behaviors, the risk factors, how do you sort of create a, a sort of a model that gets you to a place, it's okay to have a model that doesn't have a big diagnostic category, but you can in fact look at how those symptoms correlate with treatments, correlate with outcomes, et cetera, et cetera. So I think it's really, I think it's really important, not just in general for psychiatry, but specifically within the population of children and adolescents. So because I have a microphone, I'm going to get to talk for 30 more seconds about the, the future of psychiatry work. This, this idea of us looking carefully at, of how we think about diagnoses and how we then learn how to refine our work in assessment and evaluation of patients, whether it's through our own clinical evaluations or the data we collect, is really critical. So in that work, what we, we start with is that, you know, we need to really look at the DSM and we need to look at the process of how we're developing the DSM and it, and we've done field trials forever. And I think the question for me, big question was, do we really need to do field trials? Maybe we should really do, you know, really sophisticated data analyses and look, you know, and et cetera. So I think maybe it's a data strategy. I think the other thing is that we've always sort of imagined or dreamed that that, the diagnoses would be based on some kind of biological construct. You know, we may or may not be there. We don't really have firm biomarkers, but are we at a place with this, a level of specificity that you all have described that we can, we can then do, do more in terms of looking at the biology, looking at the epigenetics, looking at the, those phenomenon to try to see if there is some way we can use that data. So I think that's really important. I, so I think that the, one of the other key components of the, of the president, of the work group was to look at not only really, to me that again is the, is the basis of this, but is then guidelines, evidence and guidelines, which is then how do we, again, how do we look at, how do we develop guidelines for practice that take into consideration this data and then really create an opportunity to build on the data by consistently measuring outcomes and specifically measuring exactly what, what, what interventions people received. And so I think this is the basis for us is to then have knowledge to then create more knowledge to then be able to be more refined in our treatments and improve our outcomes. The rest of it is about things like quality measurement and education and advocacy, but we can't get there unless we do this work. And so I am, I want to commend you all on the work that you have done and are going to continue to do. And I'm going to stop talking and let you all come up here and answer questions or, and from the audience. Thank you so much for the opportunity to be able to participate with you. If you could please ask questions, thank you, into the microphone for the recording. Are you ready? Hi, Alan from the Diagnostic Unit in Copenhagen Mental Health Care. Thank you for really interesting data. I was wondering if you had a choice between only using structured data or unstructured data, what would you choose today? And giving your powers of imagination 10 years from now, what would you choose, structured or unstructured data? That's a great question. I love questions that have power of imagination included. I think this gets at the idea that we are limited in some ways by the data we have and kind of can we make the most of what is readily accessible. Clearly, there's huge promise of using text data and I don't think you're going to find anyone, maybe you would, but who will say, you know, we have to stick with the codified data, we have to stick with the structured data. It's always seemed to me like a proxy for what we really want, which is all of the richness of the data that's in the notes. That being said, we work with what we have, right? And especially when you look at big data sets like the PCORnet data and standardized data models, it often seems like the bigger the N and the more comprehensive the data set, the smaller the number of variables and the more structured they are. And there's a tradeoff there. So we have very granular data at UCLA, it's one health system. How generalizable is that? I was just going to say I totally agree and I think like concretely to answer your question when I think about this, based on the methods we have, probably I would choose structured data today knowing that we're missing so, so much more, which both of you talked about. And my hope is in 10 years we can use the notes, unless of course our entire framework for documentation has changed. But I think if we really have methods where we can reliably mine all sorts of electronic health record data and pull things from notes in a consistent way across systems, this is my dream, that would be really great because I really do think that clinicians put a lot of work into writing their notes and much less work, for better or for worse, in coding or maybe they don't even code. We have billers in some of our hospitals that code to maximize reimbursement in reality. But I think the notes are really where I think clinicians often put their nickel down and are probably the most valuable. But they're hard today, I think, to really get what we need from them. Sure. I agree and echo those. And it's a cool question because I think the sort of general answer is it depends on what you're asking, right? And I totally agree. A bigger N is great and you're going to tend to get more structured, fewer variables. But as I'm thinking about your question, a lot of the questions I am personally interested in, I do need to get to the clinical narratives to get to. And they are hard to access, as we've talked about today. So I do hope that there's more sort of systems and ways in the future that we can access the richness of those. But there's a lot of access challenges now. I think another dream for the future is ultimately when we look at electronic health record data or claims data, this is the vast minority of somebody's life, right? It's only when they're seeking medical care. And I think the hope 10 years from now is can we incorporate ecological momentary assessment? Can we incorporate social media? Can we get a more comprehensive picture of who the person is? In a non-creepy way. Yes. Thank you. You know, you mentioned the idea of data outside of the health care environment. And I think that even with the caveat of non-creepy data, the question is, you know, I think the idea is how do we create what those questions are that would then be meaningful? I mean, it comes back to we don't know what we don't know. We don't know what we don't know. But I'm wondering how we could, you know, if you all could sort of design that, do you know what you would look for? What information would you wish to have that you don't have right now? And especially if you're like, you're fishing, right? So, I mean, the question is, do you have thoughts about what data you think needs to be there in the non-medical setting? Yeah, I'll start. I think it's such a great question. Because for autism specifically, and you touched on this earlier, kids get services across multiple systems of care. And I don't know of anywhere where there's a good integrated data set. Now there are these registries, and they're very well characterized. The kids have verified diagnoses, and they are mostly white kids, right? So it requires their parents to come back year after year. They have to be recruited, and they come back year after year. So those have not been able to access lower income or diverse children. And I've struggled with that in LA. Kids get regional center services. They get school services. They get medical services. It might be through CHLA. It might be, you know, it's kind of everywhere. And so getting a comprehensive picture of what that child gets is almost impossible. So I think about that a lot. But then if you were to start integrating those data sets, at least within health systems, we have some similar structured data. And if you start bringing in education systems and others, right, it would be a big mess. But it would give us a better picture. So if I could sort of dream big, I think it would be knowing what services they're getting at what intensity in what systems. And then what I would really love to know is what do they think about the services and what do their families think about the services? We have a whole generation of autistic adults who are saying, the services I got as a child were really horrible for me. Or it wasn't working on the things that were important to me. These are the outcomes that are important to me. And you have no way of necessarily knowing that within charts. You might see progress, but is that what the family wanted the child to develop in? So those are all things that would be great if we somehow could capture at some point. I feel conflicted about this because while I think the more data the better and we should be incorporating it into holistic care for individuals, particularly if we can improve access and equity in access, because it's not hard to get a smartphone. Almost everybody can do it. Bandwidth is a different issue. But I think I wonder from the provider level, what does it mean to have all sorts of data 24-7 about a patient you see every three months? How do you aggregate it? Who's going to aggregate it for you? I think there are a lot of open questions about what we do with data between visits, particularly if we're talking about the scope of the whole world of data that we could incorporate in some way if we're dreaming. And then I think back to Amber, what you brought up about outing people and do they want to be? Do teens, well certainly we know teens and adults use social media differently, but do teens use social media in a way that they hope that their physician will see it and maybe incorporate it into treatment? I don't think so. And it could be useful from the provider perspective, but I don't know if patients will want that. So I think there are a lot of open questions and it's super interesting. And I really do think we're headed to this incorporating all the data. We're headed at least in that direction in studies. And it will be interesting to see how we as a community and also patients manage that. I think we have time for one more question if anyone else has questions. Or else I'll add to the wish list school-based data. Oh, yes. It's sort of a shorter term. I know getting all of the data from different sectors, there's an endless wish list. But just doing a little bit of better job of capturing, you know, what grade is the child in? Are they failing school? Are they flagged in preschool? Do we have some data about school would be just a next step ask in terms of what we can connect. Well, thank you all for coming today. It was lovely to see you. And please come up with questions. We love to connect. Thanks. Thank you.
Video Summary
In the session titled "Identifying Child Mental Health and Neurodevelopmental Conditions Using Real-World Clinical Data," a group of experts, including Dr. Juliet Edgcomb, Dr. Nicole Benson, and Dr. Amber Ongle, explored the potential of clinical data in improving the diagnosis and care for children with mental health issues. The main objectives discussed included evaluating commonly used methods for detecting mental health and neurodevelopmental conditions in children and adolescents, developing practices to reduce implicit biases, and examining different types of clinical data.<br /><br />Dr. Benson discussed the use of digital health tools and their potential to improve healthcare delivery and outcomes through early intervention, treatment engagement, and recovery monitoring. She presented case studies on identifying schizophrenia in children using insurance claims and questioned the sufficiency of 12-month data windows to accurately identify new diagnoses, suggesting longer data histories might be necessary. The limitations and implications of using existing data sources for accurate diagnosis were also highlighted.<br /><br />Dr. Edgcomb focused on detecting self-injurious thoughts and behaviors using electronic health record (EHR) data. She examined the efficacy of ICD codes and chief complaints in identifying suicide-related behaviors, finding issues in sensitivity particularly in detecting cases among male, preteen, and minority populations. Machine learning models were introduced as a potential method to improve detection.<br /><br />Dr. Ongle's research aimed at improving the identification of autism in underserved populations, particularly girls and Latino children, using machine learning on real-world data. The importance of developing unbiased models that consider cultural and gender differences in symptom presentation was emphasized.<br /><br />The session underscored the significant potential of using real-world data to enhance identification and care for children with mental health and neurodevelopmental conditions while acknowledging the challenges and limitations in ensuring accurate and equitable diagnosis and treatment.
Keywords
Child Mental Health
Neurodevelopmental Conditions
Real-World Clinical Data
Digital Health Tools
Early Intervention
Electronic Health Records
Machine Learning
Implicit Biases
Autism Identification
Schizophrenia Diagnosis
Self-Injurious Behaviors
Cultural and Gender Differences
×
Please select your language
1
English