false
Catalog
Systems Neuroscience of Substance Use
View Presentation
View Presentation
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
So, I'd like to thank everybody for rising bright and early to share with us some of the presentations and perspectives on the application of systems neuroscience for the study of complex neuropsychiatric conditions, with a particular eye on how these tools and approaches can be leveraged in the study of substance use and substance use disorders. I'm Tristan McClure-Begley. I'm the chair of the Integrative Neurosciences Branch at the National Institute for Drug Abuse. So first, for the interested and curious in the audience, I'll define some terms. Systems neuroscience is basically a subcategory of neuroscience that focuses on the identity and connectivity of functional units within the networks of the brain, and how these circuits give rise to complex signal processing and integration that ultimately governs observable traits like behavior and disorders associated with behavior. So why are we so interested in systems neuroscience perspectives and applications to substance use? Well, first of all, because we're NIDA. The other, you know, excellent talks during this meeting have addressed the movement in the clinical domain away from acute intervention and simple abstinence from all substance use as indicators and guideposts towards therapeutic success, which I think is appropriate, and it also underscores the nature of substance use as a spectrum disorder with a long-term dynamic course. Now, key challenges to effective therapeutic interventions and diagnostic categorization come in the form of being able to predict responses to certain interventions and prioritizing comprehensive care along the trajectory of each individual's condition. As such, the more we can address questions, such as the ones I've posed here, the better we can succeed in those endeavors. So if we consider the observation that substantial individual variability exists even within categories of substance use disorder, does that then indicate that there are subtypes of addictive behaviors that can be explained or that exhibit distinct physiological mechanisms? Furthermore, acknowledging that the brain reward circuitry is complex and multifactorial, can we apply systems biology principles to help quantify relative contributions of functional subnetworks within those defined circuits? In other words, can we use systems neuroscience to help us explain the physiology behind why a spectrum of observable traits in substance use disorder exists, and can that physiology point us to new therapeutic interventions and potential? Now, considering the brain reward circuitry and the associated components that give rise to the feature that we associate with substance use disorder, it helps to think of them as a massive, complex, adaptive system. And a key and very cool feature of complex adaptive systems is that perturbations to them induce responses, and those responses add functionality to the system, right? Learning and memory and association are sort of great examples of this, right? If you experience something new, that sensory input is essentially a perturbation to the state that existed before. And as you persist down that road, you add features to that functionality, you, you know, increase associative memory, you enrich the contextual information of a given experience. All these things are meant to do stuff like accelerate the, you know, active learning to improve recall when it works well. When the system becomes dysfunctional, well, then you can think of it as being perturbed in a pathological direction where the association with a given stimulus is now inappropriately represented or is outweighing other representations that are of equivalent value. So in the case of addictive drugs, right, that, that perturbation impact results in a perpetual shift in the contextual representation of the environment and the self. And sometimes it is to the extent that even basic self-preservation behaviors are impaired. That's one of the things that is, you know, an example of the profound power of addictive pharmacological substances, that you can actually shift the balance of value to the extent that the individual is now acting in a way that is deleterious to their own continued survival. So when we envision the brain, the reward system, and all of the contributing subsystems and ask, how do we assemble robust mechanistic models that account for the contributions of so many functional units and contextual modifiers? Now fortunately we are in an era where advances in both molecular measurement and computational models have matured to the point where we can start deriving what we would consider actionable inferences from experimentally tractable model systems. And key to this premise, though, is the emphasis on these points that I make here. Using the metaphor of navigating by the stars, we're trying to look at where things are in relation to where we are in order to paint a, you know, a research investment in a productive trajectory. So some important points to consider. First of all, when considering models of complex adaptive systems, consider the assumptions that are made and how do we test for their impact? How reliable are the things that we can observe from extremely dense information-rich contextual models? How well is that going to translate into a system that doesn't necessarily recapitulate it point for point? And that model diversity in terms of both the model system itself, whether we're talking about an animal model of behavior or even a cellular model of signaling, these things are often key to translational value. So with that, I would like to pass the baton first to Dr. Trey Ideker from the University of California at San Diego. Thanks very much, Tristan. Here we go, let's see. Which computer am I operating off of here? Okay. So I'm going to talk today about using molecular network mapping to understand the convergence of psychiatric phenotypes and the genetics behind those psychiatric phenotypes. I'm going to tell two vignettes or two stories. One is within humans and the other is across species. But the underpinnings of what I'm going to be talking about today, and I think Nevin also will touch on this, is, you know, what happens when we do genetic mapping for a disease? Well, in modern genetics, we do something like a gene association study or a GWAS for that disease. Of course, you can also do any trait or, you know, any trait can be run in this way and has been. And the ideal situation would be a plot like is shown down here in the bottom left. These are Manhattan plots where you run along all of the genetic variants that you're measuring on the X-axis, starting with chromosome one all the way to the end. And then you're measuring to what degree that variant or mutation, in the case of somatic mutation diseases, associates with your cases versus controls. You're hopeful to find peaks like this one of strong association of one or more loci. But very often what you see is a plot that looks more like the one on the right-hand side where you have many suggestive peaks, but none of which or few of which rise above this genome-wide significant p-value required to call them truly significant. Now you, of course, can now add more patients to your cohort and perhaps raise some of these peaks up above that level as you get more and more individuals in your analysis. But then you have an issue, perhaps, of mechanistic interpretation because what often happens at that point is many peaks now start to slightly rise above that significant value. And now, of course, you would see thousands of peaks associated with the disease. But as soon as you have thousands of variants and thousands of genes associated with a single trait, what have you learned? So there's three issues I want to talk about today. One is this issue of power. How do we recognize what are the loci underlying a disease? Two, how do we begin to interpret those loci in terms of molecular and other mechanisms? And then three, a corollary of this, is how do we see what these mechanisms are across cohorts and across species? And the talk today I'm going to give, and I think Nevin also, is really choosing a example of what's becoming a quite popular approach, which uses further information painted on top of those variants and genes to inform how they work together. So the premise is if the true pressure of the disease or the true units of mutation are not variants or genes but systems, then what you want is a map of those protein complexes and higher-order systems on which mutations converge. And so here today I'll be talking about protein-protein network maps, but there's a variety of knowledge bases you could bring to this goal. But the idea of all of them is shown up here at the upper right. If I have three variants and three genes, all of which are interacting in a common pathway or protein assembly, then what I can do is, by looking at the convergence of these variants, raise the significance value of that assembly by testing one assembly at a time. So we're now testing higher-order units above the gene level is sort of the idea. So the two applications I'm going to talk about today are, one, looking for common networks underlying two comorbid conditions, autism and congenital heart defects, and then, two, looking at the same exact trait, which has not agreed in the genetics between humans and rats, and looking at networks as a way to unify those genetics. All of what I'll be talking about today is a platform, Network Maps for Genomics, which sits here in my—so both Nevin and I have overlapping centers that we participate in that uses these techniques. I'll be talking about work from a center of which Abe Palmer is the PI, funded by NIDA, the Center for GWAS and Outbred Rats. But there's a lot of crossover here with the PCMI Psychiatric Cell Map Initiative. In fact, the first project is co-funded by both of these centers, as well as cancer and other projects where the common platform is advanced and then applied in different contexts. So here's the first study, autism versus congenital heart defects. The simple question that motivates the study is that babies with heart defects are found to be at higher risk for a later autism diagnosis without a clear understanding of why that is. One of the reasons for a lack of understanding is in the genetic mapping for those two diseases, essentially different sets of genes had been identified. So for ASD, at the time of this study by Bryn Rosenthal, 65 genes had been mapped for autism. This number has essentially doubled at this point in time. That's the Simon Simplex Consortium, or the SAFARI, also known as the SAFARI database. And around the same number of genes had been mapped for congenital heart defects, about 66 genes. There are a few genes overlapped between the set, but largely these are non-overlapping sets. So to ask whether these gene variants converge at a higher level of biological systems, we brought in here in the middle this resource called PCNet. It's a literature-curated database of protein-protein interactions and other kinds of gene-gene and protein-protein associations in the public domain. So it's sort of a catch-all for what's been published systematically or in literature. It covers almost all human proteins with about 2.8 million interactions. So it's a very large resource, and we'll talk more about how to make this better in the future, but that's what was used here to try to just get coverage of network connectivity among most human proteins. And the way that we use that resource is the following. We invoke this technique known as network propagation, which has become, I think, in a lot of people's hands, a very common approach for understanding the impact of variants on protein networks. And their common subnetworks they may align against. The way it works is if you have two diseases or traits, like I do here for ASD and CHD, imagine each gene that's been independently linked to those traits is colored here in blue versus yellow, respectively. Notice that even in this toy example, I've tried to model this situation where you have maybe one gene, which is commonly identified underlying both diseases in green, but by and large, the yellow and blue do not overlap at the gene level. These, of course, are protein-protein or gene-gene interactions. But now what I do is I pretend each of those colors is a source of heat, and I run essentially a heat propagation algorithm with a certain diffusion coefficient. So there's a single parameter here that controls how much spreading I get. And notice what happens after some spreading of that heat outward. Common regions of the network are implicated, including genes that were implicated in neither of those two genome-wide association studies, but nonetheless are converging on common protein complexes and assemblies. And therefore, that's why you get this bright green area or areas of the network. So here's what that looks like for this particular pair of comorbid diseases, ASD and CHD. On the left, we have the starting, and I've simply segregated the network here, pulled it apart in the layout so you can see what's happening. On the left are the ASD-risk genes, on the right are the CHD-risk genes, as originally identified in those separate studies. And now in the middle here is the convergent subnetwork that's identified. And in the interest of time, I'm going to spare you all the statistics that show this, but this is a significantly and substantially larger set of proteins and protein interactions than one would expect by chance. And as you can see, containing dozens to hundreds of proteins in that intersection. And so then what one can do in these kinds of studies, and I'll just keep it, you know, moving here to get to the second study, is just show you some examples of how we have begun to validate these genes that arise in the middle. So again, they did not come up in any single GWAS, but they are found at the intersection of both diseases. Now in some cases, what one can do is go back to the original GWAS and see that that gene is, in fact, harboring or is nearby a variant that almost reached genome-wise, you know, significance, but just was underneath the thresholds you needed. And that's the case a lot of the time. And whether or not that's the case, one likes to go in and then do functional validation. So with Helen and Jeremy Willsley, and most of the work of Helen and her postdoc, Yu Zhao, we went in into the Western frog, Xenopus, which is a common model for both heart disease and autism genetics. And we injected CRISPR reagents to knock down genes, either in the brain, and you see you do one hemisphere, so the left is the control and the right is what's probed with the CRISPR reagent. In the heart, you inject it bilaterally, and then you score morphological differences of various kinds, which Helen has simply summarized here as mild or severe brain phenotypes in green or mild or severe heart phenotypes in red. And so you can see here are the controls, positive controls, I should say, where we expected to see phenotypes in, and certainly CUL3 shows robust phenotypes in both brain and heart phenotyping. And then over here next to it are 10 additional genes which came up, oops, from the network analysis, where you can see not always, in the case of ANKRID11, for instance, we got a phenotype in brain and not in heart, but by and large here we are able to see reproducible phenotypes in the morphology of both organs, providing some validation for those initial network genetic findings. Okay, so that's the first vignette, and now in the second vignette what I want to do is turn to a related problem, same methodology, talking about that common network genetic approach here at the center of things, but now rather than looking at comorbid diseases, let's look at the same condition, and I'm going to start here with body mass index. You ask why am I talking about BMI at a psychiatric conference? Just wait. You'll see. But so here we're going to look at BMI as a model trait in two species, humans and animals, where you would have expected there was underlying genetics in common between these traits, but in fact, as you'll see in a second, there have been none identified, okay, leading the question are these really two phenotypes that are distinct mechanistically in rats and humans or are they, do they actually share some common mechanism that we just haven't gotten at yet? This, of course, is a test case for many, many, many different drug abuse phenotypes as well that A. Palmer, the PI of my center here in this case, is interested in, but we figure if you can't do this with BMI where you have essentially the largest number of rats and humans phenotype for pretty much any trait, you're not going to be able to do it with drug abuse phenotypes either. So in this case, the data sets we are using here are shown up at the upper right. The human cohort is very large. Now it's got 700,000 individuals where you have BMI and genetics. And in rodents, we used this previous study of 3,000 outbred rats, which Abe has supplemented with an additional 2,000 animals not in that original study, and nonetheless, we've recomputed all the GWAS statistics for both human and rodents. And this method I just illustrated, which uses network diffusion to understand the network relationship between two sets of genetic variants, we've now at this point in time published as a nature protocol, but it's essentially the same method I was just talking about. So here's what the non-overlap looks like in those Manhattan plots. Just to start off with here, given the larger number of human individuals, 700,000, this is 1,929, okay, that wrapped down, but that's almost 2,000 genes have been linked to BMI. In rats, it's a smaller number, most likely just because of the smaller number of individuals that have been analyzed in rodents. And if you look at the overlap between the genes in humans and their rat orthologs, you find essentially the expected number at the intersection of that Venn diagram, you find 29 genes in common, which is not a significant overlap. So that's the conundrum, and again, this is a model of the conundrum that you often face when trying to translate genetic results between a model species and Homo sapiens. And so here's essentially the same pipeline I talked about before with network propagation. It's now running horizontally across the page. We start with this PCNet resource. We project onto that network of, again, most human proteins make it in here. The variants that have been mapped for human or rodent BMI, and then you look at their convergence after propagating and spreading the effect of those variants out to their protein assemblies and those network neighborhoods. What you find is, in fact, you now get a significant convergence. There is the p-value of that overlap. This is now the expanded network neighborhood in the network, that is to say, around the human variants in blue and the rodent variants in purple. So fairly simple conceptually here what's happening. And now the intersection is shown here in the middle, and we applied actually an even more stringent filter to get down to about 657 genes, I should say genetic orthologs, because it's a rat-human gene pair. I'm now talking about that are nearby variants in both GWAS studies. That's the key word, nearby, which could mean they are those variants, so they are overlapping those variants, or it could mean they're in the same protein complex or assembly. So that's what this entire network looks like. Of course, it's hard to analyze hairballs that look like that. So then the next step I'm gonna talk about, again, draws from methodology that was developed in other centers that I talked about there, and now I'm gonna be applying to this particular GWAS challenge. So here's something we had developed originally for cancer, where we had a very large cancer-protein-protein interaction network. And again, we wanted to organize that network to elucidate what are the protein complexes and larger assemblies hidden in that hairball. Already, when you look at these networks, though, you can start to see that structure is there in the network, you just have to be able to identify it. So what we do is apply this technique known as multiscale community detection. It's essentially a variant of clustering with some important variations that I'm happy to talk about in the discussion period. Our particular algorithm for doing this is called HiDef, it's available online. If you search for that string, you can read the paper and everything. But the idea here is you start at a very close distance in these network neighborhoods. We can calibrate that distance, in fact, to physical distances in nanometers. Again, happy to discuss how we think we can do that. And now we look at the communities of proteins that form. So here in one region of the network, here is a community. And now we expand that radius and look at how the cluster, of course, expands, that protein assembly gets bigger. Essentially what I'm doing is I'm dropping stringency a little bit and looking at what else is added to that protein complex. And now eventually what will happen is as you raise stringency, or I should say decrease stringency, raise physical scale, we've merged this assembly on the left with this assembly on the right to form a giant, or a larger, I should say, cluster shown at the top of the page. And in fact, what you're looking at here is the precatalytic spliceosome, the U1 branch and the U2 branch, okay? And so to make sense of the complexity then, what I can do is reduce a network to this hierarchy here, which is shown in the middle of protein. Every node now represents a protein complex or assembly inside of a larger protein complex or assembly. And the joining of two assemblies becomes a join in that hierarchical tree, is the idea. So here is what that looks like, that algorithm looks like when I apply it to this BMI network that was found to be conserved between rats and humans. I once again had 657 proteins that were implicated by the common convergence of those variants across species. And now here are the top-down, now I'm factoring at that big network into smaller and smaller pieces, and the small ones are at the bottom. So you're sort of passing through large processes, organelles, and then finally arriving at smaller protein complexes. So let's just look now globally at what are the regions of this network that are implicated by variants in both species. And you know, I had thought, naively, not being a diabetes or BMI researcher, that we'd be implicating adipose tissue or liver-type functions. In fact, half of this is brain. It's feeding behavior, it's feeding drive that's underlying these common phenotypes. You can see nervous system development, synaptic signaling branch, and the different pieces, or ways in which the synapse gets repurposed, GPCR signaling, and so on and so forth. So let's take a look at going in now and dissecting one of those networks. So here's the GPCR signaling system that, again, is the convergence of variants in both rats and humans. There's the sort of assembly of protein interactions shown blown up here. There's four categories of proteins, if you think about it. There are proteins that were identified in the original human BMI GWAS. Those are the light blue ones. There are the proteins that were identified as being near-variants in the original rat GWAS. Those are the rat orthologs in purple. In this case, there are no common variants or genes near-variants in both species. And then we have the fourth category, are proteins that are implicated simply by the network structure. So they're there because of that protein complex, whether or not they're actually near-variants in those GWASes. So now, just like we had this important validation follow-up for the autism study, now we're gonna go in, in this case, since we have rat aligned, we can now use the power of mouse and rat genetics over three decades to, without doing any more lab work, simply go into the mouse genome database and avail ourselves of the tens of thousands of mouse knockout data sets that appear there. Okay, and this was a very good insight of Abe Palmer, who realized we didn't need to go and make these mouse knockouts ourselves because many of them have already been made. And so we go into the MGD, and we simply look, has that gene been knocked out and reported in that database? And what are the phenotypes that have been linked to that gene, knockout mouse? And you see here in brighter green, three of the genes already have well-known in MGD links. When you knock out that gene, you affect BMI and obesity. The light green ones have suggestive phenotypes for other body size traits. So for humans, human height, for rodents, it's body length without tail, that's one example. And there's, of course, many other phenotypes that are suggestive or are co-clustered with BMI itself that one can look for. And so only two of the proteins in this network did not have useful information in the MGD, more or less allowing us to validate this cluster. So now, another way of saying this is you're looking at the convergence of three datasets. You're looking here at the convergence of rat GWAS, human BMI GWAS, and mouse gene knockouts, all of which implicate GPCR signaling in obesity and BMI. Okay, now, if this seems like an anecdote, it is. I have to drill down and show you one of these things. But in fact, the majority of these systems were validatable in this way. And I should say, I think I might have failed. I had the citation, this paper's soon to appear in Cell Reports. It's not quite out yet, but you'll be able to soon read about this. It's right at out, is the manuscript citation. Okay, so that's my story today. The key takeaways are as follows. GWAS results for related diseases and traits often show little agreement at the SNVs, a single nucleotide variant, or the gene level, okay? The same is true for model species, even for the same trait, right? For humans, it's you wanna find commonalities between related traits, and that can be difficult. But it's even more puzzling, often, in model species, you don't identify the same genes as you did in the human genetic study. But in these two examples, we've seen significant convergence within biological networks, and in this particular case, these are mostly protein-protein interaction complexes that we're looking at in these networks. So as I've said a couple of times here, these results are enigmatic of a paradigm, or sort of illustrate a more general paradigm for the study of associations of genes with traits, both within and across species. What I did not take the time to discuss today, this would be a different talk, we've also, in my group, put a lot of machinery in place for taking this analysis to the next level, and building formal genotype-to-phenotype translation systems, so here you'd be trying to predict risk for diabetes, for instance, from someone's genetics, either in humans or in rodents. To do that, you turn to the burgeoning field of deep learning, and the sort of advance that we've recently seen, both in our group and others, is you can entrain those deep learning systems to the protein networks I just talked about, so you get not just black-box predictions of phenotype, given the genotype of an individual, but you can start to understand the molecular pathways through which that information is transferred towards prediction of phenotype. So again, didn't talk about that today, but it would be an important, if I had an hour, I would then go into that, but you're welcome to ask me questions about it in the discussion period. And then, I don't know exactly what Nevin is gonna talk about, but I expect he'll say a few words now about what if we replaced PCNet here with an actual honest-to-God experimental derived protein interaction network that was, in fact, relevant and tailored towards that particular disease or trait. And I think you'll see a lot of beautiful examples of that shortly. So with that, I'd like to thank my funding. As I said, the Common Platform is funded in a variety of ways, but this last example, in particular, was funded by NIDA through a Palmer's RAT GWAS Center. The resources up here in the upper right are probably most important, although, of course, so are the labs and people, but if you've enjoyed some of this and you wanna try to apply these methods to your own genetics or network data, or both, please check out the following links. So INDEX, which stands for Network Data Exchange, is a public, open-source repository of network knowledge out there, and PCNet was downloaded from INDEX in this case. NetColoc is now what Brynn calls her dual network propagation method on those networks, which was applied here both to look at autism and CHD, as well as to look at BMI translation from model species to humans. All of that is happening in the Cytoscape Network Visualization Framework. All these, I'm not giving you the full URLs here, of course, cytoscape.org is that one, but you can use Google to find, I think, pretty easily all of these. And then the community detection and that hierarchy of communities that was shown is the result of this algorithm, HiDEF. But last but not least, I would like to acknowledge the people who did all this work, Brynn Rosenthal, Sarah Wright, in my group, Brittany Ledger in the Palmer group, Yu Zhao in Helen and Jeremy Wilsey's group. And of course, all of this is a collaboration with A, Jeremy, Helen, and of course, Nevin, who's up next. So thank you very much. All right, so thanks, Trey, for that nice introduction, and also thanks to Sarah and Tristan and Ida for giving us the opportunity to talk about our ongoing work. And what I'd like to do is just tell you about initially the motivation behind the work that we've been involved in over the last couple of decades, really. And it's a work that's being done closely with Trey Idaker and others. And it is really motivated by the fact that genes and proteins do not work by themselves. They work in groups, they work in clusters, there's protein complexes that operate in functional pathways. And in order to understand the cell, you really need to use these quantitative network approaches, many of which that Trey just talked about, to understand the biology, because that's the real biology. And you need the real biology to point you in the right therapeutic direction. I would argue, recently, there's been a convergence of technologies, both experimental and computational, that have now come together, many of these tools we've been involved in developing, which is allowing for unprecedented resolution and clarity of the cell. And this is allowing for the understanding of complex biological phenomenon at a rate that's never been done before, and an understanding of the underlying biology behind many different disease areas. Which again, you need to develop the right therapeutic strategy. So, technologies are very foundational to all the work that we're involved in. This is where I want to start. So, what I've been involved in over the last couple of decades, and again, working closely with Trey on this, is combining protein physical interaction data with genetic interaction information. More recently, I would say, structural biology hasn't become high throughput, but it's becoming medium throughput here with cryogen, which is incredibly exciting. Combining this with chemistry and chemical biology, and of course, bioinformatics is key through all the individual data sets, as well as, even more importantly, combining all these different types of information together. So, what type of data are we involved in creating here? So, just a little bit more detail in the type of information. So, protein-protein interaction data using standard affinity purification mass spec based approaches. Apex, or this is on for transient interactions, cross-linking mass spec, global proteomics, abundance, post-translational modifications, genetic interactions, single perturbation, double perturbation studies, now using mostly CRISPR. Structural information, I talked about cryo-EM, but also crystallography and integrative modeling. Chemistry, chemical biology, bioinformatics, and then patient information. We're uniquely situated here because we're close to a great medical school just down the street, so that we can get medical records and also patient samples to test some of the predictions that we're generating in cell lines and primary cells. And the goal here is to put all this information together to define molecular networks that we think are important for different disease areas. Networks that hopefully have therapeutic value. And just to go a little bit deeper into the motivation behind the work, and again, this is very synergistic with Trey. I kept this one slide in here because it's a good segue to my next slide, but I can go through this very quickly because Trey did a great job explaining this, the fact that most diseases are not monogenic, and you often get dozens, if not hundreds, of different mutations associated with different disease states, especially if you get the cancer, but unfortunately, most of the key mutations fall below what's considered statistically significant, as an example here with this toy Manhattan plot. But if you apply this suite of tools, you can make sense of this genomic data by looking at the cell with respect to complexes, modules, and pathways. Here, you just got a list of genes. You don't know if they're connected or not, but down here, it's saying, ah, the corresponding proteins are in a protein complex. So this network information, you'd agree with me, I think, makes this genomic data much more interpretable, and therefore, what we need here is a Manhattan plot, not of individual genes or genomic loci, but of pathways or complexes. Ah, now this becomes statistically significant, but the sad fact is, we don't know what the vast majority of those are in any cell system, be that healthy or diseased, but this is a problem that we're trying to solve. And I would also argue, you can use this platform to get insight into specific mutations, and that insight can point you in the right modality direction. So what I'm talking about here, so here's a piece of the toy network from the previous slide, and say, if this protein is mutated here, let's say this is a well-characterized mutated protein, say, p53, and this is its network, in well-type situation, if you introduce the mutation in, there's a variety of different insights you can gain. So for example, you could get a strengthening of a protein-protein interaction with that particular mutation. You can establish a brand new interaction that doesn't exist, unless the mutation is there, and as you could probably guess, the third scenario here is, you can actually lose a specific interaction. And depending on what the mutation does to the network, it would point you in different modality directions. So for example, here, you may wanna develop an inhibitor or a degrader for an interacting protein, not the protein that all the drug companies are looking at, that may or may not be drug-able, but the interacting protein here. And so here's inhibition or degradation of the interacting protein, and here you may wanna have a molecular glue, where you may wanna resurrect the protein-protein interaction where the mutation is causing a disruption. So these approaches allow you to identify the modules in healthy situations, which gets at the biology, number one. Number two, you can put the mutations into these pipelines and get insight of what's it doing to the network, and that can point you in the right modality direction. And number three, it actually is allowing you to carry out patient stratification, because you're starting with the mutations that are associated with a set of patients. So if you were to derive a treatment based on this information, you'd know which set of individuals to target. And I don't think there's too many approaches out there that'll allow you to do all three of these things. Okay, so this is obviously an unbiased platform or pipeline, and as Trey alluded to, we've actually started a number of different cell mapping initiatives to focus this powerful pipeline on specific disease areas, and it allows you to group together individuals and allows you to raise money as well if you're focused on specific disease areas. So starting with the Cancer Cell Map Initiative, we started this with Trey Iedeker several years ago. The Host Pathogen Mapping Initiative, this was started with Jeff Cox at UC Berkeley. And then, of course, Trey talked about the Psychiatric Cell Map Initiative, where we're using these unbiased approaches to study autism, schizophrenia, and OCD, et cetera, and this is work that was started with Matthew State here, the Chair of Psychiatry at UCSF, and Trey Iedeker. So, I'm going to focus mostly on our work, obviously, because of this meeting, on the Psychiatric Cell Map Initiative, but I just want to tell you a little bit about the cancer work, because I think it laid the foundation for all the work now that we're doing in the neuropsychiatric space. So, I'm going to allude here briefly to three papers that were published back-to-back-to-back about a year and a half ago. In this particular study, what we did was just unbiasedly take the top 40 genes genomically connected to breast cancer through TCGA, generated protein-protein interaction maps in disease cells and non-disease cells, plus-and-minus mutations were relevant, and this was work led by Ming-Kyu Kim, a scientist in the lab who's now started his own lab at the University of Texas, San Antonio, and we did the same thing looking at head and neck cancer, taking, again, the top 40 genes mutated in head and neck cancer, generating the same type of maps, work led by Daniel Sweeney, and what we showed in these papers is that we could use these maps in a variety of different ways. Number one, we could use them with respect to prognosis. If people have a set of mutations associated with the tumor, we can lay them on the map and be predictive about lifespan. Number two, we can carry out patient stratification in that if there are treatments available for the cancer, we're able to predict which group should get it. And number three, which is most exciting to me, is it opened up brand new areas of biology, which opened up brand new areas of therapeutic strategies that we and others are now employing. And we put all this information together in the third paper. Trey alluded to this work. This was led by Fan Zhang and Trey's group, where we came up with a hierarchical model of the cancer cell, and we're using this model to be more predictive with respect to these different approaches that I just talked about. All right, so this is the playbook, and I just want to go into a little bit of more detail into one of these papers, and looking at PIK3CA. So this is a highly mutated gene in head and neck cancer and also mutated in a variety of other cancers as well. Here's the lollipop plot of the mutations connected to the head and neck cancer. So we took 15 different mutations, introduced them into the PIK3CA mutant, purified them, did the mass spec, compared the interaction landscape to the wild-type protein. We wanted to see what effect the different mutations had on the interaction landscape. So on the y-axis, we have the mutations. On the x-axis, we have the interacting factors. In this heat map here, if a box is blue, it means the mutation results in a reduction or a loss of an interaction. If it's red, you get a gaining or a strengthening of interaction. There's a ton of biology here on PIK3CA. I'm just going to zoom up on one slice here of this heat map, and it involves the interaction between HER3 and PIK3CA, a well-known protein-protein interaction. And you see a number of red boxes here, suggesting that in these mutant backgrounds, there's a stronger association between HER3 and PIK3CA. And you see I put in the yellow boxes here, including the two most prevalent mutations, which are here on the lollipop plot, in the helical domain, E542K and E545K mutations. So reviewers didn't believe quantitative mass spec. Hopefully, those days are coming to an end soon. But we confirmed this by IP Western. You know, most of the antibodies out there don't recognize the protein that you think, but that's a whole other story. But in this case, it looked like it did. So we confirmed the titer association with these two mutated proteins with HER3. So the question here is, this differential interaction mapping that we're doing, could it be predictive with respect to treatment if you're able to target this particular pathway in individuals that have tumors mutated in these specific PIK3CA mutant backgrounds? And to test this, we collaborated with Jennifer Grandis here at UCSF. She's involved in a clinical trial using an antibody for HER3, which works upstream of PIK3CA, the CDX3379. And we told her about this data. She actually said, well, patients that have tumors with these mutations, you actually don't try to inhibit the HER3 protein. And we said, why? It's like, well, it works in the same pathway. You just don't do that. She didn't just say that. Others said that as well, including Silvio Gutkind, who we work closely with at UCSD. And actually, Silvio carried out the following experiment for us to look at these mutations and these predictions in more depth, where he's introducing cancer cells into mice, treating them with the antibody or not, and seeing what happens to the tumor. So the first mutant that he looked at, you know, another prevalent mutation here in PIK3CA. And what he found here in this mutant background is 1047R, that there was no effect. And he said, ah, this is what you'd expect. You know, all the mutations are going to behave this way. But the big surprise here was that in two of the mutant backgrounds, there was a big effect when you inhibited HER3. And surprisingly, it was the two most prevalent mutations that we found, that others found being mutated in PIK3CA. So then we told this to Jennifer Grandis, and then retrospectively, she was looking at her clinical trials. She said, yes, people with these mutations were actually responding better to HER3 inhibition. So obviously, the goal now is to go prospective with this type of information, which is what we're doing. And these mutations are not just predictive with respect to inhibition of this pathway or treatment. They're also allowing for a better biochemical and structural characterization of PIK3CA binding to HER3. So there's been ongoing work here at UCSF and the CCMI led by Natalia Gerling-Klum-Verba structurally characterizing HER3 with HER2 and irregulin. This is a known complex. One of the next holy grails is to get HER3 binding to PIK3CA. So in these mutant backgrounds, we're getting a tighter association in vitro, and there's cryo-M work that's ongoing. What we really want here is PIK3CA bound here to HER3, and I think by cryo-M we'll have that structure very soon, which will hopefully provide even more information into this key complex and pathway, and hopefully will provide greater insight into therapeutic direction we can go. And I really love this example because it combines proteomics, genetic structural biology, and now chemical biology, because K1-SHOCAT now is chemical matter targeting HER3, and if that works, we know which mutant backgrounds to go after. So this is what we want to do again and again and again for key complexes, key pathways in cancer and also in other disease areas as well, including neuropsychiatric disorders. So what we need here for all of this work are sets of genes, right? For cancer, we use TCGA, but we need a set of genes linked to different neuropsychiatric disorders. And what we wrote here was kind of a manifesto a couple years ago, the members of the Psychiatric Cell Knot Initiative, what we could do if we actually got a set of genes connected to autism. And happily, a group, it's called the Satastro Data Set, generated a set of genes linked to autism, 102 genes. This is perfect. This is what we need to feed into our pipeline, and this is the work that's been ongoing. Again, the same playbook that we used for cancer we're using here for autism, looking at protein-protein interactions, looking at protein-DNA interactions. About a third of these 100 genes are chromatin transcription factors, so that makes sense to look at protein-DNA interactions. Structural analysis on key complexes using cryo-EM, genetic approaches as well, single and double perturbation studies, and the idea is to put all this together collaboratively with Trey to come up with an ontology of autism, a hierarchical model of the part of the cell that's connected to autism, and then use these models to make predictions that we could test in these higher-order phenotypes using IPSC, brain slice cultures with Steve Finkbeiner here, and then frog models with Helen Willsley, and Trey talked about some of that ongoing work. And then depending on the results that we get, in a reiterative way, it feeds back into this particular pathway. So this is unpublished data. It's one of the first times that I've kind of presented this work, but we're really excited about it, although it's preliminary. So out of the 102 genes, we were able to take and purify 100 of these in HEK293 cells and looked at the interacting factors. This resulted in about 2,000 interactions. Eighty-five percent of these had never been described before, even though many of these proteins have been looked at in great detail with respect to these, the same approaches that we're using. We get a range of interactions here for each prey. On average here, I think we're looking at, you know, 12, 14 interactions, which makes sense based on some of our previous work. And some people could say, well, you're looking at these interactions in the HEK293 cells. Are these relevant when you get to a more relevant cell? And it's a fair question, and we are generating this data now in NPCs. But what we did was look at the interacting factors and said, are they actually expressed in the part of the brain that autism's connected to? And indeed they are. Here's the set of 102 genes being expressed in the prenatal part of the brain. Here are the interacting proteins, and here's all the other proteins in the program. So there's indeed an enrichment of the interacting proteins being expressed in the right part of the brain, suggesting that we do have a physiologically relevant protein-protein interaction data set. So there's a ton of information here. I'm not going to go into all of this. What we ended up doing was breaking it up into seven different clusters, and I just wanted to show you one cluster. It involves 20 bates, or 20 percent of the data set, where the nodes here, the pink nodes, correspond to the autism proteins, and then, of course, we've got the co-purifying proteins. We've ordered this based on protein complexes. And what you can see here, there's a lot of connectivity between these different autism genes. And there's one particular connection that we're following up on right now involving these three autism genes, connections that had not been previously described, involving a well-characterized autism gene called DERK1A. All right, so we've got a wealth of information looking at the wild-type proteins, but then what we did, just like with cancer, we're introducing in the different mutations and seeing what effects do they have on the interaction landscape. So out of the 102 genes, there are 43 genes which have 87 derived patient mutations. Just like we did with PIK3CA, we're putting them in one at a time, seeing what effects do they have on the protein-protein interaction landscape. Other interactions go up in red or down here in blue. So this is a different type of representation that I showed, similar to what I showed previously with PIK3CA. And so instead of a heat map, I'm showing you a network representation where each gray diamond here corresponds to an autism mutation, and then the edges correspond to differential interaction. So if an edge is red, it means that a particular mutation results in the gaining or strengthening of a protein-protein interaction. If an edge is blue, it means you reduce or you lose an interaction because of that particular mutation. And if the lines or edges are dashed, it means there's a qualitative change. It's completely there or it's completely lost. So a ton of mechanistic data here, the foundation of mechanistic data, connected to autism. And I just want to zoom in on a couple of different pieces. As I told you, there's a number of different transcription chromatin factors in these 102 genes. Here's one connection with RELA and PREV-EP. This is a well-known protein-protein interaction. This is a well-known transcription complex. And there's a mutation here in PREV-EP connected to autism through the genomic work that we predicted by mass spec where you get a loss of an interaction between these two key transcription factors. This is where the mutation falls in PREV-EP. We confirmed this here by IP Western. And then when we get the structure, you can see where the mutation lies. It makes sense that this particular mutation would result in a disruption of an interaction of these two key transcription factors. We're going deeper into this particular mutation. Here's another transcription connection. So there's a series of mutations in FOXP1. This is another well-characterized transcription factor, and it's known to bind to FOXP4. Well, our mass spec data predicted that because you got these blue edges here that these mutations would result in a loss of an interaction with FOXP4. And again, when we did an IP Western here, we confirmed that, indeed, this is what you see. And what we've done here with Tom Nowakowski, who's just down the street here at UCSF, was introduced these mutations into the iPSCs, differentiated them to neurons, made organoids. We've done a series of different transcriptional assays, which has confirmed these mutations do, in fact, have a big effect on transcriptional regulation, as well as differentiation in to neurons. And this is really what we're setting up here now, is systematic introduction of these mutations connected to autism, one at a time, into the iPSCs, looking at the protein-protein interaction landscape, looking at differentiation to neurons. And we're not just stopping at autism, there's about 30 genes connected to schizophrenia that we're looking at that have mutations that are tractable for this approach, OCD, anxiety disorder, and then, of course, what we want to get to is addiction going forward. So we have the tools in place to do this systematically across all the different neuropsychiatric disorders. And just to say, I've got a couple more slides to say, we're also focusing our tools on neurodegeneration. This is a paper that just recently came out in collaboration with Daniel Sweeney and Zainab Takur at UCSF. This is where we were trying to understand APOE4-driven Alzheimer's disease, where we're taking cells that have the APOE3 allele or the APOE4 allele associated with Alzheimer's disease, globally look at proteomics, and what we found here is that there is a phosphorylation of one protein called VASP that's regulated by APOE4. We mapped which kinase was responsible, and we found that there was a significant change of the interaction partners in VASP, which is a cytoskeleton protein. When you introduce a new mutation into the phosphorylation site that's being regulated by APOE4, and then when you used a PKA inhibitor, what we found here is that inhibition of VASP phosphorylation enables neuroid outgrowth, right, so we're getting a deeper understanding using these approaches of APOE3, APOE4. And on the last data slide here, with Lee Gand and, again, with Daniel Sweeney, we're looking at Tau. This was published last year, looking at two mutations in Tau that are connected to FTD. Just like we did with PIK3CA, just like we did with all the autism genes, what we found here when we introduced in these mutations into Tau and did a protein-protein interaction study, what we found is that the FTD mutations reduced Tau interactions with mitochondrial proteins specifically, and consistent with that, the FTD mutations impair bioenergetics. And then when we look at patient samples, what's interesting with those that have Alzheimer's or dementia, the Tau-interacting proteins are actually down-regulated. So we can get insight using these unbiased approaches across a wide variety of different neurodegenerative diseases and neuropsychiatric disorders. And with respect to the neurodegeneration work, we're, again, systematically looking at all the mutations that have been connected to Alzheimer's, Parkinson's, and other diseases connected to dementia. Okay, so this is my last slide here. So what I've told you here is that there's great insight that can be garnered using these unbiased approaches across many different disease areas. And I know this is a meeting focused on neuropsychiatric disorders, but there's great advantage to looking across many disease areas using these unbiased approaches. Why? Because it's the same genes being mutated in autism that are hijacked by Zika. It's the same genes being mutated in cancer that are hijacked by SARS-CoV-2. And to me, this makes sense, because you've got these Achilles heels of the cell. Viruses have evolved to attack them and hijack them, and you also get mutations in them as well that cause disease X, Y, and Z. And there's great value in looking across a multitude of different disease areas to find these commonalities. It may not be the exact same gene that's mutated, but it could be this same complex or the same pathway. And this comes back to this idea of you've got to look at the cell using the right lens in order to find this overlap. So this is a big focus of what we're doing, obviously, across a multitude of different disease areas. And as you can tell, I'm quite excited about these efforts. Okay, so this is my group, and obviously a lot of people are involved in this type of work. I'll just segue to this picture. This is one that was recently taken just down the street. There's the Chase Center. Unfortunately, the Warriors are no longer playing. We often frequent this place, Harmonic Brewing. Tristan, we're going to take you there sometime soon. You'll like this beer. And the people here that were really heavily involved in the works that I talked about, let me kind of like see them here. So I think there's, I'll just say, Rasika, there's Rasika, there's Kirsten, and there's Danielle up front, who's heavily involved in this work, as well as all the projects that we have ongoing. So I'd like to thank them. It's great to work with them. And I'd like to thank you all for your attention. And I think, I don't know if we'll take questions now or if Sarah's going to come up. Okay. Thanks for your attention. Hello, everyone. My name is Susan Wright, and I'm the Associate Director for Data Science at NIDA. I'm just going to give a quick overview of our data science program and show why we're so interested in this. So at NIH, we're supporting investigators that are generating a large volume of data. And this is incredibly complex, and the amount of data is rapidly increasing. There's a wide variety of data that includes basic, translational, and clinical. And for this data to be useful, it's very important that the collection, storage, analysis, use and sharing of it follows the FAIR principles, refindable, accessible, interoperable, and reusable. And at NIDA, our mission is to advance science on the causes and consequences of drug use and addiction. And then we apply that knowledge to improve individual and public health. So our data science program is cross-cutting, and it focuses on the integration of existing data sets and tools with those that are being newly developed, making the data sets FAIR or refindable, accessible, interoperable, and reusable, the development or improvement of statistical and analytical methods and tools, data storage and management, and also promoting stewardship and sustainability. And you may be aware that recently the NIH has a new data management and sharing policy that went into effect, and the website is here. And just a quick overview of this policy. So this went into effect on January 25th of this year. And this applies to competing grant applications that are submitted for the January 25th date and anything after that. And under this policy, NIH is requiring researchers to submit a data management and sharing plan and also to comply with this data management and sharing plan once it's approved by their funding institute or center. During a research project's funding period, compliance with this plan is determined by the center or institute that's funding it. And after the end of the funding period, noncompliance with this DMS plan may be taken into account for future funding decisions for the recipient institution. So this does apply to all NIH research funded or conducted in whole or in part by NIH that results in the generation of scientific data. It does not apply to research and activities that do not generate scientific data such as training, fellowships, infrastructure development, and non-research activities. And there is a comprehensive listing of the mechanisms and codes that require applicants to submit this plan on the OER website. And if it does not show up on this list, it does not require a submission of the data management and sharing plan. There are six elements that should be included in these data management and sharing plans. The data type, related tools, software or code, standards, data preservation, access and associated timelines, access distribution or reuse considerations, and oversight of data management and sharing. And there are several links on the OER website that are useful if you're planning to submit something. And I'm happy to talk one-on-one with anybody if you have any questions about it. But obviously data sharing is very important to us as you've heard from these talks. And that's all I have. So I guess we can move into the question period now. Thank you. Just to reiterate from before, questions, please use the microphones in the center of the aisles. I'm responsible, so I'll stand up at the top. Well, I can get us started. One of the things that comes to mind in looking at both of your talks is the point at which at some point after looking at an empirically derived and unbiased network assembly, you get to the point where you have to follow up and validate for discrete contributions of the individual targets. How do you overcome annotation bias when you're working from these enriched novel networks? In other words, how do you not cherry pick things to follow up on based on what everybody else has already done? Or based on what they haven't done? Yes. So maybe I'll start. Go ahead. Yeah, so it's funny, we were just talking about this over the past couple of days. So when you start to have, as you point out, novel protein complexes come out of this analysis, which is the main difference between Bethel's talk and my talk, is public data for networks, you can find what's known, and that's useful for understanding genetics. But then in the second activity, you also can find known and novel complexes. So first of all, how do you know? We start by taking that whole hierarchy of assemblies and matching it to K, Go, all of the public quorum, all of the public knowledge bases of known protein complexes and pathways. The problem with Go enrichment is either one of two things happens. Either it finds something, in which case it's not novel, or it fails to find something, in which case you don't know anything more than when you started. What we've now been doing is using ChAT-GPT to instead feed it a list of proteins and say, ChAT-GPT, oh, great one. Because you should always be polite to AIs that might kill you one day. And say, here are seven proteins that are linked to autism in this protein complex. What do these do? And it's interesting, it, by the way, has ChAT-GPT4, as well as some of the other large language models, have access to PubMed. They have read every paper in PubMed. They're the only entity that has done that. They have access to gene ontology, all those databases that we can essentially enrich against. It's sort of working. It also has annotation bias. It tends to cite papers in science, nature, and cell only. And everything else tends to be BS. Anyway, it's an interesting question. You clearly want to get it understudied proteins. And doing such in an unbiased way, clearly you make decisions and you get it understudied proteins sooner or later. But how to do it, I think, is an unknown. Maybe I'll just add, actually, ChAT-GPT is working better than I thought in terms of identifying groups of proteins that are working together and calling them out. But it's still not telling you what experiment you got to do. And that's how to tell the story. And that comes down to still a lot of traditional work. So it'll only get you so far. And we're in an era now where we can collect so much information. Students, postdocs come in and they can just collect a whole slew of information. And they're like, what do I do? What do I do with it? Well, you have to read the literature. You have to spend a lot of time looking at what's been done previously to generate hypotheses. Maybe AI will get there eventually. It'll group the proteins and it's reading the literature. But it's still not telling you what experiments you have to do. Maybe, hopefully, it'll get there at some point. But then the experiments still have to get done. So it's maybe not a satisfying answer. But it's looking at PubMed, reading the literature, getting up to speed on what's been done and what should be done in the future. And a lot of trainees don't like that answer. But that's still the case. And you know, the other converse of this is, no, it's fine. OK, can it work? The converse of this, what was I going to say? I've totally forgotten what I was going to say. There was a converse. Never mind. It was going to be really good. It was going to be really good. It was going to be good. We'll have to wait for that. OK? Is this the mic? Yeah, that's fine. Yeah, that's fine. This one is going to do it? Mm-hmm. It's going to do it. It's going in and out. I think we're good. I think we're good. OK, all right. I think we're good. You can just shout. That's fine. Oh, OK. Check one, two. Oh, it's very directional, is your point. It's a very directional mic. Fair enough. It's attached is the problem. All right. So let's see. Another question I had that maybe we could spend a little bit of time discussing is very pertinent to the field of substance use from a therapeutic development perspective, which is when you look at these different networks assembled from discrete molecular entities, the data that's informing them oftentimes when you start, like with the mutation profiles from ASD, et cetera, those are coming from sort of natural experiments, right? You're looking across a population. Exactly. What kinds of experiment designs do you think are best applied to cross-sectional versus longitudinal data sets in order to give you the richest network? So by longitudinal, you mean, yeah, now it's gone. Yeah, yeah. I don't hear myself at all. Do you guys hear me? Oh, you do. OK. So by longitudinal, you mean tracking patients over time. Exactly. Right. Which I don't believe I do. I certainly talk about cross-sectional data, right? That's an excellent question. I'm trying to think, do we have? Certainly, you get into the issue of dynamics. So presumably, in a GWAS study, unless you're talking about cancer mutations, the genome is constant. It's the phenotypes that are changing. Although, again, if you're studying diseases like cancer, both are changing. And so presumably, as those data become available, you would either link to a dynamic phenotype, like as a rate, or you would simply perform what I showed at multiple time points. Yeah, I guess it's a good question. And as Trey's alluding to, we don't yet have the data. But what you're referring to is the next frontier, right, of these networks, dynamics and spatial. How do things change as you go around the cell? But what I would say is, where we're actually generating the best data, temporal type of data, is with viruses, right? Because then you can force infection, or you can just monitor the SARS-CoV-2, how it is mutating, and then introduce those mutations in, and see how it changes. Or you do it in the lab. You can mutate the virus in the lab. So we're learning how to generate that data and how to analyze it. Once we get this type of information, we'll be in a better position to handle it on other disease areas. And cancer is the other one, like I alluded to. So age of diagnosis, response to therapy, or time, you know, progression, pre-survival. Yeah. Come back to that, but I want to get to the question in the audience. Thank you for the excellent presentation. I had a question about, you know, well, you study about cancer biology and also psychiatric things, but when we compare the cancer genomics and the psychiatric genomics, I believe that we have too much noise in psychiatry, especially the labeling and the diagnosis, and diagnosis are fluid. Like 50% of the diagnosis can change within 10 years sometimes. And do you think that when you do the genetic studies about like psychiatric diagnosis, are there so much noise in that studies? Like how do you deal with the noise in these? Because it's totally different than the cancer studies. And the diagnosis is so solid, and like they don't change the diagnosis. Bladder cancer is always bladder cancer, but sometimes when you diagnose bipolar disorders, can turn into, not turn into, but like end up with a schizophrenia label or something like that. That was my question. Yeah, sure, so great point. So what do you do when the phenotype is noisy or hard? Yes, yeah. Hard to measure, I guess, would be a related or a corollary. Yeah, it's so noisy. Accurate, that is to say. So one thing that is a big movement in the field is endophenotyping, so the idea that the final phenotype of the coarsest thing you can say is is the person diseased or not? Do they have schizophrenia or do they not have schizophrenia? If you can push those phenotypes down, that gross phenotype down to a constellation, a small constellation of tissue-based phenotypes or behavioral phenotypes that are more specific, that's obviously nothing that we're pioneering here, but it's a huge movement in the field that approaches like ours will avail themselves of because now the final phenotype changes. And if you think about that, what I like about that idea that's, again, emerging in the field is it's sort of the opposite of what we've done here. So what we, you have genotype goes to phenotype. And then you change it. So genotypes down here, phenotypes up here. As I percolate up through the system, what we're doing is pushing genotype inwards towards protein complexes and pathways. What the rest of the field is doing is pushing phenotype downwards into endophenotypes. And eventually they're gonna meet, they're gonna be able to phenotype the activity of individual circuits and individual cell types for which we have the molecular underpinnings in our network maps. Well, and maybe just to provide more of a concrete example of that, if you look at autism and when Matt State here, he tells me all the time he's treated autistic children for decades, but autism spectrum disorder is not one disease, it's multiple diseases at the end of the day. And the 102 genes that were derived was from probably 20 different diseases. So that needed to be a strong signal. So wouldn't it be great to be able to more accurately diagnose these groups, then sequence them individually and look for this signal. But as Trey's alluding to, maybe we can help with that. Maybe our network approaches could help with the stratification, say for example, with autism spectrum disorder. That's what I think. I mean, it should be from genotype to phenotype, not the phenotype to genotype kind of thing. And then meet in the middle. It's both ways. Yeah, both ways. Yeah, yeah. It's better to stratify the genotype and then to see the label later or the phenotype after that. Can we cluster the genotype first and then see how does it fit with the phenotype? I mean, that's exactly what these approaches are doing is they're essentially clustering genotypes on the network maps, right? So you can think about it equally well as you're subtyping people based on their protein complex orthophys. You know, this person is mutating the synapse, you know, the post-synaptic density versus the precinct. And so that's basically molecularly typing individuals based on what genetic variants they have and which protein complexes. Okay. And thank you so much. I had also another question. It's kind of probably a basic level question, but when we talk about the mutation, are we talking about the malignant variant and also the likely malignant variant? And do we also include the, like, is it worse, like, you know, variants of undetermined significance when we talk about the, like, mutation? Do we also include those things? I'm sorry, you said, is it? I misunderstood your question. Sorry, the question is, when we're talking about the mutation, do we talk about, like, variants like malignant variants, like likely malignant variant, or, like, variant of undetermined significance, likely benign variant, and also benign variant? So there are some variants, like, they change the, like- Well, I guess, I mean, what we hopefully will be doing is changing the category. So maybe some benign variants may actually may not be benign variants when you look at them in the context of, you know, complexes and pathways, right? So I'm trying to understand what is- For example, there are some mutation that causes the change in nucleotide, but not causing the change in the protein structure. Oh, I see, I see, I see. Yeah, I mean, there's a lot of silent mutations that people have disregarded, but I think that some recent work has suggested that there's, you know, effects that are being had that we should be looking at in greater detail, right? So I wouldn't necessarily dismiss mutations, but I said, ah, these aren't doing anything. We just may not be able to look at them the right way. Or are you talking about, I mean, there's the silent mutations, but what about non-coding mutations? That's another huge area where, you know, so much, I feel like, of systems biology and omics has been directed towards transcription. And there's so much work out there that's trying to connect non-coding variants right now to gene expression patterns. It affects gene expression, but not the protein structure, right? Exactly, and so I think the idea for us, and we have some work on that collectively, separately and together we didn't talk about, which simply adds a first step. So if you have non-coding mutations, we first use the beautiful work of others to link those two genes, you know, using so-called expression quantitative loci, trait loci, EQTLs, and other structural data like HI-C, you know, this bag of tricks that people use for linking a non-coding variant to control a gene expression of a given gene. And then that feeds into the models we talked about. So it's essentially a bolted-on pre-processing step. But ultimately, even for non-coding variants, you can't stop at explaining the expression of one gene. You have to understand how those changes propagate through, you know, cell biology and tissue biology. Yeah, as proteins go up and down, they're going to be changing the network at the end of the day. Thank you so much. So back to the sort of earlier discussion in line with this in terms of looking at the temporal process of these complex diseases. You know, one thing that substance use has in common with cancer, actually, is the existence of it being considered as a chronic, often relapsing condition. You know, that there is often a period of remission followed by a change in the state and then a return to active disease. Now, what's interesting is that the temporal separation of those different states often inform subsequent treatments, right? So if you have a cancer that relapses after first-line treatment within the first year, the treatment option for salvage is different than if the relapse occurs five years from the initial remission. So what kind of information do you think can be captured that actually allows you to stratify patient risk? Well, that sounds a lot like progression-free survival in response to therapy, which is absolutely something we're, and many others, are predicting in the cancer space. So again, if this were ACR, not APA, I would have talked about that. But that's exactly the right idea, is that becomes the phenotype like you alluded to before. But I guess the question is the mutational status. If you look at a cancer, there's a tumor that's mutating. With respect to addiction, you know, I don't know if that person's, that person's not gonna be mutating. So then is it gonna be an epigenetic? Is it gonna be other types of information that we need? Probably, outside of genetics, but we could still use our maps to help interpret the changes. So maybe proteomic profile, energy and expression analysis, that type of thing, which I think is what you're kind of alluding to. So it's an interesting problem. I think it's similar in ways to cancer, but then also different as well. Yeah, that's the thing, is that if we're interested in understanding more of that complex physiology as the states change, right, that becomes important for predicting or for even going after therapeutic intervention modalities. So in a way, in the G to P equation, it's basically, it's almost E, G plus E. Yes. It's the precondition, right, which I guess we consider the E. We need a third modifier in that, right? You know, there needs to be some kind of an intermediate. G plus E equals P, isn't it? Yeah. Yeah, you know. Okay, nobody knows what the hell you guys are talking about here. Speaking in tongues. Okay, if there are no questions, I think we can end. Thank you very much, everybody, for your attention. Thank you for the- Thank you very much. Thank you.
Video Summary
The session focused on systems neuroscience and its application in studying complex neuropsychiatric conditions, particularly substance use disorders. Systems neuroscience explores brain networks and how they lead to behaviors and associated disorders. This approach is crucial as it moves beyond simple abstinence and acute interventions by recognizing substance use as a spectrum disorder with dynamic courses. The variability within substance use disorders suggests potential subtypes with distinct physiological mechanisms. The complexity of brain reward circuitry necessitates systems biology to quantify subnetwork contributions and understand the spectrum of traits in substance use disorders, potentially identifying new therapeutic interventions. <br /><br />Dr. Trey Ideker discussed molecular network mapping for understanding psychiatric phenotypes and their genetic underpinnings. He illustrated how genetic mapping often yields numerous suggestive peaks, but few significant associations unless augmented by additional patient data. Ideker emphasized the use of protein-protein interaction networks to find convergent elements in comorbid conditions like autism and congenital heart defects, which individually show little genetic overlap. He explored BMI-related genetics in humans and rats, showcasing the power of systems biology in aligning findings across species.<br /><br />Nevin Krogan highlighted the importance of quantitative network approaches in understanding molecular interactions within cells, crucial for disease biology and therapeutic direction. His work in the Cancer Cell Map Initiative and the Psychiatric Cell Map Initiative integrates genomics with molecular interaction data to elucidate disease mechanisms, offering insights into therapeutic strategies.<br /><br />The discussion pointed out the relevance of understanding dynamic, complex systems in chronic conditions like addiction, suggesting parallels with cancer in relapse and treatment adaptation. The need for integrating genetic and environmental data to better predict and interpret complex disease physiology and treatment responses was emphasized.
Keywords
systems neuroscience
neuropsychiatric conditions
substance use disorders
brain networks
systems biology
molecular network mapping
genetic underpinnings
protein-protein interaction
therapeutic interventions
quantitative network approaches
disease mechanisms
×
Please select your language
1
English