Bruce Beutler

Good morning. I'd like to talk to you today about our work with immunity, and about how the mouse has improved quite dramatically as a model organism for forward genetics. My work began with a kind of old premise, more than 100 years ago, when microbes had only recently been recognized as the causative agents of infectious disease, already people had begun to wonder about how they harmed the host. And Richard Pfeiffer, shown here in the background behind Robert Koch, had an interesting observation in that respect. In 1891, he coined the term "endotoxin" to describe something intrinsically toxic associated with Gram-negative bacteria. He noticed that even heat-killed organisms could, if injected into guinea pigs, cause shock and death, reminiscent of an authentic infection, although he couldn't recover any viable organisms from the peritoneal cavity after he administered the microbes. Richard Pfeiffer became very famous for this observation, and although he's a rather obscure character today, in his lifetime he was nominated 33 times to receive the Nobel Prize in physiology or medicine. The reason for that was that, then as now, hundreds and maybe even thousands of people die every day of endotoxin-induced shock, and this is what a typical patient with Gram-negative septicaemia might look like. It's a child with meningococcal sepsis. This is a severe systemic form of inflammation, and it has to be recognized that all inflammation had obscure origins in those days, and here Pfeiffer had perhaps identified a single molecular species that could cause inflammation. Over the years that followed his initial report, it was found that endotoxin, which we now call lipopolysaccharide or LPS, was associated with all Gram-negative bacteria. It was a structural component of the outer leaflet of the outer membrane. It had a lipid and polysaccharide moiety, and the Lipid A moiety was the toxic part of the molecule. Eventually, Lipid A molecules were synthesized entirely artificially, and they can do most of what Pfeiffer recognized long ago. We might draw a typical LPS structure like this, and from our work and the work of others, it emerged more recently that LPS activates macrophages to make cytokines. These cytokines, and especially tumour necrosis factor among all of them, bind to receptors on many other cells, and trigger the release of terminal mediators of vasodilation and lead to shock and to aggregation of platelets and all of the things that contribute to the clinical syndrome. A major question, obviously, was, "What's the receptor for lipopolysaccharide?" And this was a tough problem that resisted direct attempts to crack it biochemically. By 1990, from the work of Ulevitch and Wright, it was known that antibodies against the surface antigen CD14, present on macrophages, would inhibit the LPS response very strongly. But CD14 was a GPI-anchored protein. It had no cytoplasmic domain, and it was guessed that it could only work by associating with some kind of mysterious co-receptor, that did have a cytoplasmic domain and could transduce the signal across the membrane. The nature of that receptor was unknown, however, where TNF was concerned, it was known by that time that NF kappaB was the critical transcription factor that would lead to activation of the TNF gene. Once the TNF mRNA was made and processed, it was sequestered in a locked form, today we would say in P-bodies most likely, and that had to be unlocked in order to allow translation of the mRNA and production and processing of the protein. The big question still loomed: What was the true membrane-spanning, activating receptor for LPS? I felt that that question, if solved, would offer the key to understanding how the host becomes aware of infections within the first minutes after inoculation with bacteria. In short, how innate immune sensing works. Finding the LPS receptor ultimately depended on two unrelated sub-strains of mice that couldn't respond to LPS because of spontaneous mutations. The first of these had been identified in 1965 by Heppner and Weiss, it was the C3H/HeJ strain. It was shown to be refractory to any amount of LPS, yet the animals were highly susceptible to Salmonella typhimurium, as was later shown, and also to LPS and other Gram-negative microbes. In 1978 a completely unrelated strain of mouse was found to be refractory to LPS. This was the C57 black/10ScCr mouse, and by crossing the two strains, both of which had recessive problems, one found that the F1 hybrid offspring were refractory to LPS as well. So it was guessed that the two animals had allelic defects, and in both cases there were closely related control strains that showed a normal LPS response. This set the stage for positional cloning of what, by then, was called the LPS Locus. Positional cloning in the classical sense isn't practiced anymore, and not all of you know exactly what it is, but essentially, it's cloning by phenotype. It's possible, taking a phenotype alone to find the location of a gene, by first establishing a critical region, and that's the phase of genetic mapping, then one clones all of the DNA across the critical region in a physical mapping effort, and finally, one has identified all of the candidate genes within the critical region and would go and look for the mutation, the causative mutation. Typically there would be only one in this sort of circumstance. This was a very difficult kind of cloning to accomplish. Back in 1993, we set about to do genetic mapping using only 11 markers on mouse chromosome 4, where we knew the LPS Locus was, and that covered most of the chromosome. We expanded the area and on 2,093 meioses, we established a critical region between two new markers, B and 83.3, and we felt that this was about 2.6 million base pairs, though today we know it was about twice as large. We then did physical mapping, cloning a large number of bacterial artificial chromosomes in an overlapping format to span this region, and then we began to sequence them, starting at the middle of the interval and working outward bidirectionally. That was how our life went for about three years. We would fragment BACs, sequence them, blast the results against libraries of expressed sequence tags that were maintained at NCBI, and looked for genes. And over all that time we found only a collection of pseudogenes, though of course they didn't come with a label that they were authentic genes and that they had any mutation that would distinguish one strain from the other. We were getting rather scared by the summer of 1998, because we had covered more than 90% of the region and we were running out of material to sequence. And still we hadn't found the gene. Then, of course, it's always in the last place you look. Far toward the proximal end of the interval we found a gene encoding an orphan receptor, called the toll-like receptor 4. Now, this was very exciting right from the start. First of all, the gene we had found in our critical region had leucine-rich repeats in its ectodomain, just like CD14 did, and we could imagine that perhaps by proximity or by transfer, LPS would go from CD14 to TLR4 and then trigger a response. And this was, indeed, a single-spanning transmembrane protein. Second, on the cytoplasmic side, there was strong homology between the TLR4 receptor and the interleukin-1 receptor. The interleukin-1 receptor was known to have inflammatory effects when triggered by a protein ligand, interleukin-1, and it could activate NF kappaB. So again, we thought probably this motif would work to drive the activation of the TNF gene and other inflammatory cytokine genes, when stimulated. Third, there was an observation, then two years old, by Jules Hoffmann and his colleagues, who had been looking for mutations that would cause susceptibility to fungal infection in the fly. And in a beautiful paper, in Cell in 1996, Jules and his colleagues showed that flies with mutations in toll, the namesake of this super-family of proteins, would be susceptible to infection by fungi, specifically Aspergillus fumigatus. Here you see a dead fly with hyphae sprouting from its thorax because the fly couldn't make a critical antimicrobial peptide, drosomycin. This seemed a parallel story to the case that we were concerned with, where mutation made mice highly susceptible to Gram-negative bacterial infection. Of course, all of those ideas would amount to nothing if we didn't find a mutation, but in due course, we did. And we found that in the C3H/HeJ strain there was a single base pair change that altered the cytoplasmic domain of TLR4, making it unable to signal. And in the C57black/10ScCr strain there was a deletion encompassing all of the axons of the toll-like receptor 4 gene, a 74 kb deletion, and we ultimately defined its exact margins. And these two allelic defects, which weren't present in the control strains, convinced us completely that this was the gene we were after. There was still some question about whether TLR4 was actually a receptor for LPS. And over a period of a year it was determined by Kensuke Miyake and colleagues that there was another sub-unit to the complex, called MD-2, shown in magenta here, a basket-shaped protein that interacts strongly with TLR4, which is shown in cyan. It has all of these leucine-rich repeats, which make a kind of slinky-shaped molecule, and here you see that the molecule is dimeric and LPS, as it was shown in 2009 by Jie-O Li and colleagues, who finally crystallized the complex, fits into the pocket of MD-2 but has some contact with the backbone of TLR4 as well. This, when it occurs, creates a conformational change that's sensed across the membrane. And this is where all of the inflammatory effects of lipopolysaccharide begin, with this one molecule. The next questions we wanted to address had to do with signalling by TLR4 and how that worked. We were enamoured of the forward genetic approach by that time, but there were no other spontaneous mutations in mice that could tell us anything about how LPS signalling worked. So we decided that we'd have to create new phenotypes using a mutagen, and we began to do so in 2000. Over the next 11 years or so, we identified many phenotypes that were related to TLR signalling and we also focused on other aspects of immunity, keeping them under surveillance with different screens at the same time. In those days, ENU mutagenesis was a blind process. ENU or ethyl-nitrosourea is the only mutagen that one can really use effectively in the mouse, the only chemical mutagen. It's given to male mice, it mutates the spermatogonia and mutations are transmitted to the sons of these mice, the G1 generation. A single G1 defines a pedigree, and the G1 mice were bred to be six mice to produce daughters. They were then back-crossed to their own daughters, and that brought some of the mutations to homozygosity in every G3 offspring born to that cross. We typically made very small pedigrees because we didn't want to be in the position of screening the same mutations over and over, and our thinking on that has totally changed, as you'll see in a moment. We didn't know it at that time, but we know today that the average sperm derived from a G0 animal has 60 to 70 mutations that change coding sense. And it's known from long experience that if you see a phenotype, it's almost always from a coding change, rather than an intergenic change of some kind. Now, this was a blind process, as I've said. The only way we knew the ENU was working, was by seeing phenotypes or detecting them in our screens. And it was encouraging to see that we saw a lot of peculiar mice, and of course we would track down all of the mutations and anything we happened to see visibly, as well. Over 11 years we found 34 mutations that fell into 20 genes, that informed us quite a bit about how TLR signalling worked. We had mutations in the toll-like receptors themselves, of which there are 12 in the mouse, and I'm only illustrating some of those here. We also found mutations in co-receptors, in addition to those that I've mentioned already. Mutations in chaperones like UNC93B, that bring the TLRs where they need to be. Some channel proteins are required for signalling from the endosome by TLRs as well. Then there are adaptor proteins that are recruited to the receptor in order to signal further. There's a layer of kinases that become activated, then an ubiquitin ligase, TRAF6, becomes activated as a result. It ubiquitinates itself and other proteins and TAB2 brings these all together to allow signalling to proceed as it should. And finally, one has another layer of kinases that degrade, ultimately phosphorylate, and lead to the degradation of I kappaB and NF kappaB translocation, and there are still other proteins that are needed for TNF to be processed and released from the cell. We still took TNF production as the endpoint of our screen. In the beginning, this was very hard, just like it had been before. But it got easier when the mouse genome was sequenced and annotated, then one didn't have to make contagions anymore, you knew what all the genes were, it wasn't terra incognita, like in the old days. It also got faster when better sequencing technologies came online, first capillary sequencing, then after that, massively parallel sequencing platforms. But by 2011, it was clear that the rate-limiting step in mutation finding was genetic mapping. The usual paradigm of outcrossing an established stock to a marker strain, then backcrossing, making a critical region assignment, and then looking for mutation there, had begun to slow us down. And we could declare many more phenotypes than we could actually solve. Sometimes a year or more was required to track down a mutation. So a new approach was needed. I began to think of what the most magical approach would be, and I thought in terms of Google Glasses in those days. I wanted to have a magical pair of glasses, with which I could look at a family of mice, like this one here, and even if the mutation were not obvious, as I've shown it, it would tell you which mice were affected by a mutation. And not only that, but in the blink of an eye, it would tell you, this is a mutation in SOX10. These are the coordinates of the mutation, the amino acid change, the motif, the human homologue. And if there were structural data, it would even show that to you as well. Well, actually, this is all a reality now, and we are able to find mutations in real time, and I'll tell you precisely how it's done. First, we make G1 mice, just as we always did, but then we whole-exome sequence every G1 mouse that's produced, up front, to find every mutation they might transmit into the pedigree. We've been at this for quite a while and we found that the mean number of mutations that change coding sense was 63 per G1 mouse, and the modal number was 70. And there are ways to make it higher than that, but we'd prefer not to because we wind up with too much G3 lethality. If the number is greater than 30 mutations, then we move forward with that pedigree, otherwise it's discarded. Moving forward means we order an Ampliseq panel, which is a collected of PCR primers calculated not to interfere with one another, that target every one of the mutation sites and allow us to genotype them. Then, the G2 mice and G3 mice are all genotyped at all the mutation sites that we've created with ENU. Only then, the mice are released for phenotypic screening. In this case it involves visual inspection, weighing the mice, giving them a glucose tolerance test. Then subjecting them to a battery of tests for innate immune performance, by macrophages, immunizing them, and doing flow cytometry to assess adaptive immune development and performance. We do then a DSS challenge, we infect them with mouse cytomegalovirus, and then they are passed on for other screens that are in the area of neurobehavioural responses. As of June 28, 2015, we had created nearly 64,000 mutations in this way, and now it's no longer a blind process. We know what every mutation is, and we know what genes they affect. These mutations fell into 17,204 genes, or upwards of 3/4, I think, of the 24,981 genes that the mouse has in total. Now, this is an enormous number of mutations. If they were present, even in the heterozygous state in one G1 mouse, they would almost certainly be lethal. But, of course, they're distributed among more than 1,000 pedigrees and they affect We are able to calculate that we've mutated 17% of all genes to a state of phenovariance, and I'll come back to what I mean by that, and tested them in the homozygous state three times or more, in at least one of our screens. In all, we have about 135 screens, and that's what most of the mice were subjected to. Where adaptive immune performance alone is concerned, we came across 60 known genes that were known to be required for immune development or function, and we detected them by phenotype. But along with those, we found hundreds of genes that were previously not known to be involved in immunity. And all this suggests that a large fraction of our genome is needed for immune defence, as I would've guessed. But now we're in a position, maybe to make more precise estimates about just how large that fraction is. To look through these data, one needs software that lets the observer explore all of the mutations. And we wrote a programme called Linkage analyser, and a browsing programme called Linkage explorer that makes that possible. So one may focus on any particular gene, on any screen of interest, on a subset of the mice, or on a trivial phenotype name. One can restrict the search to different types of mutations, one can insist only on looking at large pedigrees if he or she wants to. The number of observations in the homozygous state can be controlled and the observer also chooses the p-value of association between the phenotype and the genotype of interest. And this is done by altering this value here. And there are other ways, as well, to restrict the quality of the observations. To give you an example, we might say that we're interested just in assays having to do with CD8 cells, with their number or activation state. So you can key in CD8 under the screen name, we insist on seeing the mutation three or more times in the homozygous state, we insist on a relatively strong p-value of association, .0005, and we check these other items as well, I won't go through all of them, and we click "submit." Then, in short order, you get back a list of genes, in this case, a list of 102 variant alleles of 100 implicated genes that come from 70 different pedigrees. From this fact alone you can see that we don't always resolve to a single mutation, sometimes we have linkage of two mutations that fall within one linkage peak, as you might guess. But usually we get down to a single mutation that's implicated. In the first column you see gene names, and some of these, if you're immunologists any of you, will be familiar to you. Themis is known to be involved for positive selection of t-cells, and it shows up in a screen for CD8 cells or the CD4-CD8 ratio. Some are unknown. I doubt any of you are aware that SNRNP40, which is a component of the U5 spliceosome, has a selective role in immunity, but it does. In the next column you see the coordinates of the mutation, estimates of what the mutation does, also you see the screens in which scoring was observed. And then farther over, you see the number of observations of homozygous reference allele, heterozygotes or homozygotes for the mutation. In these three columns, you see the score for linkage in either an additive, a recessive, or a dominant model of inheritance. And if you want to actually look at the plot of inheritance, you click on one of those numbers, and you see a Manhattan plot. This is a log-scale plot of the probability of linkage and you might mouse-over any of the mutations that you see here, all of these are the mutations in the pedigree, only one of them shows strong linkage above the Bonferroni correction line, and if you mouse-over it you find that this is SNRNP40. You might not know what SNRNP40 is, so you can click on that and you get some information about the gene, which has been precalculated. You see that our mutation makes a shortened version of the protein, and this would be interactive in the real programme, where you could mouse-over and see the domain structure. You can click on the gene model, and you find that the mutation is near to axon 5 and is believed to remove axon 5, which creates an in-frame product. And you find much other information as well, it's all been pre-calculated. If you'd like to see the authentic data, the raw data, you can click on the peak value as well, right click, and there you see the phenotypic performance of the homozygous mutants, the heterozygotes, and the reference allele homozygotes. Now, there's overlap between the heterozygotes and homozygotes. This would've been a terrible thing to try to map using a qualitative approach, as we always used in the old days, and you might not even completely believe these data because, after all, we have only a limited number of mice here. And you might think it's some kind of a fluke. But you have to keep in mind that gradually, as we approach saturation, we hit the same genes over and over again, and the computer detects this, and generates "superpedigrees" whenever this occurs. Either with identical alleles transmitted from the same G0 or different alleles that hit the same gene. They get combined into a single large artificial pedigree. Eventually all mutations will be incorporated that way. At present, more than half of all genes are falling into superpedigrees and the number is climbing quickly. With multiple alleles, confidence about an association between phenotype and genotype increases. The same kind of browsing programme is used to examine superpedigrees. In the case of SNRNP40, which I've looked up for you here, we have a total of 16 pedigrees, 16 G1 mice, but only four different alleles, so there is the one I showed you and three others. The one I showed you is a probably null allele by our estimation. And again, you have the same kinds of models of additive, recessive and dominant inheritance. If we click on one of these, we see something rather different. We find that now we've assayed 376 G3 mice from all of these pedigrees, here are all the mutations contained in all of the pedigrees, those in dark blue have multiple alleles of their own, and if you mouse-over the top value, that of course is the SNRNP40 shown in red. You see now there's really no ambiguity, you have a clear shift in the phenotypic performance of homozygotes compared to heterozygotes or reference allele. And that's what gives this spectacular p-value. I would say, in this case, you don't even really need further confirmation, but our standard procedure is to make a crisper targeted allele in every case. Now the great question that we can address nowadays that we couldn't before is, how much damage have we done to the genome with our 64,000 mutations? If we just concentrate on one screen, the CD8 screening, then we see that in all the pedigrees that were examined, but not all of these were even transmitted once to homozygocity. But, mutations in 45.9% of genes were transmitted to homozygosity three times or more. That says nothing about how much damage those mutations caused, but we can look at that as well. If we're talking about probably null alleles, premature stop codons or critical splice junction errors, then nearly 6% of all genes have been affected and examined three times or more in the homozygous state. This would be a very conservative estimate of how much damage was done. If we consider probably null or probably damaging alleles, where probably damaging is established by a programme called PolyPhen-2, that guesses about how much damage a particular amino acid change does, then we get to 24.91% of all genes were mutated and examined three or more times in homozygous state. This would be a very liberal estimate of how much damage was done to the genome. And the true value, we know, is bracketed by these two estimates. We don't know exactly where it lies between them, we have a sense that it's somewhere near the middle, and so we're inclined to say that about 16 to 17% of all of the protein-encoding genes have been mutated to phenovariance. We're only quite near the beginning of the process, of course, and we could draw a red line and say that these are the conservative and liberal estimates of damage. As time goes on, these two curves will converge with each other, though they'll never quite touch, and we'll always be somewhat in doubt about the exact amount of damage we've done to the genome. But what have we accomplished? In the old days it took us five years to positionally clone just one gene, now it takes about one hour. Then, one phenotype was solved in five years, now, one or two phenotypes are solved every day in our lab. That means we proceed 3,000 times as quickly as before. We're limited now only by the rate at which mutations can be produced and screened. And this means that we can interrogate about 1,400 mutations every week. And many of them, about 1/2% of them or so cause phenotype in at least one of our screens. We can project that we'll destroy the majority of genes and analyse their phenotypic consequences within about three years, and then we'll know what most of the genes are that are needed for robust immunity, as we define it. This story was a very long one in terms of time, and I have especially to credit Alexander Poltorak for the positional cloning of the LPS Locus. I now have a much larger group than I did back in those days and it's mainly in the present group, the computational people Chun Hui Bu, Stephen Lyon, Sara Hildebrand, David Pratt and Xaiowei Zhan who deserve the credit for the automated positional cloning that I showed you. And we had help as well from Tao Wang and Yang Xie in the Center forcomputational biology. Thanks very much to all of you.

Guten Morgen. Heute möchte ich gerne über unsere Arbeit zur Immunität sprechen, und wie sich die Maus ziemlich weitreichend als Modellorganismus für Forward Genetics vervollkommnet hat. Meine Arbeit begann mit einer eher alten Prämisse. Bereits vor mehr als 100 Jahren, als man Mikroben erst als Erreger von Infektionskrankheiten erkannt hatte, fragte man sich bereits, wie sie den Wirt schädigten. Richard Pfeiffer, hier im Hintergrund hinter Robert Koch zu sehen, machte dazu eine interessante Beobachtung. um damit etwas intrinsisch Toxisches im Zusammenhang mit Gram-negativen Bakterien zu beschreiben. Er bemerkte, selbst durch Hitze abgetötete Organismen konnten, wenn sie Meerschweinchen injiziert wurden, in Form eines echten Infekts, Schock oder Tod verursachen, obgleich er keine lebensfähigen Organismen aus der Bauchhöhle gewinnen konnte, nachdem er die Mikroben verabreicht hatte. Richard Pfeiffer wurde wegen dieser Beobachtung sehr berühmt, und obgleich er heute als ziemlich zweifelhafter Charakter angesehen wird, wurde er zu seinen Lebzeiten 33 Mal für den Nobelpreis in Physiologie oder Medizin nominiert. Grund dafür war, damals wie heute, sterben hunderte, vielleicht tausende Menschen täglich an Endotoxin-induziertem Schock und so etwa könnte ein typischer Patient mit Gram-negativer Septikämie aussehen. Das ist ein Kind mit einer Meningokokken-Blutvergiftung. Dies ist eine schwere systemische Form einer Entzündung und man muss verstehen, dass damals der Ursprung alle Entzündungen im Dunkeln lag. Pfeiffer hatte hier vielleicht eine einzelne molekulare Gattung identifiziert, durch die eine Entzündung hervorgerufen wurde. Im Laufe der Jahre, die auf seinen ersten Bericht folgten, entdeckte man, dass Endotoxin, nun Lipopolysaccharide oder LPS genannt, in Zusammenhang mit allen Gram-negativen Bakterien stand. Es war eine strukturelle Komponente der äußeren Schicht der äußeren Membran. Sie hat einen Lipid- und Polysaccharid-Teil und der Lipid A Teil war der toxische Teil des Moleküls. Lipid A Moleküle wurden schließlich vollständig künstlich hergestellt und vermögen das meiste von dem was Pfeiffer lange vorher erkannt hatte. Wir können eine typische LPS-Struktur so zeichnen und aus unserer Arbeit und der Arbeit anderer ergab sich kürzlich, dass LPS Makrophagen aktiviert, um Zytokine zu erzeugen. Diese Zytokine, und unter diesen vor allem der Tumor-Nekrose-Faktor, binden die Rezeptoren an viele andere Zellen und lösen die Freisetzung terminaler Mediatoren von Vasodilatation aus und führen zum Schock und der Aggregation von Blutplättchen und all solchen Dingen, die zum klinischen Krankheitsbild gehören. Eine der Hauptfragen war: „Welches ist der Rezeptor für Lipopolysaccharide?“ Dies war ein schwieriges Problem, das direkten Versuchen, es biochemisch zu knacken, widerstand. Ab 1990 war durch die Arbeiten von Ulevitch und Wright bekannt, dass Antikörper gegen das Oberflächenantigen CD14, welches an Makrophagen vorhanden ist, die LPS-Reaktion stark hemmen würden. Doch CD14 ist ein GPI-verankertes Protein. Es verfügte über keinen cytoplasmischen Bereich und man vermutete, es könnte nur funktionieren, indem man es mit einer Art mysteriösem Co-Rezeptor, der über einen cytoplasmischen Bereich verfügte und das Signal über die Membran leitete, verbinden würde. Die Art dieser Rezeptoren war nicht bekannt, aber was TNF betraf, war zu der Zeit bekannt, dass NF kappaB ein wesentlicher Transkriptionsfaktor war, der zur Aktivierung des TNF-Gen führen würde. Sobald das TNF mRNA hergestellt und verarbeitet wurde, wurde es in einer geschlossenen Form abgesondert, heute würden wir wahrscheinlich sagen in P-bodies, und diese mussten aufgeschlossen werden, um die Translation von mRNA und die Herstellung und Verarbeitung des Proteins zu ermöglichen. Die größte Frage stand nach wie vor im Raum: Was war der wirkliche Membran-überspannende aktivierende Rezeptor für LPS? Ich hatte das Gefühl, wenn dies gelöst ist, würde es den Schlüssel zum Verständnis liefern, wie der Wirt Kenntnis von den Infektionen innerhalb der ersten Minuten nach Beimpfung mit Bakterien erhält. Kurz gesagt, wie funktioniert die angeborene Immun-Erfassung. Das Finden des LPS Rezeptors ging letztlich von zwei unzusammenhängenden Unterstämmen von Mäusen aus, die auf LPS aufgrund spontaner Mutationen nicht ansprachen. Der erste Stamm wurde 1965 von Heppner und Weiss identifiziert, es war der C3H/HeJ Maus-Stamm. Er erwies sich widerstandsfähig gegen jede Menge von LPS, die Tiere waren aber hoch empfindlich gegen Salmonella Typhimurium, wie sich später zeigte, und auch gegen LPS und andere Gram-negativen Mikroben. Das war die C57 schwarz/10ScCr Maus, und durch Kreuzen der beiden Stämme, die beide rezessive Probleme hatten, fand man heraus, auch die F1 hybriden Nachkommen waren gegen LPS widerstandsfähig. Es wurde vermutet, dies beiden Tiere hätten defekte Allele und in beiden Fällen gab es eng verwandte Kontroll-Stämme mit normaler LPS-Reaktion. Damit waren die Voraussetzungen für Positionsklonierung geschaffen, was damals LPS Locus genannt wurde. Positionsklonierung im klassischen Sinne wird heute nicht mehr angewendet und nicht alle hier wissen genau, was das ist; im Wesentlichen ist es aber Klonierung nach Phänotyp. Es ist möglich, nur den Phänotyp zu nehmen und die Lokalisation eines Gens zu finden, indem man zuerst einen kritischen Bereich etabliert - das ist die Phase der genetischen Entschlüsselung - dann klont man die gesamte DNA quer durch den kritischen Bereich im Bestreben eine physikalische Abbildung zu erlangen, und schließlich hat man alle Gen-Kandidaten im kritischen Bereich identifiziert und hält dann nach einer Mutation Ausschau, nämlich der ursächlichen Mutation. Normalerweise gibt es unter diesen Umständen nur eine. Diese Klonierung war sehr schwer zu erreichen. Damals, 1993, machten wir uns an die genetische Kartierung und verwendeten nur 11 Marker bei Maus-Chromosom 4, von dem wir wussten, das dort der LPS Locus war, und das deckte die meisten Chromosomen ab. Wir erweiterten den Bereich und nach 2.093 Meiosen fanden wir einen kritischen Bereich zwischen zwei neuen Markern, B und 83.3, und wir glaubten, dies seien etwa 2,6 Millionen Basenpaare, heute wissen wir allerdings, dass es etwa doppelt so viele waren. Wir haben dann mit physikalischen Abbildungen gearbeitet, eine große Anzahl bakteriell artifiziell Chromosomen in einem überlappenden Format geklont, um diesen Bereich zu umfassen, und dann begannen wir sie zu sequenzieren, beginnend in der Mitte des Intervalls, und dann bidirektional nach außen arbeitend. So gestaltete sich unser Leben etwa drei Jahre lang. Wir fragmentierten BACs, sequenzierten sie, verglichen die Ergebnisse mit Bibliotheken von Expressed Sequence Tags, die beim NCBI vorlagen, und suchten nach Genen. Und während dieser ganzen Zeit fanden wir nur eine kleine Zahl von Pseudogenen, die natürlich nicht das Etikett dass es sich dabei nicht doch um authentische Gene handelt und sie irgendeine Mutation hätten, die einen Stamm vom anderen unterschied. Im Sommer 1998 bekamen wir es mit der Angst zu tun, wir hatten nämlich über 90% des Bereichs erfasst und uns ging das Material zum Sequenzieren aus. Und das Gen hatten wir noch nicht gefunden. Natürlich ist es dann, wie immer, am letzten Ort, an dem man sucht. Ziemlich gegen Ende des Intervalls fanden wir ein kodierendes Gen, einen Orphanrezeptor, genannt der Toll-like-Rezeptor 4. Das war von Anfang an sehr aufregend. Zum einen, das Gen, das wir in dem kritischen Bereich gefunden hatten, hatte Leucine-rich Repeats in seiner Ektodomäne, genauso wie CD14, und wir konnten uns vorstellen, dass vielleicht durch Nähe oder Transfer LPS von CD14 zu TLR4 gelangte und dann die Reaktion auslöst. Dies war tatsächlich ein einfach umspannendes Transmembranprotein. Zweitens, auf der zytoplasmatischen Seite gab es eine starke Homologie zwischen dem TLR4 Rezeptor und dem Interleukin-1 Rezeptor. Der Interleukin-1 Rezeptor war für entzündende Wirkungen bekannt, wenn er durch ein Proteinliganden, Interleukin-1 ausgelöst wurde, und er konnte NF kappaB aktivieren. Wir glaubten, dieses Motiv würde vermutlich bewirken, die Aktivierung des TNF-Gens und anderer entzündlicher Zytokinen-Gene bei einer Stimulation anzutreiben. Drittens gab es eine Beobachtung, die damals vor zwei Jahren Jules Hoffmann und seine Kollegen gemacht hatten, welche sich Mutationen ansahen, die eine Anfälligkeit für Pilzinfektionen bei der Fliege verursachten. In einem sehr schönen Arbeitspapier in Cell, 1996, zeigten Jules und seine Kollegen, dass Fliegen mit Mutationen in Toll, dem Namensvetter dieser Superfamilie von Proteinen, für Infektionen durch Pilze anfällig sind, insbesondere gegenüber dem Aspergillus fumigatus. Sie sehen hier eine tote Fliege mit Hyphen, die aus dem Thorax wachsen, da die Fliege kein wichtiges antimikrobielles Peptid, das Drosomycin, bilden konnte. Dies schien eine Parallelgeschichte zu dem Fall zu sein, um den es uns ging, bei dem eine Mutation Mäuse hochgradig anfällig gegen Gram-negative bakterielle Infektionen machte. Natürlich hätten sich alle diese Gedanken in nichts aufgelöst, sollten wir keine Mutation finden, aber wir fanden sie. Wir fanden, dass es im C3H/HeJ Stamm eine Veränderung in einem einzelnen Basenpaar gab, welche die zytoplasmische Domäne von TLR4 abänderte, weshalb es unfähig war, ein Signal zu geben. Im C57schwarz/10ScCr Maus-Stamm gab es eine Deletion, alle Axone des Toll-like-Rezeptor 4 Gen umfassend, eine 74 kb Deletion. Wir konnten schließlich die exakten Ränder festlegen. Diese beiden defekten Allele, die im Kontroll-Stamm nicht vorhanden waren, überzeugten uns vollständig, dass es sich hier um das Gen handelte, nach dem wir suchten. Es gab noch einige Fragen dahingehend, ob TLR4 wirklich ein Rezeptor für LPS sei. Innerhalb eines Jahres wurde von Kensuke Miyake und Mitarbeitern bestimmt, dass eine weitere Unter-Einheit beim Komplex vorhanden war, genannt MD-2, die hier in Violett zu sehen ist, ein korbförmiges Protein, das in starker Wechselwirkung zu TLR4 steht, das man hier in Türkis sieht. Es verfügt über alle diese Leucine-rich Repeats, die so ein geschmeidig geformtes Molekül erzeugen und hier sieht man, das Molekül ist dimer und LPS, so wie Jie-O Li und Mitarbeiter es 2009 zeigten, welcher schließlich den Komplex herauskristallisierte, passt in die Tasche des MD-2, hat aber auch etwas Kontakt mit dem Rückgrat des TLR4. Wenn dies auftritt, erzeugt es eine Konformationsänderung, die über die Membran hinweg wahrgenommen wird. Hier beginnen alle entzündlichen Wirkungen des Lipopolysaccharid, bei diesem einen Molekül. Die nächste Frage, der wir uns zuwenden wollten bezog sich auf die Signalgebung von TLR4 und wie sich diese vollzog. Wir waren damals von dem Forward Generic Ansatz fasziniert, es gab aber keine weiteren spontanen Mutationen bei Mäusen, die uns etwas über die Wirkungsweise der Signalgebung von LPS hätten sagen können. Wir entschieden daher, wir mussten neue Phänotypen unter Verwendung eines Mutagens erzeugen, womit wir 2000 begannen. Etwa während der nächsten 11 Jahre identifizierten wir viele Phänotypen, die eine Beziehung zur LPS Signalgebung hatten und wir konzentrierten uns auch auf andere Aspekte der Immunität, und überwachten sie mit verschiedenen Schirmen gleichzeitig. Damals war ENU-Mutagenese ein blindes Verfahren. ENU oder Ethyl-Nitrosoharnstoff ist das einzige Mutagen, das sich bei Mäusen wirklich effektiv einsetzen lässt. Es ist das einzige chemische Mutagen. Man verabreicht es männlichen Mäusen, es mutiert die Spermatogonien, und Mutationen werden auf den Sohn dieser Maus übertragen, auf die G1 Generation. Ein einzelner G1 legt den Stammbaum fest und die G1 wird auf sechs Mäuse gezüchtet, um weibliche Mäuse zu erhalten. Sie werden dann mit ihren eigenen Töchtern rückgekreuzt und das ergab einige der Mutationen bei Homozygotie in jedem G3 Nachwuchs, der von dieser Kreuzung geboren wurde. Wir schufen gewöhnlich sehr kleine Stammbäume, da wir nicht in die Situation kommen wollten, die gleichen Mutationen immer wieder sichten zu müssen und unsere Ansicht dazu hat sich grundlegend geändert, was Sie gleich sehen werden. Damals wussten wir das nicht, aber heute wissen wir, das durchschnittliche Sperma eines G0 Tieres verfügt über 60-70 Mutationen, die den genetischen Code verändern. Aus langjährigen Experimenten weiß man, sieht man einen Phänotyp, kommt das immer von einer Kodierungsänderung, statt von einer irgend gearteten intergenischen Änderung. Dies war, wie gesagt, ein blindes Verfahren. Der einzige Weg, weshalb wir wussten, dass ENU wirksam war, war über das Auftreten von Phänotypen in unseren Screenings. Es war ermutigend eine Menge spezieller Mäuse zu sehen und natürlich versuchten wir allen Mutationen, und auch allem was wir zu Gesicht bekamen, auf die Spur zu kommen. Im Laufe von 11 Jahren entdeckten wir 34 Mutationen bei 20 Genen, was uns einiges darüber erzählte, wie die TL-Rezeptorsignalgebung wirkt. Wir hatten Mutationen in den Toll-like-Rezeptoren selbst, von denen 12 in Mäusen vorkommen. Ich zeige hier nur einige davon. Wir fanden Mutationen auch in Co-Rezeptoren, zusätzlich zu den bereits von mir erwähnten. Mutationen in Chaperonen wie UNC93B, die die TLRs dahin bringen, wo sie benötigt werden. Einige Kanalproteine sind ebenfalls zur Signalgebung vom Endosom durch TLRs erforderlich. Dann gibt es Adaptorproteine, die zum Rezeptor gerufen werden, um weiter zu signalisieren. Es gibt eine Schicht von Kinasen, die aktiviert werden, dann wird in Folge eine Ubiquitin-Ligase, TRAF6, aktiviert. Es ubiquitiniert sich selbst und andere Proteine und TAB2 führt dies alles zusammen, damit die Signalgebung so fortgesetzt wird, wie es sein soll. Schließlich hat man eine weitere Schicht Kinasen, die sich abbauen, letztendlich phosphorylieren und zum Abbau von I kappaB und NF kappaB Translokation führen und da sind noch weitere Proteine, die benötigt werden, um TNF zu verarbeiten und aus der Zelle freizusetzen. Wir nahmen TNF als den Endpunkt unseres Screenings an. Anfangs war das sehr schwierig, genauso schwierig wie zuvor. Doch wurde es einfacher, als das Mausgenom sequenziert und annotiert wurde, man musste dann keine Ansteckungen mehr durchführen; man wusste, worum es sich bei allen Genen handelte. Es war kein Neuland, wie es früher noch war. Es ging auch schneller, da bessere Sequenzierungstechnologien Online verfügbar wurden, zuerst Kapillar-Sequenzierung, danach Plattformen zur massiven parallelen Sequenzierung. Aber 2011 war klar: der limitierende Schritt bei der Mutationssuche war die genetische Kartierung. Das gewöhnliche Paradigma bei der Auskreuzung eines Stammes zu einem Marker-Stamm, das Vornehmen einer kritischen Bereichszuordnung, um dann dort nach Mutationen zu suchen, hatte unsere Arbeit verlangsamt. Wir konnten zudem sehr viel mehr Phänotypen feststellen, als wir lösen konnten. Manchmal war ein Jahr erforderlich, eine Mutation aufzuspüren. Es wurde ein neuer Ansatz gebraucht. Ich fing an darüber nachzudenken, was der perfekte Ansatz sein könnte und ich dachte an sowas wie Google Glass. Ich wünschte mir eine magische Brille, durch die ich mir eine Familie von Mäusen anschauen könnte, so wie diese hier, und selbst wenn die Mutation nicht offensichtlich wäre, so wie ich das gezeigt habe, könnte ich sagen, welche Maus von der Mutation betroffen ist. Und nicht nur das, ich könnte auch in einem Augenblick sagen: dies ist eine Mutation in SOX10. Dieses sind die Koordinaten der Mutation, die Veränderung der Aminosäure, das Motiv, das menschliche Homolog. Wenn es Strukturdaten gäbe, würde sogar diese gezeigt werden. Das ist jetzt bereits Realität und wir sind in der Lage, Mutationen in Echtzeit zu finden. Ich sage Ihnen genau, wie das geht. Zuerst erzeugen wir eine G1 Maus, so wie wir das immer gemacht haben, dann aber nehmen wir bei jeder G1 Maus eine Gesamt-Exom-Sequenzierung vor, im Voraus, um jede Mutation zu finden, die sie in den Stammbaum übertragen könnte. Wir sind da seit einer Weile am Ball und fanden heraus, die durchschnittliche Anzahl an Mutationen, die den genetischen Code verändern war 63 pro G1 Maus und die Modalzahl ist 70. Es gibt Möglichkeiten, diese zu erhöhen, wir zogen es aber vor, es zu unterlassen, denn wir hatten am Ende zu viel G3 Letalität. Liegt die Anzahl bei über 30 Mutationen, arbeiten wir mit diesem Stammbaum weiter, anderenfalls wird er ignoriert. Damit weiterarbeiten bedeutet, wir bestellen ein Ampliseq Panel, das ist ein Ansammlung PCR Primer, so berechnet, dass sie sich gegenseitig nicht beeinträchtigen, diese peilen jede der Mutationsstellen an und wir können dann ihren Genotyp feststellen. Dann sind die Genotypen aller G2 und G3 Mäuse an den Mutationsstellen, die wir mit ENU erzeugt haben, festgestellt. Erst dann werden die Mäuse für das phänotypische Screening freigegeben. In diesem Falle schließt dies eine visuelle Überprüfung, das Wiegen der Maus und einen Glukosetoleranztest mit ein. Dann werden sie in einer Testserie auf die angeborene Immunleistung untersucht (mittels Makrophagen), wir immunisieren sie und verwenden die Durchflusscytometrie, um die adaptive Immunentwicklung und -leistung zu bewerten. Danach nehmen wir ein DSS Challenge vor, wir infizieren eine Maus mit dem Zytomegalievirus, danach werden sie zu weiteren Untersuchungen im Bereich der neurologischen Verhaltensreaktion weitergegeben. Bis zum 28. Juni 2015 hatten wir auf diese Weise nahezu 64.000 Mutationen geschaffen, und inzwischen ist es kein blindes Verfahren mehr. Wir wissen, was jede Mutation ist, und wir wissen, auf welche Gene sie wirken. Diese Mutationen erstreckten sich auf 17.204 Gene oder auf mehr als 3/4, ich glaube, die Maus hat insgesamt 24.981 Gene. Das ist eine enorme Anzahl an Mutationen. Wären diese, selbst im heterozygoten Zustand in einer G1 Maus vorhanden, wären sie beinahe mit Sicherheit tödlich. Sie sind natürlich über mehr als 1000 Stammbäume verteilt und tangieren insgesamt 26.455 G3 Mäuse. Wir können berechnen, dass wir 17% aller Gene in einen Zustand der Phänovarianz mutiert haben, darauf komme ich nochmal zurück, und wir haben sie drei Mal oder häufiger in homozygotem Zustand getestet, zumindest in einem unserer Screenings. Insgesamt führten wir 135 Überprüfungen durch, denen die meisten Mäuse unterzogen wurden. Da, wo es nur um die adaptive Immunleistung ging, begegneten wir 60 bekannten Genen, von denen bekannt war, dass sie für die Immunentwicklung oder –funktion erforderlich sind, und wir wiesen sie durch Phänotypen nach. Zusammen mit diesen entdeckten wir aber hunderte von Gene, von denen zuvor nicht bekannt war, dass sie etwas mit Immunität zu tun hatten. Alles dies weist darauf hin, dass ein großer Teil unserer Genome zur Immunabwehr notwendig ist, so wie ich es vermutet hatte. Wir sind jetzt aber in einer Position, in der wir möglicherweise präzisere Schätzungen darüber vornehmen können, wie groß dieser Anteil ist. Um diese Daten zu sichten braucht man eine Software, mit der der Betrachter alle Mutationen untersuchen kann. Wir schrieben ein Programm, genannt Linkage Analyser, und ein Programm zum Durchsuchen, genannt Linkage Explorer, mit denen dies möglich wurde. Man kann sich auf irgendein bestimmtes Gen konzentrieren, in jedem gewünschten Screening, bei einer Teilmenge von Mäusen oder anhand des trivialen Namens eines Phänotyps. Man kann die Suche auf verschiedene Arten der Mutation beschränken, man kann sich darauf festlegen, nur große Stammbäume anzusehen, sofern gewünscht. Die Anzahl der Beobachtungen im homozygoten Zustand lässt sich steuern und der Beobachter wählt ebenfalls den p-Wert der Zuordnung zwischen dem Phänotyp und dem interessierenden Genotyp. Dies wird durch die Änderung dieses Wertes hier vorgenommen. Und es gibt noch andere Methoden, die Qualität der Beobachtung zu begrenzen. Um Ihnen ein Beispiel zu geben, wir können sagen, wir sind nur an Proben in Verbindung mit CD8-Zellen interessiert, mit deren Anzahl oder Aktivierungszustand. Man kann CD8 unter dem Screeningnamen eingeben, wir bestehen darauf, die Mutation drei Mal oder öfter im homozygoten Zustand zu sehen, wir bestehen auf einem relativ starken p-Wert der Zuordnung, 0,0005, und prüfen ebenfalls diese anderen Elemente, ich will nicht alle durchgehen, und wir klicken auf „absenden“. Dann geht man unverzüglich zurück zur Liste der Gene, in diesem Falle eine Liste von 102 Allelen von 100 beteiligten Genen aus 70 verschiedenen Stammbäumen. Sie können alleine hieraus erkennen, dass es uns nicht immer um eine einzelne Mutation geht, manchmal haben wir eine Verkettung von zwei Mutationen, die auf einen Maximalwert der Verkettung fallen, wie Sie sich sicher vorstellen können. Gewöhnlich arbeiten wir aber mit einer einzelnen daran beteiligten Mutation. Sie sehen in der ersten Spalte Namen der Gene, und mit einigen, falls jemand unter Ihnen Immunbiologe ist, werden Sie vertraut sein. Themis ist für die Beteiligung bei der positiven Auswahl von T-Zellen bekannt und taucht in einem Screening für CD8-Zellen oder für das CD4-CD8 Verhältnis auf. Einige sind unbekannt. Ich zweifle daran ob es einem von Ihnen bewusst ist, dass SNRNP40, welches eine Komponente des U5 Spleißosoms ist, eine selektive Rolle bei der Immunität spielt, aber dem ist so. In der nächsten Spalte sehen Sie die Koordinaten der Mutation, Einschätzungen dessen, was die Mutation bewirkt, Sie sehen auch die Screenings, auf denen das Scoring überwacht wurde. Dann weiter da drüben sehen Sie die Anzahl der Beobachtungen homozygoter Referenzallele, heterozygote oder homozygote für die Mutation. In diesen drei Spalten sehen Sie die Bewertung der Verkettung, entweder in einem additiven, einem rezessiven oder einem dominanten Vererbungsmodell. Falls Sie sich den Vererbungsplan ansehen wollen, dann klicken Sie diese Nummer an und Sie sehen einen Manhattan-Plot. Dies ist ein Log-skalierter Plot der Wahrscheinlichkeit der Verkettung und man kann mit der Maus über irgendeine der Mutationen gehen, die Sie hier sehen; dies alles sind die Mutationen in dem Stammbaum, nur eine davon zeigt eine starke Verkettung über der Bonferroni Korrektur-Linie, und wenn Sie hier mit der Maus rübergehen, sehen Sie, dass dies SNRNP40 ist. Vielleicht wissen Sie nicht, was SNRNP40 ist, Sie können das also anklicken und erhalten einige Informationen zu diesem Gen, welche vorausberechnet wurden. Sie sehen, unsere Mutation bewirkt eine kürzere Version des Proteins und dies wäre im realen Programm interaktiv, Sie könnten mit der Maus darüber ziehen und die Domänenstruktur sehen. Sie können das Genmodell anklicken und sehen, die Mutation ist in der Nähe von Axon 5 und man nimmt an, dass es Axon 5 entfernt, was ein In-Frame-Produkt erzeugt. Sie finden noch viele weitere Informationen, dies wurde alles vorausberechnet. Wenn man die authentischen Daten sehen möchte, die Rohdaten, kann man auch den Maximalwert anklicken und hier sehen Sie die phänotypische Leistung der homozygoten Mutanten und die Referenzallele bei Homozygoten. Jetzt gibt es zwischen den Heterozygoten und den Homozygoten eine Überlappung. Bei einem qualitativen Ansatz wäre der Versuch, dies abzubilden, ein schrecklicher Aufwand, was wir früher immer gemacht haben und Sie werden diesen Daten nicht einmal vollständig vertrauen, denn, schließlich haben wir hier nur eine begrenzte Anzahl Mäuse. Und Sie denken vielleicht, das ist so eine Art Glücksfall. Beachten Sie aber, dass graduell, indem wir uns der Sättigung nähern, wir immer wieder die gleichen Gene treffen und der Computer erkennt das zeigt „Superstammbäume“ an, wann immer er darauf trifft, entweder bei identischen Allelen, die von der gleichen G0 übertragen wurden, oder bei verschiedenen Allelen, die das gleiche Gen getroffen haben. Sie werden in einem einzigen großen künstlichen Stammbaum zusammengefasst. Irgendwann werden alle Mutationen auf diese Art integriert sein. Derzeit fallen über die Hälfte aller Gene in Superstammbäume und die Zahlen steigen rasch. Mit mannigfaltigen Allelen wächst das Vertrauen in eine Verknüpfung zwischen Phänotyp und Genotyp. Das gleiche Browser-Programm wird dazu verwendet, Superstammbäume zu untersuchen. Im Falle von SNRNP40, das ich hier für Sie nachgeschaut habe, haben wir eine Gesamtzahl von 16 Stammbäumen, 16 G1 Mäuse, wir haben aber nur vier verschiedene Allele, hier ist das eine, das ich Ihnen gezeigt habe und drei weitere. Das eine, das ich Ihnen zeigte, ist vermutlich nach unserer Einschätzung das 0-Allel. Und wieder hat man die gleichen Modelle additiver, rezessiver und dominanter Vererbung. Wenn wir eines von diesen anklicken, sehen wir etwas recht Unterschiedliches. Wir haben jetzt 376 G3 Mäuse von allen diesen Stammbäumen untersucht, hier sind alle Mutationen, die in all diese Stammbäume enthalten sind; die dunkelblauen haben mehrere eigene Allele und wenn Sie mit der Maus über den Spitzenwert fahren, ist das natürlich SNRNP40, das Sie hier rot sehen. Hier gibt es wirklich keine Unklarheit, man hat eine deutliche Verschiebung bei der phänotypischen Leistung der Homozygoten im Vergleich zu den Heterozygoten oder Referenzallelen. Das ist es, was diesen spektakulären p-Wert ergibt. In diesem Falle würde ich sagen, man braucht keine weitere Bestätigung, doch unser Standardverfahren ist, bei jedem Fall ein schärfer ausgerichtetes Allel zu erzeugen. Die große Frage, die wir heute stellen können, und zuvor nicht stellen konnten ist, wie sehr haben wir die Genome mit unseren 64.000 Mutationen geschädigt? Wenn wir uns auf nur ein Screening konzentrieren, das CD8-Screening, sehen wir, in allen untersuchten Stammbäumen wurden mindestens 67,76% aller Gene angefasst, doch nicht alle davon wurden in Homozygotie überführt. Doch Mutationen in 45,9% der Gene wurden drei Mal oder öfter in Homozygotie überführt. Das sagt nichts darüber aus, wieviel Schäden diese Mutationen verursacht haben, wir haben uns aber auch das angesehen. Wenn wir etwa von 0-Allelen sprechen, von vorzeitigen Stoppcodons oder kritischen Fehlern der Spleißstelle, dann waren nahezu 6% aller Gene betroffen und wurden drei Mal oder öfter im homozygoten Zustand untersucht. Dies wäre eine sehr konservative Schätzung darüber, wieviel Schädigung erfolgte. Wenn wir die 0-Allele oder die wahrscheinlich schädigenden Allele betrachten, wobei wahrscheinlich schädigend durch ein Programm, genannt PolyPhen-2, festgelegt ist, das vermutet, wie viel Schaden eine bestimmte Aminosäure verursachen kann, dann haben wir: 24,91% aller Gene wurden drei Mal oder öfter im reinerbigen Zustand mutiert und untersucht. Dies wäre eine sehr großzügige Schätzung bezüglich der beim Genom erfolgten Schäden. Der wahre Wert, den wir kennen, wird von diesen beiden Schätzungen eingeklammert. Wir wissen nicht, wo genau er zwischen diesen beiden liegt, wir haben eine Ahnung, dass er sich irgendwo in Nähe der Mitte befindet, weshalb wir dahin tendieren zu sagen, dass es etwa 16-17% aller Protein-kodierender Gene sind, die zu phänotypischen Variationen mutierten. Wir stehen natürlich erst am Anfang dieses Prozesses, und wir könnten eine Linie ziehen und sagen, dies ist die konservative und dies die großzügige Schätzung der Schädigung. Mit der Zeit werden diese beiden Kurven sich annähern, sie werden sich aber nie wirklich berühren, und wir werden immer etwas im Zweifel über das genaue Schadensausmaß sein, das wir beim Genom verursacht haben. Was aber haben wir erreicht? Früher brauchten wir fünf Jahre, um nur ein Gen positionell zu klonen, jetzt benötigen wir etwa eine Stunde. Dann wurde ein Phänotyp in fünf Jahren gelöst, jetzt werden in unserem Labor jeden Tag ein oder zwei Phänotypen gelöst. Dies bedeutet, wir kommen 3.000 Mal schneller voran als zuvor. Wir werden derzeit nur durch die Zahl, in der sich Mutationen erzeugen und überprüfen lassen, beschränkt. Das bedeutet, wir können jede Woche etwa 1.400 Mutationen abfragen. Und viele davon, etwa 1/2% davon, verursachen einen Phänotyp auf mindestens einem unserer Screenings. Wir können projizieren, dass wir die Mehrheit der Gene zerstören und ihre phänotypischen Konsequenzen etwa innerhalb von drei Jahren analysieren, und dann wissen wir, welches die meisten Gene sind, die für eine starke Immunität, so wie wir dies definiert haben, erforderlich sind. Das war eine sehr langwierige Geschichte, und ich muss besonders Alexander Poltorak für das positionelle Klonen des LPS Locus danken. Ich habe jetzt eine sehr viel größere Gruppe als damals und in der gegenwärtigen Gruppe sind es besonders die Computerleute Chun Hui Bu, Stephen Lyon, Sara Hildebrand, David Pratt und Xaiowei Zhan, die Anerkennung für das automatische positionelle Klonen verdienen, das ich Ihnen gezeigt habe. Wir wurden auch von Tao Wang und Yang Xie im Center for Computational Biology unterstützt. Meinen herzlichen Dank an Sie alle.

Abstract

Beginning with an exception to normal function caused by a genetic aberration, one may hope to find at least one protein with non-redundant function in a certain biological process. This approach permitted the identification of the receptor for bacterial endotoxin (lipopolysaccharide; LPS) in mammals, which revealed a conserved system for the activation of innate immune responses, represented in plants, insects, mammals, and other taxa. But to be comprehensive, it is necessary to deliberately create mutations at random. This is best done with the efficient chemical mutagen N-ethyl-N-nitrosourea. The general strategy is known as “forward genetics,” and it offers a chance of finding all the key constituents of immunity, as queried through screens for aberrant phenotype. Historically, forward genetics was slow and expensive in mammals. We have developed techniques that make it several thousand-fold faster. We use massively parallel sequencing platforms to identify all ENU-induced mutations that cause coding change—and determine their zygosity—in all mutant mice prior to phenotypic screening. Statistical computation can then be used to identify mutations that cause phenotype as soon as phenotypic data are uploaded. At present writing, we have created more than 60,000 coding changes in the mouse genome, altering nearly 69% of all protein-encoding genes. We estimate that we have mutated about 18% of all genes to phenovariance and examined the mutant alleles three or more times in the homozygous state. Alongside a large number of genes that were previously known to support immunity, we have identified many genes with novel immune functions. We project the destruction of more than half of all genes within the next two years, and the development of an increasingly comprehensive understanding of sensing, signaling, and effector mechanisms that protect us from infectious diseases.