Confirmation Number: 334093 Event Started: 3/1/2005

(SACGHS meeting March 1, 2005)

I think we're getting close to being able to have the slides ready for the presentation. So, can I call the committee to order? Thank everyone for being here on day two. Webcast, are we okay? We'll have to go ahead and -- we'll go ahead without the webcast for the moment and you'll catch up with us as we go. That's okay?

Let me thank everybody for a very intense day yesterday. Very hard work. There are a couple of things we want to let you know that are gear main. The discussion on coverage and reimbursement there have been some subcommittee work last night and this morning and at lunchtime, we will have a working lunch and we will present to you a scheme AT&Tic for, we hope, an organized and very precise discussion that will get us to some conclusion at the end of the lunch session. It will take everybody really paying attention and working hard to get there but we believe that we can accomplish what we need to accomplish during the lunch hour. To facilitate a working lunch, you have at your desk the lunch menu.So if you need to fill that out and we'll pick them up at the break. By 10:00 we have to have all the food ordered so you can get your food and come back here and work. So, this is a critical small ingredient we want you to attend to.

With that, let me also let you know that at the break, by the way, we were telling people to go away from the food cart and it turns out that we don't need to tell you to go. You're not supposed to bring bags with you, but that's actually available for everybody at the little food area out there. So it's okay. Our people in the audience, you can get coffee out there and so forth. And we're not going to make you leave if you don't bring your lunch pail.

Today, from 8:30 to 2:45, we'll talk about large population studies, the opportunities and challenges. The human gentlemen nome, and the challenge of translating the wealth of information into improved health, the environmental and genetic components of common complex diseases and genetic and environmental factor and the interplay of those factors. They have been an important and necessary way to translate the human genome see consequence into useful clinical and public health strategies. While many different approaches can be taken, all intend to build on the information provided by the see consequencing of genome . These studies are complex and they raise a number of scientific, logistical and ethical and legal and social concerns. We decided during our priority process that it was important to understand the opportunities and the challenges posed by these large population studies and that these questions required in-depth study. NIH has also asked us to provide feedback on the need for such a study. As such, the large population Force is appointed in June of '04 to begin work on this issue. I'd like to thank the Force members for organizing this session. Hunt, Ed McCabe, Ellen fox, and all the members of the committee, we want to thank you. We want to thank the staff, particularly Amanda, as well as holly, Campbell row send for their work and developing the backgrounder that we've been supplied. By the end of that session we hope to have gained a deeper understanding of what the large population studies are and why they're under consideration this time. The goals of the first three presentations are to inform us about different approaches to large population studies and provide us with a broad introduction to this topic. We're very pleased to David Goldstein will discuss the Conceptual basis. And Gilbert Omenn will present the public health perspective and Teri Manolio will present an overview of national and international large population studies. I would urge you to turn to tab one of your briefing book and you'll see the biographies of each of these three distinguished people and so, I'm not going to go through those right now. So, to begin, let me just thank David for coming and we're very interested in the next half hour to hear you talk to us about the Conceptual basis for large population studies of Variation and common disease.David, thank you.

Your welcome.

By the way, I think what we'll do, depending on how long the presentation takes 23 they stick to their half hour alotment, what we may do is you have an urgent, burning question you want to ask the individual speaker, we can probably take one or two of those right after, but then we'll also try to query the panel later.

Thanks very much for the invitation to come here and talk about the Conceptual basis for large population studies. What I'd like to do is in half an hour, try to cover two things. One is why we might want to undertake such an enterprise. And secondly, how we might go about it in terms of what the technical requirements would be. I'm going to kind of balance this back and forth between those two things. Kicking off, why would we want to set up a powerful framework for studies the genetics of common diseases? The basic motivations are indicated there. We would like to be able to predict risk, but importantly, I'm going to come back to this a few times. We would like to not only be able to predict risk but do something about it. Not good enough to predict risk. This is not for insurance companies. It's not good enough to predict. We to be able to intervene. That's something that's going to come up, I think, in a few places.

The other motivation is not about prediction and intervention but object identifying genes and path ways that might help us in the drug development process. And finally, the aim would be to identify genetic determinates of treatment response. That's traditionally thought of in terms of pharmacogenomics which I'll talk about the genetic determine ats of what drugs are safest and work best. You can also think about the genetic determine ats of other treatment responses like when there are options for surgical procedures, nonsurgical procedures and so on. So, in general, the genetics of treatment response.So the first thing that we need to be clear about is what kind of genetic variation we're talking about. The first thing that needs to be said is we're not talking about the kind of genetic differences indicated on this slide where you have a mutation that's a segregated in a family that causes a disease. So, in that simple case, there is a one-to-one correspondence, often between a genetic difference and the disease that we're interested in and that's actually quite straight forward to work with, genetically and the community is now extremely good at finding those kind of causes of disease. Now, unfortunately, common diseases aren't like that. The genetic contributed to common disease don't have that kind of one-to-one correspondence. The kind of genetic variation that we're talking about here is illustrated with this cartoon so the idea is that our genome is a big place. There are many places in genome where individuals tend to differ one the next. In fact, there are now estimated to be more then 10 million common polyM morphisms and a site where the rare form has a frequency of more then 1%. More then 10 million of those different places in the human genome and if you allow for rarer variance there's many more then that. And these variance, the different forms that many of these sites, we know, often have very subtle effects so they change fizz yule give in some subtle way. That's very difficult to measure. And then, these variance influence the types that we're interested in. The kind of diseases people get. In some kind of complicated interaction, both with other genetic differences in our genetic makeup and with the environment. That's what really creates the challenge. A large number of variable sites in our genetic make up. They interact with one another and the environment and ultimately, they have some kind of influence on what we're interested in looking at and that is the health of the individual. And I really just want to walk you through to emphasize at the end of the day what we're talking about is the probability of certain conditions, being influenced by these variance, the variance don't determine the condition. For that reason, I think it really isn't appropriate to talk about genes for diseases. We're not doing the same thing as we did with the disease. We're not finding the gene for diabetes and the gene for asthma and so on. We're understanding how genetic differences influence these conditions. It's a different kind of thing.

Okay. So that's what we're -- what our aim is to understand how all those genetic differences we have influence our health. That's the aim. And it looks like it's going to be difficult. There is now really no question about that. But what I'll now turn to is some of the technical requirements that we're going to need in order to be able to make progress. I'll spend the most time talking about the requirements to efficiently represent genetic variation. Two reasons for that. One, I was explicitly asked to do that. The other reason is that's where we're fartherest along. When you hear people talk about the genetics of common disease, nine times out of ten, people talk about how good we're getting at see consequencing and gene type and how much we know about genetic variation, we've gotten good at that side of it. That's the easiest side. The difficult side is where we haven't made much progress, which is knowing exactly how to measure in patients what we need to measure. And knowing how to relate that to the genetic variation, that's the harder bit. I'll spend more time talking about what we're better at and telegraph what we're not good at and ideas of how we might improve on that.

First, kicking off, the genome is a big place with a lot of genetic variation, as good as we are at sequencing and gene typing we can't get very, very large numbers of individuals that suffer from a certain condition, individuals that don't. And exhaustively compare them genetically. We're not capable of doing that. We might at some ., that kind of capacity has been promised to be around the corner but it never quite arrives. What people think about are more efficient ways to make the comparisons more economic cal ways. And something that's getting a lot of attention right now is called the -- tagging. I'll spend a few minutes talking about. The basic idea here is to find a framework for efficiently representing the genetic variation either in a region of the genetic make up you're interested in, or in the entire genome . don't know how well you can see this, but what's shown here is a cartoon representing a stretch of genome . You can consider that a gene. And indicated are each of the sites in that stretch of genome that differ. Where there's a polymorvism. 12 sites indicated there. Just . here. This group, those are three, four, polymorvisms that are indicated the gene and you see the first row is one chrome some you might sample from the population N that chrome some that first site has a -- and then, the fifth chrome some you might sample has the -- and then the next site which has -- and so on. The . here is members of the green group are all associated with one another. In this case, if you know the aleal present at the first site it tells you the aleal that's present at the second site in the green group and the third and the fourth. Okay? Those associations among variable sites in our genome are due to a whole raft of population genetic forces which I won't go into. But they do exist. There are these associations. Usually not perfect. I'll say something about that in a minute. They do exist. Because of that, if you were interested in looking to see if any of those sites associated with a trait you were interested in, you wouldn't have to directly asay all of them. You could asay one member of the green group and tell you about the others. You could asay one member of the pink group or whatever color, and it will tell you about the others and so on. They're -- another name is linkage to equilibrium mapping. These associations do K3*EUS. If you understand the nature of these associations you know how to select out a subset of the variable site that is tell you about the others. And in this particular case, obviously, the subset that you can use is one member of each color group. There is no loss of information at all because each member is telling you about the others. So, if one of the ones that you didn't asay was influencing the fine type you would see it through the one you did look at. That is at its Conceptual core, the entirety of the mapping or linkage to equilibrium mapping and it is, in fact, the primary motivation, I think as far as I'm concerned and most people are concerned. The primary motivation for happen map project, an effort to characterize these patterns of association among variable sites so that you can select out a subset that efficiently represents the variation in our genetic makeup. So that is an extremely important tool currently because we can't look at variation exre hencebly and that's the Conceptual core. In fact, the associations -- because we're doing Biology here. These associations are never perfect. You have to use a bunch of statistics to go through the step of choosing one member of each color group. That's a technical detail. This is the basic aim.

What I'd now like to do is take a couple of minutes addressing the issue of how well we expect this to work. So, can we feel comfortable that we really do have a good framework in hand for efficiently representing variation? And I'm going to try to give i a "yes or no" answer to that question. I'll illustrate that with some work that we did on a data set that we collected together with glax okay Smith Cline where we look at the patterns of association among 55 genes that contain the me tab lyzingenzymes. And there were a bunch of these sites that were asayed in a number of individuals both of European an zestry and Japanese an zestry. That's the data set. This indicates the way this sort of analysis is carried out. This is a stretch of see consequence indicated and there's genes indicated and there are all the polymorvisms indicated that were looked at as thin lines and that's -- there's about 60 plus of them spread through four genes that are contiguous. And what you do is do a statistical version of selecting one member of each color group and you identify nine out of those 60-plus polymorvisms that you assess are able to represent the other variation that's there. And then the question you want to answer is "how well is that really going to work in representing variation" that, A., you don't know about and B, variation in a somewhat different population from the one that you looked at originally. That's important. You have to remember the way this works, for example, the way we're all going to use the hap map data. The hap map looks at a number of individuals, for example, from the depostory. And selects these special tagging and goes and applies them in a different group. For example N. our case, patients with epilepsy and so on. You have to ask the question "how well do they represent variance that is you may not knowable initially and in a somewhat different population"? So you need an answer. In this case we find the nine to represent all these others. What you want to know about is how well they represent the ones you don't know about in a somewhat different population.

So you think of some statistical ways to do that which I won't talk about. And evaluate how well they do. We went through a few of those exercises. As I said I'll skip. What I'll do, instead, is show a direct evaluation of or not they work. And that is taking these snips you identify out to a brand new population sample and assessing or not they predict variable sites that we know are functional. So there are, in these particular genes, lots of sites that we know change the activities of the enzymes, for example. Those are exactly the kind of differences we're looking for and we can ask "do these tagging snips work? " and this shows the result shown here is the minor aleal frequency of the snips we're trying to predict. Proposing not to type. And here is a measure of how well we can predict them. And without -- it doesn't really matter how that measure works. What does matter is if you're up here at the top in this performance measure that is exactly the situation. You can show this formally. It's exactly the situation of the cartoon. If you're up here at one, in this performance measure, it's exactly like taking one member of each color group that exactly predicts the other. No loss of power have so ever. If you're in this range you do very well. If you're down here you do very badly. Which is to say, if there was a snip down here you didn't type and it was influencing the condition, you wouldn't see it. Okay? So how do you do it? Here's the minor frequency of what you're trying to predict. Once you above 5% you do great. It's fair to say, the short nontechnical version is that, out here, if any of this stuff was influencing the fee know type and we only typed the tagging snips, not these things directly, we would still see it. That's really encouraging. This is the very discouraging note, small sample so far. These rare things may not be predicted at all. Sometimes you predict them. And sometimes you don't. We've gone on and done a bit more of that kind of thing and our impression is that this is a fairly general outcome. That in this framework you just can't reliably pick up the variants that are rare in the population where rare is something between 3 and 5% as a cut-off. More work needs to be done but that's how it looks to us at the moment.

What's the conclusion from that? What I'd like to emphasize is that we're talking about a truly dramatic economy. In the 55 genes that we looked at, we estimated that there are 4,000 common polymorvisms. What we show is about 200 of these specially selected snips can represent the other 4,000. Now, you can select these in different ways and some people would use methods that would result in a number slightly larger then 200. But it is some really dramatic economy that you can achieve this way. And I would assert that it is now not controversial, or not you can represent common variation in this framework. It's still discussed a little bit in the literature but I think that debate really now has gone out of date. I think it should be viewed as demonstrated, that this can officially represent common variation. Issued say that I have no association with the hap map project so I don't feel any need to support the necessity of the hap map project. So it's just a technical evaluation. That framework really does seem -- has been demonstrated to work well in representing common variations. So I think that's encouraging. Of course, these data we have are by no means the only data that make this case.

So common variation can be efficiently represented. We should view that as noncontroversial. It seems unlikely rare variation can be efficiently represented. So for that we don't have an economic cal approach. If we want to also identify the rare variance that influence both common diseases and responses to treatment, we're going to to do more difficult and expensive things. And we should because without a doubt, rare variance will also contribute. I'm not going to go into that whole debate. But I think it's quite clear to most people that both common variance and rare variance contribute to common diseases. The relative importance of those two things we don't know but they'll both make some contribution. We have a good very economic cal method for representing common variation. We don't for representing rarer variation. I don't expect tagging will serve the purpose but if you find more clever methods to do it, perhaps. We probably need to think about alternatives. So, I think in terms of representing common variation, the genetic side, we really are, now, in pretty good shape, even though we've got a challenge for rarer variation it's terrific that we can now start asking questions about those 10 million genetic differences among us all. That's terrific. That's a real tool. We'll, no doubt, lead to advances. But p what is much much more complicated is deciding about how to look at individuals that are being studied genetically, both individuals with diseases and individuals that don't have diseases. For example, if you're thinking about prospective studies and many have been making arguments for the advantages of prospective studies and that is where you enroll people, random samples from the population, for example, in one design, and monitor them over time and as they become affected by different common diseases, you can then carry out genetic studies, knowing about the background of the individual because they've been in your study for a while. So, as we move to carry out those kinds of studies which do have a lot of advantage, we need to think about exactly what information we need about individuals the time of enrollment and don't have time to go into details here. But I would say that that's something we really don't have a very good idea about. For example, if you're interested in cardiovascular disease, exactly how much information do you need the time of enrollment for a large population sample in order to understand the state the person when they're 50 well enough, that it really tells you extra things about why they had a heart attack when they were 66. We don't know exactly what we should be looking at when we enroll individuals for cardiovascular disease or other things, we really don't know. If we move towards very large perspective population studies that's something we'll have to figure out. Obviously, lots of people have ideas but it's not like the genetic side where we really know what we're doing. Definitely an area of active work.

The other things I'd like to raise as an issue is the question of what types of information are the most important. So, for example, we've been carrying out a variety of studies in epilepsy and a common way people think of doing epilepsy work is the sort of thing that people usually do. Which is you get a lot of individuals with epilepsy and compare them to individuals that don't have epilepsy. And, yet, epilepsy has quite a striking potential in that in cases where patients don't respond to pharmacogenomics treatment, surgery is carried out and the actual affected tissue is available for study so you can at the seizure-focused tissue in those patients that have to undergo surgery. That is basically not being done in epilepsy research. You can actually write out a long list of striking opportunities like that if we look at the right place. And interface correctly with the actual care, clinical care, of patients where we might really figure new things out if we actually look at the right kind of information. And sometimes that right kind of information doesn't come from simply enrolling a million people in a study. I'm not disparaging that. I'm saying there are other kinds of data that are available that emerge from clinical care that we're not making system AT&Tic use from in the area I'm familiar with, is certainly the case, and in a variety of other areas. We have to think very carefully how we interface genetic's work with healthcare to make sure we really do capitolize on the most important types of information, as, for example, we're most certainly not doing in epilepsy and we're trying to change that.

Another . I'd like to raise in that context is the overwhelming detailed information about how patients respond to treatment. I'm not going to have a lot of time to talk about this but I'll talk a little bit about it. I think it's very, very clear that genetics plays a major roll role in influencing treatment response, in particular, responses to medicine. In order to make progress in identifying the genetic differences among patients that influence how they respond to medicine, it's essential to have very detailed information about what medicine they were given, in what doses, in what combinations and exactly how they responded. We're not going to be able to make progress unless we have that available and that's very, very difficult to get.And in that context I'll mention one opportunity forgetting that kind of information may, in fact, be through managed healthcare providers where the patient records have been made electronic may be a framework forgetting exactly the kind of information about drug response that you need. In thinking about large population studies, I would say that it is absolutely essential to make sure you do the best job you can do in representing how patients respond to medicine. .

So I'd like to just end in the last four or five minutes with a couple of thoughts. A, about what we're trying to do. And B, about the case for more serious attention to pharmacogenomics. And first, on the . of of on the matter of what we're trying to do, I'd like to raise the issue that in academic genetics research there's been a real focus on a final and accurate determination of whether a given polymorvism really is a risk factor for a given disease. And in some context, that's something you would like to know. For example, in prediction, you would like to know whether polymorvism really is a risk factor. One thing I think is not so well appreciated is there are contexts where you don't need to know with certainty whether a polymorvism really is a risk factor. It's good enough to have an educated guess and I'd like to make that by reference to a project that glax okay Smith Cline, but I've not been involved with but I report this with permission. They've done a gentlemen genetic study compared individuals with and without type two diabetes and they tried to identify polymorvisms that are associated with diabetes. And what they did is look at 400 individuals with diabetes first and 400 individuals without and had a follow-up. And the size of those studies -- and we know this already from calculations you can do in advance -- are not sufficiently powered to reach a final determination with any degree of statistical confidence, that a given polymorvism really is a risk fact for diabetes. In fact, reaching that final . of confidence is hugely expensive in diabetes. We know the effect sizes are small. However, what they did come up with is a set -- when they went through the exercise -- of 21 gene variance, genetic differences that appear to be associated. None of those 21 clearly, with statistical confidence S., in fact, a risk factor. But you can ask the request in a somewhat different way. You can say "I don't care about any single one of those, I care about the set of 21". What is the probability that at least five or six out of the 21, even though don't know which one it is, what's the probability that at least five or six really are disease-associated? That's a completely different calculation. And, in fact, in this case -- and I won't go through the details in this case what you find, probably, with fairly good confidence, five out of the 20 are real but you don't know which. Okay? That's actually still very useful because in the context of drug development, that means you can take all 20 and start working on them. You don't have to know which one it is. You can ask the question if it's going to cost you another $250 million to get really precise assessments for each of those 21, maybe it's actually better to spend $100 million and start screening some of them. So what I'd like to say is when we're thinking of drug development it's not necessarily always just a matter of reaching a final conclusion, no matter what the cost is, of whether given polymorvism is, in fact, a risk factor.

And the ending two minutes is the case for pharmacogenomics. I think that in academic research, far as I'm concerned there is a slightly inappropriate overemphasis of studies predisposition directly as opposed to treatment response. It's starting to change I think it hadn't changed enough. I want to make the case that variable responses to medicines is, A, hugely important. And B, easier to do then directly studies disease predisposition. So these numbers, the study they're based on, has many method issues and they are highly debated but nonetheless, however you look at it's quite clear variable responses to medicine is hiewnlly important and it's estimated that adverse reaction to medicines cause over 100,000 deaths in the U.S. alone ranking as the fourth or fifth leading cause of death. And in terms of variable efficacy as, in fact, a senior Vice-President for glass okay Smith Cline pointed out, medicines typically onn't work. The average rate at which a given medicine does what it's supposed to do is about 50%. And it varies across therapeutic areas. A lot of the variation is genetic. We know it but we haven't found it. I'll close by saying that when you actually start looking in detail, the genetic to determine a drug response what you find out is it's usually quite a bit simpler then the genetic basis of common disease. And that has -- sorry about this -- that has two components. One is that you often know where in the genome to look for possible genetic determinates of drug response. And, two, the genetic determinates of variable drug response often are common, so they're not the rare things that are hard to find. And the final thing is that when you find a genetic determinate of variable drug response, there's often the possibility of doing something about it clinically. The possibility. It's not immediate but you often, for example, have the possibility of suggesting you use drug A instead of drug B or you change the dose. And that is the final thing I want to say in sharp contrast to predisposition studies of common disease where, sometimes, you find things that really are risk factors and there's nothing whatsoever you can do about it. So, maybe we shouldn't do common disease predisposition but it certainly means that in thinkingbility these large population based studies we have to take the drug response side and treatment response side more generally, very, very seriously. I'd like to end there and issued mention the people that work on some of the stuff I talked about. Thanks.

Thank you very much. Very well done.

Is there one hot, burning question? If not we'll come back and do it with the panel. Terrific. David, thank you very much.

Gill Omenn, terrific to have you with us. And we're looking forward to your perspectives on public health . of view on large population studies of Variation . The environment and common disease. Of course, speakers, by the way, so you know there's a timer sitting be side Sarah and it's sort of, you want to guage where you are it's there with the usual -- right there, with the usual yellow light and so forth.

Thank you.

Okay, thank you very much. Great pleasure to join you. The scenario in which I've been intensely interested in for decades. At least 35 years in pharmacogenomics and echo genetics so the chance to share with you how I think about this and how I think many people in public sciences and Public Health think about the opportunities to really make a difference as we expand our knowledge base from genetics and other fields is a special welcome and thank you for having me.

So, here's a vision which is actually a short term vision. But we'll carry on for decades of work. As you just heard from Goldstein we already have begings of an large of genomec genetic information, valueated snips and validations of hap map. Genes and aleals and especially, many kind of the genes for particular disease risks. The second bullet has been very much less addressed. And this is improvement of our environmental and behavioral data sets and most importantly, their linkage with genetic information. In fact, we have many proposed statutes and regulations that would make this impossible. I'll come back to that at the end. The third is to carry ows both of the first two items with well-established and in the public mind and the legal mind, credible, privacy and confidentiality protections. Both for genetic and nongenetic information. And finally, I think we can be quite confident the technologies we have in hand and concepts being developed will yede break-through tests, vaccines, drugs, behavior your change seemed and regulatory actions, all of which would be aimed at reducing health risks and treating patients cost effectively in this country and globally. You know, in medicine we say we save one life at the time. School public health and Johns Hopkins adopted this "we save lives millions at a time." That's the public health perspective.

Okay. The new world in which we live is well known to all of you here, we're excited about the new Biology. Most of us roig the developments in Biology have been made conceivable by new technologies. You know, there's this notion to go from science to technology to application. There's a huge feedback loop from technologies. It's reflected in the protein economics which I'm working on and bioand on the medical side, an increasing community health services, public health preventive services side, we talk and evidence based medicine. Many of you heard that phrase, evidence-based medicine. When you use it in a rotary club or someplace else you can see mouths up open and jaws drop and finally someone asks "if this is exciting and new, what have you folks been doing up until now? " we're doing better. We're trying harder and of course sometimes the hardest sell is with our own clinical colleagues. The vision from all this is kind of healthcare and community-based services that would be personal, predicktive, andly preventive.And this takes people prepared to carry out such programs, through medicine two or three years ago we issued this report in which they stated with the arrival of the era in which we'll have the ability to understand gene environmental interactions comes not only the era of genome medicine but of genomecs based health this. Is essential for effective public health work-force and the CDC is particularly well represented here today, appropriately so.Here are our centers that CDC establishes several years ago including the one we're proud to have at the University of Michigan and another, pleased to help get started at the University of Washington. And the third in north Carolina. And they collaborate effectively to have a website you can check. The mission is exactly the mission of this discussion.

Now, just so we're on the the same wave length and especially those who are likely to be aware of this meeting and not actually particularly involved, definitions do matter. There is a something of a struggle over which is the broader term "genetics" or genomes" and in recent reports we've tried to help the public and ourselves understand genetics is the broader, historical, Browarder scientific approaching the roles and helping disease, luges. Genomes being the set of powerful new tools from Molecular Biology and science that is permit us, when we choose, to examine the entire complement of genes and their gene products all together of those you just heard, generalizing across all the genes as a formidable task and we end up focusing pretty quickly.

These global analysis do permit us, in fact require us, very usefully, to go beyond what we sometimes speak of as looking under the lamp post where we know aboutity gene of fee know type we're interested in or a desired affect of a drug and ignore the off target actions of the same drug that lead to nasty complications. Same thing on the protein side. Individual proteins or proteins as a class. Proteinics and looking globally as as many as possible of the very much larger number of protein and proten forms coded by those genes.

So we already had a good instruction to that about the genomec information from the global analysis. The international hap map consortium and the direct associations of the individual snip aleals with dairs disease types and the very substantial database we heard is over 10 million, and the hap type structure work which is really still emerging with a lot of clever efforts to use tagging snips and variable linkage equilibrium, a combination of hot spots and other details of structure.

Where can we get information about environmental variables to put together with the genomec information? Well, I'll give you a few examples and you'll hear from Dr. Teri Manolio and this is morning, the centers for disease control, national center for health statistics, conducted for 40 years, surveys of the American population. And increasing numbers of laboratory analysis. Now we're going to hear later, I will come to a slide about what is really the set of categories called environmental or nongenetickic in the U.K. biobank but here I want to focus particularly on chemical and environmental exposures complementary to behavioral traits and history and what you hear more about from others. The thisains proud of May jor impacts and major contributing factor in the removal of lead from gas. One of the public health try umms of the last century. Collaboration of pediatric growth charts. Preve lense estimates for cholesterol. Blood pressure, Hepatitis-C and other important variables. These are the environmental exposure that are asayed currently. And this is ongoing. So lead and a lead-biomarketer. Cad up, Mercury, arrests knick, and also, organic chemicals, Acrylamide, and metals, IGE antibodies showing lay tex allergy. The hydrocarbons. Estrogens, die objection sins and a whole bunch of markers for these exposures. And also, the smokes history or if a nonsmoker, environmental tobacco smoke exposure and a lot of other types of measure in laboratory. This is a rich data resource. Over the years, the tool which concluded in the 80s had 14,000 people. In haynes three, 34,000 people. I couldn't find in the very extensive website of NCHS the number of the current ongoing study. They told me there would be about 7,000, 6 or 7,000 so far who have DNA samples taken. I think that might be about a 10% sample of the total.NIHS is interested in environmental and genetickic interactions. I served on an advisory committee on personalized exposure assessment. The approaches we highlighted in our report, which will be out shortly in environmental health prospectives where the use of geographic information systems and the example there is the NIEHS set of children's health studies where they combined GIS and wireless devices to track exposures. Pesticides to validate dietary -- diary entries. These are diary entries not just of diet but potential fivities that would be tied to those exposures including children who might be exposed as migrant worker families or children who would be exposed with information about pesticides in the house and garden. They are developing spacial models for households at risk for lead poisoning and a variety of other exposures.

The second comes from the technology side of the biocenters and the devices that permit feasible measurement in the individual of exposures and relate, then, to actual bioburden measures of the sort that haynes does.

Third category is molecular signatures of exposure, early affect and variation sus Septemberability. The Conceptual strategy here of really building a program which would fit nicely with what was just described and what will be described in the biobank and some other large studies, may be applied in proper settings to retrospective or in case control studies, as well, of course. You have to be able to identify what your priority diseases are and the plausible or hypothesized environmental factors. This is nontrivial. We basically punted in this study for later work to be done on this. Identify potential genetic determinates and model systems for explores the genetic environmental interactions. Identify target study populations for feasible measurement. Define the genetic determinates to cu Septemberability. Conduct targeted exposure assessments. Identify and validate biomarkers and bring it all together with genetic environmental interactions. One thing that should be emphasized is the era of fighting as to whether things are nature or nurture, genetic or environmental is behind us, we're now all thinking about contributing genetic and nongenetic factors and specific ways they interact and even I would say I cringed a little at the comment in the last talk from dealing with disorders we know what the gene type pattern is. It's a lot more direct then for multi-factor Al disease. The variation can be quite stunning for the single gene disorders the most dramatic being over the last decade from Saudi Arabia of people with hemoglobin and -- with no apparent clinical fine type for biochemical fine type and many other examples.

Technologies and approaches. Some are listed here. I think I've already basically mentioned them. This is a natural process language to try to search the best literature. Very good tools now becoming available for doing this in an automated way to assist us limited humans. GIS I mentioned. Mapping and systems. One of the questions I asked was the extent to which the haynes findings sampled through the American population are actually being mapped. As EPA trying to do for other purposes, to states, locations, neighborhoods, maybe all the way to individuals. And so forth. This is one of the most important things for the laboratory sciences which is to link perspective sensors and molecular biomarkers in animals and in humans with invee to and studies to try to make the bridge between tox cooling and Epidemiology which has been needed for so long. .

EPA, EPA, of course, regulates the air, water, soil and together with FDA, foods for contaminants. The EPA has many measurement and modeling programs which this may be the most relevant for our purposes today. It's called the musty my yeah integrated modeling systems. MIMMS. The primary application is to simulate airborne substances in urban settings and the spacial scales they look at range from 10 I can lome teres down to less then one I can lome ter which kill METERs which gets to be interesting.

They're working on prototypes and successive generations of support tools and this is for air pollution and for Homeland^Security. You can easily imagine that. These tools bridge gas between two quite previousesly quite different approaches. One is the chemical grid modeling. The other is the disburstion model common tore water and air pollution. These models capture -- ground level concentration of air tox ins and hazardous releases from stationary sites and may reveal enough hot spots to be quite interesting in terms of human studies. .

There's a sort of progression to make measurements in the air wherever there is a monitoring station and where the stations are placed, of course, is highly irregular and never been systemized around the country. There are personal monitors. We're familiar with these in the work place in industrial high gene but available for community sampling studies there. 's biomonitoring as shown here for several examples. Of course, with pie you monitoring and in isolation, as with hains or the biomonitoring or the studies done under these genetickic population, there's little information about the source of the agent measured and that needs to be thought about in advance. And finally, the national scale. Sort of the summation of all this. And the CDC2003 data had 116 environmental chemicals including the ones I listed for you a moment ago.

Here, John, is my take from the web and I was at a planning meeting in Dublin four years ago. I wasn't aware when I prepared my slides that we were going to have an expert talk about this from the people who are actually doing it so I'll be very quick. Maybe it would be interesting to see perspective of someone across the ocean so we know what's going on. This is a genetic data bank to be developed from blood samples from half a million people. I understand the studies will be based on proposals from researchers. The -- general practices, many of them, in regional combines. With a ten-year follow-up. The age at recruitment, 45 to 69, and their expected to be substantial numbers of deaths over that period of time from common diseases. So that would be of great interest here.

There will be a questionnaire or risks, lifestyle, diet. A blood sample taken. Not too much said yet about what the blood sample will be used for. Maybe we'll hear today.

Statistical power estimates, very important in planning studies. They expect over 5,000 cases per year for diabetes, heart disease, Colorectal cancer and breast cancer and you see the projected relative risks and interaction ratios they would be able to detect with these numbers and that power and notice that should be that should be 1% significance. Isn't that right?

And at a lower incidents, there would be arthritis, Parkinson's disease, bladder cancer, and others, with, again, power estimates. And they have a very high expectation that 40050% of the patients in each practice would actually enroll. This would be astop knishing in America. Maybe they can do it in U.K. They chosen for the the blood sample, a very interesting question as to what form of serum or to be used and the separate, big, collaboration that I lead about this in the plasma and serum. We have similarly given high graisd to EDTA and higher to the plasma. There will be a case control and cross-sectional studies including a variety of family-based studies.

There have been some criticisms of a design, naturally. One is, even half a million people, that's too small to analyze complex diseases. And within these disease diagnostic carghts is extreme. When I was in Ireland, there which is big discussion about a proposal to actually enroll pairs which would be particularly informative for genetickic studies. I'm curious about the status of that. I couldn't find any mention on the website. The age of 45 to 69, of course, is a late time to be gathering information about crucial determinates of early stages of latent diseases, long guess todaying diseases. And, of course, relying on medical records while maybe they're better then here, I still limitation.

There's some comment that might be an overemphasis on genetic factors because of reliance on medical record and back of the lack of much collection about other kind of environmental factors and there have been vigilant consumer and patient groups looking out for confidentity and opposing any kind of genetic behavior studies and some other concerns.These are the exposure carghts as I understand it. You can see them all listed here. And no categories and no specific mention of environmental chemicals which, in this country, would be the top of the public's list.

An example of the kinds of studies that can be undertaken you see here. All of them are interesting. But they are of a subset of the variety that I've indicated would be a broader genetic environmental interaction.Now, other large-scale studies are underway in various places and in the biobank site they mention the much-publicized studies in iceland and much publicized in the development in Canada. A big European collaborative study called "epic" and others which Teri Manolio will, I guess, is providing you have received the materials for this meeting. In this country, the most remarkable study of the last decade has been the women's health initiative with 160,000 women participating. Both obs variable and randomized studies and as you know the outcomes have been front-page news for months. Let me bring this into a broader perspective from the public health view.

This is about genetics and environment and how we share a lot of interests. We both bring together the digital -- aim to bring together the digital code of information with the environmental could yous some people call them from knew trirks metabolism, pharmaceuticals and don't forget the knew tray suit cals and the chemical exposures. The broadway to think of this is a system's Biology approach to look at the inputs and then, the genomec ep pi genomec, protein -- levels of integrating the molecular information. Echo genetics has been the focus of my talk. I'll carry on about environmental and occupational exposures and variations to cu Septemberability. It can be looked at from infectious diseases, chronic diseases, nutrition, unhealthful behaviors and it means we should include genetics prominently in the case of disease prevention and these would include most interactions well as drug and vaccine development. I've already mentioned the training need.

Put all that together and should be, in the next decade or two, a golden age for public health sciences. We need these kind of population-based disciplines in order to make sense of a genetic variation. It would be a tragedy, in my view, if we had extensive genetic variation and couldn't make the relationship or answer people's questions about what you could do with the information to reduce your health risks.

Go out to the chemical exposures specifically, there is a discipline of risk assessment, risk management. Risk communication, that I've developed over the last 25 years, it's all addressed at this observation. Scientists disagree. This is extremely bewildering and disconcerting to a lot of people. In the current debate of faith-based ways of thinking and scientific ways of thinking, the characterization of scientific ways of thinking is based on fact and certainty is a huge failure of our communication. We are typically most interested in what we don't know what is uncertain and how we could learn more and make it useful.There's a framework for this kind of thing with regard to regulatory decision making on chemicals and other factors, especially chemicals to identify if there's a potential for hazard with all these methods which is what I'm talking about to, characterize the risk, characterize, not just to quantify, but to describe, have a useful narrative about to nature of the types. And how reversible they are. How serious they are. Related to potency, exposure analysis which, until recently was very underexplored and, our saying of variation cu Septemberability. To do something about it. Very often, information long before there's a regulatory action, has a powerful affect.

The tox koel genome , I mentioned at the national tox cooling program. A framework which says we need to Putney tox koel give. An environmental scare of scientific finding into broader health context and have anorederly process to develop the assessment of the risk, reasonable options, make decisions and carry them out and evaluate what we accomplished, if we did. All of this, from the beginning with proactive information of the stakeholders as the they've been doing. Context means in the environmental world, going beyond the statutory have one chemical, one environmental medium, one health the time. about the total public health of or any group. This requires multiple molecular markers and specially public health comprehensive view. Context means multiple sources of the same agent. Multiple path ways of exposure. Multiple risk the of one agent or multiple agents causing the same effect.

Data, surveillance, interaction with the environment and crucial issues about health disparities, environmental injustice, social and cultural traditions and differences in perception about risks and should be done about them. Finally, I want to . out some good work from an organization called "partnership for prevention: Endaging with the states" and CDC is very active with that. A lot of action at the state level. And a pending Federal Legislature on protecting people from insurance or employment discrimination for genetic diagnosis some 38 states, at least, have passed their own patchwork of Legislature. .

The aim for states are shown here. Monitor what's happening to assure that we have applications not just for treatment of people with specific diseases, but for health promotion and disease prevention. These are the two key findings. The first we've already covered. A lot of opportunity in this genomec era. The second is a hot policy debate. And it was the position of the partnership for prevention that genetics and genome , should be integrated into existing health, social and enviernmental policy rather then stand-alone genetics program. This is a quote from that report citing a very highly-regarded report which I was not personally involved in at the state of Michigan, the governor's commission on the state of policy and progress. At a time when many state policies were based on exceptionalism, taking genetics out from the mainstream of medicine and public health -- Michigan adopted an integracious perspective and recommended genetics issues be dealt with in the overall values and principals. All health conditions have some degree of genetic basis. It's very hard to dry a line between what is genetic and what is not. Most common diseases we're emphasizing result from gene environment interactions so the genetic advances are likely to extend and expand and not to is your plant the environmental protection. Some genetickic variations are associated with greater health risks then others covering the huge range with a one-size-fits-all policy is inappropriate. Issues about ethics , costs, societial issues. Medical cal care decisions should be links with research, insurance, and broader health policies. The intersection between public policy is immediate and longterm, warranting close monitoring. I added this line on the bottom which is that in this era where in the clinic where I'll be all day tomorrow, we have to tell patients that would be wise to make sure your insurance is complete and adequate before you have any tests done. That prohibiting discrimination based on test results or genetic diagnosis is necessary. .

The kinds of research we want to stimulate in populations and communities requires certain principals. Albert Johnson prominent bioest cyst observed in a seminar in Seattle that while we had developed widely accepted concepts and tools for ethics in medicine, namely, the informed consent principal, and the principal of autonomy of the individual participant, that we had no corresponding highlighted principals for public health or community-based research. So he, I and others developed and published this scream about engaging community partners early in the planning process. Keeping them posted. Seeking their input in the analysis and interpretation. Building productive partnerships that last and empowering people to propose studies. There are sources of information shown here and final comment six year ago from Frances Collins that what we're engaged in collectively, mapping the human genetic terrain may rank with the great expeditions. It's clear to get maximum value and meet our public responsibilities that we need to understand that progression from genes to proteins and from molecular and laboratory interests and, of course, clinical translation. And more broadly, to address the issue of this meeting which is to link genetic variation with the many kinds of nongenetic variables.

Thank you very much.

Terrific, thank you very much, Joe. Again, any one particular pressing question? Great. Thank you, we'll come back to you in just a bit.

Now, Teri Manolio will give us a sense of the overview of this issue from the international and national perspective. Thank you so much, Teri.

Great, thank you very much.I appreciate being invited to comment on international and national studies. There are a large number of them and we won't be able to do them all justice, several will be discussed in more detail. I was asked to review the studies and then talk somewhat more about design as well. Design of perspective studies of case controlled study. I probably won't have a chance to get to the last one. Use of existing okay hooters and new existing okay hooters. But if we have have time we will.

New ones are cropping up every day. Very few of them have gotten into the field and gotten going. The population -- the public population in the U.K. biobank you'll hear from subsequent speakers so I won't focus as much on them. Biobank Japan, I can talk about. This one I can go into in a little more detail because it's the one furtherest along and generating results. I'll comment on the marshfield project and the national children's study and a variety of other clinical samples I won't go into.

A broad overview of several international ones, the biobank Japan, obviously, in Japan, is anticipated to be 3020,000 people ages 20 and above that focuses on, at present. 47, common complex diseases which as we heard before, were diseases that don't seem to have the pattern of inheritance that are related to a single gene but probably to multiple genes. Access to those data and sample at present is limited to Japan and Japanese researchers. The genetics in iceland. They anticipate having most likely, the entire population if they keep going, of at least all of those that consent be at least 200,000 of all ages. 50 common diseases and access is possible with collaboration.

The genome project in astone yeah is varying estimates to have total size of the country is 1.3 million and they initially talked about trying to get a million of those, now they're scaling back closer to 100,000. The age I'm not sure of. I assume all adults but don't know. Common diseases and with dlab ration.

And you heard much about the U.K. biobank and we'll hear much more about that.

The cart gene is a Canadian study in Quebec anticipated to be about 50,000 people, age 25 to 74, focusing on common diseases and they -- whose flight was cancelled, will tell you more about that. The twins, similarly, is part of that collaboration. It has seven European countries with 800,000 twin pairs. Twin pairs are a very interesting genetic model. Great strengths well as weaknesses. I'm sure you'll hear about that. Focusing on seven key outcomes at present and are available with collaboration.

The marshfield penalized medicine project relying on the marshfield clinic. Anticipated 40,000 people. 18 and above. A large focus on adverse drug reactions and David Goldstein spoke to you early about the importance of adverse drug reactions and I would think you could find really exciting information about this.

The national children's study we'll talk about later to, include 100,000 infants and mothers to follow them for 21 years.

Just briefly to comment on the biobank Japan. The goal of the study is to clarify on a large basis, the causes of diseases and medication side affects in relation to genetic variation and ultimately, to develop new drugs and dying nos. The goal of many of the large biobanks focusing towards dug drugs and diagnostics to to the field, but to help support the biobank itself. Samples and data will be collected by a network of collaborating organizations and private universities, public universities are not involved. And that has raised some eyebrows outside of Japan but the Japanese are quite happy with it and it's their study. These are some of the universities involved.They hope that their project will stimulate the development of lodge slags in Japan to predict personal research information. Not only genetic information but research information in general, which is an interesting sideline of a biobank. Begun in 2003. 90,000 samples have been collected to date and that's 120,000 disease cases because each person they collected has more then one disease. This is unlikely to be a random population sample. More patient-based because it's working with hospitals and, so, it's relevance to a general population is a little more questionable.And distribution of DNA and serum to Japanese researchers has already begun.

The astonian project has a similar goal to find links between genes and enviern American factors and common diseases to apply it to improved healthcare. Maybe as many as a million persons but scaling down to, perhaps, 100,000 begun in October of 2000 with about 10,000 recruited in an initial pilot as of 2004, in three astonian counties. Written-informed consent. 6 to to 90 minute questionnaire including information back two or three generations, simple measures, height, weight, balance sheet, heart rete and a blood sample. Personalized information is intended to be provided back to participants with their consent and interest, and to their physicians with their consent. People who participate in that are called gene donors and actually, participants can go on to their website in astone yeah and ask a series of questions about their involvement and what it means.

There's a nonprofit astonian genome product foundation in public-private partnership with egene, ink, which just recently dissolved their arrangement with egene in 2004 and they are now looking for other sources of funding.

The marshfield project, as I said, is based out of the marshfield clinic in Wisconsin, a very large private set of clinics. Intended, also, to translate genetic data to knowledge that will enhance patient care. It utilizes the marshfield Epidemiology study area in central Wisconsin which has a long-standing electronic medical record and utilizes the strength of having ongoing electronic record. I would comment the CLINICIANs are still CLINICIANs in Wisconsin and don't ules record things in a standard way. Just because it's electronic doesn't mean it's reliable.

They intend to recruit up to 40,000 people. Age 18 and older. In September of 2002 and they have 17,000 recruited so far. Response rate is fairly respectable for a study of this size and scope of 45%. In the Epidemiology studies we like it to be much higher but for a variety of this is quite good. Written informed consent. Questionnaires, dap extract and blood. Data is encrypted which means there is no one with access to the identifiable clinic information has also, access to the genetic information and there's a link that can be broken by a third party.

Decode genetics is the icelandic group, a biopharmaceutical group that is studies the development of drugs for common disease. They yiewt lies the unique resources of iceland. It's relatively isolated. There are founder affects there which means they were settled by a relatively small number of people, probably still in the tens of thousands in the early 10th century and have remained isolated since then. Also gone through a series of population bottlenecks, fa minimum, disease and volcano eruptions and things. They also have an extensive gene logical database going back to the settlement of the Long Island in 900A.D. and good record systems. They currently has DNA data on 110,000 consent icelanders and 20,000 nonicelanders from various parts of Europe they have collaborations with begun in 19 98. There was tremendous controversy generated by this project because of their proposal for an opt-out consent for access to medical records. A proposal to have what was called a health sector database accessed in everyone unless --

There minutes remaining.

This conference is showing no activity. If you would like to continue the conference, press star 1 now.

Well, what can I say?the Lord has spoken. But I'll try to have a little more activity here. Okay? As I was saying. This opt-out consent caused a big problem and that, eventually was abandoned. The plans. Whether they're revisited or not in iceland is not clear. But there's written informed consent for all the genetic studies and third-party encription as well. In the interest of full disclosure mention I am collaborating with this group and that's why I know a little bit more about it so take my comments in that context. The uniqueness of this population, founded by settlers of mixed European descent of norway and swee den. The British aisles and picked up passengers, sometimes willing and sometimes not. And went to iceland from there. The current population is about 285,000, which is almost exactly 1/1000th of the U.S. And another tremendous resource is their careful geneic records. This is almost an obsession, they all know who they are related to. When two icelander's meet they'll say you're so and so's grandson and my cousin went to school with your aunt and they can all relate each other to various relatives. And it's an interesting -- without, you know, any -- it's not like there's feuds and that sort of thing. But clearly something they're very interested in and have kept very good records. Given the relatively small founder population, relatively similar genetic background and isolation following that means fewer variances to study. What has been done with the records which any family, if you visited an icelandic home, after dinner they'll show you the books. These have been computerized and every icelander has a password. This is the gene yule give of the founder of this and he can go into this as can any icelander and trace his gene yolg back to six generations to this person and then, click on this next button. She was born in 1776 and trace her back another six generations and then the next one, born in the 16th century and in the 14th century and in the 12th century and, finally, back into the 10th century. Back to their original Norwegian founders and of them can do it. It's really quite remarkable. They can also, when they meet someone, go home and look them up in this database and find out who they're related to and find out how closely they're related to each other. Married couples were very interested and they were like, oh, gee, we're related back five or six generations. Maybe that's why our son, Charlie, is so strange. More often it's just an interesting how pi they have. They're very interested. They'll say, I can go home and check and see who I'm related to. This is a big deal for them.

Also a big deal for science, what one can do is take two people that happen to have the same disease and see how they're related to each other and pull out groups of cases that actually are related in very large ped agrees and that's done in a project. This is a ped agree with 69 patients. Not the largest one they had. One was 700, but this one fit on the page. All these people in these little black boxes and circles which are a tremendous resource for finding genes. The purpose of this study is to identify genes related to common diseases.And what we did with this, then, recognizing the common diseases don't show the inheritance patterns and very often you don't have relateded effected siblingses which is the model most often used in this country, but you often have people with more distant relatives. So you can look at the degree of relatives you have a person with a disease, his or her first degree relative are 77% more likely to have the problem, too, then people without a relative with the disease.If you exclude the first degree relatives which are mothers, fathers, sisters, brothers, daughters and sons, the relative risk is still 36% higher. 18% higher if you look at third-degree relatives. 10%, 5%, and very few populations can go to this level of detail in relationships. And what's interesting about this particular example is that decline by halves, in the degree of relative risks parallels the decline in sharing of genetic variance through generations. So it's very strong suggestion that there's something genetic here that's related to this disease. Usually this approach to map diseases which means finding areas of chrome somes that are likely to be related to par for all of these diseases shown in white, for those shown in blue, sorry, we actually identified what looks to be a variance within a gene and a possibility of a variant related to it. And then the purple ones are ones they developed drugs for and in clinical trialing to try to reduce. So, again, a very pow everful way for finding genetic variance.

One of the challenges in identifying genes to actually understand, as gill was alluding to earlier, the population impact of these, I would Squibble a little bit with Dr. Goldstein's comment that just because you know a gene doesn't mean you can do anything about it. Sometimes they -- it may be that one would want to really reduce those other risk factors as a way of, perhaps, reducing the risk before. That's a reasonable research question that needs to be pursued. If you consider genes to be risk factors passed from parents to children, the Epidemiologies know what to do. You look at association and preve lense that are identified in family studies or other studies, assess their magnitude and independence, recognizing the common risk factors are generally not strong ones, strong risk factors are generally not common, if they were, we'd all have them and get sick. Those get weeded out and we end up with the smaller affect of much more common. One can define associations with I is variety of types, but perhaps related to other diseases as well. And identify factors particularly enviernmental factors because these are the things we can change. They've've changed in the past 30 years to us the incredible epidemic of obesity. If we can identify those things and have impact on them we may particularly want to do that within the genetically susceptible individual. This shows just three of the variance that they've identifieded. There are a little bit known on the frequency and risk associated with these in the icelandic population, for a variety of reasons very different from the U.S. population and one would want to know not only the frequency in risk but what other types of associations are there with this particular variance and particularly, what modifies them. Very little of that work has been done and that's what needs to be done in these larger biobanks.

Frances Collins published a paper earlier this year talking about the need for large studies and we'll have comments later.-- the actual quantitative contribution of the environment and genetic factors, the interactions Monday them and other disorders that may share common risk factors F you get heart disease, does that affect your risk for asthma or cancer? It probably does. Rerecognized and pointed out that replications of associations and estimating their magnitude and consistency and time relationship is best done through the perspective studies.

Just briefly, the studies that are prospective, before the time the disease develops into the the future of investigation of a representative sample. Representative meaning you can relate that to the population from which it was drawn. You're not just studying truck drivers who may be different from the of the population. Not just studying air force pilots, a sample that's representative of the entire group. Follow them for development of specified end times. Identify things and look for them actively so they don't just happen to be picked up but are surveyed and 3EUBGed up system AT&Tally. The purpose mentioned before is to identify the risk factors predisposing the disease of the general population. Particularly, you want the design, when you're looking for risk factors effected by disease so you can't measure them after the disease occurred are things that are affected by treatment or lifestyle changes, when people are sick, they think they need to help do something to prevent myself from disease. This can have an impact. Joe you larly want to look at those difficult to recall and which there is buy yos recall. Once somebody develops a disease and we'll talk about that. Something that has an impact early on and later on may not have much of an effect at all, you're likely on to pick those up in prospective studies rather then waiting until the disease occurs. .

(Captioner switch)Test test



Thank you all very much. Our next three presentations will explore the low link cal, legal, aspects of large population studies. We're very please d that that Marlene has been able to join us on very short notice. It turns out that Barbara moppers is in Canada and there's something like a snowstorm up that way and she couldn't get In Somar lean was very, very kind to come in and help us out here. She will present an overview of the Elsie issues followed by Charles that will explore the dichotomy between social identity and the ancestry and the issues raised by this die -- dichotomy and John Newton in an effort to develop the biobank. Let's turn to Marlene of large population studies. Thank you so much. There's a little timer there in case you need to time yourself.

Good morning, and thank you for the opportunities to talk to biobanks. I learn yesterday that I would be giving this presentation by March that's plane was cancelled so I hope I will be able to convey her ideas because this is her presentation. The presentation is divided in three parts. I will first talk about the legal and ethical framework and I think we're still in search of an adequate one so I will comment on these. I will kind of skip the second part because I think engineer Teri Manolio talk about these existing aspects and I will focus on the thuvrd time challenging and the issues of population biobanks and I'll talk to you lastly about p G3 at the end of my presentation. Let's start with a small brief introduction. I think it's clear now that we, the way we do research has changed in the recent year. We first looked into more single gene disorders and now we're in more complex diseases. We're really now focused too international and international collaboration and, in fact, they're pivotal to ref and complex diseases. And we went from what we call research on traditional biobanks, the ones that, you know, the small and researchers lab to towards human genetic research database per se. Finally it's interesting to notice that tissues were at some point interested almost waste and now they're kind of to the level of almost equivalent to the person from whom they came and there's been some recent bureaucratic review. I don't think the process was sfwended to be as complex and bureaucrat advertised as it was right now but there is -- it's certainly an element we need to take into effect. Was it the biobank. For the purpose of this presentation we're focusing on collection of information that is organized, that is searchable, it's not just a large bulk of samples. You need to have a way to search through it. It's interesting to note that in the legal ethical literature, oftentimes biobanks collection courts, these are works that are used as if they were all synonyms and we need to make sure we use the appropriate wording and I'll focus on this presentation of the reality meaning this large population database including at least 10,000 individuals. What are the legal and ethical framework and what struggles do we have in those? I can see two things. First there's really a trend toward a proliferation and specialization of international and national policies, and I'll tell you about this in a minute and through this we see that this demonstrates a need of urbanization of some of the principles but most importantly of the terminology and I will tell you more about this too in a second but talking about the proliferation and specialization of policy. Here you see at the international level within the past three years, some of the international guidelines, legislation or declarations, I should say that has been adopted by various organizations like uc Oor the world helicopter organization, if you look now at National levels and the title says it all, it's very uneven playing field, you see a great disparite between all jurisdictions. Here you have a few countries, legislation that specifically regulates human genetic research database and these are interestingly enough they come from the examples we have here, all come from northern part of Europe. If you look at other jurisdictions, some of them just rely on the current data legislation, public health legislation, traditional content legislation and this creates some confusion and conflicts and overlap, and some areas are sometimes left even unregulated. I think this code says several systems co-exist so there's different angles that ignore each other. You try to regulate by pieces that are not well, a dapted to human research databases. You can see an increase interest in debate including the database and these are examples of very recent documents that are issued by advisory committees or law reform commission in various countries. And the Canadian biotechnology advisory committee being the most recent one that we have here so we see that there's an interest and some discomfort at least in the countries with respect to the current situations. If we go to the second part, the challenge of urbanization, the international level, it's very clear that there's an increased need for urbanization, sorry. I think the lack of internationally agreed upon rules but most importantly common taxnommi is really detriment to research collaboration, it's an impediment to be able to exchange your sample to other countries or to transfer information so we need to acknowledge this problem and it's already being acknowledged by various organizations such as the wh O. Here you have the bible towers, really I think that's how researchers of out of their field and this secretary general Ewan quote, says it all. It says despite the existence of codes dealing with jet Nick tick data the changing call for the establishments of an international instrument that would enable states to agree on agreeable principle to transpose into their legislation. This is a wish. But it's a tool that we really need right now for the type of genetic research that we need to do. At the international level now there's a need to recognize the specificity of human genetic research database. These are no longer research projects but research resources that will be used for put multiple future uses so it's quite the different thing. There's limits to the traditional concept and privacy legislation. These legislation often time were created in the context of research for candidate gene for diseases and are not really appropriate in the case of database like the one that we're talking about here. There's also a need in personal data and privacy legislation, there's a need to have a more common language. We know that there's a huge problem with the vocabulary that's being used right now for coded deno, ma'am nized, d link, deidentified and one country and another country the same word will mean something different. And so when you want to respect participants and make sure the concept that follows the sample will really show your partners how they should use the sample it's a problem because we're not even sure how it's understood at each -- between each partners. So there's also a cause for the implementation of a more comprehensive regulatory framework so that we'll be more easy I would say to conduct these types of research. There are some consensus on what shall we should be working on. The first thing is certainly to work on the tailoring of traditional concepts of research databases. We can no longer use the traditional consents models. I don't think it's appropriate neither for participants nor for the needs of researchers. We immediate to have a better correlation between the data I'd fiblt and the obligations that comes with it. It's more interesting, of course, to have data that are coded and we can link to a part pants but it comes with obligation and what are we going to do in 20 years from now? Will we have the obligation to bring results to these participants. That's something that we need to clarify. The need for ethical oversight from the inception of a database as well as monitoring mechanism. That's something certainly we need to work on as fast as we can. Initiating and promoting and strengthening the professional and public dialogue. This is fundamental to the type of enterprise we're talking about and we certainly need to work on it. And it's kind of reted to the last point either, the need to give up a benefit sharing policy. We need to do I think a better job at really being able to identify what are the benefits and it's difficult because we know the benefits are long term but for the participants and the funders and be able to justify such an important investment, we need to to be to have better communication with the public about this. Sole controversial issues, funding. If I think -- this is a very sensitive issue. If we want these human genetic database to stay in the public domain, the way they will be if youed has a tremendous impact. This issue about original concept form and secondary use of sample is also one that's controversial. Are we going to go into this banquet concept. We have very big doubts that something that's going to be accepted in the legal system but it could be possible. There's suggestions about the authorization model, maybe it's a new way we should explore but certainly what's the appropriate type of consent we need here is something we need to further discuss and it's really something that he is a sensitive he should because it will have an impact on genetic research and any other research. Protecting privacy. Again, the choice of word is very important. Personal feedback as I said or what are we going to do in large-scale setting, it's recent to think we will bring back results, individual results. Is this something that's reasonable and feasible. The status of genetic material, ownership, who owns these database and the tissue and certain jurisdictions, the mere fact that you would own tissue is counter intuitive I would say and against most of most of basing fundamental principles. Looks into checks and balances is something I will talk a little bit more in the second, and ethical review for multicenter research project is also quite challenging these days. I will skip this part and go right through now to the challenges.

So if you were to establish a human genetic research database right now what would you consider? What are the fundamental elements you need to think about? And we think there are at least three elements you'd like to go through. The first one is ensuring legitimate of your database of you'd like to look into the adequate protection, building trust, making sure that it's well protected and you'd like to make sure that there's appropriate checks and balances and let me go into more details into all these three levels. If we're looking into legitimate things. You need justify putting so much research, money resources into these huge human genetic research database. What are the benefits? And how, we need to explain these benefits. So this is key into the funding and the support of the community and we need to work on this I think. Legitimacy can come into different ways, in some countries they've chosen the democratic forum through parliament and legislation to start these human genetic research. You have his Tonia and ice lands where in these countries they've created the genetic database. Is parliament the most appropriate way or is appropriate forum by which you could engage the public and make sure that there's legitimacy there. And question that we had is if there's not enough public consultation, public communication prior to this parliament enactment of the legislation, we might have questions with respect to the process. But nevertheless in many countries, it has at least -- it least it's very clear, whenever there's a legislation, you know what the rules and being done. In other project, the initiative instead of going through parent is a project that started by scientists themselves and they are adapting the science to communities's needs and the population's desire and through discussion and again in this case, it's more I would say in the way self-regulated. But the participants have really again here to discuss about the regulatory framework that's being built. So these are two different ways in which you could approach it. Now for transnational enterprise it's a little bit more complex like general no. These are transinternational international collaboration and it depends on trust and communication between members and based on common understanding of the issues issues and agreements on the scientific ethical legal social issues and common philosophy so this is quite challenging but at the same time the benefits are I think incredible. Now the second part is about building trust. Building trust at different levels. First ensuring public representation, and ideally inclusion of all the groups that could be representing the sample population but we know that there are financial constraints and it's not always possible. Building trust with the community realize on -- depends on your communication strategy and we cannot emphasize enough how important it is really to create a communication strategy that really include the community from the start, and that will really enable bilateral communication if I should say so. Ensuring that a collector's participation and expertise, making sure that the people that were collect data are properly trained. That the researchers are sensitive to all these ethical legal, social issues. That's something you will with a to think about. Privacy again, prices cyst often times the think that worries I would say communities. That's the first thing that will come and it's -- in a way it's legitimate because you're in these genetic database you're putting in sensitive information and really concentrating in one spot. So it's legit Nat that they have -- legitimate but we have to be able to answer with appropriate tools, choosing an appropriate concept process, looking into our security mechanism and looking into the type of identifyability of the samples that you're going to look into. Individual feedback and general results, again that's something that the research team will have to make a decision about. You see here different options in his Tonia, they chose to respect the rights to know in a way and in other projects, there will be no or research results except for the medical examination from the start. So that's another element you'll need to consider, but is it possible, that's the question we're consistent derg is it possible to get the appropriate genetic counseling to make sure that you don't fall into the potential problems in genetic discrimination or mis interpretation of results. Finally, stig tides sayings or discrimination are issues you want to consider and commercial aspect. This is a very tough one making sure that you get free public access, yet at the same time, we need to respect all these intellectual property rights that are involved and the involvement of the industry I think there's -- the resource, the financial resource needed for these type of projects often, we will for sure need the involvement of the industry but how to do it, at what level and how to appropriately make it, that's the question.

And finally checks and balance. Thinking about checks and balance, you need to think about it from the start to get profile not approval not only of the protocols but you need to lock into the framework itself and stamp of approval. We wrote from the authorities it could be anybody from the ethics knew to other types of authorities making sure that the public is recognized again as a true partner and will have the say in the establishment and the creation of the framework itself and need to built in mechanism for the review procedure will will it needs to be there from the start. If you look into the research project review and monitoring, this is really I think a quite challenging area because we want to set mechanism to really make sure that there will be appropriate ongoing monitoring not only of the research project, but, again, of these public resources and how it will set. There are very innovative, biodid something interesting. I think there are very innovative -- innovative solutions out there but we need to work on those. And finally the management structures. In each these projects, they've built interesting charts on how the project would be managed appropriately balanced, et cetera, et cetera, so we need to ensure for transparency and independence and integrity but to create and conceive, consistent accept lies these -- conceptual lies these management structure is quite challenging for researchers as well. I want to talk to you a little bit about the ptg project because I've been talking about some of the challenges, the problems of having different taxno, ma'ami to designate similar things. And public populating projects in gentlemen note -- genomics as none for profit organization that is currently building an international consortium to promote the type of discussion that we need in the field of genetic research. We want to foster this international urbanization at all level. At scientific to be able, for instance, to have common words to designate the type of research, common ways to collect data, and also at the ethical-legal-social level to make sure that people are provided with the types of tools and that we can benefit from the experience also of other population genetic research database that are already out there. And we want ultimately to create a body of knowledge that will be publicly available so that all the human genetic research data bays that are out there will have opportunity to really be able to communicate with each other, to be able to compare data if it's interesting, to be able to exchange data because they will have a chance talk about this urbanization of taxKnomei talk about the doubt of some of theegs issues of making sure we have a common roach and common vocabulary. The true pasters in of pg projects and I'll go back to the slides of the website. Transpartners are genome u twin,. The his tonian, genome project, and cimgr, is which is a Manchester project. We have other partners that are coming up in the project right now. And the chair of the board for this project is bartek monopoliers, I invite you basically to go see our website. So just in conclusion, I think, I think we're building really unprecedented, very interesting research tools that will be used for generations to come. But I think the legal and ethical tools right now might not be -- might not deal appropriately with all the issues that are raised. I think often times they were created as I mentioned earlier, for drug research or mandalYan research and if we want these biobanks to really span the test of time we need to look at these three things. We he need to probably revisit the currents ethical legal framework and make sure that participants are on board and on board in communities are on board very early on in these types of projects. Because I think ultimately the success of these types of human genetic research database will rely on their trust in these types of tools. And we had the common goal here. It's really to benefit the health of everybody. I think we then should have common vocabulary and we still don't have this yet so we need to work on. This thank you very much.

Thank you very much, Mary. That was terrific on its own merit but even more terrific for having stepped in at the last second. I'm looking forward to Hance opportunities to lead the rounds table with all of our participants and the opportunity to query eave of you at that time. Let's turn now to Charles rotin suz heny who will chair is his thoughts on the dichotomy between social identity and ancestry and large population studies. And, Charles, thank you. Again Charles is acting director of national genome center at Howard university.

Thank you. . Thanks for inviting me. What I thought I'd do today is to share with you some of my thoughts, some of my biases and how I think about some of these issues in relation to how we do large population studies and how we try to represent different groups or not represent different groups for various reasons. One of the first comments I want to make is depending on what we're doing, we desire different level of resolution. For example, F. we are trying to identify how common layers, at least five percent or higher impact on disease, we will define our study in such a way that we have a level of resolution to get at that. For example,. If we want to identify people who eat beef, that's one level of resolution. And if we want to identify people who not only eat beef but eat it in a certain way, cook it in a certain way that's our level of resolution and you may have to go to some part of the world and not go to other parts of the world. So again, depending on how we are defining ourselves and our identify, we do stop at different parts of this. If you really look at only in terms of our history, one can say that we are, indeed, Africans and we started somewhere in terms of the roots and trunk of evolutionary history of somewhere in Africa, but, of course, time has -- did not stop and we -- migrate to different parts of the world and depending on your socialization and what you wereling to accept how you want to define yourself and the question of survival, the identity you want to put forward, your level of resolution do differ and we have to always bring that to bear and that is why it's extremely important when we are defining large-scale studies like what we are planning here, that is capable of impact our health for a very long time, we need to be extremely careful who is at the table and who is making decisions not just in terms of science but in terms of how is this representing the people especially if you're using taxpayers's money so it's extremely important for us to appreciate all of that and scientists were socialized before they became scientists, we bring all of our baggage to these issues.

Also I want to again make some distinction here and that is in terms of when we are talking about understanding geology and eliminating here disparity. Sometimes we say these things are the same and overlap. I want to make over lap a little Bicker but I couldn't figure out in the power point. It is a little bigger than that but they're not complete overlap. For example, if you are interested in Elim nature her disparity, you may be interested in how people get access to care. That means they have nothing to do in terms of etiology, we need to be clear as to what is it that we want to do. a Strategy to look at the disparity may, indeed, may have more involving strategy at a social level. Again, typically we look at a die gram like this or we usually use this to represent a disparity and sometimes to point out etiology. One of the things I want to point out here is when you look at, a 50 percent prevalence of type II diabetes among Pima indians one has to wonder as to the same United States as to what is going on and the gene hasn't canninged that much. It doesn't mean genetics isn't involved but it hasn't changed much over years. One of the things we do know certain characteristics has changed the looking at the eatling and disparity and maybe addressing both, now this is looking at populations of the African population. This is where I used to stay when I was working at Louisville medical center in Chicago. It's 34 African-American and this cohort is over 10,000 people from different parts of the people. What you see again is this is clearly a disparity issue but on more people who are recent African ancestry. About 50 percent here, about 34 percent here and you do see a dramatic increase in body mass index so clearly how heavy you are and the environment you find yourself has serious implication for hypertension. There's a new study that is extremely important in terms of how we address some of these issues, what are we calling disparity, how plentiful groups in different parts of continuum in terms of the experience with the problem of hypertension. This was done with Richard Cooper and colleagues recently and what did you see again. Clearly depending on where you are, you do have very different rate. What I want to point out here, when you look at whites, a group we call whites within the United States, in relation to our group, typically we say there's a huge disparity. There is a huge here disparity but if you place all of this populations and look at it together. You see that it is truly a human experience. And when you are in Germany your rate of hypertension is really, really high. The U.S. tends white -- white tends to be quite health any in terms of European population and there for exaggerate to a large stents who gets hyper Texas and who does not. This is is really important that we have to bring to bear, cross culture experiences, bringing to bear international experiences so when we're defining variables, and we're defining strategy that we take those into consideration. This is the same set of studies. Now if you your group is Europeans, the populations in all African populations, do you see that Europeans have a much higher level of systolic blood pressure you but you don't hear this when you hear people talking about their experiences of blood pressure and hypertension. So again, cross culture comparisons is extremely important international, and extremely important when we're doing these large-scale studies. Also what we want this large-scale study to answer. We want to define this to the study. Do we want to give us the level of who gets diabetes, yes, no. Who is reacting to drug, yes, no or also want us to tell us some stories about who we are, where are we from, how we are represent thed. May be use thul this type of activity. It, indeed, it is, we need to bring to bear a design strategy that will help us to say this in a way that we are not reinforcing notions about who we are. So in that regard, ancestry in my opinion becomes a very critical thing for us to consider. I like this slide a lot because every time people talk about this whole issue of race ethnicity, I'm getting so tired of the whole issue, but I always ask myself where do we draw boundaries, and how do we draw boundaries. Again it really depend on where you grow up, how you were socialized, the things that you are afraid of and the things that up like. Who is black? This is the whole spectrum of who is black and this spectrum is limited. Husband pan panics, there's no -- hispanics, one of the pictures I've seen on a PBS website where they actually show that you can see all the variations of complexion right there in Africa, all of it the, the skin complexion you can see, and I'll show you some of the of my experiences in Brazil. But do you see that these could be considered black. But there again, radically different ancestral history through Ethiopia and different parts of the world. I put a slide here to tell a story in terms of what we're doing in the type II diabetes in the Africa Astoria.

The real intention, what we're trying to get at why is the high rate of type II diabetes in African-Americans and to get at that we need to go back to the population of Africa American, and we are noting of the history of the middle passage and that our most African-Americans again came from this part of west Africa and again Mozambique. But the story here I really want to put point out when we started writing manuscript, reporting the results of this study, one of the things that I I called to task on is how are you sure that you can combine all of this group to go because there is an effective design and we analyze the cohorts about 400 pairs with type II diabetes, we analyze one cohort as one group but repeatedly, we say how, why do you think you can combine all of this group together. But the point here is that I have done a lot of work in African-Americans and no reviewer has taken me to task why do I think African-Americans are uniform group. You see how the way we are socialized when we reviewed the work what will we find because this cannot work if you're -- anything can be cured for that reason only, that reasoning. But you could he that even the ancestry history of African-American is even broader than what we have here. Nobody takes us to task on it because the assumption is we are dealing with a uniform homogenous group so we need to be very conscious so what we're talking about so the problem I say is group identities confuse with ancestry. Identification is confused with more complex tapestry of ancestry. Now, when I prepare a slide for this talk, if you take the issue of African-American is confusing, talk about history called the Hispanic group, what is the group we call Hispanic. That is completely mind blowing where you think of about who we classify some people under that umbrella. To some unity to me begs the question of what we are doing and it may indicate why we are not getting some consistent results, some work we've been doing because we lump people together and build on some very interesting groupsings. When we look at the census pretty clear. We are the one that confuses it. If we're not doing anything based on biology, we're lookinging at society bails itself and collecting information on that. When we do our studies we I want to impose biology on that, sometimes it works, and sometimes it doesn't work. So Hispanic, you can be of any race, okay. So this is just to oipts out some of the groups we call Hispanic. Mexican, you know, South America, Cuba and Puerto Rican. This is a whole list of people who have radically different ancestry, if you really go into the history and I put a slide here. This is -- picture, my only visit so far to Rio. I wish I extremely informative for me and I enjoyed myself quite a bit. I was flabbergasted when I drove into a major road going to the university in Rio and I saw this junction. It took me to my young elementary school days when I was in Nigeria going to school. We used to put up a school bag, ours was made out of metal box. We put it on here. We're so good -- on our way to school. But what it turns out is this is sacrifice. In Rio, they follow the European tradition and I was extremely surprised by that. You have these chickens and parts and oil and wine, you know, making offerings to the God for protection. This is three years ago in Rio. Now talk about Genome environmental rack. If you're studying this group you better take into consideration the African ancestry and history and why this group have kept this experience offer over the years and what does it mean, therefore, to have Cuba and Mexico, Brazil as Hispanic and studying them as group? This is again to assure again how we lump people and sometimes lose quite a bit of information. If you look at people who are 18, under 18 and 65 plus, you do see that depending on which population, Hispanic population that we are sampling you could be doing yourself a disservice or service again just to show that and the same thing also are here in terms of education. The radically different experiences again, relevant. I think the same story is true when we look at Asians. We do group all of these group and we call it Asian. Now, for example, Chinese, now, how does that represent the experiences of this are report and the ancestral history of this people and if, indeed, there is something that has been selected on over the years and these are the other own experiences, may, indeed, Lumbee were captured, I don't know but for us to be conscious of who we're calling Asians, one of the other extreme in this experience in working and actually I live in the United States is that depending on how you see yourself, and how you related to your environment, you 10 to lose some of the social identity that you have. It's not important any more to be German American. It doesn't offer you any extra advantage. Okay. Whereas maybe extremely important to identify native American or, you know, Hispanic or, however, as you want to do it. , but, again, this shows that depending on the group, who is sitting at the table, the maximum relevance of certain things and not the relevance of others. So we need to again be very careful as to why we're using this labor, how this labor came about and what is the present relevance. Okay. Now to sort of rounds out here, looking at ethnicity, identity in terms of Africa, one of the things that happened over the years, -- and this is just one of the issues I take to apologies, I 10 to single them out" is this whole notion of thinks within that part of the world or remote environment sort of static and they don't change or we don't want them to change so if people have been cooking, in particular, we want them to continue to cook. Whereas in our environment we are creating jets that can carry 1800 people now. We are like the society to evolve and want others to stay static. I don't know the rationale behind it but the point is just like anywhere in the world, identity changes how we look at ourselves changes and these are based on economic, political and whatever else ways for us, especially an issue of survival. Obviously we're extremely efficient in the way we identify differences because I do believe somewhere down the road that we need it to be so. We need to know who is family, who is friends and who's -- so we're very good at seeing differences and it may not be the reality when you're looking at the genes. So the message here really is things need to study. That identity changes, it's multilayer, depending where you're looking genetics may be important, may not be. Making the sacrifices may be more relevant in terms of the issue. So I like to end by just again bringing us to some theory as N terms of who is who is telling the story. Depending on who is telling the story, depending on who is designing the study, depending on who is present, who is funding this study, you can tell stories and history in a very, very different way. Okay. For example,. During the interaction between Europeans and Africans, there was some surprises that were not anticipated. And because of that biases that came or preconceived notions certain things were difficult to accept. By the way, this is where I grew up. Some of the issues we have again we're still trying to get some of the work that went away a long time ago. But the typical message here for this particular slide is we need to take more comprehensively if we are going to design very large studies and especially if we're going to for genetic environmental interactions. Again, this is the same set of points. I'm just going to escape this. But where do we sample, again, because they're relevant. Because very interestingly oriental European Americans, again that's a very broad term. I'm not quite sure who is under that umbrella -- tends to be -- you can sample anywhere in the United States for that group. But if you're interested in American Indians, eskimos, Asians, blacks, his pan he can, you have to go to different parts of the United States. For example, you do see most African-Americans are here. The people we call hispanics are here. So again, it is very, very important, if you want to emphasize efficiency that you go and depending on also who you're putting on that umbrella of Hispanic it may do you better to be in Florida than to be in California. Again, just for us to be conscious of that. And this is something that we did recently at Howard university with gentlemen met ticks. Really and some of the people are here, actually contributed to that effort. It's really to not get at how do we explain the fact that yes, there is variation at a genome level and that variation is to be studied and how do we do it in sufficient a way that we don't bring old notions on it, let it tell its own story so we can really know how we relate. But the point I also want to make with this slide on depending where you draw circles here, here, here or here. The genetic variation will tell you the story. If you move, it would tell you a story. There would be overlap, there might be some differentiated frequency. Usually what happened is that you you don't have uniqueness. So in terms of large-scale, I look at large-scale as this big umbrella. And as we are trying to fit a lot of things you know this big umbrella. Depending on how many thinks we want to fit you know this umbrella, the timing, the level of compromise that we are going to have to make, this could be prostate cancer, heart disease, this could be infectious diseases, HIV or whatever. So depending on what's it is that we want to put you know this umbrella, we are going to compromise, we are going to have to make some compromise and I want to say at this point that the real critical thing here is the type of fen no typing that is going to drive all of this again. This effort. At some.in the very near future, five years or so down the road, we'll probably have genetic variance, and we'll put it around our neck like an id card but the environment is interested because it's ever changing or not, and depending on how we feel today, my blood pressure quarterback high or it can be low, just looking at you, I can be smiling and things happening to my physiology. How do we capture that in a way that we can relate it to genes that are supposed to be based on this environment. I think we need think carefully how many things we want to put you know this umbrella or we want to answer. So as as a final note here is that the whole point I'm trying to make in my presentation, or I'll try to make, was well articulated here. Historical and throw Poe logical thing of pop saigz which are research the correlation, superficial understanding of the present ethnic populating or or how these populations were developed. The future drug therapy will not depending on the emphysema significance of the race ethnicity but individual adaptation and David made this point earlier. It's not to eradicate or diverpdz or redefine or move beyond levels such as race. Thank you very much.

Thank you very much as well. We very much appreciated that. Thank you. Now let me invite John Newton from the U.K. biobank to share his perspectives I'LL TRY TO FILL IN A BIT MORE DETAILWE'RE STARTING WITHFIVE00 THOUSAND PEOPLE. WE'VE CHANGED OUR AGE RANGE AN GONE DOWN TO RAGE 40 TO 69REASON WHICH I COULDTHE ESSENTIAL IDEA IS RELATIVELY SIMPLE. WE IDENTIFY VOLUNTEERS AT BASELINE. WE COLLECTINFORMATION ON ENVIRONMENTALEXPOSURES, WE TAKE CERTAIN MEASUREMENTS FROM THEM. THEN WE TAKE BIOLOGICALSAMPLES: BLOOD AND URINE(PLEASE STANDBY]I'LL TRY WITH THOUSAND AN COULDTHE RELATIVELYWE IDENTIFY COLLECT ON FROMTHEN STANDBY]OFOF OF.

SO IT'S IMPORTANT NOT TO OVERSELL THESE PROJECTS.IT'S ONLYPART OF STRIKE THAT TO ANSWER THESE QUESTIONSTestingSign tfic objectives..

There will be large numbers of people with diseases.If you choose the right diseases for exampleThings like so rise you can do rather nice studiesOn theCases.We can also do the classic studies looking at people with a particularExposure. an environmentalExposure.Perhaps exposures to pestsides orOtherPerhaps smoke orClass or some occupationalClass.And follow them up as a Group.An interesting vair yent on theStudies isDrivenClinicalWe're recruitingHalf million people and there'sExpectation that perhaps within five years it will bePossible to genotype the wholeCohort for at least a limitedNumberSnips. would thenPossible to identify peep wlSnips and invite them so they canVolunteer inFashion to take part in studies looking at the effect ofGenotypes inRepresentative Group of people as opposed to people who have identified because they're ill.This isVery powerful.ItWholeSet of ethical and legalProblemsEven on top of the ones thatMarlene described IBut nevertheless we'veSomeInterestingDiscussionsTheGroups in the U.K.Suggesting thisLikely to be feasibleProvide it's done carefully.TheThird big area of interest of course is in identifying bimarkers as earlyRiskOrNot just as a potential diagnostic tool but as something that helps us to explain the model.The fact that substances raised beforeSomeone's developed the disease may giveClues to the disease mechanism.In general I think the point is that studies like biobank and all o others we've talked about and indeed complimentary studies will help us to understand disease models in a way we've never done before.And of course is really the holy Grail of biomedical research.What we do with thisSeparate question.Scientific justification for prospectors. course you f heard this before. one or two things.HavingInformation on people regards to severity is importantTake coronary heart disease, many people who develop coronary heart disease it arises in suddenDeath and not having samples beforehand can be a problem or indeedRisk factors beforehandAndOf as certaining blood Sames generally for pro tee ai mix isNot just genetics isImportant.Point about genetic study is if you take genes as justRiskFactor it's important thatPerhaps as Charles points out you have to have no preconceptionsWhat the disease risk factor relationships might be.If you start with case control studies you will rarely detect relationships which with diseases that you haven'tThoughtSo if a particular gene causes say parkinson's rather than breast cancer, you are doing a caseControl study of cancer you won't defectThatRelationship.It's importantPick up things you weren't expecting and it's important to be able to study health well as disease. would argue you can only really do that by taking samples of the whole population not just a Group of apparently representative cases and controls.So recap the general benefits of uk biobank lie in public health and look at howFactors work together in populations, clinical medicine, understandingDisease Groups better, particularly k lookingAtCenturyPrognosis is the essence of goodClinical medicine.And biosine, particularly the bio marker disease associations.And the process of doing biobank raises a whole lot of issues that we've had to work through.And we think ha will have inthenfits for others.Particularly our work on ethics and governmentThe whole approach tends to provide better access to resources for scientistsAnd it promotes InternationalCollaboration and in some sensesIs efficient and economicallyBeneficial as well.Moving on to the details of biobank itself.How is U.K. biobank funded?WellFourResearch funders Cale together.Cost is 61 million-pounds about million of which the lion share comesMedical researchCouncil and the welcomeTrust, a large biomedical research charity, well as the governmentDepartment of healthIs that a lot of money?It's approximately the cost of aFilm three cost the same as biobank.Some will argue that terminalThree made a profit.Bio bank may make a profit too.Of course the point is the value statement for biobank is thatThe value of the resources isWorth a lot moreThan the cost ofCollecting it. that becomes increasinglyTrue as time goes on.TheHealth service in the U.K. spends the same amount in eight hoursSo if we canHave some benefit on health care it will seem a small amount ofMoney.Again another comparative cost. cost of biobank is about1 percent of the spend onBiomedical research in the U.K.SoFunding a project like biobank isn't really distorting funding priorities in the U.K.Well that's my bit on the fundingHow have we established biobank? it's important toProperly.It seems like hard work but I'm sureWorthwhile.We haveBoard that biobank itself is a company,Charity withAims.But an independent companyThere is aScience committee which advises biobank on all matters scientificThere is on the other side a separate ethics and government council which is independent chaired by the professor of bioethics which advises biobank on ethics in governments particularly in relation to the interest of participants and will continue to advise biobank and will speak pubically about whether biobank is conforming to its ethics inGovernmentPolicies.WeCollaboratingCenters whichScientificGroups around the country comprising 22 universities in all.Approach is to try to be as efficient as possible this.Is a very large scaleProcess.If we're not efficient we will fail.It's easy to spend 61 million-pounds and not deliverBuy oi bank.I itPossibleTo61million-pounds andBiobank it is an industrial scale process.I would emphasize the need for process planning and project planningEarly on.And we've done a lot of thatA distribute scientific collaboration is think is only way to do but you do have to haveStrongCentralThere is the potentialBuild a tower of babble in producing these big projects and there's a fine line to beCut between having masses and masses of torque and no action and enough torque to make sure that you haveCovered all the bases that needCover.We particularly value theCollaborations and we've had a number of meetingsWithPeopleIn UnitedWhichHelped a lot.Send out our material for comment quite widely and again we very much appreciate the comment that we receive from the years.So we'llParticipantsWe're probably not guying use practices themselves that much.Essential recruiting to biobank is like launchingMobile phone network.You got to try with direct mailing attract half a million people toBuy into our idea and so after considerable thought and planning we are probably guying to take more of that sort of lineSo we will have a increasing we're going to start off relatively small try to get the procedures absolutely right in the first year then roll it out in a massWay. account for thisStudy that you tend to overshoot at the end if you don'tStopHow will participants enter biobank?Well they'll attend the clinic.We set up a dedicatedClinic.Do a data collection and againEfficiency of these processesSo important that weWe think dedicated clinics are the only way to do it.Satch amsTransportedCentral resource along with the data.The question is we hope we'll be on touch screen entry so the data will instantly beAmal gated into theResource.And a big emphasis on archiving and cure ating theSamples and the data for long termUseAnd ofBox five is very important it's always easy to forget this.In the end the resource is only as goodExtent you can distribute and make available the data and Sam for future use.It's important to put resources into that now as well.Data management is a big challenge.Just flick through this relativelyQuickly.A lot of dataAcquired at recruitment to doWith the questionaire, the samples and how the samplesStored.At the end we haveAsFollow people out we have information coming in from the nhs particularly but also research input as well from dedicatedFollow-up procedures.And it had to be gay mated into a intoSecure dataBase.All this isNew, it's got to be developed.There's a lot of interest from commercial suppliers and we're working with some am ofTo develop these systems.Although mostly it's the experience of researchers that really tells you what's going to happen.We also have a big investment in the U.K. InternationalProgram for it.Many millions of pounds beingSpentGrassing together theseData sources which may or may not be useful for us butWe're not dependent on them but they wouldHelp.We've done a lot of work on this.There was anGroup that pondered this.Reviewed theProduced a report which is available on the webFor peer review.And in the end we decided this is what we're going to do.We willGet things rolling but we thinkThe mistakes we've made will bePardonable in the future.Because of the way we've approached itIn essence we'reBlood in various different ways so they can be madeAvailable for the things that scientists want to do so there's going to be plasma and serum we can do baseline -- baseline hematology, baseline biochemistry. the key is storing blood in such a way that people canDoJee net studies well as urine particularly for me ta boletic studies.We'llStore blood so we can immortalize white cells in the future.Just emphasize the volume of work involved at peak we'll be recruiting 750 people a day.That's3,750 bottles arriving in the everyThe storage will generate24million tubes each of which identified withTwo national markers.This is a huge, huge resource.Quite a challenge to manageThe samples will be stored in two ways.Liquid nitrogen.You partly need that for wholeBlood to be able to mortalizeWhite cells at that low temperature.The problem is put k blood into these things is fine.Getting them out is a lot more difficult.Traditionally people have usedLiquidNitroStorage facilities and they are secure.But also used an automatedMinusStorageSo this is a system whereThe tubes you will seeAre stored in racksIn here and these areHeld at minus 80-degreesTheRobot operates at minusThis is a Markup working in aFactory but it's very similar to the one that will be built in our storage facility.The robot thenEssentiallyProcesses theSamples according to protocols which are computerized and uses a laser toRecognize the tube so it knows which tube it's handling all the time.They're used quite widely in the pharmaceuticalIndustry and they're usedEverywhere including restaurants have them for picking bolts from their cellars so if it's good enough for them. of course the huge advantageYouSet the thing runningAccording on the protocolThat the scientist has declined -- defined and it can issue up toSamples a day which can then be made available for analysis.Whereas to extractTubes by hand from liquid nitrogen it can take up to two months to get four to 6,000 samples out. one person working forMonths.Apart from it's extremely unpleasant work.So the health and safe issues.So this is I think the way to go.This is the way to do things in the futureAnd itCostOn the sort of scale we're doing.The cost of the minus 80 storage is about the same asCost of the liquid nitrogenStorage.So there we are.Ethics andThere's a huge amount I could say about this.To sumarize briefly, biobank's base odd fact that people are volunteers mostThat they can with draw any timeThey give broad consent to future use.And this is a huge issue. think I would be more optimistic.And I think broad has been quite widely accepted particularly in Europe asAnSentence approach to prospectiveResearchmentQuestion to what broad consent means and what safeguards you have to put in place to allow broad consent to be reasonable is a big issues and needs carefulConsiderationConfidentiality has to be assured and there's a lt of work that has to be done on this.We've chosen to retain control of the samples.WePeople are wary of their dnA being widely distributed and therefore we have tight controlSamples. other hand we haveFull access to evaluation of the samples andTest on the Sam as and data for appropriate purposes.The word appropriate needs to be defined soInternal and external uses of the science of potential uses of biobank.One of the safeguards that covers a lot of is this our independent ethics and governmentCouncil.We undertook a lot of public consultation before we drew this up and that was of the issues that came out of that public consultation.Felt an independent GroupWho could speak on their behalf was importantWe've also had a lot ofSupportParliamentWe've done a lot ofWork for the house of loords -- Lords the house of commons.In fact there's a veryBig report from the house of Lords on genetic databases which was done as early as 2,001Biobank's a bigStudy.Five00 thousand but it's notBigAnd you quickly run out ofIndividuals for a lot of studies.So it's sentence we can collaborateAnd collaboration means two things.ItEncouraging people to set up similar studies and working with them but alsoMeansNo good if we all set up studies which don't talk to each other.Which is why theWork of btg is important and theWork from cdc look at the outcome of the research studies.There we under the U.K.These population studies lend themselves to countries where you havePopulation registration so there's aNatural tendencies for Canada, U.K. and the Scandinavia countries to think of sit setting up these studies.I was in a meeting at Sweden -- sing apour last -- Singapore last week and we're hoping the U.S.A. will make a contribution.There are already studiesWhich clearly will makeAnd really hope that I will be astonished if the U.S.A. doesn'tReally make an importantContributionTo this world wideOf course you are very welcome to use ourIt would be great if we couldSwap.How far have we got? is the time line.We'rePilot studies.Testing the sample, handling procedures, testing the clinical procedures.We'll start integratedPilot studies which will look very much like the real study in September and we start the main study in January of 2,006.And from then on it's one personFiveMinutes five years. are we doing at the moment? are we all looking so tired.It isHard to work setting up these big studies.There's a lot toWe're doing the piloting.We're setting up the it infrastructure.And tryingThe-- we're planning how we approach the general public.Developing a communicationStrike that to support recruitment.The participants are fundamental to the studies.If you don't have the trust of the participants.If you don't convey the fact that we think they are participants, not subjects, then people will walk away from usSoTakeVeryWe're developing the protocol which was published about two years ago was really a proposal.There is a huge amount ofWork to