Gresham College Lectures

The AI Revolution in Cancer Imaging - Dr Richard Sidebottom

January 11, 2024 Gresham College
Gresham College Lectures
The AI Revolution in Cancer Imaging - Dr Richard Sidebottom
Show Notes Transcript

AI will be one of the most disruptive technologies, enabling safer, faster and more accurate healthcare. It will unlock smarter cancer imaging and new insights from medical scans that were indiscernible to the human eye.

This lecture will demystify the AI technological revolution and explore “why now?” and how to ensure AI is deployed safely and meaningfully.

It will discuss how AI deployed in radiology can empower healthcare professionals to provide compassionate and precision care for patients with cancer.


This lecture was recorded by Dr Richard Sidebottom on 9th January 2024 at Barnard's Inn Hall, London.

The transcript and downloadable versions of the lecture are available from the Gresham College website:
https://www.gresham.ac.uk/watch-now/ai-cancer

Gresham College has offered free public lectures for over 400 years, thanks to the generosity of our supporters. There are currently over 2,500 lectures free to access. We believe that everyone should have the opportunity to learn from some of the greatest minds. To support Gresham's mission, please consider making a donation: https://gresham.ac.uk/support/

Website:  https://gresham.ac.uk
Twitter:  https://twitter.com/greshamcollege
Facebook: https://facebook.com/greshamcollege
Instagram: https://instagram.com/greshamcollege

Support the show

So, thanks ever so much, um, for inviting me to present this talk here at this, this ancient place. As Martin said, I'd like you to tell you about what's happening in this area of AI as applied to medical imaging. And I'm gonna try and explain my understanding that I've built up as I've done some work in this area about what's going on under the bonnet of these systems, um, and some of the challenges in development of these systems. And then move on to a bit of, uh, speculation about what might happen next. So I'm hoping I can encourage all our curiosity and express some of my optimism, optimism and enthusiasm for, for kind of what's going on. But at the same time, I wanna steer away from the, the hype and feverishness that is kind of really, uh, prevalent in this, in this area. And as a bit of concrete context for this, there's a guy called Jeffrey Hinton, who's a, a very renowned, uh, pioneer in AI in the us And at, at a research meeting in 2016, uh, he said that we should stop training radiologists. And in his words, he said it, it's just completely obvious that within five years deep learning will do better than radiologists. Well, thankfully, I mean, I was just finishing my radiology training at that stage, <laugh>. Um, but thankfully most people didn't listen to 'em 'cause we'd be in a real mess if we had no new radiologists arriving at the moment. So I'm a radiologist. I'm, I'm a clinical doctor, specialized in imaging. I'm not a computer scientist or a mathematician, but I do want to stray into that area of, of looking into the clockwork of these systems, partly because I just, I think it's really interesting and I hope you will too. And, and partly because I think it's, it's quite often glossed over in these types of talks. And I think a wide appreciation and understanding throughout society of, of how these systems work is, is gonna be helpful. And like I said, I do think it's fascinating. So you've heard a bit about, about me from, from Martin. I wonder if I could find out a bit about this audience. I dunno if, if you are willing to raise hands if anyone's kind of worked in healthcare here. There's all this, there's, there's lots. Great. Um, how about kind of mathematicians or, or computer scientists or AI experts? Oh, right, we've got, we've got quite a few as well. So I hope I don't raise too many of your eyebrows with my explanations about what's going on. Um, all of us here are healthcare users and all of us therefore generate healthcare data. The Gresham College lecture that you're listening to right now is giving you knowledge and insight from one of the world's leading academic experts. Making it takes a lot of time. But because we want to encourage a love of learning, we think it's well worth it. We never make you pay for lectures, although donations are needed. All we ask in return is this. Send a link to this lecture to someone you think would benefit. And if you haven't already, click the follow or subscribe button from wherever you are listening right now. Now, let's get back to the lecture. Uh, it, this is an, an area of that affects all of us. And, and about half of us are gonna encounter cancer in our lifetimes. And, and perhaps some of you here have already had that experience. So I, you know, this is, this is important for everyone. I think, um, I'm gonna be reading, largely reading my presentation 'cause I just get mixed up. If I don't, um, if I end up talking about the next slide without moving it on, somebody shout slide at me.<laugh>, First of all, before we dive into the kind of AI side of things, I'd like to give you a whistle stop tour of the incredible and varied technology that we use in medical imaging. Um, so medical imaging has evolved from experiments producing the first radiographs at the end of the 19th century to becoming kind of completely ubiquitous and central to the provision of healthcare. We've, we've got the famous radiograph here of, of, uh, Anna Renton, the the wife of the X-ray discoverer, Wilhelm Renkin. And these type of plane x-rays, uh, are routinely used in healthcare and, and carry on being an essential part of practice today. And, and they're fast, they're inexpensive, and very often they're all we need to take to make the correct diagnosis. But in the late sixties and early seventies experiments by Jeffrey Hounsfield invented a new way of using X-rays where he, he scanned patients using a series of exposures to a narrow beam of X-rays, which he could then reconstruct to provide cross-sectional imaging like slices through a patient. So that's the birth of CT scanning and CT scanning has been just iteratively improved and improved since then. And now we can scan a patient from head to toe in a matter of moments, giving us just vast volumes of incredibly detailed and useful information. So as an example here of, of, uh, the difference in power between CT scans and plain x-rays, we've got a, uh, a patient who, who arrived in, in the emergency department and, and had a plain chest x-ray. They, they had, uh, chest pain and, and shortness of breath, and they were unwell. And, uh, if you look on the original image, there are some very subtle findings about changes in vascular markings in the right lung. But actually the CT scan that they then had shows this, uh, extremely obvious sort of, uh, um, uh, pulmonary embolus there with a big blood clot, that's a threatening problem. So it's this precision, um, makes CT just essential to our hospital care. So other modalities have been discovered, um, like ultrasound where high frequency, uh, sound waves are pinged in from a transducer held against the patient into the tissues. And an image is generated by listening to the reflected echoes. So we're all familiar with seeing ultrasound in the context of pregnancy scans, but ultrasounds also a vital tool, um, for imaging cancer patients. And it's one of our main tools that we use in breast imaging. So it has the benefit of being a portable tool as well. And so that allows us to use it at the bedside. And now there are even tiny handheld units that plug into your phone, but we use it all the time for performing image guiding interventions. So this image shows sheer wave elastography, which is one of the newer developments in ultrasound, then that, that gives us information about the stiffness of tissue as well as the, the shape of it. And we can also get information about blood flow. Um, and we can see that there's a small breast cancer on the right that's really helped by the elastography to make that more obvious. And this much larger lump on, on, on the right of the screen here is a, a benign fiber adenoma. So next, if we move on to MRI imaging, um, this was also invented in the seventies, later on in the seventies by a physicist called Peter Mansfield. And I think this technology has a strong claim to be the most incredible technology yet developed in any domain of human kind of endeavor, really. We, we lie a patient down inside a tube surrounded by an incredibly strong superconducting magnet that's cooled down to minus 269 degrees C. That's four degrees above absolute zero. So this lines up the protons within the patient and we ping these protons with a series of radio frequency pulses and then listen to the very quiet replies, and we can generate imaging from these signals using powerful computers and mathematics to reconstruct them. And MRI also has been continuously developed since it was invented. And, and it doesn't really show any signs of stopping in terms of, uh, its development recently. AI based image reconstruction has been deployed and that really can speed up the, uh, the amount of time we need to keep patients in those scanners by up to kind of four times whilst maintaining the image quality. So MRI shows us all sorts of different tissue properties depending on how the physicist build the sequence. Um, and, and again, it generates huge volumes of data. A typical breast MRI that I might report contains almost 2000 images to review. Um, and we can see here there's an image of a, uh, a patient who has breast cancer in their right breast. Um, and this kind of side view here, this zoomed inside view shows that it's invading the pectoralis muscle in great detail. So that's really useful for, for the surgeons to know about before an operation. Um, and this is a, a beautiful image of, of a, a full body MRI in coronal section as if just a slice all the way through the patient. And the on the left, we've got, sorry, the right, uh, the, the beautiful image of Mr. Tractography, which, which shows the neural pathways in the brain. And there's loads of other ways that the MRI is use useful. So another type of imaging concentrates on the functional aspect of health and disease. This is, uh, uh, called nuclear medicine or, or, and based on radioactive nucleotides, it, uh, sometimes rather ungenerously used to be known as unclear medicine because of the rather sort of blobby images that you get that kind of low resolution compared to other bits of radiology. But this, again, is changing dramatically. And there's a technology called time of flight, uh, PET imaging that's just a mind blowing technique. So PET imaging uses radioisotopes that emit antimatter particles called positrons. Um, and that's the antimatter equivalent of electrons. So these are generated in a particle collider machine called a cyclotron. So we then bind these a positron emitters to a sugar analog mono molecule that's designed to accumulate in highly active cells, including cancer cells. And when the positrons emitted encounter an electron, there's something called a matter anti-matter annihilation reaction that occurs that releases the energy from those two particles, which no longer exist, but it becomes two different gamma rays heading off in 180 degrees to each other. Um, so these gamma rays are then picked up by detectors, and it can measure the difference in time that each one of these arrives. And then it can calculate the distance that this annihilation reaction occurred at. So these gamma rays that are moving at the speed of light and they're moving over a distance of less than a meter, I I just find it completely incredible that someone has created technology with that kind of precision. Um, I mean, the whole thing with pet imaging just sounds like science fiction with anato particles and annihilation reactions, but it's happening every day throughout the developed world. It's amazing. So we can see that these, this is a, a pet scan of a, a lady with lymphoma and it's affecting her breasts and some of her lymph nodes. Um, and we can also see some of the normal, uh, parts where the pet tracer is accumulating in the kidneys, where it's being excreted and into the bladder. Um, and also up into the brain.'cause the brain is highly active tissue. And I hope all of yours is during the talk. So at the same time as doing the PET scan, uh, we do a CT scan as well, and it's in, in PET ct. So we get the highly detailed information from the CT scan and the functional information from the PET scan giving us these fantastically useful, uh, images. So we've got all these incredible sources of imaging and functional information coming from our radiology departments. And in a way this is, uh, both a blessing and a curse. So this is, there's ever increasing pressure on radiology departments from the sheer weight of the clinical imaging, which is required. And there's also undoubtedly some information within this imaging, which we're struggle to appreciate. So a very simple example is, is shown by the way that we have to review CT scans. So humans are able to, um, perceive approximately 32 levels of gray, but the information recorded with by a CT scanner includes much, much greater, uh, kind of subtlety of a range of gray scales. Um, we therefore use what we call windowing, where we limit the range of gray levels displayed to demonstrate different features. So for example, we might review a CT scan first on, so-called soft tissue windows, which give us the best gray levels to show the lesions within the soft tissue. So we can see the lumps here within the liver of this, uh, cancer patient unfortunately show, uh, liver metastasis. But if we want to see the lung lesions, we, we can't see them clearly on this soft tissue window. So we move to lung window where we can see some lung, um, some, uh, lung metastases at the top of the right lung there, um, and some probable bone metastases on the bone windows. So there's a whole area of study called radios, which aims to identify and use quantitative feature, quantitative features, uh, within imaging that might go unappreciated. So it's part of the application of AI and radiology. But I'm, I'm not gonna discuss, uh, radios in such detail. I'm gonna rather concentrate on the broader kind of image, image classification type technologies. But using computers to analyze medical images has, has a really long history, um, trying to attempt to address some of these problems and some of these opportunities. And this has really heated up in the past few years because we've now got so much data stored digitally in hospital systems, um, including containing information that we know that we are not making the most of. Um, and then we've got the advancements in computing power and advancements in artificial tech, uh, artificial intelligence techniques. So we've got really good questions to ask and we've got some really good ways to try and answer them. So before we dive into these details, let's define a few terms that we use. So the UK National AI strategy defines AI to be machines that perform tasks normally requiring human intelligence, especially when the machines learn from the data how to do those tasks. And there's a guy called Stuart Russell, who's a professor at Berkeley, and he did the BBC reflexes a couple of years ago. I I really recommend them. Um, he says that AI is about building machines that do the right thing, that act in ways which can be expected to achieve their objectives. And then importantly, he kind of goes on to discuss how we set these objectives and, and the possibility and dangers of miss specifying these objectives. So to define a few more terms used, um, machine learning refers to the development of programs that are able to alter their internal parameters on the basis of, um, of, of the data presented to them without being explicitly programmed. Neural networks are a design of computer program that's loosely based on our understanding of how brains process information and deep learning is a type of neural network design that refers to multiple steps or layers. And it's these type of networks that all of the, uh, recent excitement has been about. So to look at some examples of what's been going on, I'll concentrate on my home turf of breast imaging. Um, and let's illustrate this with an imagined patient who is a kind of composite of many different patients I might see in my clinics. So Sheila attended for her first breast screening appointment, age 51. No con no concerns were found on her mammograms, although we did notice that she had quite dense breasts and we know that we are more likely to miss things with the so-called dense breasts. She diligently self examined on a regular basis, and 18 months later, she found a lump in her right breast. So her GP referred her to the hospital breast clinic for further assessment. And there we found that she had a concerning lump on examination and we could see that on ultrasound and on the mammogram we did that day as well. We could kind of see an area of tissue that probably, probably did, uh, correspond to that lump, though it was difficult to appreciate 'cause it was essentially difficult to distinguish from the surrounding tissue in that sort of situation. We always go back to the last screening mammogram as well and scrutinize that looking for, uh, whether we missed the obvious things, really, um, making sure we learn from, from cases. Um, and, you know, possibly there was a lump there. It again, it looked kind of indistinguishable from the rest of the tissue and that's, that's not an uncommon situation. So it's fair to say that best breast screening is by no means perfect. We only use mammograms and we only just make a binary decision about do they probably have cancer at that moment or not. Um, so in the UK and most of Europe, we use double reading where each set of mammograms is read by two different readers. And if there's a disagreement between them, the case is decided by a third reader or, or a group of readers. So despite our best efforts, three outta four patients that we recall for further imaging, because we are concerned about the appearance of their mammograms, they don't have cancer, they're false alarms. And unfortunately, like Sheila, about one third of cancers that occur in screen, women aren't detected by screening. So this room for improvement in breast screening together with the kind of high volume of the imaging and the well-ordered clinical data where perhaps the objectives are a little easier to specify than in many cases of radiology and medicine has this, has made breast imaging a, an attractive AR area to try and apply computer analysis. So initial approaches used kind of basic image processing techniques and involved radiologists and computer scientists working together to create mathematical descriptions of the features that we look for, um, in mammograms like size and shape and density. So these features were then analyzed with advanced statistics and kind of classical machine learning to identify these handcrafted features and to classify them. And this is the basis of older generation computerated detection or CAD systems. So these, these older CAD systems have been widely used in America, but they never gained traction in Europe, perhaps because they never came close to human levels of specificity. That is they had too many false alarms on them. So in a 2015 study done in the US by Professor Lehman and her colleagues, they looked at over a half a million cases and concluded that CAD does not improve the diagnostic accuracy of mammography. So these new approaches are based on artificial in, uh, neural networks and deep learning. So artificial neural networks are inspired by real neurons and they have multiple inputs, some kind of information processing going on in the cell body, and then an output to multiple other cells. And a single artificial neuron has a similar configuration with lots of connections to both input neurons and output neurons and some kind of very simple information processing, just a sum really in the mathematical neuron. So these systems must be able to alter their internal parameters to enable them to learn to produce a useful output. So each of the connection there is a weight which increases or decreases the contribution of that connection. So we can see these are sort of roughly analogous. So these artificial neurons can be built into a network. This has an input layer, an output layer, and then hidden layers in between where incremental information processing occurs. In our case, we might want to input a mammogram image and output a classification of cancer or not cancer. So initially, if these weights are set randomly, the output will be non-discriminatory, then we need to train the network. And this is done by showing the ground truth of an example to network and calculating the error that the network makes compared to this truth. So this is known as supervised learning. There's a classification label that's provided to the AI system. So they then use very clever mathematical technique called back propagation, and that incrementally optimizes the weights throughout the entire network to minimize this error. So the weights are adjusted to strengthen connections that leads to a useful output. And this is repeated and repeated and until the output is optimized, and this leads to pathways and features which have stronger predictive results becoming dominant. And the final output of these systems will be a probability for each class. So we'd hope for a cancer case that we get a much higher probability that it's a cancer. But as for human readers, there is kind of some nuance to this. So some cases are really easy to be definite about and some we are much more uncertain of. So the type of network, which has been widely used recently for image classification tasks is called a convolutional neural try again, convolutional neural network. Um, it's a design inspired also by biology, by by the biology of our visual systems. This time having kind of receptive fields and layers of analysis. So the maths that's performed by the network is a convolution or a filtering, and a filter is like a little piece of an image that contains a specific feature or a pattern. In this case, I've drawn a diagonal line. So this filter is applied or translated across the input image representation from the previous layer, and it makes an activation map or feature map of where features are identified. And then this map is passed onto the next layer. So the filters are, are described by the weights which are trained and optimized during the SPAC propagation process. So the filters are adjusted and optimized to identify the important features that lead to improved accuracy of classification. And as the network is trained, the initial layers might contain simple representations like lines and edges and corners, but the deeper layers become increasingly abstract. This complexity allows the network to accurately classify the input, but it also means that it becomes effectively impossible to understand kind of what's going on in these deeper layers of the network. So various different types of layers make up these networks. And there's convolutional layers used for feature extraction, pooling layers to make the maths work. And then output layers used to perform classification. So the networks used for medical imaging classification might have over a hundred layers and millions and millions of connections and weights. And it's this complexity that allows this to be a much more versatile and powerful approach than the previous effort with handcrafted rules. So kind of that's my understanding about the fundamentals of how convolutional neural networks learn. Next, we need to think about how to train them. So they need lots of data. Um, publish studies and findings from employing these networks in a non-medical setting suggests that basically the more data, the better, um, Data sets used in the context of breast imaging, uh, typically contain tens or hundreds of thousands or even millions of images. So the data set will be divided into training and validation and test sets. And the training set is used by the network to set its weights again and again and again. The validation or tuning set is used to evaluate the performance on cases not seen while training to check for generalizability. So the results of this will be used to monitor how well the model is learning and tune the network architecture, like changing the number and order of the hidden layers. Now, the test set shouldn't be used until you want to demonstrate the performance. Um, it should be totally independent from the development set ideally. So as an analogy, the test set is like the final exam. The validation set is all your past papers, which you can refer to as many times as you like. And the training set is all your textbooks, all your notes, and all your relevant experience. So the Tests should be representative of your real world use case where you are hoping to find a, find a case for this. So if there are biases in the validation and especially the test sets, then they may, these systems may not perform as expected. So training data is vital and we'll come back to that right at the end. So there must be lots of it, it must be reliable and the outcomes that you're hoping the model to help with must be accurately represented in this data. So kind of where are we up to now? I'd, I'd like to just have a think about how we look at performance of these systems and where we're up to in terms of the research and the use. Again, I'm gonna concentrate on breast screening. So this paper from 2020 was the first to do a really good comparison of breast screening AI systems. You can see that they present the performance of these systems as curves. These are known as receiver operating characteristic curves. Essentially these plot how good the system is at correctly classifying cancers against how many false alarms it causes. And that difference occurs as you change what probability threshold you use for calling a case a probable cancer. So the ideal performance will be going straight up to the top left corner. And then across the top, These were, these curves work really well from comparing ang algorithms as they're under development or is in the case of this study to compare different algorithms providing you use it on precisely the same set of cases, but if you want to compare to, uh, human readers with human readers, you won't, you can't get a curve. Um, you'll only get one value per reader, really. And that gained from looking at lots of cases. And actually this is usually averaged over lots of readers in order to reduce the margins of error in the statistics. Um, so with this study, if we add the values of sensitivity, um, and specificity from the average human performance for the first reader and the second reader, and then this consensus thing that, that's what I was talking about with the, with the double reading in, in, uh, breast screening, you find that the, the performance is kind of, kind of close to the first and second reader, but can't quite, uh, reach the consensus opinion. And this study that I was involved with found had very similar findings. This was also in 2020 where uh, it looked like it probably outperformed the first reader similar to the second, but couldn't quite read consensus. And we have to remember that consensus, uh, the consensus opinion is our standard of care. So, so it looks like probably the performance is inching up a little bit in the, in in the interim. But basically human, human expert, human and machine performance are similar. Perhaps on average machine performance is now beginning to, to, uh, outperform a single expert, but it's something in the same kind of range. Now these studies have been using retrospective data and this approach has several strengths including access to really large volumes of data from hospital systems. And that contains real clinical results of what happened, both in terms of the radiologist interpretation at the time, um, and then crucially in terms of longer term outcomes. So both for interval cancers that present symptomatically like our imaginary patient Sheila did, and also for cancers detected at the next round of screening. So there are also a few reasons to be cautious about this approach. It's inherently a look at older data and therefore it may not truly represent the performance of a system on current screening images that are required today. And obviously that's what we actually need to use in practice. Um, and it doesn't capture the kind of result of the human interaction with that machine that will be vital to check that we can correctly act on these uh, systems. But happily, there is now increasing positive evidence from, uh, real life prospective work. Um, and, and there's been three papers out of of Scandinavia just over the last few months, um, showing really promising work. So this, this paper published over the summer showing their preliminary results on the MAI trial, and that's been done at four sites in Sweden. They've looked at modifying the normal double read process whereby cancer, um, ca cases are considered by the AI system. Um, and, and a risk is attached to them. Now, if they're in the lowest 90% of, of the risk according to the AI system, they have only a single human read, but the human read does have access to the AI output as well. And if they're in the highest 10%, they have a double human read. Um, and also both readers have access to the AI output. So these pre preliminary results have shown a small increase in the number of cancers detected with no increase in the number of false alarms, and overall a 44% reduction in the number of, uh, mammograms that need to be read because of this change from double reading to single reading. And it's also, I think, really important and interesting that it's very well accepted by the screening population with only a, a fraction of 1% declining, uh, to participate in this study. So in terms of what's going on in wider radiology, well, there are loads of informa interesting research papers coming out all the time as showing all kinds of novel findings, like being able to accurately predict valvular disease from chest x-rays in a way that we wouldn't, we in some ways we, we know that we can do that, but, uh, certainly looking more impressive than what we can achieve just by looking at them in terms of products available for clinical use. There's this website from Radboud University that's really helpful. It lists all the AI type interpretation products which have been awarded a CE marking. So last time I checked, uh, there were 220 products on this website and there's a similar one, uh, for the us. Um, now this covers most areas of radiology. So, so far most of these products seem to be quite narrow in their scope and many aim to kind of give us improved efficiencies and tasks that we can already do, like, uh, monitoring lesions over time, but quite a few offer capabilities which we're not able to do explicitly ourselves, like quantitative risk prediction. So onto the next bit of my talk, as you might be aware, there's now the next generation of AI developments arriving and it might potentially render most of what I've just discussed, uh, obsolete or, or certainly, uh, change it. So this is the advent of the so-called large language models that, that are based on transform networks. Hollywood has kindly named, uh, one of their latest films, transformers Rise of The Beasts, um, and even written it on the side of a bus, which, uh, gives us a nice slide to start off this section with. And perhaps a question for researchers like me is, is, is do we try and get on that bus? Um, it looks like they're gonna be central for developments in imaging AI and, and healthcare AI in general. So again, I'm gonna try and look at how this approach functions under the bonnet. So transformer models were proposed by Google researchers in 2017 as an efficient and powerful method, originally demonstrated for translation from one language to another, um, that they've become the basis of these large language models, which as far as I can tell, have surprised everyone including their designers for the kind of power, power and capabilities that see these systems seem to have. So they're based on the idea of self supervised learning originally from text, but now also using multimodal data such as images. So part of the power of this approach is that they can ingest a piece of information and use the trick of guess what comes next. Um, and that can inform their learning. So all they need to do is to mask out what word or piece of information comes next. Make a guess, compare that guess to what's really there when you unmask it. And then update the weights to minimize this error in a similar way to, we kind of saw with the other. This means that the training sets are, are kind of everything. They're enormous, all of the text on the information, uh, all of the text on the internet and all the books they can get their hands on. Um, that's what they've used this. Guess what next needs to be based on kind of context. So these models have to learn and refine their representations of language and concepts in order to be able to do that. Guess what comes next accurately. So let's use this example a sentence, uh, which is a slightly strange sentence, but it allows me to show off some of my children's artworks. Do you<laugh>? So the system will align, will, the system will assign a numerical value to the words it's known as tokenization. And that's using some kind of learned dictionary approach. And it will also record the pos the position of the words in the sequence in a much more complicated way than I've written here. It will then generate a series of numbers called a vector, which can be thought of as a coordinate or an address for how this word should be represented in some kind of meaning space known as an embedding. So it calculates this depending on an attention mechanism that it learns. So it learns the significant associations within sentences or longer pieces of text to give the words themselves proper context. So the model architecture here is shown on the left of the screen. Um, and as you can see, even though it's a simplified diagram, it's quite complex. Again, in a similar way to the multi-layered approach, uh, used by convolutional neural networks. These transformers have multiple layers. And again, this leads to an increasingly abstract and complex representation of this data that they're presented with. So there's also a decode, a part of this network which extracts this information using some of the information, using some of the things it's learned to encode it in the first place. And then the output is used as the feedback to alter the weights and improve the accuracy in predicting guess what comes next. So if we sem, if we represent this meaning space as a simple 2D graph, we can show that the cat has been placed close to the dog in the X value and that the cat is close to the milk in the Y direction, but the, the milk and the dog are further apart. So if we then add a third dimension, then all of these words might be related as words that children first learn to read. So they have similar values in the Z direction. So I can't carry on this similarly any further 'cause I can't visualize after three dimensions. But if we simply think of a list of coordinates, then you can imagine this extending into multiple dimensions mathematically. Um, but whilst maintaining a kind of meaningful relationship between the information coded for in these coordinates, so apparently chat GPT, sorry GPT-3, which is now an obsolete model, has about, has just over 12,000 dimensions in this, uh, in this embedding space. You know, another dimension with more abstract concepts might include where pet cats are related to factors leading to the loss of wildlife. Anyway, these, so-called foundation models can then be fine tuned for specific tasks with smaller targeted data sets. Um, so perhaps in the medical domain and a variety of, uh, methods are then used to make them behave in a specific way, such as is done with reinforcement learning from human feedback that they use to make a conversational chat bot like chat GPT. So already these systems are at kind of genius level for language tasks. They've got a massive working memory and they show what is known as emergent skills where they can handle tasks that they've never encountered in training and they can now handle multimodal data as well. So overall, does that sound a little bit like what we hope to do when we practice medicine? Possibly on the downside, we also know that they may be biased, they might make stuff up that's blatantly untrue, um, and have major reasoning errors. Um, and at the same time they'll sound scarily credible because they're grammar and language will be perfect. And if we're feeling introspective, it's also sounds a bit familiar to what we do as medics. Um, but it's clear whatever, that these are really powerful technologies. Um, and some people are even claiming it's the start of artificial general intelligence. So the way this information is handled by these multiple layers of abstraction means that it's so complex that, and nobody understands really what's going on in there. But I think it's fascinating how these work and what the implications for these are. Um, forgive me for going off into a bit of blatant speculation for the next bit, but at some level they must be generating some kind of reasonably accurate model of the world just from the information that they've scraped from the internet. And if this is possible, then presumably they can also be persuaded to make realistic models of humans in health and disease. So the possibilities are pretty exciting, I think. So people have talked about the idea of precision medicine for a long time, and whilst we're inching towards that, these type of technologies could really deliver a transformational change towards achieving this. I think, uh, this diagram is from a paper published by my colleagues at the Royal Marsden regarding our ambition to move towards a integrated diagnostics and discovery system. This illustrates the strands of data about a patient like pathology and radiology and genomics and medical records that are currently not exploited as we'd hoped they could be. And the idea that these could be woven together to deliver both improved care for the patient and also new research insights. So this path per is about the kind of the data and the idea and the possibilities of joining this paper. This data together is not about these new AI systems, but I think, you know, it seems to me like these could go together very nicely being sort of rather speculative. Again, this, this type of technology could allow us to understand the messy and complex information that's routinely collected in clinical care and approach questions that we currently aren't able to answer because of this complexity and even make new discoveries. So essentially kind of enrolling everyone in an observational clinical trial, you could envisage such a technology having implications throughout the PA pathway, for example, suggesting personalized screening based on family history, general healthcare information, and possibly genomics in addition to using the imaging in a more advanced way than just cancer or no cancer. So if we go back to Sheila, you know, more perhaps a system that could have suggested she needed additional imaging using a different modality might have detected her cancer earlier. So once a cancer has been diagnosed, we might offer kind of accurate tailored prognostic information and guidance on what treatments might be best. So going back to that idea of death, what comes next? It's exactly what we're trying to do when we're making these decisions about patient's treatment. So for example, a model that can combine imaging from both radiology and pathology might be able to more accurately predict the utility of different chemotherapy regimes. Um, if we add in information from the electronic health record, we could also predict the toxicity risks for that patient and that might help us tailor further the, the kind of treatment recommendations. So it could be useful for helping us with surgical planning or, or making, uh, really effective and efficient post-treatment surveillance regimes rather than just sticking to a, to a standard model. So ultimately everyone attending hospital could benefit from this kind of individualized precision approach and in turn their data could be used to update models and improve performance for future patients. So before I finish and we move on to a kind of question and answer session, I wanna get some questions and concerns in there and perhaps we can all discuss these together. Um, Reliability of these systems needs to be demonstrated and we can do that both with retrospective data and with prospective studies as we've discussed already. But then we must continuously monitor performance of these systems and look actively look for adverse outcomes or drift in performance occurring. For example, we know that changes to a mammography machine will produce differences to an AI performance if it's been trained on the older version. Um, it might require recalibration or it might require additional training to overcome this. And this raises an interesting question for, you know, when a new mammography system or or any system that's been a similar context, if you see what I mean, when we've got a new system introduced, we'll have no prior prior cases to learn from or or to test performance on. So yeah, it could put us into quite a difficult position and particularly in the con context of our other unresourced NHSI can envisage a situation where we no longer have sufficient trained people to return to a doubled reading, a human double read without any AI assistance. Um, and so do we remain stuck with the old system? Do we have to tolerate periods of increased uncertainty about performance? Should we just keep our fingers crossed that further research and development will give us robustness of AI systems? That means that we don't have to worry about this problem so much or should we wait until that's actually been properly developed and delivered before we start to deploy these systems? And overreliance on these systems in general could make our hospital systems much more prone to causing serious harm when it, it failures occur. And we'd hope that as we get more advanced, it failures become less common. But I don't know, in my experience it seems like they're becoming more common for all sorts of just crazy, quite trivial reasons. We can just knock hospital systems out for half a day or a day or, uh, yeah, uh, so that it's a real danger there. Um, The human machine interaction is kind of key to this. There's an inherent black box element to all these systems where it's usually not possible to understand explicitly how these machines have made their decisions. There's lots of research into this and I'm not so worried about this aspect because I think partly by familiarity with these systems, we'll get better at using them. Um, and partly because I think there will be advancement in explainable AI because there is lots of effort there. And after all, I don't really understand what's going on in my head. Well, I don't understand at all what's going on in my head in terms of how I make the judgment that I'm worried enough about that patient. But I do learn how to explain what I've seen on those scans to my colleagues so I can convince them that, uh, of, of what I've seen and they can decide whether or not I'm spouting nonsense. Um, and at the moment we're, we're at this area, I think a relatively comfortable position where I think in almost all circumstances the best way to achieve best performance is to have a human and AI system working together. And this is also what patients are happiest with from, from what I've seen from some, uh, from some, um, patient involvement events. However, I don't see any reason why the performance of these systems won't continue to improve further and that puts us into the uncomfortable position that as if, if I as a clinician intervene or overrule the AI decision, then on average I'll just introduce error and I'll make outcomes worse for my patients overall, though of course it might be that for that particular patient on that particular day I am right and the AI is not. Um, I think there's a issue about kind of in some situations too much information. Uh, there's ignorance, it's bliss, isn't there. One, one likely outcome from all this is we'll be able to give patients lots more and lots more accurate information about their risks and prognosis. Um, and this is not something that we're all that familiar with, really. We just dunno how people will respond to this, both clinicians and patients. So currently in clinical genetics, there are very careful conversations that happen pre-testing to forewarn patients that their tests might show results with profound consequences. And that leads some people to choose not to have testing at all. And in some situations that's probably a good choice. I think that's just something we need to be aware of. And something that I'm a bit obsessed with is kind of, uh, is this aspect of who owns what and control of these systems. And as we become, as all become more pronounced in, in other areas of society where AI impacts our lives, I'm not ideal. I'm not wild about the idea that adoption of these systems could cause a centralization of power and control. Um, and this funding and control is such a interesting subject for healthcare in my opinion at the moment. Uh, research organizations and AI systems tend to be granted access to anonymized patient data under a var variety of agreements, usually for time limited periods. And these, they develop products from them and these are owned by the companies who will then need to make some money from them. Uh, in some ways these systems are a distillation of the data presented to them. They're a very complex digital filter to make these products work as well as possible. For all of us, as much data as possible needs to be available. And if essentially the whole population is contributing to this work, we should, should, I dunno, should we all somehow have a say in and a control in in what's produced? I'm not sure. So to sum up, um, while there are huge concerns and problems to be carefully addressed, overall I'm, I'm excited and enthusiastic about the application of AI in medical imaging and in oncology in general. Um, the revolution alluded to in the title is underway, but it's not yet very tangible in its impact in the clinic. So we can see that these new tools that are now beginning to be developed and deployed look like they're having real clinical benefit. Now. The speed and the scope of the AI developments are startling at the moment. However, the approach to medical research is necessarily much more cautious. Robust evidence from trials and practical experience from the early deployments will need to accumulate before we're ready to rely on totally novel methods. But if the new generation of AI technologies is able to deliver anything like their promise, then we might just see some entirely new approaches to some aspects of healthcare. So thanks very much indeed. Um, I'm hoping that some people in addition to kind of questions have also got comments and, and statements and, and answers to some of those things that I've raised. Um, and yeah, so thank you. Thank you so much Richard for a really wonderful thought provoking talk and for sharing your children's really wonderful art <laugh>. Thank you. It was very, very good. Uh, we do have some questions on the tablet already, so if you're happy for me to ask the first one and to the, the question who asked this, I'm going to elaborate on it a little bit if that's okay. So the, the question is, can this detect lobular breast cancer? And how I would like to expand upon it is when we have the publications which looked at the concordance between the AI and obviously the human read, was there any difference between the subtypes, lobular, ductal, Et cetera? Yeah, that's a, that's a really good and important question. Um, and uh, people are looking into to that, but I think we don't know enough about it. My observation is that the same sort of cancers that the humans and the models work in a more similar way than I perhaps expected that they did. But the, there is more overlap between what AI models detect in terms of breast cancer compared to two humans. So that gives an opportunity that perhaps between the two, between the human reader and the AI reader, you can detect more cancers altogether if you then arbitrate the disagreements correctly or somehow you make the right decision about which ones you're gonna act on because you still get lots of false alarms as well. Um, I think that that question about lobular cancer is really interesting because, uh, it's my view that really what we should be doing with these AI systems is, is using massive volumes of training data, population level volumes. And then we can start to train an AI system to look Just, uh, at the most dangerous cancers, uh, to look at, to classify things not as cancer or not cancer, but you know, it thinks this is a, a grade two lobular cancer or, or it thinks this is, yeah, this is a, an eight millimeter possible high grade cancer.'cause then we could start to use approaches where actually if it looks like if it's falling into that category of a much more dangerous cancer, then we accept many more false positives in that situation. False alarms, if we're gonna catch the ones that are most likely to lead to bad outcomes compared to, yeah, if we try and train for much more indolent cancers that actually may not matter, may not need diagnosing at all because they might just stay there for the rest of the patient's life.'cause we know that, that that situation occurs as well. So, but to, to be able to train for that, you need really large volumes of cancers in, in your, in your training set. Um, and I think that's part of why kind of it, part of why the question about kind of population level support and participation in, in the, the technologies that are developed is important. Thank you very much. Do you think that privacy laws and new regulations will accelerate or slow AI's integration into medicine? Oh, well I think it's vital that we, we protect the privacy of patients. Can you, can you imagine if you go and have your mammogram and then it's you, you start being advertised brows of a particular size that's just right or, you know, uh, it it is absolutely vital, isn't it? Um, this area of regulation of, of privacy is just evolving all the time. Um, I think people quite often do feel that it impedes things, but I I, my view is that it is just so important to get it right because we need everyone to trust what's going on. So if it takes us a little bit longer, i, I, you know, it it needs to, it needs to be, it needs to be trusted and everyone needs to be happy with it so that we can make the best of it. Hi, um, thank you very much. It's great talk. Um, so I'm coming from more the AI side rather than, um, radiography, I dunno anything about radiography to be honest. Um, so I was just more curious about the training or data pipelines. If we think about the deep learning models for now. Um, like is this data publicly available? Like how large are the data sets that we're talking about? Um, yeah, just, just curious about that training process. Yeah, so there are some publicly available data sets, um, particularly from America actually.'cause I think their research, uh, infrastructure demands that in certain situations you make things publicly available. The one, so I've been involved with A-C-I-U-K project called Optimum. Um, and that's started as a kind of physics project looking at what factors in mammograms made cancers, uh, either be permissible or, you know, the, the kind of physics of mammography. Um, and that changed into, uh, a big data set that's been useful for training AI systems. So, uh, that, that, um, database originally came from three hospitals. I think it's, it's now expanding to six or seven. Um, and you can apply to, it's not publicly available, but it's the public can apply for, uh, access to it. So, uh, academics, um, uh, can have access to it for a small fee and commercial companies can have access to it for a large fee is kind of my understanding. Um, and that's, that's run by CR uk. Um, and it's the guys at the Royal Surry County Hospital who, who, uh, have kind of built that. I wonder, COVID demonstrated the NHS, uh, provided a national basis for research and a lot went on in Southampton. Um, and there was a huge commitment amongst that local population to make their, uh, data to participate in that research. Uh, I just wondered as regards your topical, uh, uh, inquiry, whether an elderly cohort concentrating on them would be, would enable you to get more results, um, more effective results, uh, rather than a population-wide, uh, study. Um, obviously older people are gonna exhibit those signs of, uh, uh, cancer more new. There's gonna be more of them literally. Yes. Um, I mean cer certainly that that could be the case for your training cases. You can have a kind of biased population, for example, over sample, but have many more cancers in that population than than you'd see. Um, and so you, you could train a system using kind of unbalanced population like that so long as you then also have sets of data that represent what you want it to achieve for the kind of the validation bits and then crucially the test sets. So yes, I think so. So This is a question I'm really interested in your thoughts on and indeed those in the room for rare cancers which prove challenging for clinical trials. How can we overcome the lack of data on them to train the AI with and have that test data? Yeah, that's a really good question. Um, and, and these systems, I mean, why breast cancer has been attractive is 'cause it's got really loaded, it's common cancer with loads of data and our, our brains are very good at being adaptable. And so when we learn to do one thing, we can kind of apply that. So yeah, that's a, that's a real problem. It might be less of a problem for this new generation of AI systems that, that they've, they, they claim kind of zero short learning where and, and where it can do things that you just wouldn't expect it to be able to do that it hasn't seen in training. So who knows that that might, that might help address it. I think it's definitely an area where there'll be focus. Where there say again, sorry. I say I think it'll be an area where there'll be focus considering the high unmet need for rare cancers. Yeah. Um, my question was very simple, I think. So, um, how do you think genes or silent genes would be used for detecting cancer from genomics in a person who's exhibiting cancer from your radiography screen? So how would you use the gene or a silent gene to confirm cancer is my question? How would you detect cancer from a silent gene or a gene from your radiography? Um, I mean, both imaging and genomics have a, a really strong potential to kind of work together. I mean, the kind of genomic medicine, these kind of liquid biopsy type techniques might well replace the need for image-based breast screening altogether ultimately. Um, so yeah, I don't, I don't know how they'll go together, but I can imagine that, that, that working out the nuances of what's important about a cancer, both to be able to diagnose it accurately in the first place and then to know how to treat it, that that could work very nicely together. But What are the biggest obstacles to get this adopted on a wide scale in the NHS Kind of logistics? I think, um, we've got quite an obsolete, uh, information system that breast imaging runs on. And there is, there, there is a, a call out currently for running a big trial of AI in screening. So that hasn't been concluded that that funding call, and it certainly hasn't started that trial. But I think the, the hope is from the people who wrote that call, that that trial will also kind of force a system to be developed that can then be kind of turned into the next generation of, of screening information system. Thank you very much again, Richard. Well thank you. Thank you Nikal thank you to Novartis for sponsoring it and we very much looking forward to seeing you at the next lecture. Thank you. Thanks.