Artificial Intelligence in Drug Development
Innovations and Challenges
Donna Snyder, MD, MBE, Executive Physician, WCG
Artificial Intelligence (AI) has the potential to impact all areas of drug development from the identification of viable compounds to the recruitment of participants, design of the protocol, analysis of data and everything in between. AI may have transformative effects on drug development if potential risks and biases are appreciated.

Although Artificial Intelligence (AI) took root in the mid twentieth century shortly after the advent of computers, AI has rapidly advanced over the last several years as computers have grown their capacity to handle and analyze large datasets at a speed much faster than the human mind. Release of the ChatGPT (Generative Pretrained Transformer) chatbot in November 2022 introduced the public to the pervasive power of one application of AI, Large Language Models (LLM). It is no wonder that AI currently has and will have an impact on the development of pharmaceutical products. As these tools become more powerful, it is imperative to understand where AI might be implemented and recognize the benefits and risks of AI’s use in drug and biologic products research.
Candidate Drug Selection
Before clinical research can begin, promising candidate drugs must be identified. Many large pharmaceutical companies and some start-ups are using AI as part of the drug discovery process. One example of its use in drug discovery is the development of tools to identify molecules or proteins based on a causal relationship between a target and a disease. This may take the form of a literature-based search to identify relevant target/disease pairs or an analysis of an inferred reaction using tools such as SPIDER (self-organizing map-based prediction of drug equivalence relationships). Other areas where AI may play a role are the evaluation of drug-drug interactions, optimizing compounds using predictive models, and evaluating disease mechanisms. One of the challenges with the use of AI in drug discovery is, because of the variability and variety of datasets available for analysis to develop models, researchers may have difficulty having confidence that the predictions are accurate. Currently deep learning models (DLM) are complex and may not be transparent or easily interpretable. Methods must be developed to optimize and query these models to help ensure the accuracy of the model’s results.
Once a viable compound has been identified, AI can aid evaluation of the product in non-clinical animal models. AI models to evaluate the toxicity of products before use in a human participant may speed up product development and reduce the number of animals needed to conduct the research. Computational tools can also be used to predict pharmacokinetic (PK) and pharmacodynamic (PD) responses of drug products prior to investigation in humans or in animals. Using AI/ML (machine learning) recurrent neural network models to complement traditional PK/PD methods may improve the predictive accuracy of the data prior to use in clinical trials especially when dealing with complex PK/PD data analysis sets.
Clinical Trial Operations
AI has the potential to impact all aspects of clinical trial design and operation, such as site selection, recruitment of participants, selection of eligible participants, stratification when appropriate, dose selection and optimization, study adherence, study retention, and data analysis. AI can also be used to enable predictions of disease outcomes or progression to improve study design. Recruiting sufficient participants to complete the clinical trial is paramount, so appropriate site selection is key to a successful trial. Electronic Medical Records (EMRs) can be screened using AI to identify sites that will have the most eligible participants, or algorithms could be used to evaluate site performance to establish which sites might be more likely to stay on schedule or fall behind. One concern noted when using automated methods to screen for eligible sites is that the sites with smaller populations or low volumes of clinical trials could be missed using these methods.
Participant Recruitment
Regarding recruitment, AI can shorten the time to enrollment in clinical trials, reduce workload by as much as 90%, and improve the accuracy of identification of appropriate study participants. Rather than researchers needing to manually extract data, AI systems can rapidly sort through EMRs using natural language processing (NLP) methods and select appropriate participants and assemble study cohorts. Neural NLP can analyze notes and free text to further evaluate the appropriateness of enrolling certain individuals in a trial. Barriers to implementing the use of AI in participant recruitment are difficulties in merging participant information when the data is dispersed throughout different departments and institutions, and the lack of clear guidelines on how the data is processed and identified by the AI system (e.g., Is it relying on an AI model to identify participants which may have unidentified biases?). Incorporating a description within the protocol of how AI is used as part of recruitment would be instrumental in helping understand and evaluate the reliability and efficiency of recruitment using AI methods and should be encouraged.
Trial Enrichment
Enrichment is a process whereby participants are selected for trials based on specific characteristics that make them more likely to respond to a treatment. If AI predictive models could select participants to enrich a trial, the number of participants needed to complete the trial may be reduced. As a result, the trial may be easier to complete because of increased efficiency and feasibility. However, the drawback to such an approach may be that the results of the study might not be generalizable. This is not a drawback related to the use of AI in clinical trials per se, but an overall issue with enrichment. As precision medicine becomes more mainstream and more products are targeted to individuals with specific genetic or disease characteristics, this may become less of a concern.
Refining Dosing
AI models and methods could be used to further refine dosing within a trial, or help establish dosing for special populations, such as children, where rigorous PK/PD studies may be challenging due to ethical concerns. As modeling and simulation algorithms are refined, potentially effective doses can be identified for use in clinical trials. In pediatric studies, data from adults may be used to predict the doses in children eliminating the need for dedicated PK trials. Studies can be developed using an adaptive approach to minimize the number of participants needed to evaluate safety and effectiveness of the product and this can streamline product development. For example, nonparametric Bayesian learning can help with dose selection using Bayesian logistic regression models that allow data-driven borrowing across multiple populations to improve the accuracy of the estimation of the optimal dosing level.
Participant Engagement
Once participants are enrolled in the clinical trial, it is important to ensure that they are engaged in the trial and willing to adhere to study requirements. AI tools may help with this process by improving communication, evaluating drug compliance, and accurately collecting data on study outcome measures. For example, smart phone applications (apps) might send medication reminders and might even use facial recognition to evaluate whether a drug has been taken. Smart pill boxes can track whether medication has been accessed. Chatbots can provide personalized text messages to participants to keep them interested in the research and robotic assistants can help with disease management. According to one article, the use of AI robotic assistants had helped children with self-management of diabetes and insulin control. Passive collection of data through clinical practice or digital health trackers might also automate collection of data to eliminate or reduce the number of study visits required. Use of these tools may result in better clinical outcomes, better data collection and less attrition, and may help participants remain engaged in the trial.
Participant Challenges
However, some limitations may exist when incorporating AI tools as part of research procedures. Some participants may have problems using the platforms or may have performance anxiety related to use. Others may not trust the app or prefer talking to a real person about their health rather than a computer. This may result in less diversity in the clinical trial because only those who have experience using apps overall may be comfortable using them in the context of a clinical trial. Participants with cognitive delay, from lower socioeconomic backgrounds, or on multiple drugs, who might be at risk for non-adherence, may find these systems harder to use than traditional methods.
Data Management
Data management and analysis are critical aspects of clinical trials. AI tools can be used to integrate data within a clinical trial by converting data to compatible formats and imputing missing data elements. Duplicate entries can be removed, and data can be better curated for analysis. AI can also be used to monitor the quality of data more effectively from studies and sites. This is done using a technique called “smart monitoring” where AI/ML tools are developed that learn from trial data as it accumulates through a series of cross-database checks. An advanced form of risk-based management using these AI tools could be developed that improves efficiency and reduces cost. If quality is improved this could have a “transformative effect on [the] sponsor’s ability to protect patient safety, reduce trial duration, and trial cost.” Safety signals can be assessed using AI/ML models that evaluate various sources, such as social media, or digital health tracking information, while clinical trials are in progress. Such evaluations may not be possible using traditional methods. Regulatory agencies are evaluating how AI/ML might be used to process individual case safety reports and detect and evaluate safety findings when conducting post-marketing safety surveillance.
Protocol Design
Finally, AI may be used to facilitate protocol design. Digital twins have been proposed to either replace or reduce the number of participants that will be assigned to a placebo arm of a trial. “A digital twin is a computer simulation that allows [researchers] to generate biologically realistic data of a target patient.” Participants may be more interested in participating since they know they will be more likely to receive the study treatment. Studies may be easier to complete because fewer participants are necessary to complete the trial. Other examples using AI to facilitate study design include evaluating previous trial designs to see what designs are more likely to be successful and using those designs for new studies.
Challenges with AI
AI has the potential to significantly improve the efficiency, speed, and safety of clinical trials and of drug development, but as already alluded to within this article, there are potential risks and harms that may be introduced when using these technologies in clinical research. The output generated by an AI model is only as good as the data that is used to build the model. For example, an AI model used to identify participants for a trial that does not pull from a heterogeneous, and racially diverse set of potential participants may bias the study results. Or an AI model used to generate molecules that might be effective in a particular disease condition must draw from robust, reliable datasets on the molecule’s potential action or composition to generate viable compounds. Additionally, a certain level of detail in the data may be necessary to construct a valid AI model, but this may require utilizing data that might be considered private. An individual’s confidential information might be inadvertently compromised. Unlike ChatGPT, created by pulling data from public sources, private sources (such as EMRs) may be required to develop models for research. Adequate protections must be in place to de-identify that information so that it does not result in harm to participants.
Regulatory Perspective
Members of the European Parliament passed the European Union AI Act in March 2024, and the United States (US) issued an executive order in October 2023 to establish guidelines that may ultimately lead to regulations for the safe and trustworthy use of AI in the US. These initiatives should provide, over time, significant oversight of AI development and implementation. Meanwhile, given the pace at which AI technology is being implemented, it is imperative that models used in research are developed with a focus on transparency, or knowing how the model is constructed, accountability, or explaining why certain decisions are made to create the model, and responsibility to protect the privacy and confidentiality of the data employed, to prevent any inadvertent harms associated with the use of AI. There also appears to be consensus that including the human element whenever possible in the loop may raise trust levels when AI is utilized in place of processes that might typically be performed by humans.
Conclusion
In summary, AI has the potential to accelerate the drug development process by streamlining many aspects of drug discovery, protocol design and implementation, study analysis and safety monitoring, and innumerable areas in between these steps. However, it is important to recognize that in addition to the benefits, there may be harms associated with the use of these tools. Understanding the various areas where AI may impact the drug development process and considering the impact on individuals and the research enterprise in general are the first steps in ensuring that AI has a positive impact in drug and biological product development.
References
1. Anyoda, R. The History of Artificial Intelligence. Science in the News. [Online] Harvard University, August 28, 2017. [Cited: March 15, 2024.] https://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence
2. Farina. M, Lavazza, A. ChatGPT in society: emerging issues. Frontiers in Artificial Intelligence. Sec. Natural Language Processing, 2023, Vol. 6.
3. Reker D, Rodrigues T, Schneider P, Schneider G. Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus. Proc Natl Acad Sci USA. 111, March 3, 2014, Vol. 11.
4. Qureshi R, Irfan M, Gondal TM, Khan S, Wu J, Hadi MU, Heymach J, Le X, Yan H, Alam T. AI in drug discovery and its clinical relevance. Heliyon. 9, June 26, 2023, Vol. 7.
5. Vijayan, R., Kihlberg, J., Cross, J., Poongavanam, V. Enhancing preclinical drug discovery with artificial intelligence. Drug Discovery Today. April 2022, Vol. 27, 4, pp. 967-984.
6. FDA. About Alternative Methods. www.fda.gov. [Online] [Cited: March 23, 2024.] https://www.fda.gov/science-research/advancing-alternative-methods-fda/about-alternative-methods.
7. FDA. Using Artificial Intelligence & Machine Learning in the Development of Drug & Biologic Products.
8. Askin S, Burkhalter D, Calado G, El Dakrouni S. Artificial Intelligence Applied to clinical trials: opportunities and challenges. 2023, Vol. 13, 2.
9. Laaksonen N, Bengtström M, Axelin A, Blomster J, Scheinin M, Huupponen R. Clinical trial site identification practices and the use of electronic health records in feasibility evaluations: An interview study in the Nordic countries. Clin Trials. 18, 2021, Vol. 6.
10. Ismail A, Al-Zoubi T, El Naqa I, Saeed H. The role of artificial intelligence in hastening time to recruitment in clinical trials. BJR Open. 5, May 16, 2023, Vol. 1.
11. FDA. Enrichment Strategies for Clinical Trials to Support Determination of Effectiveness of Human Drugs and Biological Products, Guidance for Industry. 2019.
12. FDA. Adaptive Designs for Clinical Trials of Drugs and Biologics Guidance for Industry. [Online] 2019. [Cited: March 24, 2024.] https://www.fda.gov/media/78495/download.
13. Kolluri S, Lin J, Liu R, Zhang Y, Zhang W. Machine Learning and Artificial Intelligence in Pharmaceutical Research and Development: a Review. AAPS J. 24, January 4, 2022, Vol. 1, p. 2022.
14. Babel A, Taneja R, Mondello Malvestiti F, Monaco A, Donde S. Artificial Intelligence Solutions to Increase Medication Adherence in Patients With Non-communicable Diseases. Front Digit Health. June 29, 2021, Vol. 3.
15. Zhang X, Yan C, Gao C, Malin BA, Chen Y. Predicting Missing Values in Medical Data via XGBoost Regression. J Healthc Inform Res. 4, December 2020, Vol. 4, pp. 382-394.
16. Emmert-Streib F, Yli-Harja O. What Is a Digital Twin? Experimental Design for a Data-Centric Machine Learning Perspective in Health. Int J Mol Sci. 23, 2022, Vol. 21.
17. Hutson, M. Cutting to the Chase. Nature. 2024, Vol. 627, pp. 82-85.
18. European Commission. Shaping Europe's Digital Future. [Online] March 6, 2024. [Cited: March 28, 2024.] https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai.
19. The White House. Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. [Online] October 30, 2023. [Cited: March 28, 2024.] https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/.
20. Radclyffe C, Ribeiro M, Wortham RH. The assessment list for trustworthy artificial intelligence: A review and recommendations. Front Artif Intell. March 9, 2023, Vol. 6.
21. Laux J, Wachter S, Mittelstadt B. Trustworthy artificial intelligence and the European Union AI act: On the conflation of trustworthiness and acceptability of risk. Regul Gov. 18, January 2024, Vol. 1, pp. 3-32.
