Jennifer is a speaker for ODSC East 2020 this April 13-17 in Boston. Be sure to check out her talk, “Impacting Clinical Development with Advanced Analytics: Challenges & Opportunities,” there!

Why did I become a statistician in Big Pharma? When I completed my training, I had never considered it. Big Pharma statisticians were generally considered pencil pushers, engaged in the most boring work possible. But when a position opened up as a research statistician at Genentech (which is a member of the Roche group), I wanted it – because Genentech has always had the reputation of being at the forefront of biological science and research, in particular, seemed more promising in terms of interesting work. With Roche’s acquisitions of Flatiron Health and Foundation Medicine and partnerships like PicnicHealth, science is now more broadly defined to include data science as well. With this large-scale transformation occurring within the company, opportunities to pursue methodology development on novel data types now exist in all functions of the company – embedded statisticians are in turn uniquely positioned to directly impact our pipeline.

Pharma generally recognizes the need to embrace change in how we analyze and handle data internally, a process in the past clinical teams did not prioritize as the end result of a trial is meant to be a filing, not a curated data mart. Now that we are focused on curated and integrated datasets moving forward, there is naturally a commensurate pressure to gain additional insights from this data. Expectations are higher – we’ve already run the logistic regression or mixed effect model repeat measurement (MMRM) on the primary, secondary and exploratory endpoints. In order to justify the heavy-lift needed to wrangle curated data from our incurred technical debt, the algorithm used to analyze this data must be proportionally complex. Terms start flying around like advanced analytics, machine learning, deep learning, and artificial intelligence.

What is the bar to adopt these analytical methodologies?  The core of Big Pharma is to develop drugs and transition new molecules through the pipeline of progressing evidence in support of helping patients. We expect that our clinical development plans are informed by data and evidence. Just like any modeling exercise – there are training and test data. In this setting, training data is a large non-interventional cohort, which does not typically reflect the highly ascertained patient population of a trial. As an example, this atezolizumab trial in Non-Small Cell Lung Cancer has 35 inclusion/exclusion criteria, many of which are based on variables that will not typically be collected in an observational cohort or electronic health medical record database. This is simply the first obstacle – but assuming that we will hit the bar, what are the rest of the obstacles unique to Pharma that we will need to address? What do our first forays at the forefront of integrating data science into our pipelines look like? Come find out at my talk where I would love to hear your feedback!

More on the author/speaker:

Jennifer Tom joined Genentech in 2014 as a statistician in bioinformatics and computational biology and moved into product development in 2018. She has supported human genetics, microbiome, imaging, early clinical development, and biomarker activities across various non-oncology indications. Previously she worked as a software engineer at Agilent and the visiting assistant Neyman Professor of statistics at Berkeley. Jennifer received a BA in Molecular and Cellular Biology from UC Berkeley and an MS and PhD in Biostatistics from UCLA.