Microbiome Biostatistician


Los Angeles, California

Date Posted January 16, 2023
Industry Clinical Research
Specialty Not Specified
Job Status Full-time
Salary Not Specified



Seed Health is a microbiome science company pioneering innovations in probiotics and living medicines to impact human and planetary health. Our scientific board comprises leading scientists, researchers, and clinicians across the fields of microbiology, immunology, bioinformatics, dermatology, oral health, vaginal health, gastroenterology, mental health, pediatrics, and nutrition.

Consumer innovations are commercialized under Seed® with a mission to bring much-needed precision, efficacy, education and perspective-shifting science communication to the global category of probiotics. Our efforts to set a new standard in probiotics, microbial innovation and translational communication have earned various accolades, including Fast Company's World Changing Ideas in 2019, 2020, 2021, and 2022, and TIME's Best Inventions 2018.

We focus on categories where microbial innovations and microbiome-related products will disrupt and capture global market share in the coming years (e.g. oral care, skin care, infant health, etc.). We have built a strong foundation with breakthrough research and strong intellectual property, and believe there is a substantial opportunity to emerge as an innovative product line, backed by the most rigorous science.

Who You Are

You are passionate about diving into big microbiome datasets (e.g. from clinical trials or multiple cohorts). In your current role, you are immersed in analytical work in statistics and work eloquently with large data sets. You are capable of designing and executing both detailed and open-ended analysis plans in a fast-paced environment. Ideally, you have familiarity with the nuances and requirements of clinical trial or human cohort datasets, including the need for detailed documentation, accurate metadata, statistical analysis plans, regulatory compliance, and cohort or clinical trial designs.

Teachers and mentors would describe you as a focused, analytical mind fluent with statistical techniques and able to apply them quickly. You consistently deliver impeccable work using large data sets (such as multi-omics) and can communicate the results in biological context to a diverse team with multiple backgrounds. You conduct the type of work that drives forward innovation and you effectively collaborate and communicate with a highly intellectual, diverse team and with company stakeholders.

Importantly, you respect deadlines, starting on time and finishing on time. In conversation, you're quick on your feet, employ an impressive vocabulary, and rarely fumble for words. You hold yourself and colleagues accountable for deliverables, work fast, and loathe stagnancy and procrastination. You have a proven track record in an R&D or clinical analysis team using an array of biostatistics and other statistical techniques to interrogate clinical datasets containing various microbiological (sparse compositional count data) and clinical or human cohort (e.g. human health) data. You can hit the ground running and launch into executing a suite of statistical analyses appropriate to the clinical trial or cohort design on high-throughput, high-dimensional omics/microbiome human data, under tight deadlines. Ideally, you are familiar with concepts in clinical trial analysis, including power calculation, differences between primary/secondary/exploratory endpoints, and various trial designs including longitudinal vs observational sampling, nesting, multiple arms, confounding variables, blinding, interim analysis (alpha spending function), how to pre-specify appropriate statistical methods for a study up front before data arrival (in contrast to post hoc exploratory analysis afterward), etc.

Overall, you will help Seed remain a data-driven culture which reflects the intentionality of our brand and the human-centric ethos of our company.
You can rapidly internalize and become proficient in running bioinformatics analysis internally and/or using cloud resources. You obsess over data, insights, new trends, technologies and the insights that come from experience and deep knowledge of our brand.

Projects under your scope will proceed quickly across many different dimensions – you are comfortable independently organizing, logging, and communicating your thoughts and processes. You have both willingness and ability to speak up, think critically, and explain the complexity of biostatistics to people at all levels of mastery.

What You'll Do

  • Integrate and analyze human microbiome clinical trial and cohort data, including use of count tables derived from metagenomic sequencing
  • Use bioinformatic tools for precision pre/probiotic design and microbial comparative genomics
  • Analyze clinical trial data with respect to primary/secondary endpoints, for multiple trials
  • Analyze clinical trial data post-hoc for the development of new microbial therapies
  • Use clinical biostatistical best practices for each analysis, including methods optimized for cross-sectional and time series experiments, control for confounders and study structure, and report significance and effect sizes
  • Be involved in contributing to clinical trial (statistical) design, SAPs (statistical analysis protocols), analysis documentation, and/or academic papers for post hoc analysis
  • Communicate biostatistical analysis results with academic and industry partners around the globe

Required Qualifications

  • PhD / MA + 2 years industry experience or BA + 4 yrs industry experience in biostatistics or closely related statistical discipline
  • Experience in analysis of high dimensional biological datasets
  • Knowledge of the R programming language, bioconductor repository, and various biostatistics packages (other statistical languages optional, e.g. SAS, SPSS and STATA)
  • Ability to work in dynamic and fast-paced environments with tight deadlines
  • Experience with clinical trial or cohort design and/or relevant statistical procedures, including SAPs (statistical analysis plans) and pre-specification of analysis methods
  • Expert knowledge and proven experience with clinical trial-applicable biostatistics using sparse high-dimensional data (such as 'omics), please see desired qualifications for examples (not all are required, but many or most).

Desired Qualifications

Working knowledge or willingness to learn microbiome bioinformatics workflows in a Unix/Linux environment (to generate the variables and count tables needed for statistical analysis)

  • Experience working with WGS microbiome short-read metagenomic datasets
  • Experience with machine learning, including random forests, statistical models, classification, clustering
  • Big data handling and raw data storage, infrastructure, database generation
  • AWS/cloud compute (or other HPC environment for running software)
  • Algorithms and applied discrete mathematical modeling of microbial ecosystems
  • Comparative genomics expertise
  • Present or communicate results to a wider audience, including reports, meetings, and academic publications
  • Familiarity with the majority of the following statistical techniques:
    • Common clinical trial designs, structures, endpoints (primary, secondary, exploratory)
    • Longitudinal population analysis, including repeated-measures ANOVA, Friedman test, MANOVA, GEE (generalized estimating equations), MER (mixed effect regression, including LME models)
    • Correlation and integration of omics (e.g. mixOmics, mantel tests, etc)
    • Subgroup identification and analysis (e.g. responder vs non-responder analysis)
    • Data imputation techniques, including zero-imputation and zero-inflated distributions (ZINB, ZIP), hurdle models, and associated statistical tests
    • Power calculation
    • Nested clinical trial considerations
    • Multiple arm trial considerations (e.g. use of Two-way anova, Dunnet's test, etc)
    • Methods for adjusting and controlling for confounding variables
    • Interim analysis considerations (including alpha spending function)
    • Robust toolkit of univariate and multivariate statistics and statistical models with covariates and interactions
    • Non-parametric statistics (Wilcoxon rank-sum, Friedman, Shapiro-Wilks, Kruskal-Wallis, rank correlations, etc)
    • Ecological alpha and beta diversity metrics (shannon index, bray-curtis dissimilarity, etc), including longitudinal or delta correlation thereof

The annual pay range for this full-time position is $110k-$180k + equity + benefits across all US locations (this position is 100% remote-US). Our pay ranges are guided by discipline, level and experience required. Within the range, individual pay may vary based on additional factors, including: your specific location, desired skills/ technical competency, relevant experience and advanced education/ training.Benefits include: Medical, Dental, Vision, Life, AD&D, LTD, Mental Wellness, EAP, Wellness Stipend + 401(k) match.

Seed is an equal opportunity employer. For us, diversity isn't an HR metric—it is the result of billions of years of evolution; it's our nature. To serve our community inclusively means to cultivate a relative abundance of perspectives, backgrounds, geographies, and experiences. Like in biology, each role and its function is key to the productivity, sustainability, and resilience of our ecosystem.