Example projects

Case studies in turning data into
clear, defensible decisions.

De-identified summaries · Real problems.

The examples below are simplified and anonymized, but they reflect the types of questions we regularly help with: probability and risk modeling, sample size and design questions, assay interpretation, and relationships among complex measurements.

Details are adjusted to protect client confidentiality, but the statistical thinking and structure of the solutions mirror the real work.

We do not share client data. These writeups are high-level narrative summaries only.

Case studies

Selected de-identified examples.

Each case shows a different type of question, but the same overall pattern: clarify the objective, build an appropriate statistical framework, and turn the results into recommendations that support real-world choices.

Batch assay result near the lower limit

Manufacturing · Assay reliability · Sample size

A manufacturing team observed an average assay result just under 96% for a large batch of product. This was technically within the specification range but noticeably below the long-term trend, which had been close to 100%. The question was not simply “pass or fail,” but:

  • Setting: Biologic product, routine release assay
  • Core question: How many units should we re-test to be highly confident the true mean is not below a critical threshold?
  • We translated the concern into a one-sided statistical question: with known assay variability (on the order of 1%), what sample size is needed to achieve a pre-specified confidence level that the true batch mean is above a minimum acceptable value?
  • Using standard sampling theory and conservative assumptions, we derived the smallest sample size that would yield roughly 99% confidence that the true mean is not below the critical threshold, assuming the observed average holds up on retesting.
  • We provided a simple decision rule: if the re-sampled mean and variability remained within defined bounds, the batch could be released with clearly documented statistical justification.

Relating three test methods with different units

Method relationships · CT / TOC / BIO style data

A client used three different test methods that measure related but not identical aspects of a product. Each method reported results in different units and on different scales, making direct comparison difficult.

  • Setting: Three analytical methods with different units
  • Core question: Do the readings move together, and can one test help predict another?
  • We first visualized the data using standardized scales, correlation plots, and simple regression fits, so trends were visible without requiring the client to think in transformed units.
  • We then fit regression models where each method in turn was treated as a “response” and the other methods as predictors, checking linearity, residual patterns, and uncertainty in the estimated relationships.
  • The final report focused on plain-language conclusions: where readings tended to agree, where they diverged, and in what ranges it was reasonable (or not) to treat one method as a proxy for another.

Sampling when a suspect load is mixed with good product

Probability · Hypergeometric logic · Practical sampling

Due to a procedural error, a suspect autoclave load (potentially over-processed) was accidentally mixed with correctly processed loads of the same product. All units were now indistinguishable.

  • Setting: One suspect subgroup mixed into a larger batch
  • Core question: How many units must be sampled so that, with high probability, the sample contains at least a small number of units from the suspect group?
  • We represented the situation with a finite-population model: a known number of “suspect” units and a known number of “non-suspect” units combined into a single pool.
  • Using the hypergeometric distribution, we computed the probability of capturing at least a target number of suspect units for different sample sizes.
  • From this, we recommended a minimum sample size that achieved roughly a 95% chance of including at least the desired number of suspect units, along with a simple table the client could use in similar situations.

Interpreting 4PL parallelism and relative potency outputs

4PL · Parallelism · Relative potency · Plain-language report

A client used a 4-Parameter Logistic (4PL) model in their plate reader software to analyze dose–response curves for reference and test materials. The software produced chi-squared, F-tests, p-values, R², and relative potency estimates — but the practical meaning of these numbers was unclear to non-statistical stakeholders.

  • Setting: 4PL dose–response curves (reference + tests)
  • Core question: How do we interpret the parallelism tests and relative potency estimates in a way that guides pass/fail decisions?
  • We explained the idea of “parallel” 4PL curves: if the shapes (slope and upper/lower levels) are statistically compatible, their difference can be summarized as a horizontal shift — which is exactly what relative potency captures.
  • We reviewed the chi-squared and F-test outputs from the software, defining what it means when a test supports the parallelism assumption versus when the data suggest meaningful shape differences.
  • Instead of prescribing rigid universal cutoffs, we provided example ranges for “clearly good,” “borderline,” and “problematic” results (e.g., high R², p-values that do not signal strong lack of fit), emphasizing that these thresholds should be embedded in the client’s broader assay strategy and regulatory context.
  • Finally, we showed how to interpret the relative potency estimate and its confidence interval in simple language: when a test batch is effectively similar to the reference, and when the data suggest meaningfully higher or lower potency.