Definitions of Statistics

Statistics definitions.

Many statistics are used in flow cytometric analysis.  Here we define the most common and describe how FlowJoTM calculates them.

  1. Median — The median is the relative intensity value below which 50% of the events are found; i.e., it is the 50th percentile. In general, the median is a more robust estimator of the central tendency of a population than the mean.
  2. Mean — The arithmetic mean. For a normal distribution, the mean = median = mode.
  3. Geom. Mean — The geometric mean. Can be a more applicable metric for a log-normal distribution. It is always less than or equal to the arithmetic mean.  In FlowJoTM this is calculated as the geometric mean of the graph space to make it usable on data that may include zeros or negative numbers.
  4. Robust CV — robust coefficient of variation, Equals 100 * 1/2( Intensity[at 84.13 percentile] – Intensity [at 15.87 percentile] ) / Median. The robust CV is not as skewed by outlying values as the CV.
  5. Robust SD — Robust standard deviation, 68.26% of the events around the Median are used for this calculation, and an upper and lower range set.  The robust standard deviation is equal to (upper range + lower range) /2.  If the upper range is off scale, the robust standard deviation is equal to the lower range, and vice versa when the lower range is off scale and the robust standard deviation is the upper range.  The robust standard deviation is not as skewed by outlying values as the Standard Deviation.
  6. CV — The Coefficient of Variation is a normalized Standard Deviation. CV = StdDev/Mean. In FlowJo, the CV statistic is displayed in percent (i.e. a CV of 0.15 is displayed as 15). 1/CV is a common way to define the Signal to Noise Ratio.
  7. SD — The Standard Deviation is a measure of the spread of the dataset. Lower values indicate the data points are closer to the mean and give higher confidence to the mean value.
  8. Percentile — This is the relative intensity below which n% of the events are found, where n is the selected value. n=50 is equivalent to the median.
  9. MADP* — Median Absolute Deviation Percentile is 100 * the MAD divided by the median, which is a measure of variance on a normalized scale to aid in interpretation.
  10. Median Abs Dev — Median Absolute Deviation is a robust measure of population variance.  It is calculated as the median of the absolute deviation of each cells measure from the population median.
  11. Freq. of Parent —The percentage of events (cells) in this population out of the parent population (one level up).
  12. Freq. of Grandparent —The percentage of events (cells) in this population out of the population two levels up.
  13. Freq. of —The percentage of events (cells) in this population out of the total number of events a selected upstream population.
  14. Freq. of Total — The percentage of events (cells) in this population out of the total number of events in the sample.
  15. Count —The absolute number of events (cells) in a population.
  16. Mode —The relative intensity value which is most frequently found for a given parameter. This is the same intensity value at which the highest point on a histogram is found.

Note: A common question is “Which statistic should I use?”.   The answer depends on how your cells are expressing the markers of choice, and what scale you have used to display the data on.  Means are appropriate for linear scales, while geometric means are appropriate for log or biexponential scales, while medians are appropriate for either. By default, FlowJo tends to use medians which are also less impacted by outliers.

One clear recommendation is that when using the term “MFI”, it is a good idea to clearly define it in your context.