All Research Uses Statistics

Uncertainty

Uncertainty is the lack of complete knowledge about an event or object. In research, the goal is to generate knowledge. In the typical outcome of research, some new fact or attribute is obtained through experimentation and the uncertainty surrounding the results is quantified. Over time, more pieces of the object of research are uncovered and uncertainty surrounding it decreases, but it never goes to zero. Properly dealing with knowing a fraction of the truth is central to sound research outcomes and requires careful attention to what is known, how it came to be known, and how reliable the knowledge is. While uncertainty can decrease over time through careful experimentation and analysis, it is also introduced at every step along the way.

Sources of error and its propagation

Uncertainty is known by many names in different contexts, such as error in measurements, variance in distributions, noise in signals analysis, and risk in most of life. In research, uncertainty is typically discussed and quantified in terms of the error introduced in an experimental method and the error within the outcome or measurement (error bars). These two depend on each other; the type of experiment that is performed generates a specific type of error that propagates through to the results and the inherent variance in the object of interest determines what type of experiment will reveal novel characteristics. Experimental procedures evolve over time as results come into view, so every set of experiments require analysis before the next experiments can occur, so that uncertainty can be minimized at each step. Some sources of error that are common in experiments include:

Resolution errors – This is also known as rounding errors. If a piece of equipment is not sensitive enough to detect a change in the range that your phenomena is occurring you may either not see a difference when you should or see a much larger difference than there is depending on whether you are close to or far from a boundary or bin that the equipment reports.

Human errors – This can include: improperly following a procedure, reading a value on an instrument incorrectly, using different equipment from usual, incorrect labeling, personal bias, not being blind to test conditions, data manipulation, using the wrong reagents etc.

Environmental errors – Anything that is different in the space that the experiment is proceeding within can potentially change the outcome. This includes: temperature, lighting, airflow, volume, electrical draw, background electromagnetic noise, mechanical vibrations, gas concentrations, air pressure and humidity etc.

Instrumental errors – This includes all types of equipment malfunctions. A light not turning on at the right time, a cameras shutter opening late, an electrical surge within some components, a lasers power drooping, a pipette being mis-calibrated, a lens bring dirty, a hard drive error, a computer shutting off or stuttering, the internet going out briefly etc.

All of these types of errors can skew measurements and results. Everything needs to be documented and all errors should be propagated to the final measurement. Every piece of equipment in a lab either has an uncertainty/error range from the manufacturer or should be tested with known standards to determine its range for specific measurements. Some resources to learn error propagation are:

http://www.geol.lsu.edu/jlorenzo/geophysics/uncertainties/Uncertaintiespart2.html

http://ipl.physics.harvard.edu/wp-uploads/2013/03/PS3_Error_Propagation_sp13.pdf

https://courses.washington.edu/phys431/propagation_errors_UCh.pdf

https://www.itl.nist.gov/div898/handbook/mpc/section5/mpc55.htm

What Are Statistics

Uncertainty and error can be complicated, but luckily much work has been done to formalize how to deal with different cases and magnitudes of error. This study, statistics, “concerns the collection, organization, analysis, interpretation and presentation of data”. Due to the sources of error outlined above, measurements collected during an experiment will not be identical, but in bulk they have attributes that are used to interpret the results.

Distributions

The set of collected data from an experiment is called a distribution. The most common form is the Gaussian distribution shown below:

Here, data from an experiment is binned into small containers and added together to generate the top curve of the distribution. The shape of the distribution is formally related to the error or uncertainty with the measurements and has some useful attributes, which are formally called its statistics.

Averages

The first set of statistics that are typically taught are the averages, the arithmetic mean, median and mode of a distribution. Each one of these attempts to determine where the middle of the distribution above is located, and each has different uses. The arithmetic mean (the most commonly used average) is the sum of all measurement values divided by the number of measurements and is the basis for many statistical tests. The median is the value in the center of a data set that is organized from lowest to highest value, and is less affected by outliers (a single value that is far away from the rest) than the arithmetic mean is. The mode is the most commonly found value in the measurement set. For the perfect normal (Gaussian) distribution above, all of these numbers are the same, but in a set of measurements they rarely are. When generating figures the arithmetic mean is almost always used, but the relationship to the others can give useful insight into what your distribution looks like (see below).

Standard deviations

Another statistic is the standard deviation (which is the square root of the variance) which quantifies the mean of the distance between each value and the mean – stated simply, it is the spread of the values within a distribution. Distributions with greater standard deviations appear wider (see below).

The error in measurements is directly related to the standard deviation; more or larger errors will result in greater standard deviation.

Higher moments

The mean and standard deviation are called the zeroth and first moment of a distribution. There are higher moments for each distribution, but the higher they are, the less and less frequently they are used in research. The next few moments are skewness and kurtosis and can be found in many textbooks or on Wikipedia.

Hypothesis testing

In an experiment, even with no hypothesis at the outset, analysis is conducted using hypothesis testing. It is centered around making the assumption that there is no relation between a condition and a result, and concluding with some level of uncertainty that that assumption was right or wrong. The method is as follows:

Generate the Null hypostasis – the null hypothesis is a general statement or default position that there is no relationship between two measured phenomena or no association among groups.

Generate the Alternative hypothesis - the alternative hypothesis is a position that states something is happening, a new theory is true instead of an old one (null hypothesis).

Determine the correct statistical test and calculate the T-value – This is not trivial, here are some resources for choosing the right test:

https://stats.idre.ucla.edu/other/mult-pkg/whatstat/

https://www.graphpad.com/support/faqid/1790/

http://www.biostathandbook.com/testchoice.html

https://www.scribbr.com/statistics/statistical-tests/

Calculate the p-value - This is the probability, under the null hypothesis, of sampling a test statistic at least as extreme as that which was observed.

https://en.wikipedia.org/wiki/P-value

Reject the null hypothesis, in favor of the alternative hypothesis, if and only if the p-value is less than (or equal to) the significance level (the selected probability) threshold (alpha)

Types of Data

The type of data that you have will determine the type of statistical test that is appropriate to run. These definitions are included to aid in the search for statistical tests.

Numerical

This data is just a list of numbers that correspond to measurements at different conditions. This can be: voltages read off a piece of equipment, number of objects in an image, concentrations, cell counts, intensity in a fluorescent image or test, distance traveled, height, weight, age – anything that is a number.

Categorical

This includes data that falls into different bins that have no inherent numerical value. This includes race, sex, hair color, eye color, state of residence, education level.

Rank

This data includes the order of observations but does not include their numerical values or distances from each other. Some examples are satisfaction level, places in a tournament, and school rankings

Conclusion

Statistics are complicated but critical to use correctly in research. The workflow for even a seasoned analyst is to look up many tests and read through all the assumptions before proceeding with analysis, and many tests have to be performed on the way to the final answer. If a career in research is on your potential path, try to take at least one statistics course during college (or an online course) and keep looking things up and asking questions. No one feels completely certain about which test to use (get it?) and strictly following the interpretation guidelines is the only way to learn anything from analysis.

References:
http://www.geol.lsu.edu/jlorenzo/geophysics/uncertainties/Uncertaintiespart2.html