<![CDATA[Anesthesia & Analgesia - Biostatistics, Epidemiology and Study Design: A Practical Online Primer for Clinicians]]>
https://journals.lww.com/anesthesia-analgesia/pages/collectiondetails.aspx?TopicalCollectionId=188
en-usMon, 20 Jan 2020 06:27:40 -0600Wolters Kluwer Health RSS Generatorhttps://cdn-images-journals.azureedge.net/anesthesia-analgesia/XLargeThumb.00000539-202002000-00000.CV.jpeg<![CDATA[Anesthesia & Analgesia - Biostatistics, Epidemiology and Study Design: A Practical Online Primer for Clinicians]]>
https://journals.lww.com/anesthesia-analgesia/pages/collectiondetails.aspx?TopicalCollectionId=188
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/05000/In_the_Beginning_There_Is_the_Introduction_and.50.aspx
<![CDATA[In the Beginning—There Is the Introduction—and Your Study Hypothesis]]>Writing a manuscript for a medical journal is very akin to writing a newspaper article—albeit a scholarly one. Like any journalist, you have a story to tell. You need to tell your story in a way that is easy to follow and makes a compelling case to the reader. Although recommended since the beginning of the 20th century, the conventional Introduction-Methods-Results-And-Discussion (IMRAD) scientific reporting structure has only been the standard since the 1980s. The Introduction should be focused and succinct in communicating the significance, background, rationale, study aims or objectives, and the primary (and secondary, if appropriate) study hypotheses. Hypothesis testing involves posing both a null and an alternative hypothesis. The null hypothesis proposes that no difference or association exists on the outcome variable of interest between the interventions or groups being compared. The alternative hypothesis is the opposite of the null hypothesis and thus typically proposes that a difference in the population does exist between the groups being compared on the parameter of interest. Most investigators seek to reject the null hypothesis because of their expectation that the studied intervention does result in a difference between the study groups or that the association of interest does exist. Therefore, in most clinical and basic science studies and manuscripts, the alternative hypothesis is stated, not the null hypothesis. Also, in the Introduction, the alternative hypothesis is typically stated in the direction of interest, or the expected direction. However, when assessing the association of interest, researchers typically look in both directions (ie, favoring 1 group or the other) by conducting a 2-tailed statistical test because the true direction of the effect is typically not known, and either direction would be important to report.]]>Thu, 02 May 2019 15:26:31 GMT-05:0000000539-201705000-00050
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/06000/Magic_Mirror,_on_the_Wall_Which_Is_the_Right_Study.47.aspx
<![CDATA[Magic Mirror, on the Wall—Which Is the Right Study Design of Them All?—Part I]]>The assessment of a new or existing treatment or intervention typically answers 1 of 3 research-related questions: (1) “Can it work?” (efficacy); (2) “Does it work?” (effectiveness); and (3) “Is it worth it?” (efficiency or cost-effectiveness). There are a number of study designs that on a situational basis are appropriate to apply in conducting research. These study designs are classified as experimental, quasi-experimental, or observational, with observational studies being further divided into descriptive and analytic categories. This first of a 2-part statistical tutorial reviews these 3 salient research questions and describes a subset of the most common types of experimental and quasi-experimental study design. Attention is focused on the strengths and weaknesses of each study design to assist in choosing which is appropriate for a given study objective and hypothesis as well as the particular study setting and available resources and data. Specific studies and papers are highlighted as examples of a well-chosen, clearly stated, and properly executed study design type.]]>Thu, 02 May 2019 15:27:39 GMT-05:0000000539-201706000-00047
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/07000/Magic_Mirror,_On_the_Wall_Which_Is_the_Right_Study.49.aspx
<![CDATA[Magic Mirror, On the Wall—Which Is the Right Study Design of Them All?—Part II]]>The assessment of a new or existing treatment or other intervention typically answers 1 of 3 central research-related questions: (1) “Can it work?” (efficacy); (2) “Does it work?” (effectiveness); or (3) “Is it worth it?” (efficiency or cost-effectiveness). There are a number of study designs that, on a situational basis, are appropriate to apply in conducting research. These study designs are generally classified as experimental, quasiexperimental, or observational, with observational studies being further divided into descriptive and analytic categories. This second of a 2-part statistical tutorial reviews these 3 salient research questions and describes a subset of the most common types of observational study designs. Attention is focused on the strengths and weaknesses of each study design to assist in choosing which is appropriate for a given study objective and hypothesis as well as the particular study setting and available resources and data. Specific studies and papers are highlighted as examples of a well-chosen, clearly stated, and properly executed study design type.]]>Thu, 02 May 2019 15:28:24 GMT-05:0000000539-201707000-00049
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/08000/Defining_the_Primary_Outcomes_and_Justifying.45.aspx
<![CDATA[Defining the Primary Outcomes and Justifying Secondary Outcomes of a Study: Usually, the Fewer, the Better]]>One of the first steps in designing and conducting a research study is identifying the primary and any secondary study outcomes. In an experimental, quasi-experimental, or analytic observational research study, the primary study outcomes arise from and align directly with the primary study aim or objective. Likewise, any secondary study outcomes arise from and directly align with any secondary study aim or objective. One designated primary study outcome then forms the basis for and is incorporated literally into the stated hypothesis. In a Methods section, authors clearly state and define each primary and any secondary study outcome variable. In the same Methods section, authors clearly describe how all primary and any secondary study outcome variables were measured. Enough detail is provided so that a clinician, statistician, or informatician can know exactly what is being measured and that other investigators could duplicate the measurements in their research venue. The authors provide published substantiation (preferably) or other documented evidence of the validity and reliability of any applied measurement instrument, tool, or scale. A common pitfall—and often fatal study design flaw—is the application of a newly created (“home-grown”) or ad hoc modification of an existing measurement instrument, tool, or scale—without any supporting evidence of its validity and reliability. An optimal primary outcome is the one for which there is the most existing or plausible evidence of being associated with the exposure of interest or intervention. Including too many primary outcomes can (a) lead to an unfocused research question and study and (b) present problems with interpretation if the treatment effect differed across the outcomes. Inclusion of secondary variables in the study design and the resulting manuscript needs to be justified. Secondary outcomes are particularly helpful if they lend supporting evidence for the primary endpoint. A composite endpoint is an endpoint consisting of several outcome variables that are typically correlated with each. In designing a study, researchers limit components of a composite endpoint to variables on which the intervention of interest would most plausibly have an effect, and optimally with preliminary evidence of an effect. Ideally, components of a strong composite endpoint have similar treatment effect, frequency, and severity—with the most important being similar severity.]]>Thu, 02 May 2019 15:47:07 GMT-05:0000000539-201708000-00045
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/09000/Bias,_Confounding,_and_Interaction__Lions_and.46.aspx
<![CDATA[Bias, Confounding, and Interaction: Lions and Tigers, and Bears, Oh My!]]>Epidemiologists seek to make a valid inference about the causal effect between an exposure and a disease in a specific population, using representative sample data from a specific population. Clinical researchers likewise seek to make a valid inference about the association between an intervention and outcome(s) in a specific population, based upon their randomly collected, representative sample data. Both do so by using the available data about the sample variable to make a valid estimate about its corresponding or underlying, but unknown population parameter. Random error in an experiment can be due to the natural, periodic fluctuation or variation in the accuracy or precision of virtually any data sampling technique or health measurement tool or scale. In a clinical research study, random error can be due to not only innate human variability but also purely chance. Systematic error in an experiment arises from an innate flaw in the data sampling technique or measurement instrument. In the clinical research setting, systematic error is more commonly referred to as systematic bias. The most commonly encountered types of bias in anesthesia, perioperative, critical care, and pain medicine research include recall bias, observational bias (Hawthorne effect), attrition bias, misclassification or informational bias, and selection bias. A confounding variable is a factor associated with both the exposure of interest and the outcome of interest. A confounding variable (confounding factor or confounder) is a variable that correlates (positively or negatively) with both the exposure and outcome. Confounding is typically not an issue in a randomized trial because the randomized groups are sufficiently balanced on all potential confounding variables, both observed and nonobserved. However, confounding can be a major problem with any observational (nonrandomized) study. Ignoring confounding in an observational study will often result in a “distorted” or incorrect estimate of the association or treatment effect. Interaction among variables, also known as effect modification, exists when the effect of 1 explanatory variable on the outcome depends on the particular level or value of another explanatory variable. Bias and confounding are common potential explanations for statistically significant associations between exposure and outcome when the true relationship is noncausal. Understanding interactions is vital to proper interpretation of treatment effects. These complex concepts should be consistently and appropriately considered whenever one is not only designing but also analyzing and interpreting data from a randomized trial or observational study.]]>Thu, 02 May 2019 15:48:22 GMT-05:0000000539-201709000-00046
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/10000/Fundamentals_of_Research_Data_and_Variables__The.45.aspx
<![CDATA[Fundamentals of Research Data and Variables: The Devil Is in the Details]]>Designing, conducting, analyzing, reporting, and interpreting the findings of a research study require an understanding of the types and characteristics of data and variables. Descriptive statistics are typically used simply to calculate, describe, and summarize the collected research data in a logical, meaningful, and efficient way. Inferential statistics allow researchers to make a valid estimate of the association between an intervention and the treatment effect in a specific population, based upon their randomly collected, representative sample data. Categorical data can be either dichotomous or polytomous. Dichotomous data have only 2 categories, and thus are considered binary. Polytomous data have more than 2 categories. Unlike dichotomous and polytomous data, ordinal data are rank ordered, typically based on a numerical scale that is comprised of a small set of discrete classes or integers. Continuous data are measured on a continuum and can have any numeric value over this continuous range. Continuous data can be meaningfully divided into smaller and smaller or finer and finer increments, depending upon the precision of the measurement instrument. Interval data are a form of continuous data in which equal intervals represent equal differences in the property being measured. Ratio data are another form of continuous data, which have the same properties as interval data, plus a true definition of an absolute zero point, and the ratios of the values on the measurement scale make sense. The normal (Gaussian) distribution (“bell-shaped curve”) is of the most common statistical distributions. Many applied inferential statistical tests are predicated on the assumption that the analyzed data follow a normal distribution. The histogram and the Q–Q plot are 2 graphical methods to assess if a set of data have a normal distribution (display “normality”). The Shapiro-Wilk test and the Kolmogorov-Smirnov test are 2 well-known and historically widely applied quantitative methods to assess for data normality. Parametric statistical tests make certain assumptions about the characteristics and/or parameters of the underlying population distribution upon which the test is based, whereas nonparametric tests make fewer or less rigorous assumptions. If the normality test concludes that the study data deviate significantly from a Gaussian distribution, rather than applying a less robust nonparametric test, the problem can potentially be remedied by judiciously and openly: (1) performing a data transformation of all the data values; or (2) eliminating any obvious data outlier(s).]]>Thu, 02 May 2019 15:49:09 GMT-05:0000000539-201710000-00045
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/11000/Descriptive_Statistics__Reporting_the_Answers_to.48.aspx
<![CDATA[Descriptive Statistics: Reporting the Answers to the 5 Basic Questions of Who, What, Why, When, Where, and a Sixth, So What?]]>Descriptive statistics are specific methods basically used to calculate, describe, and summarize collected research data in a logical, meaningful, and efficient way. Descriptive statistics are reported numerically in the manuscript text and/or in its tables, or graphically in its figures. This basic statistical tutorial discusses a series of fundamental concepts about descriptive statistics and their reporting. The mean, median, and mode are 3 measures of the center or central tendency of a set of data. In addition to a measure of its central tendency (mean, median, or mode), another important characteristic of a research data set is its variability or dispersion (ie, spread). In simplest terms, variability is how much the individual recorded scores or observed values differ from one another. The range, standard deviation, and interquartile range are 3 measures of variability or dispersion. The standard deviation is typically reported for a mean, and the interquartile range for a median. Testing for statistical significance, along with calculating the observed treatment effect (or the strength of the association between an exposure and an outcome), and generating a corresponding confidence interval are 3 tools commonly used by researchers (and their collaborating biostatistician or epidemiologist) to validly make inferences and more generalized conclusions from their collected data and descriptive statistics. A number of journals, including Anesthesia & Analgesia, strongly encourage or require the reporting of pertinent confidence intervals. A confidence interval can be calculated for virtually any variable or outcome measure in an experimental, quasi-experimental, or observational research study design. Generally speaking, in a clinical trial, the confidence interval is the range of values within which the true treatment effect in the population likely resides. In an observational study, the confidence interval is the range of values within which the true strength of the association between the exposure and the outcome (eg, the risk ratio or odds ratio) in the population likely resides. There are many possible ways to graphically display or illustrate different types of data. While there is often latitude as to the choice of format, ultimately, the simplest and most comprehensible format is preferred. Common examples include a histogram, bar chart, line chart or line graph, pie chart, scatterplot, and box-and-whisker plot. Valid and reliable descriptive statistics can answer basic yet important questions about a research data set, namely: “Who, What, Why, When, Where, How, How Much?”]]>Thu, 02 May 2019 15:49:49 GMT-05:0000000539-201711000-00048
https://journals.lww.com/anesthesia-analgesia/Fulltext/2017/12000/Fundamental_Epidemiology_Terminology_and_Measures_.45.aspx
<![CDATA[Fundamental Epidemiology Terminology and Measures: It Really Is All in the Name]]>Epidemiology is the study of how disease is distributed in populations and the factors that influence or determine this distribution. Clinical epidemiology denotes the application of epidemiologic methods to questions relevant to patient care and provides a highly useful set of principles and methods for the design and conduct of quantitative clinical research. Validly analyzing, correctly reporting, and successfully interpreting the findings of a clinical research study often require an understanding of the epidemiologic terms and measures that describe the patterns of association between the exposure of interest (treatment or intervention) and a health outcome (disease). This statistical tutorial thus discusses selected fundamental epidemiologic concepts and terminology that are applicable to clinical research. Incidence is the occurrence of a health outcome during a specific time period. Prevalence is the existence of a health outcome during a specific time period. The relative risk can be defined as the probability of the outcome of interest (eg, developing the disease) among exposed individuals compared to the probability of the same event in nonexposed individuals. The odds ratio is a measure of risk that compares the frequency of exposure to a putative causal factor in the individuals with the health outcome (cases) versus those individuals without the health outcome (controls). Factors that are associated with both the exposure and the outcome of interest need to be considered to avoid bias in your estimate of risk. Because it takes into consideration the contribution of extraneous variables (confounders), the adjusted odds ratio provides a more valid estimation of the association between the exposure and the health outcome and thus is the preferably reported measure. The odds ratio closely approximates the risk ratio in a cohort study or a randomized controlled trial when the outcome of interest does not occur frequently (<10%). The editors, reviewers, authors, and readers of journal articles should be aware of and make the key distinction between the absolute risk reduction and the relative risk reduction. In assessing the findings of a clinical study, the investigators, reviewers, and readers must determine if the findings are not only statistically significant, but also clinically meaningful. Furthermore, in deciding on the merits of a new medication or other therapeutic intervention, the clinician must balance the benefits versus the adverse effects in individual patients. The number needed to treat and the number needed to harm can provide this needed additional insight and perspective.]]>Thu, 02 May 2019 15:50:31 GMT-05:0000000539-201712000-00045
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/01000/Unadjusted_Bivariate_Two_Group_Comparisons__When.48.aspx
<![CDATA[Unadjusted Bivariate Two-Group Comparisons: When Simpler is Better]]>Hypothesis testing involves posing both a null hypothesis and an alternative hypothesis. This basic statistical tutorial discusses the appropriate use, including their so-called assumptions, of the common unadjusted bivariate tests for hypothesis testing and thus comparing study sample data for a difference or association. The appropriate choice of a statistical test is predicated on the type of data being analyzed and compared. The unpaired or independent samples t test is used to test the null hypothesis that the 2 population means are equal, thereby accepting the alternative hypothesis that the 2 population means are not equal. The unpaired t test is intended for comparing dependent continuous (interval or ratio) data from 2 study groups. A common mistake is to apply several unpaired t tests when comparing data from 3 or more study groups. In this situation, an analysis of variance with post hoc (posttest) intragroup comparisons should instead be applied. Another common mistake is to apply a series of unpaired t tests when comparing sequentially collected data from 2 study groups. In this situation, a repeated-measures analysis of variance, with tests for group-by-time interaction, and post hoc comparisons, as appropriate, should instead be applied in analyzing data from sequential collection points. The paired t test is used to assess the difference in the means of 2 study groups when the sample observations have been obtained in pairs, often before and after an intervention in each study subject. The Pearson chi-square test is widely used to test the null hypothesis that 2 unpaired categorical variables, each with 2 or more nominal levels (values), are independent of each other. When the null hypothesis is rejected, 1 concludes that there is a probable association between the 2 unpaired categorical variables. When comparing 2 groups on an ordinal or nonnormally distributed continuous outcome variable, the 2-sample t test is usually not appropriate. The Wilcoxon-Mann-Whitney test is instead preferred. When making paired comparisons on data that are ordinal, or continuous but nonnormally distributed, the Wilcoxon signed-rank test can be used. In analyzing their data, researchers should consider the continued merits of these simple yet equally valid unadjusted bivariate statistical tests. However, the appropriate use of an unadjusted bivariate test still requires a solid understanding of its utility, assumptions (requirements), and limitations. This understanding will mitigate the risk of misleading findings, interpretations, and conclusions.]]>Thu, 02 May 2019 19:40:57 GMT-05:0000000539-201801000-00048
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/02000/Significance,_Errors,_Power,_and_Sample_Size__The.49.aspx
<![CDATA[Significance, Errors, Power, and Sample Size: The Blocking and Tackling of Statistics]]>Inferential statistics relies heavily on the central limit theorem and the related law of large numbers. According to the central limit theorem, regardless of the distribution of the source population, a sample estimate of that population will have a normal distribution, but only if the sample is large enough. The related law of large numbers holds that the central limit theorem is valid as random samples become large enough, usually defined as an n ≥ 30. In research-related hypothesis testing, the term “statistically significant” is used to describe when an observed difference or association has met a certain threshold. This significance threshold or cut-point is denoted as alpha (α) and is typically set at .05. When the observed P value is less than α, one rejects the null hypothesis (Ho) and accepts the alternative. Clinical significance is even more important than statistical significance, so treatment effect estimates and confidence intervals should be regularly reported. A type I error occurs when the Ho of no difference or no association is rejected, when in fact the Ho is true. A type II error occurs when the Ho is not rejected, when in fact there is a true population effect. Power is the probability of detecting a true difference, effect, or association if it truly exists. Sample size justification and power analysis are key elements of a study design. Ethical concerns arise when studies are poorly planned or underpowered. When calculating sample size for comparing groups, 4 quantities are needed: α, type II error, the difference or effect of interest, and the estimated variability of the outcome variable. Sample size increases for increasing variability and power, and for decreasing α and decreasing difference to detect. Sample size for a given relative reduction in proportions depends heavily on the proportion in the control group itself, and increases as the proportion decreases. Sample size for single-group studies estimating an unknown parameter is based on the desired precision of the estimate. Interim analyses assessing for efficacy and/or futility are great tools to save time and money, as well as allow science to progress faster, but are only 1 component considered when a decision to stop or continue a trial is made.]]>Thu, 02 May 2019 19:42:57 GMT-05:0000000539-201802000-00049
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/03000/Statistical_Significance_Versus_Clinical.48.aspx
<![CDATA[Statistical Significance Versus Clinical Importance of Observed Effect Sizes: What Do P Values and Confidence Intervals Really Represent?]]>Effect size measures are used to quantify treatment effects or associations between variables. Such measures, of which >70 have been described in the literature, include unstandardized and standardized differences in means, risk differences, risk ratios, odds ratios, or correlations. While null hypothesis significance testing is the predominant approach to statistical inference on effect sizes, results of such tests are often misinterpreted, provide no information on the magnitude of the estimate, and tell us nothing about the clinically importance of an effect. Hence, researchers should not merely focus on statistical significance but should also report the observed effect size. However, all samples are to some degree affected by randomness, such that there is a certain uncertainty on how well the observed effect size represents the actual magnitude and direction of the effect in the population. Therefore, point estimates of effect sizes should be accompanied by the entire range of plausible values to quantify this uncertainty. This facilitates assessment of how large or small the observed effect could actually be in the population of interest, and hence how clinically important it could be. This tutorial reviews different effect size measures and describes how confidence intervals can be used to address not only the statistical significance but also the clinical significance of the observed effect or association. Moreover, we discuss what P values actually represent, and how they provide supplemental information about the significant versus nonsignificant dichotomy. This tutorial intentionally focuses on an intuitive explanation of concepts and interpretation of results, rather than on the underlying mathematical theory or concepts.]]>Thu, 02 May 2019 19:44:06 GMT-05:0000000539-201803000-00048
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/05000/Correlation_Coefficients__Appropriate_Use_and.50.aspx
<![CDATA[Correlation Coefficients: Appropriate Use and Interpretation]]>Correlation in the broadest sense is a measure of an association between variables. In correlated data, the change in the magnitude of 1 variable is associated with a change in the magnitude of another variable, either in the same (positive correlation) or in the opposite (negative correlation) direction. Most often, the term correlation is used in the context of a linear relationship between 2 continuous variables and expressed as Pearson product-moment correlation. The Pearson correlation coefficient is typically used for jointly normally distributed data (data that follow a bivariate normal distribution). For nonnormally distributed continuous data, for ordinal data, or for data with relevant outliers, a Spearman rank correlation can be used as a measure of a monotonic association. Both correlation coefficients are scaled such that they range from –1 to +1, where 0 indicates that there is no linear or monotonic association, and the relationship gets stronger and ultimately approaches a straight line (Pearson correlation) or a constantly increasing or decreasing curve (Spearman correlation) as the coefficient approaches an absolute value of 1. Hypothesis tests and confidence intervals can be used to address the statistical significance of the results and to estimate the strength of the relationship in the population from which the data were sampled. The aim of this tutorial is to guide researchers and clinicians in the appropriate use and interpretation of correlation coefficients.]]>Thu, 02 May 2019 19:44:55 GMT-05:0000000539-201805000-00050
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/06000/Agreement_Analysis__What_He_Said,_She_Said_Versus.49.aspx
<![CDATA[Agreement Analysis: What He Said, She Said Versus You Said]]>Correlation and agreement are 2 concepts that are widely applied in the medical literature and clinical practice to assess for the presence and strength of an association. However, because correlation and agreement are conceptually distinct, they require the use of different statistics. Agreement is a concept that is closely related to but fundamentally different from and often confused with correlation. The idea of agreement refers to the notion of reproducibility of clinical evaluations or biomedical measurements. The intraclass correlation coefficient is a commonly applied measure of agreement for continuous data. The intraclass correlation coefficient can be validly applied specifically to assess intrarater reliability and interrater reliability. As its name implies, the Lin concordance correlation coefficient is another measure of agreement or concordance. In undertaking a comparison of a new measurement technique with an established one, it is necessary to determine whether they agree sufficiently for the new to replace the old. Bland and Altman demonstrated that using a correlation coefficient is not appropriate for assessing the interchangeability of 2 such measurement methods. They in turn described an alternative approach, the since widely applied graphical Bland–Altman Plot, which is based on a simple estimation of the mean and standard deviation of differences between measurements by the 2 methods. In reading a medical journal article that includes the interpretation of diagnostic tests and application of diagnostic criteria, attention is conventionally focused on aspects like sensitivity, specificity, predictive values, and likelihood ratios. However, if the clinicians who interpret the test cannot agree on its interpretation and resulting typically dichotomous or binary diagnosis, the test results will be of little practical use. Such agreement between observers (interobserver agreement) about a dichotomous or binary variable is often reported as the kappa statistic. Assessing the interrater agreement between observers, in the case of ordinal variables and data, also has important biomedical applicability. Typically, this situation calls for use of the Cohen weighted kappa. Questionnaires, psychometric scales, and diagnostic tests are widespread and increasingly used by not only researchers but also clinicians in their daily practice. It is essential that these questionnaires, scales, and diagnostic tests have a high degree of agreement between observers. It is therefore vital that biomedical researchers and clinicians apply the appropriate statistical measures of agreement to assess the reproducibility and quality of these measurement instruments and decision-making processes.]]>Thu, 02 May 2019 19:45:39 GMT-05:0000000539-201806000-00049
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/07000/Regression__The_Apple_Does_Not_Fall_Far_From_the.45.aspx
<![CDATA[Regression: The Apple Does Not Fall Far From the Tree]]>Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.]]>Thu, 02 May 2019 19:46:22 GMT-05:0000000539-201807000-00045
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/08000/Repeated_Measures_Designs_and_Analysis_of.41.aspx
<![CDATA[Repeated Measures Designs and Analysis of Longitudinal Data: If at First You Do Not Succeed—Try, Try Again]]>Anesthesia, critical care, perioperative, and pain research often involves study designs in which the same outcome variable is repeatedly measured or observed over time on the same patients. Such repeatedly measured data are referred to as longitudinal data, and longitudinal study designs are commonly used to investigate changes in an outcome over time and to compare these changes among treatment groups. From a statistical perspective, longitudinal studies usually increase the precision of estimated treatment effects, thus increasing the power to detect such effects. Commonly used statistical techniques mostly assume independence of the observations or measurements. However, values repeatedly measured in the same individual will usually be more similar to each other than values of different individuals and ignoring the correlation between repeated measurements may lead to biased estimates as well as invalid P values and confidence intervals. Therefore, appropriate analysis of repeated-measures data requires specific statistical techniques. This tutorial reviews 3 classes of commonly used approaches for the analysis of longitudinal data. The first class uses summary statistics to condense the repeatedly measured information to a single number per subject, thus basically eliminating within-subject repeated measurements and allowing for a straightforward comparison of groups using standard statistical hypothesis tests. The second class is historically popular and comprises the repeated-measures analysis of variance type of analyses. However, strong assumptions that are seldom met in practice and low flexibility limit the usefulness of this approach. The third class comprises modern and flexible regression-based techniques that can be generalized to accommodate a wide range of outcome data including continuous, categorical, and count data. Such methods can be further divided into so-called “population-average statistical models” that focus on the specification of the mean response of the outcome estimated by generalized estimating equations, and “subject-specific models” that allow a full specification of the distribution of the outcome by using random effects to capture within-subject correlations. The choice as to which approach to choose partly depends on the aim of the research and the desired interpretation of the estimated effects (population-average versus subject-specific interpretation). This tutorial discusses aspects of the theoretical background for each technique, and with specific examples of studies published in Anesthesia & Analgesia, demonstrates how these techniques are used in practice.]]>Thu, 02 May 2019 19:47:24 GMT-05:0000000539-201808000-00041
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/09000/Survival_Analysis_and_Interpretation_of.32.aspx
<![CDATA[Survival Analysis and Interpretation of Time-to-Event Data: The Tortoise and the Hare]]>Survival analysis, or more generally, time-to-event analysis, refers to a set of methods for analyzing the length of time until the occurrence of a well-defined end point of interest. A unique feature of survival data is that typically not all patients experience the event (eg, death) by the end of the observation period, so the actual survival times for some patients are unknown. This phenomenon, referred to as censoring, must be accounted for in the analysis to allow for valid inferences. Moreover, survival times are usually skewed, limiting the usefulness of analysis methods that assume a normal data distribution. As part of the ongoing series in Anesthesia & Analgesia, this tutorial reviews statistical methods for the appropriate analysis of time-to-event data, including nonparametric and semiparametric methods—specifically the Kaplan-Meier estimator, log-rank test, and Cox proportional hazards model. These methods are by far the most commonly used techniques for such data in medical literature. Illustrative examples from studies published in Anesthesia & Analgesia demonstrate how these techniques are used in practice. Full parametric models and models to deal with special circumstances, such as recurrent events models, competing risks models, and frailty models, are briefly discussed.]]>Thu, 02 May 2019 19:47:59 GMT-05:0000000539-201809000-00032
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/10000/Diagnostic_Testing_and_Decision_Making__Beauty_Is.39.aspx
<![CDATA[Diagnostic Testing and Decision-Making: Beauty Is Not Just in the Eye of the Beholder]]>To use a diagnostic test effectively and consistently in their practice, clinicians need to know how well the test distinguishes between those patients who have the suspected acute or chronic disease and those patients who do not. Clinicians are equally interested and usually more concerned whether, based on the results of a screening test, a given patient actually: (1) does or does not have the suspected disease; or (2) will or will not subsequently experience the adverse event or outcome. Medical tests that are performed to screen for a risk factor, diagnose a disease, or to estimate a patient’s prognosis are frequently a key component of a clinical research study. Like therapeutic interventions, medical tests require proper analysis and demonstrated efficacy before being incorporated into routine clinical practice. This basic statistical tutorial, thus, discusses the fundamental concepts and techniques related to diagnostic testing and medical decision-making, including sensitivity and specificity, positive predictive value and negative predictive value, positive and negative likelihood ratio, receiver operating characteristic curve, diagnostic accuracy, choosing a best cut-point for a continuous variable biomarker, comparing methods on diagnostic accuracy, and design of a diagnostic accuracy study.]]>Thu, 02 May 2019 19:49:13 GMT-05:0000000539-201810000-00039
https://journals.lww.com/anesthesia-analgesia/Fulltext/2019/01000/Psychometrics__Trust,_but_Verify.27.aspx
<![CDATA[Psychometrics: Trust, but Verify]]>There is a continued mandate for practicing evidence-based medicine and the prerequisite rigorous analysis of the comparative effectiveness of alternative treatments. There is also an increasing emphasis on delivering value-based health care. Both these high priorities and their related endeavors require correct information about the outcomes of care. Accurately measuring and confirming health care outcomes are thus likely now of even greater importance. The present basic statistical tutorial focuses on the germane topic of psychometrics. In its narrower sense, psychometrics is the science of evaluating the attributes of such psychological tests. However, in its broader sense, psychometrics is concerned with the objective measurement of the skills, knowledge, and abilities, as well as the subjective measurement of the interests, values, and attitudes of individuals—both patients and their clinicians. While psychometrics is principally the domain and content expertise of psychiatry, psychology, and social work, it is also very pertinent to patient care, education, and research in anesthesiology, perioperative medicine, critical care, and pain medicine. A key step in selecting an existing or creating a new health-related assessment tool, scale, or survey is confirming or establishing the usefulness of the existing or new measure; this process conventionally involves assessing its reliability and its validity. Assessing reliability involves demonstrating that the measurement instrument generates consistent and hence reproducible results—in other words, whether the instrument produces the same results each time it is used in the same setting, with the same type of subjects. This includes interrater reliability, intrarater reliability, test–retest reliability, and internal reliability. Assessing validity is answering whether the instrument is actually measuring what it is intended to measure. This includes content validity, criterion validity, and construct validity. In evaluating a reported set of research data and its analyses, in a similar manner, it is important to assess the overall internal validity of the attendant study design and the external validity (generalizability) of its findings.]]>Thu, 02 May 2019 19:49:57 GMT-05:0000000539-201901000-00027
https://journals.lww.com/anesthesia-analgesia/Fulltext/2019/02000/Statistical_Process_Control__No_Hits,_No_Runs,_No.24.aspx
<![CDATA[Statistical Process Control: No Hits, No Runs, No Errors?]]>A novel intervention or new clinical program must achieve and sustain its operational and clinical goals. To demonstrate successfully optimizing health care value, providers and other stakeholders must longitudinally measure and report these tracked relevant associated outcomes. This includes clinicians and perioperative health services researchers who chose to participate in these process improvement and quality improvement efforts (“play in this space”). Statistical process control is a branch of statistics that combines rigorous sequential, time-based analysis methods with graphical presentation of performance and quality data. Statistical process control and its primary tool—the control chart—provide researchers and practitioners with a method of better understanding and communicating data from health care performance and quality improvement efforts. Statistical process control presents performance and quality data in a format that is typically more understandable to practicing clinicians, administrators, and health care decision makers and often more readily generates actionable insights and conclusions. Health care quality improvement is predicated on statistical process control. Undertaking, achieving, and reporting continuous quality improvement in anesthesiology, critical care, perioperative medicine, and acute and chronic pain management all fundamentally rely on applying statistical process control methods and tools. Thus, the present basic statistical tutorial focuses on the germane topic of statistical process control, including random (common) causes of variation versus assignable (special) causes of variation: Six Sigma versus Lean versus Lean Six Sigma, levels of quality management, run chart, control charts, selecting the applicable type of control chart, and analyzing a control chart. Specific attention is focused on quasi-experimental study designs, which are particularly applicable to process improvement and quality improvement efforts.]]>Thu, 02 May 2019 19:51:00 GMT-05:0000000539-201902000-00024
https://journals.lww.com/anesthesia-analgesia/Fulltext/2019/03000/Systematic_Review_and_Meta_analysis__Sometimes.25.aspx
<![CDATA[Systematic Review and Meta-analysis: Sometimes Bigger Is Indeed Better]]>Clinicians encounter an ever increasing and frequently overwhelming amount of information, even in a narrow scope or area of interest. Given this enormous amount of scientific information published every year, systematic reviews and meta-analyses have become indispensable methods for the evaluation of medical treatments and the delivery of evidence-based best practice. The present basic statistical tutorial thus focuses on the fundamentals of a systematic review and meta-analysis, against the backdrop of practicing evidence-based medicine. Even if properly performed, a single study is no more than tentative evidence, which needs to be confirmed by additional, independent research. A systematic review summarizes the existing, published research on a particular topic, in a well-described, methodical, rigorous, and reproducible (hence “systematic”) manner. A systematic review typically includes a greater range of patients than any single study, thus strengthening the external validity or generalizability of its findings and the utility to the clinician seeking to practice evidence-based medicine. A systematic review often forms the basis for a concomitant meta-analysis, in which the results from the identified series of separate studies are aggregated and statistical pooling is performed. This allows for a single best estimate of the effect or association. A conjoint systematic review and meta-analysis can provide an estimate of therapeutic efficacy, prognosis, or diagnostic test accuracy. By aggregating and pooling the data derived from a systemic review, a well-done meta-analysis essentially increases the precision and the certainty of the statistical inference. The resulting single best estimate of effect or association facilitates clinical decision making and practicing evidence-based medicine. A well-designed systematic review and meta-analysis can provide valuable information for researchers, policymakers, and clinicians. However, there are many critical caveats in performing and interpreting them, and thus, like the individual research studies on which they are based, there are many ways in which meta-analyses can yield misleading information. Creators, reviewers, and consumers alike of systematic reviews and meta-analyses would thus be well-served to observe and mitigate their associated caveats and potential pitfalls.]]>Thu, 02 May 2019 19:51:36 GMT-05:0000000539-201903000-00025
https://journals.lww.com/anesthesia-analgesia/Fulltext/2018/01000/Writing_Research_Reports.47.aspx
<![CDATA[Writing Research Reports]]>Clear writing makes manuscripts easier to understand. Clear writing enhances research reports, increasing clinical adoption and scientific impact. We discuss styles and organization to help junior investigators present their findings and avoid common errors.]]>Thu, 02 May 2019 19:42:17 GMT-05:0000000539-201801000-00047