Skip to main content

WHEN TO USE WHAT STATISTICAL TEST IN RESEARCH

There are several statistical test types for analyzing Research Data. When to use what is often the challenge. This piece provides a simplification 

1️⃣t-test:

- Use when: You want to compare the means of two groups to determine if there's a significant difference.
- Example: You want to compare the average score of students who received traditional teaching vs. those who received innovative teaching.

2️⃣ANOVA (Analysis of Variance):

- Use when: You want to compare the means of three or more groups to determine if there are significant differences.
- Example: You want to compare the average score of students from different schools to determine if there are significant differences in their performance.

3️⃣Regression (Simple and Multiple):

- Use when: You want to examine the relationship between a dependent variable and one or more independent variables.
- Example: You want to examine the relationship between hours studied and exam scores (simple regression), or the relationship between hours studied, exam scores, and student motivation (multiple regression).

4️⃣Chi-squared test:

- Use when: You want to determine if there's a significant association between two categorical variables.
- Example: You want to determine if there's a significant association between smoking and lung cancer.

5️⃣Wilcoxon rank-sum test (Mann-Whitney U test):

- Use when: You want to compare the distributions of two independent groups.
- Example: You want to compare the distribution of scores between students who received traditional teaching and those who received innovative teaching.

6️⃣Kruskal-Wallis H test:

- Use when: You want to compare the distributions of three or more independent groups.
- Example: You want to compare the distribution of scores among students from different schools.

7️⃣Friedman test:

- Use when: You want to compare the distributions of three or more related groups.
- Example: You want to compare the distribution of scores among students at different time points.

8️⃣Pearson correlation coefficient:

- Use when: You want to examine the linear relationship between two continuous variables.
- Example: You want to examine the relationship between hours studied and exam scores.

9️⃣Spearman rank correlation coefficient:

- Use when: You want to examine the relationship between two variables when data is not normally distributed.
- Example: You want to examine the relationship between ranking of favorite foods and ranking of nutritional value.

🔟Kendall's tau correlation coefficient:

- Use when: You want to examine the relationship between two variables when data is ordinal or categorical.
- Example: You want to examine the relationship between socioeconomic status and education level.

1️⃣1️⃣ARIMA models:

- Use when: You want to forecast future values in a time series data.
- Example: You want to predict stock prices based on past trends.

1️⃣2️⃣Exponential smoothing (ES):

- Use when: You want to forecast future values in a time series data with a simple exponential smoothing method.
- Example: You want to predict sales based on past trends.

1️⃣3️⃣Seasonal decomposition:

- Use when: You want to decompose time series data into trend, seasonality, and residuals.
- Example: You want to analyze website traffic data to identify seasonal patterns.

1️⃣4️⃣Kaplan-Meier estimator:

- Use when: You want to estimate the survival function of a population.
- Example: You want to analyze the survival rate of patients with a specific disease.

1️⃣5️⃣Cox proportional hazards model:

- Use when: You want to examine the relationship between covariates and survival time.
- Example: You want to investigate the effect of treatment on survival time.

1️⃣6️⃣Log-rank test:

- Use when: You want to compare the survival curves of two or more groups.
- Example: You want to compare the survival rates of patients with different treatments.

1️⃣7️⃣K-means clustering:

- Use when: You want to group similar observations into clusters based on features.
- Example: You want to segment customers based on buying behavior.

1️⃣8️⃣Hierarchical clustering:

- Use when: You want to group similar observations into clusters based on features, with a hierarchical structure.
- Example: You want to analyze gene expression data to identify clusters of genes.

1️⃣9️⃣DBSCAN (density-based spatial clustering of applications with noise):

- Use when: You want to group similar observations into clusters based on features, with noise handling.
- Example: You want to analyze spatial data to identify clusters of high density.

2️⃣0️⃣Principal component analysis (PCA):

- Use when: You want to reduce the dimensionality of a dataset by identifying principal components.
- Example: You want to analyze stock prices to identify principal components of variation.

2️⃣1️⃣Discriminant analysis:

- Use when: You want to predict group membership based on multivariate data.
- Example: You want to predict customer churn based on usage patterns.

2️⃣2️⃣Canonical correlation analysis:

- Use when: You want to examine the relationship between two sets of multivariate data.
- Example: You want to investigate the relationship between personality traits and behavior.

2️⃣3️⃣Bayesian inference:

- Use when: You want to update probabilities based on new data.
- Example: You want to update the probability of a hypothesis based on new evidence.

2️⃣4️⃣Bayesian regression:

- Use when: You want to model the relationship between variables using Bayesian methods.
- Example:

2️⃣5️⃣Bayesian networks:

- Use when: You want to model complex relationships between variables using Bayesian methods.
- Example: You want to model the relationship between genes and diseases.

2️⃣6️⃣Decision trees:

- Use when: You want to classify observations based on a tree-like model.
- Example: You want to predict customer churn based on usage patterns.

2️⃣7️⃣Random forests:

- Use when: You want to classify observations based on an ensemble of decision trees.
- Example: You want to predict disease diagnosis based on symptoms.

2️⃣8️⃣Support vector machines (SVMs):

- Use when: You want to classify observations based on a hyperplane.
- Example: You want to predict customer churn based on usage patterns.

2️⃣9️⃣Cluster analysis:

- Use when: You want to group similar observations into clusters based on features.
- Example: You want to segment customers based on buying behavior.

3️⃣0️⃣Factor analysis:

- Use when: You want to reduce the dimensionality of a dataset by identifying underlying factors.
- Example: You want to analyze survey data to identify underlying factors of satisfaction.

3️⃣1️⃣Survival analysis:

- Use when: You want to analyze the time-to-event data.
- Example: You want to analyze the survival rate of patients with a specific disease.

3️⃣2️⃣Time-series analysis:

- Use when: You want to analyze data that is ordered in time.
- Example: You want to analyze stock prices to identify patterns and trends.

3️⃣3️⃣Non-parametric tests:

- Use when: You want to analyze data without assuming a specific distribution.
- Example: You want to compare the median scores of students who received traditional teaching vs. those who received innovative teaching.

3️⃣4️⃣Machine learning algorithms:

- Use when: You want to predict outcomes or classify observations based on large datasets.
- Example: You want to predict customer churn based on usage patterns.

The specific test or technique used depends on the research question, data type, and study design.




Comments

Popular posts from this blog

Photogrammetry – Types of Photographs

In photogrammetry, aerial photographs are categorized based on camera orientation , coverage , and spectral sensitivity . Below is a breakdown of the major types: 1️⃣ Based on Camera Axis Orientation Type Description Key Feature Vertical Photo Taken with the camera axis pointing directly downward (within 3° of vertical). Used for maps and measurements Oblique Photo Taken with the camera axis tilted away from vertical. Covers more area but with distortions Low Oblique: Horizon not visible High Oblique: Horizon visible 2️⃣ Based on Number of Photos Taken Type Description Single Photo One image taken of an area Stereoscopic Pair Two overlapping photos for 3D viewing and depth analysis Strip or Mosaic Series of overlapping photos covering a long area, useful in mapping large regions 3️⃣ Based on Spectral Sensitivity Type Description Application Panchromatic Captures images in black and white General mapping Infrared (IR) Sensitive to infrared radiation Veget...

Photogrammetry – Geometry of a Vertical Photograph

Photogrammetry is the science of making measurements from photographs, especially for mapping and surveying. When the camera axis is perpendicular (vertical) to the ground, the photo is called a vertical photograph , and its geometry is central to accurate mapping.  Elements of Vertical Photo Geometry In a vertical aerial photograph , the geometry is governed by the central projection principle. Here's how it works: 1. Principal Point (P) The point on the photo where the optical axis of the camera intersects the photo plane. It's the geometric center of the photo. 2. Nadir Point (N) The point on the ground directly below the camera at the time of exposure. Ideally, in a perfect vertical photo, the nadir and principal point coincide. 3. Photo Center (C) Usually coincides with the principal point in a vertical photo. 4. Ground Coordinates (X, Y, Z) Real-world (map) coordinates of objects photographed. 5. Flying Height (H) He...

Raster Data Structure

Raster Data Raster data is like a digital photo made up of small squares called cells or pixels . Each cell shows something about that spot — like how high it is (elevation), how hot it is (temperature), or what kind of land it is (forest, water, etc.). Think of it like a graph paper where each box is colored to show what's there. Key Points What's in the cell? Each cell stores information — for example, "water" or "forest." Where is the cell? The cell's location comes from its place in the grid (like row 3, column 5). We don't need to store its exact coordinates. How Do We Decide a Cell's Value? Sometimes, one cell covers more than one thing (like part forest and part water). To choose one value , we can: Center Point: Use whatever feature is in the middle. Most Area: Use the feature that takes up the most space in the cell. Most Important: Use the most important feature (like a road or well), even if it...

Photogrammetry

Photogrammetry is the science of taking measurements from photographs —especially to create maps, models, or 3D images of objects, land, or buildings. Imagine you take two pictures of a mountain from slightly different angles. Photogrammetry uses those photos to figure out the shape, size, and position of the mountain—just like our eyes do when we see in 3D! Concepts and Terminologies 1. Photograph A picture captured by a camera , either from the ground (terrestrial) or from above (aerial or drone). 2. Stereo Pair Two overlapping photos taken from different angles. When seen together, they help create a 3D effect —just like how two human eyes work. 3. Overlap To get a 3D model, photos must overlap each other: Forward overlap : Between two photos in a flight line (usually 60–70%) Side overlap : Between adjacent flight lines (usually 30–40%) 4. Scale The ratio of the photo size to real-world size. Example: A 1:10,000 scale photo means 1 cm on the photo...

Logical Data Model in GIS

In GIS, a logical data model defines how data is structured and interrelated—independent of how it is physically stored or implemented. It serves as a blueprint for designing databases, focusing on the organization of entities, their attributes, and relationships, without tying them to a specific database technology. Key Features Abstraction : The logical model operates at an abstract level, emphasizing the conceptual structure of data rather than the technical details of storage or implementation. Entity-Attribute Relationships : It identifies key entities (objects or concepts) and their attributes (properties), as well as the logical relationships between them. Business Rules : Business logic is embedded in the model to enforce rules, constraints, and conditions that ensure data consistency and accuracy. Technology Independence : The logical model is platform-agnostic—it is not tied to any specific database system or storage format. Visual Representat...