PakMediNet - Medical Information Gateway of Pakistan

Discussion Forum For Health Professionals

Post a Message

Lost your password?

Post Icon:

Note: Only Health Care Professionals (Doctors, Nurses, Pharmacists etc) and Members of this forum can add a message or reply to this message. Messages of the Non Health Care Professionals will be deleted without notification.

Topic Review - Newest First (only newest 5 are displayed)

anwer_khur

Re: What is wrong with this

What we have discussed so far deal with nonlinear transformation (log, reciprocal, square). However in the case of linear transformation (for example conversion of Centigrade to Fahrenheit 9/5 C +32 =F) you will get exactly the same mean for untransformed as well as transformed data.

rqayyum

Re: What is wrong with this

Thank you

anwer_khur

Re: What is wrong with this

Thanks for continuing the discussion: The question is whether back transformation is legitimate or real thing? Like most questions in Statistics, the answer is "it depends...". Strictly speaking, the back transformation is valid and useful for interpretation because it returns data to the original measurement scale. However, once data have been transformed, interpretation of what the transformed (or backtransformed) mean, regression coefficients, CI's and differences among means represent requires special care and is not necessarily intuitive. In short, the old caution applies: if you transform data to meet the assumptions of a statistical test, (medical, biological) interpretation of the output should be made with care.

A Word of warning: With log and other non-linear transformations, the back-transformed mean of the transformed variable will never be the same as the mean of the original raw variable. Log transformation yields the so-called geometric mean of the variable, which isn't easily interpreted.

rqayyum

Re: What is wrong with this

Thank you Prof. Khurshid. This detail is very helpful and makes quite a few things clearer.

What is surprising me is that transforming data (any data - normal or non-normal) results in different mean than a non-transformed data, i.e. mean is not stable. I am wondering, is it that I don't understand something or that once data goes through transformation its mean is not the same. The reason, I am spending so much time on it is that I have noticed, many parametric tests use mean. If mean is not stable after transformation of data, should we really be transforming data?

anwer_khur

Re: What is wrong with this

Sometimes we can come across some examples of data for which a linear regression model is not appropriate: a residual analysis would suggest that one or more of the assumptions of the linear regression model were broken. We may recall that the linear regression model assumes the following:

Independence:
The response variables are independent.

Normality:
The response variables are normally distributed.

Homoscedasticity:
The response variables all have the same variance .

Linearity:
The true relationship between the mean of the response variable and the explanatory variables is a straight line.

The necessity to transform data may arise under the conditions of non-independence or non-normality (in most cases). Data transformation seems like a lot of manipulation at a first glance, but it just involves placing the data on another scale.

As you have written you are transforming the data. So the question is why and what is transformation? By transformation we mean "a change in the scale for the values of a variable obtained by using some mathematical operations". Sometimes transformations are performed to simplify calculations. Frequently, transformations are made so that transformed data can satisfy the assumptions underlying a given statistical procedure.

Following is a brief summary of three commonly used transformations (which you have mentioned).

1. Logarithmic transformation: It is used when (a) the variances are not equal (heterogeneity of variances), (b) standard deviations are proportional to the means (CV's are equal), (c) when the data is positively skewed.

Procedure:
Step 1: Convert raw data into their logarithms by or depending on the data.
Step 2: Perform analysis on log data.
Step 3: Convert back into units of the raw data by taking the antilog of the results.

{Taking logarithms of the sample values (i.e., transforming the sample), finding the arithmetic mean of the logs, and then retransforming back to the original scale (by taking antilogs), the result is the sample geometric mean}.

2. Squared Transformation: It is used when (a) standard deviation decreases as the mean increases, (b) when the data is negatively skewed.

Procedure:
Convert raw data into squared transformation by

3. Reciprocal Transformation: It is used when standard deviation is proportional to the square of the mean.

Procedure:
Convert raw data into reciprocal transformation by or (to avoid zero in original data)
{Taking reciprocals of the sample values (i.e., transforming the sample), finding the arithmetic mean of the reciprocals, and then retransforming back to the original scale the result is the sample harmonic mean which is sometimes used to average rates.}