-2

I am trying to create a loop in Stata so that I can add multiple variables for creating mean_var and median_var. However, this data isn't creating any variables; the collapsed_file is the same as my original data.

use appended_data, clear

// Define variables for loop (Renamed)

local renamed_vars varA varB

// Speaker-Interview Level Averages and Medians
preserve
foreach var of local renamed_vars {
      collapse (mean) mean_`var' = `var' ///
                    (median) median_`var' = `var', ///
                     by(interview_id speaker_num)
}
tempfile collapsed_file
save `collapsed_file', replace
restore

Can someone help? Thanks!

4

1 Answer 1

3

There are so far as I can see several problems and misconceptions here beyond your question.

Presumably your observations are identified by interview_id speaker_num, so each group of observations is just one observation, with the result reported. I guess that you should specify interview_id only. The general idea is that you should tell by() the variable(s) that specify the groups you want.

I have four comments on your code otherwise.

  • You could slim it down. The indirection of creating a local macro only to use its contents immediately afterwards serves no good purpose.
use appended_data, clear

// Speaker-Interview Level Averages and Medians
preserve
foreach var in varA varB {
      collapse (mean) mean_`var' = `var' ///
                    (median) median_`var' = `var', ///
                     by(interview_id speaker_num)
}
tempfile collapsed_file
save `collapsed_file', replace
restore
  • You are collapsing twice, so the second collapse works on the previously collapsed file. For your intended purpose, you would I think need to read in the original data again. The restore comes too late to do that. Or (better) collapse both variables in the same command.

  • Creating a new file is not obviously helpful. One solution is to create new variables and tag just one observation in each group, something like this:

use appended_data, clear

// Speaker-Interview Level Averages and Medians of varA varB
foreach var in varA varB {
      egen mean_`var' = mean(`var'), by(interview_id)
      egen median_`var' = median(`var'), by(interview_id)
}

egen tag = tag(interview_id)

Now you can compare means and medians if tag.

  • Putting results in temporary files may fit some wider strategy, but you have to be very careful. You are usually better served by using a more permanent filename.
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.