Unable to collapse the data in loop and create variables [closed]

Question

Closed. This question needs debugging details. It is not currently accepting answers.

Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.

Closed 4 months ago.

Improve this question

I am trying to create a loop in Stata so that I can add multiple variables for creating mean_var and median_var. However, this data isn't creating any variables; the collapsed_file is the same as my original data.

use appended_data, clear

// Define variables for loop (Renamed)

local renamed_vars varA varB

// Speaker-Interview Level Averages and Medians
preserve
foreach var of local renamed_vars {
      collapse (mean) mean_`var' = `var' ///
                    (median) median_`var' = `var', ///
                     by(interview_id speaker_num)
}
tempfile collapsed_file
save `collapsed_file', replace
restore

Can someone help? Thanks!

Cross-posted at statalist.org/forums/forum/general-stata-discussion/general/… In any forum, it is courteous to tell people about cross-posting. — Nick Cox
– Nick Cox, Commented Jan 16 at 9:05
You have experience across several languages but need to ask more focused questions based on minimal reproducible examples. In this case that would mean creating a small toy dataset. See stackoverflow.com/help/minimal-reproducible-example — Nick Cox
– Nick Cox, Commented Jan 16 at 9:48
While the OP has (here and elsewhere) received advice that should help, this is too confused a question to have conceivable long-term value for future readers. — Nick Cox
– Nick Cox, Commented Jan 16 at 11:55

Nick Cox · Accepted Answer · 2025-01-16 10:18:44Z

There are so far as I can see several problems and misconceptions here beyond your question.

Presumably your observations are identified by interview_id speaker_num, so each group of observations is just one observation, with the result reported. I guess that you should specify interview_id only. The general idea is that you should tell by() the variable(s) that specify the groups you want.

I have four comments on your code otherwise.

You could slim it down. The indirection of creating a local macro only to use its contents immediately afterwards serves no good purpose.

use appended_data, clear

// Speaker-Interview Level Averages and Medians
preserve
foreach var in varA varB {
      collapse (mean) mean_`var' = `var' ///
                    (median) median_`var' = `var', ///
                     by(interview_id speaker_num)
}
tempfile collapsed_file
save `collapsed_file', replace
restore

You are collapsing twice, so the second collapse works on the previously collapsed file. For your intended purpose, you would I think need to read in the original data again. The restore comes too late to do that. Or (better) collapse both variables in the same command.
Creating a new file is not obviously helpful. One solution is to create new variables and tag just one observation in each group, something like this:

use appended_data, clear

// Speaker-Interview Level Averages and Medians of varA varB
foreach var in varA varB {
      egen mean_`var' = mean(`var'), by(interview_id)
      egen median_`var' = median(`var'), by(interview_id)
}

egen tag = tag(interview_id)

Now you can compare means and medians if tag.

Putting results in temporary files may fit some wider strategy, but you have to be very careful. You are usually better served by using a more permanent filename.

Collectives™ on Stack Overflow

Unable to collapse the data in loop and create variables [closed]

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related