How to automatically run a section of code with different parameter combinations

Question

I'm attempting to write some code in order to do some analyses on ecological data. What I'm currently doing is calculating the diversity index for certain stretches of a stream. I have a data frame of information for 5 stream sections (Site), over three periods of time (Survey), and the diversity is calculated from the number (Count) of each species (Species) we caught at whatever site during whatever survey.

This is the code I'm currently running.

library(tidyverse)
library(vegan)

# SHANNON DIVERSITY ----

# Species count by survey and site
count_sum_norun <- df %>% 
  group_by(Survey, Site, Species) %>%
  summarise(Count = sum(Count))%>%
  ungroup

## SPECIFY survey and site ---- 
sp_count <- count_sum_norun %>%
  filter(Survey == 1, Site == "B")

## Calculate Diversity ----
shannon_diversity_vegan <- diversity(sp_count$Count, index="shannon")

this is an example of my starting dataframe:

Survey Site    Run Species Count
    <dbl> <chr> <dbl> <chr>   <dbl>
 1      1 A         1 rbt         1
 2      1 A         1 rbt         1
 3      1 A         1 rbt         1
 4      1 A         1 rbt         1
 5      1 A         1 rbt         1
 6      1 A         1 rbt         1
 7      1 A         1 rbt         1
 8      1 A         1 rbt         1
 9      1 A         1 rbt         1
10      1 A         1 rbt         1
# ℹ 1,963 more rows

This is the kind of dataframe that first chunk of code gives me:

A tibble: 92 × 4
   Survey Site  Species Count
    <dbl> <chr> <chr>   <dbl>
 1      1 A     brt         1
 2      1 A     cmm         1
 3      1 A     lnd         1
 4      1 A     rbt        95
 5      1 A     scul       92
 6      1 A     ws          2
 7      1 B     bnm         1
 8      1 B     brt         6
 9      1 B     rbt        95
10      1 B     scul       35
# ℹ 82 more rows

This is the kind of dataframe I get from the second chunk of code, which specifies the section of stream, and the time period:

 A tibble: 7 × 4
  Survey Site  Species Count
   <dbl> <chr> <chr>   <dbl>
1      3 B     brt       294
2      3 B     bsb         2
3      3 B     coho      176
4      3 B     rbt       381
5      3 B     scul      327
6      3 B     wbnd        1
7      3 B     ws          5

this is the kind of output I get, calculating the diversity from the above selected data:

shannon_diversity_vegan
[1] 1.388683

All of the above is correct, and runs great.

But I have further analyses to do with the diversity data I get out of this and I'd like to be able to have a dataframe of the diversity value per every survey and site combination, so I can easily plot it against other data points I have.

Something that looks like this ideally, but with that last column being my Diversity values:

 Survey Site  Richness
    <dbl> <chr>   <int>
 1      1 A           6
 2      1 B           5
 3      1 C           4
 4      1 D           3
 5      2 A          11
 6      2 B           7
 7      2 C           6
 8      2 D           3
 9      2 E           9
10      3 A          13
11      3 B           7
12      3 D           5
13      3 E          13

So my question is, is there a straightforward way to run that second chunk of code I have

## SPECIFY survey and site ---- 
sp_count <- count_sum_norun %>%
  filter(Survey == 1, Site == "B")

over and over with the different combinations of Survey and Site filtered, so I can throw them in a nifty little dataframe to work with, instead of manually changing the parameters over and over and copy pasting the diversity values I get into excel to work with. Which is what I'm currently doing, and is super annoying and inefficient.

There seems like there should be a relatively easy way to do this, but I'm very new to R and don't know where to start to figure it out.

EDIT: I've been told that having a better dataframe in here as an example would be helpful (sorry, I did not know how to create a dataframe that would work with my code, but I've puzzled it out), so I think this code will be able to be fully run as a simplified example with some random numbers:

library(tidyverse)
library(vegan

set.seed(1)     

df <- data.frame(
  Species = rep(c("brt", "rbt", "scul"), times = 15),
  Survey = rep(c(1,2,3), each = 15),
  Site = rep(c("a","b","c","d","e"), times = 3, each = 3),
  Count = sample(1:15)
)

## SPECIFY survey and site ---- 
sp_count_pe <- df %>%
  filter(Survey == 1, Site == "b")


## Calculate Diversity ----
shannon_diversity_vegan <- diversity(sp_count_pe$Count, index="shannon")

so it's still this code section that I'm trying to be able to automatically replicate, with all the different site and survey combos, without having to do it manually:

## SPECIFY survey and site ---- 
sp_count_pe <- df %>%
  filter(Survey == 1, Site == "b")

with the goal of being able to throw the diversity values I can get from each combination into a data frame

It's easier to help you if you include a simple reproducible example with sample input and desired output that can be used to test and verify possible solutions. Share data with dput() or generate some sample data in the question itself so we can run the code and test it ourselves. — MrFlick
– MrFlick, Commented Nov 18 at 18:54
Maybe count_sum_norun %>% dplyr::mutate(Diversisty = diversity(sp_count$Count, index="shannon"), .by=c(Survey, Site)) does what you want? — MrFlick
– MrFlick, Commented Nov 18 at 19:00
@MrFlick Thank you, I've edited my post and added modified code at the end so hopefully it can be run like you asked. I'll try your suggestion to see if it helps any. — Ray
– Ray, Commented Nov 18 at 20:37
If you use sample to set-up example data, make sure to use set.seed() to make it reproducible. — Friede
– Friede, Commented Nov 18 at 20:49

Friede · Accepted Answer · 2025-11-18 22:27:44Z

3

Aggregation based on combinations present.

# base R
> aggregate(cbind(div_index=Count)~Survey+Site, X,  
+           vegan::diversity, index='shannon')
  Survey Site div_index
1      1    A 0.9002561
2      2    A 1.0114043
3      3    A 0.6365142
4      1    B 0.6829081
5      2    B 0.5004024
6      3    B 0.5982696
7      1    C 0.9002561
8      2    C 0.6730117
9      3    D 0.0000000
> 
> # dplyr
> dplyr::summarise(X, div_index=vegan::diversity(Count, index='shannon'), 
+                  .by=c(Survey, Site))
# A tibble: 9 × 3
  Survey Site div_index
   <dbl> <chr>    <dbl>
1      1 A        0.900
2      1 B        0.683
3      1 C        0.900
4      2 A        1.01 
5      2 B        0.500
6      2 C        0.673
7      3 A        0.637
8      3 B        0.598
9      3 D        0

Constructed Data

X = tibble::tribble(
  ~Survey, ~Site, ~Species, ~Count,
  1, "A", "rbt", 5,
  1, "A", "bnc", 2,
  1, "A", "sts", 1,
  1, "B", "rbt", 3,
  1, "B", "scu", 4,
  1, "C", "rbt", 1,
  1, "C", "scb", 2,
  1, "C", "sts", 5,
  2, "A", "rbt", 2,
  2, "A", "bnc", 1,
  2, "A", "scu", 3,
  2, "B", "rbt", 4,
  2, "B", "scb", 1,
  2, "C", "sts", 3,
  2, "C", "scu", 2,
  3, "A", "rbt", 1,
  3, "A", "bnc", 2,
  3, "B", "rbt", 2,
  3, "B", "scb", 5,
  3, "D", "sts", 3
)

(If this is identified as simple aggregation question it should be a duplicate. Question is about an hour old and no dupe votes.)

edited Nov 18 at 22:27

answered Nov 18 at 19:10

Friede

11.7k2 gold badges14 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Ray Nov 19 at 15:25

This worked for me, thank you!

Collectives™ on Stack Overflow

How to automatically run a section of code with different parameter combinations

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related