I'm currently teaching myself R, and I'm attempting to write code to give me a population estimate of different fish species in different stretches of a stream (Site) over during different periods of time (Survey). The code I'm using to do so is this:
library(tidyverse)
library(FSA)
set.seed(1)
#random sample dataframe
df <- data.frame(
Survey = rep(c(1,2,3), each = 45),
Site = rep(c("a","b","c","d","e"), times = 3, each = 9),
Run = rep (c(1,2,3), 15, each = 3),
Species = rep(c("brt", "rbt", "scul"), times = 45),
Count = sample(1:15))
# get number of species caught at each run per survey per site
count_summary <- df %>%
group_by(Survey, Site, Run, Species) %>%
summarise(n = sum(Count))%>%
rename("Count" = "n") %>%
ungroup
# SPECIFY catch data
sp_count <- count_summary %>%
filter(Species == "scul", Survey == 3, Site =="b")
# Make catch a vector so removal() will run
catch <- as.vector(sp_count$Count)
# Calculate population estimate
pop_est <- removal(catch)
#pulls the relevant list, transforms to a dataframe, transposes,
#transforms to a dataframe again
df_pop_est <- as.data.frame(t(as.data.frame(pop_est$est)))
# keeps only the relevant numbers (pop estimate, lower CI, upper CI)
pe_ci <- df_pop_est %>%
select(No, No.UCI, No.LCI)%>%
rename(PE = No, LCI = No.LCI, UCI = No.UCI)
pe_ci
This works fairly well, and will return a population estimate, and the upper and lower confidence interval for the estimate:
PE UCI LCI
pop_est$est 55 142.1708 -32.17078
However, I need to be able to compare these numbers across different sites, surveys, and species. I would like to be able to automatically run all parameter combinations, instead of having to manually change the parameters over and over in here:
# SPECIFY catch data
sp_count <- count_summary %>%
filter(Species == "scul", Survey == 3, Site =="b")
so I can just run it once and have a data frame of PE, UCI, LCI, for every site, survey, and species combination.
I've asked a similar question on here and was able to figure out how to do something like this for some much simpler code. But for this I have vectors and lists and dataframes, as well as whatever this is:
pop_est
$est
No No.se No.LCI No.UCI p p.se p.LCI p.UCI
55.0000000 44.4757062 -32.1707823 142.1707823 0.1942446 0.1949422 -0.1878351 0.5763243
$int
k T X
3 27 26
$CS.prior
alpha beta
1 1
$CS.se
[1] "Zippin"
$catch
[1] 6 14 7
$lbl
[1] "Carle & Strub (1978) K-Pass Removal Method"
$method
[1] "CarleStrub"
$conf.level
[1] 0.95
attr(,"class")
[1] "removal"
all playing together, which I struggle to understand the differences of and how to work with them to begin with, so I'm tearing my hair out trying to figure out how to get all that to run repeatedly for different parameters, while also throwing the outputs in a dataframe.
Something like this is ultimately what I'm trying for, I don't know if it's possible to get it looking like this or not
Survey Site Species PE UCI LCI
3 b scul 55 142.1708 -32.17078
3 a scul x y z
2 a rbt x y z
1 e brt x y z
1 d scul x y z
I've googled how to loop things? Since that sounded like it might be what I'm trying to do, but that didn't clear anything up for me and I'm not sure if it's even actually what I'm looking for.
I've also tried figuring out how to make functions to see if that could help, but that didn't go well and I don't know if it would have helped me anyway.
Any thoughts are appreciated, even if it's just giving me something to look up that will help me better understand what I'm trying to do.
removal()exported from?aggregate(Count ~ Survey + Site + Species, count_summary, \(x) removal(x)$est).