I'm working on a proof of concept to replace a samba file share with vsftpd (over TLS). This is not a public file store, and so the need for security is essential. And clients will only write to this location for the server to then pick up and process.
But I've run into some issues when I started load testing the FTP part using JMeter. The load test scenario is each user connects and login with a local account, uploads a small file, logout, and wait one second.
It works fine up to around 300 users. After that point the problem starts to manifest as waves of unix_chkpwds start causing massive CPU spikes every couple of seconds, which then becomes a constant load after a minute or so. The load average showing between 60-80 on a 16 logical core system. I feel like decrypting and reading the shadow file shouldn't be this much of a burden.
Currently /etc/pam.d/vsftpd is configured as:
auth required pam_succeed_if.so user = citsl_ftp
auth required pam_unix.so
account sufficient pam_permit.so
Enabling either anonymous login, or editing /etc/pam.d/vsftpd to not auth with the pam_unix.so module "fixes" the issue. I made sure the number of processes didn't cap out, ps -A --no-header | wc -l only showing around 400-800 processes. But I also noticed the PID of the processes rose quickly, and wrapped around every 10 or so seconds (which looks a bit worrying but I don't know enough about Linux to tell if that is an actual problem or not).
I'm learning a lot of this as I go, so please tell me if I'm a fool and FTP is the wrong tool for a private write only file store. But is unix_chkpwd behaving as you'd expect in this scenario or is there something I've missed? Is using a password protected local user the wrong way about it, and how vulnerable would I make myself if I used a highly restricted anonymous login instead?
Edit: As TooTea suspected it was the hashing algorithm of the password that is the culprit. Changing it from using SHA-512 to md5 brought down the average CPU usage to around 9. It is still a bit high for my liking considering the low load I'm putting on the server, but it gives me options.
