0

I have some software in Golang that runs 500-3000 tasks. Every task does little compute and a lot of IO (http requests). Every task needs to use its own unique transport (may occasionally change their transport) and cookie jar as shown below for the http client. Every task must use its own API responses and cannot share them with other tasks. My goal is to optimize for speed (every task runs as concurrently and fast as possible). I know the bottle neck is mainly network latency.

My current approach just launches every task in a goroutine and each makes its own http client. How can I optimize my setup? Do I use fasthttp package, is there a feasible method to speed up the io?

transport := &http.Transport{
    Proxy: http.ProxyURL(&url.URL{
        Scheme: "http",
        User:   url.UserPassword(task.Proxy.Username, task.Proxy.Password),
        Host:   task.Proxy.IP + ":" + strconv.Itoa(task.Proxy.Port),
    }),
}

client := &http.Client{Transport: transport,}
3
  • 2
    Reuse of clients has a purpose, it's not an end goal in itself. If your (rather strange) setup requires new proxies (and thus new TCP connections) and thus new Transports and thus new Clients than it is so. Try to understand why clients should be re-used and why your case doesn't fit any normal HTTP behaviour. "It seems like there's a bunch of overhead with this." Of course, not only a bunch, a shitload. "Or is there a better way to do this?" Of course not. Unless there is smth. in your rather strange setup that could be exploited. Commented Nov 21, 2024 at 6:59
  • @Volker thanks. I want every task to run as quickly and concurrently as possible, but 500+ goroutines doesn't seem optimal (not completely sure but a huge amount of threads seems like a lot of OS scheduling work). Maybe using a smaller worker pool to handle x tasks would be better. Maybe this is something I can exploit by assigning each worker its own client, and then maybe reassign the transport every time a worker picks up a new task. Commented Nov 21, 2024 at 7:27
  • 2
    You seem to struggle with concepts like "concurrent" and "parallel". Note that 500 goroutines is basically "nothing" for Go. Your worry about OS threads is somewhat strange as probably the bottleneck of all your work is somewhere in the network (TCP handshake, proxy authentication, HTTP requests, maybe even port numbers and sockets) but not CPU. All this hints at some fundamental misunderstanding of the a) problem space and b) how things actually work in computers. Commented Nov 21, 2024 at 7:56

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.