2

I have this class:

class Socket
{
    [...]
    pollfd* fdPtr; 
};

And I have a Manager class that creates the actual pollfd objects and adds them to a std::vector named pollFdsVec, to then poll on them. Meanwhile, the sockets themselves are kept in another std::vector named socketsVec.

When a socket is created (accepted) the fdPtr is set like this:

class Manager
{
    std::vector<pollfd> pollFdsVec
    std::vector<std::shared_ptr<Socket>> socketsVec;

    void Accept()
    {
        // When a new connection is created
        std::shared_ptr<Socket> sock = Accept(...); // listener returns a sharedptr
        socketsVec.push_back(sock);

        // Add the new connection to the pfds
        pollfd newPfd;
        [...]
        m_pollFdsVec.push_back(newPfd);

        // Set the pointer
        sock->fdPtr = &pollFdsVec[pollFdsVec.size() - 1]
    }

    void Poll()
    {
        int res = poll(pollFdsVec);
        [...]
    }
};

This works well, because I've made sure to call pollFdsVec.reserve(MAX_CLIENTS_CONECTED) during initialization, and because I make sure to never call push_back on the vector if the reserved size is reached, so that a relocation cannot happen.

However, I need to be able to delete elements from pollFdsVec when connections are dropped, and with this current design I'm facing the issue where after a pollFdsVec.erase() all the other fdPtr pointers lose meaning. Strangely enough, I don't get a crash, but I believe the issues I do get subsequently from the deletion come from this.

So, to fix this, at some point I collect the indexes of the elements inside pollFdsVec that I want to delete, and run this at the end of a poll cycle:

if (toRemove.size() > 0)
{
    // Remove socket from list and fds, in reverse order to avoid index shift
    std::sort(toRemove.begin(), toRemove.end(), std::greater<int>());
    for (int idx : toRemove)
    {
        std::swap(pollFdsVec[idx], pollFdsVec.back());
        std::swap(socketsVec[idx], socketsVec.back());

        // Update the socket that was sitting at the back and suffered the swap
        socketsVec[idx]->SetPfd(&m_poll_fds[idx]);

        m_poll_fds.pop_back();
        m_list.pop_back();
    }
}

But, this is my first time doing something like this. Even though it seems to work, I'm wondering if this is actually safe, and if some of you more experienced developers have ever seen situations where this could lead to potential issue in the future.

This polling cycle happens in a single thread, and yes I need each Socket to know its pollfd so I can dynamically adjust it to listen to POLLIN and/or POLLOUT events.

I've thought about saving and fixing indexes instead of a pollfd*`, but in that way I'd have to update (in the worst case) ALL the connections for every single removal, which seems less efficient.

3
  • [OT] pollFdsVec[pollFdsVec.size() - 1] can be replaced by pollFdsVec.back(). Commented Jul 17 at 18:26
  • I'm having trouble reading this question. Say you have FDs 1,2,3 And they are conveniently mapped to sockets 1,2,3. Socket 2 closes and is to be removed. std::swap(pollFdsVec[idx], pollFdsVec.back()); runs so now the FD list is 1,3,2 . Socket 3's FD is now at the wrong address. Commented Jul 17 at 18:49
  • When I want a fixed size array, I usually use std::array... I'd probably subclass pollfd to initialize with fd=-1, so all the array elements end up "empty". Commented Jul 18 at 16:41

2 Answers 2

7

Since your pollFdsVec is reserved to a fixed size, you don't really need to remove any pollfds from it at all. Simply set the pollfd::fd field to -1 when the corresponding Socket is closed, and then reassign the pollfd::fd when a new Socket is created. poll() ignores pollfds that have fd=-1:

https://man7.org/linux/man-pages/man2/poll.2.html

The field fd contains a file descriptor for an open file. If this field is negative, then the corresponding events field is ignored and the revents field returns zero. (This provides an easy way of ignoring a file descriptor for a single poll() call: simply set the fd field to its bitwise complement.)

Try this:

class Socket
{
    int fd;
    pollfd* pfdPtr;
    // ...
};

class Manager
{
    std::vector<pollfd> pollFdsVec;
    std::vector<std::shared_ptr<Socket>> socketsVec;

    Manager()
    {
        pollFdsVec.reserve(MAX_CLIENTS_CONECTED);
    }

    void Accept()
    {
        // When a new connection is created
        std::shared_ptr<Socket> sock = Accept(...); // listener returns a shared ptr

        // Add the new connection to the pfds
        auto pfdIter = std::find(pollFdsVec.begin(), pollFdsVec.end(),
           [](const pollfd &pfd){ return pfd.fd == -1; });
        if (pfdIter == pollFdsVec.end()) {
            pollFdsVec.emplace_back();
            pfdIter = --pollFdsVec.end();
        }

        pollfd &pfd = *pfdIter;
        pfd.fd = sock->fd;
        pfd.events = ...;
        pfd.revents = 0;

        // Set the pointer
        sock->pfdPtr = &pfd;

        socketsVec.push_back(sock);
    }

    void Poll()
    {
        int res = poll(pollFdsVec.data(), pollFdsVec.size(), 0);
        for (int i = 0; i < res; ++i)
        {
            // use pollFdsVec[i] as needed ...
        }
    }
};

...

// when a Socket is closed...
sock->fdPtr->fd = -1;

That being said, pollfd is a very light-weight type to copy, so you could simply get rid of the pollFdsVec vector altogether and store each pollfd directly inside its owning Socket object. poll() doesn't care where the array comes from, so its very trivial to create a new array each time poll() is called. This way, you don't have to worry about managing any pollfd* pointers at all.

Try this:

class Socket
{
    int fd;
    pollfd pfd;
    // ...
};

class Manager
{
    std::vector<std::shared_ptr<Socket>> socketsVec;

    void Accept()
    {
        // When a new connection is created
        std::shared_ptr<Socket> sock = Accept(...); // listener returns a shared ptr
        sock->pfd.fd = sock->fd;
        sock->pfd.events = ...;
        sock->pfd.revents = 0;

        socketsVec.push_back(sock);
    }

    void Poll()
    {
        pollfd pollFds[MAX_CLIENTS_CONECTED];
        nfds_t nfds = 0;

        for(auto &sock : socketsVec)
            pollFds[nfds++] = sock->pfd;

        int res = poll(pollFds, nfds, 0);
        for (int i = 0; i < res; ++i)
        {
            // use pollFds[i] as needed ...
        }
    }
};

Alternatively, you could use epoll() instead and get rid of pollfd completely, eg:

class Socket
{
    int fd;
    // ...
};

class Manager
{
    std::vector<std::shared_ptr<Socket>> socketsVec;
    int epfd;

    Manager()
    {
        epfd = epoll_create(MAX_CLIENTS_CONECTED);
        if (epfd == -1)
            throw ...;
    }

    ~Manager()
    {
         close(epfd);
    }

    void Accept()
    {
        // When a new connection is created
        std::shared_ptr<Socket> sock = Accept(...); // listener returns a shared ptr

        // Add the new connection to the epfd
        epoll_event event = {};
        event.events = ...;
        event.data = ...;

        if (epoll_ctl(epfd, EPOLL_CTL_ADD, sock->fd, &event) < 0)
            throw ...;

        socketsVec.push_back(sock);

        // ...
    }

    void Poll()
    {
        epoll_event events[MAX_CLIENTS_CONECTED];
        int res = epoll_wait(epfd, events, MAX_CLIENTS_CONECTED, 0);
        for(int i = 0; i < res; ++i)
        {
            // use events[i] as needed ...
        }
    }
};

The downside to this approach is that each time you want to flip the events to poll for a given Socket, you would have to call epoll_ctl(EPOLL_CTL_MOD) to update the epfd's interest list. You don't have to do that with poll() since you pass it an updated pollfd array each time you call it.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much for these! I really appreciate your insights. They make much more sense than my original design.
4

You are looking for rules for iterator invalidation, specifically, for std::vector::erase:

Iterators (including the end() iterator) and references to the elements at or after the point of the erase are invalidated.

While for std::vector::pop_back:

Iterators and references to the last element are invalidated. The end() iterator is also invalidated.

Hence, it is safe to store pointers to elements in the vector and use them after calling pop_back. Only pointers to the last element are invalidated.


For the sake of completeness, I also mention std::vector::push_back:

If after the operation the new size() is greater than old capacity() a reallocation takes place, in which case all iterators (including the end() iterator) and all references to the elements are invalidated. Otherwise only the end() iterator is invalidated.

Hence, your approach to reserve enough capacity upfront is sufficient to ensure push_back does not invalidate pointers to elements.


You can consider to use a std::list which is often inferior to std::vector performance wise, but it allows insertion without invalidating iterators and even std::list::erase only invalidates iterators to the erased element.

1 Comment

Although std::list() is ideal for fast insertions and removals, it can't be used as the backing array for poll() like std::vector can be.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.