4

I have an application in which various entities communicate to each other via sockets and I'm using the C programming language. When an entity sends a long message to another entity, the recv() function may read this message in parts. Therefore, I have to re-construct the message on the recipient side by appending all the received parts.

My question is a general socket programming question related to recv(). How does recv() know when a message has been fully read? Should I terminate a message with a special character like "\n"? or should I send the size of the message as a header? What is the common practice?

3 Answers 3

8

As you've noticed, with stream sockets there is no built-in notion of message boundaries. You need to build some way of determining the end-of-message into your application-level protocol.

Both of the options you've suggested are common: either a length prefix (starting each message with the length of the message) or an end-of-message delimiter (which might just be a newline in a text-based protocol, for example). A third, lesser-used, option is to mandate a fixed size for each message. Combinations of these options are also possible - for example, a fixed-size header that includes a length value.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks...it really helps to know what the common practice is.
3

When you use send() and recv() you specify a buffer size.

If you are using such a way to send a message:

send(new_socket,message,strlen(message),0);

The third parameter is the size of your buffer.

One way to learn if you have sent a packet successfully is if you are using TCP sockets, send() and recv() would return the same values. You can check this at the sender side by checking if the message size is the same as the value returned from send().

For checking at the receiver side, easiest way is to add end of string delimiter \0 to your string.

4 Comments

For sending the delimiter, strlen(message)+1 would be perfectly fine.
Besides that, it is better to talk about "the sender side". From the start of the connection, both sides can send and receive, so using the term "server" is extremely misleading here.
@glglgl Thanks I fixed the server and client into sender and receiver. Also I assumed that string already has \0 delimiter, for that reason that would be already included within the length of string, but if string does not have delimiter, then you're right.
There is no reason whatsoever why send() and recv() should return the same values. TCP is a byte-stream protocol, not a message protocol. If you want message boundaries you have to implement them yourself. Recv() is perfectly entitled to return one byte at a time, for example. -1
0

As soon as on starts doing serious amounts of network programming in C one realises quite quickly why higher level languages are popular! Basically they have a great deal of functionality built in that you soon find yourself wishing that C had a bit more to offer!

First off I would strongly encourage you to look at ZeroMQ (http://zeromq.org/bindings:c) and its C binding. This does a great deal of the horrible donkey work for you in terms of dealing with connections, message demarcation, etc. Plus, it's fast at runtime to; it's quick to develop with and quick to run, the hallmarks of a good library.

ZeroMQ is close to being the perfect sockets library. The only thing it doesn't do yet (AFAIK) is actively monitor the connection to see if it's collapsed - you only find out if you try and send something. You'd have to send your own connection test messages regularly if you wanted to keep a check on the health of the connection.

Secondly I would be encourage you to consider serialisation. As soon as you start having complex data structures which point at allocated memory you'll start getting into complex and difficult territory. When faced with this problem I chose to use ASN.1 for defining and serialising my data structures using the libraries and tools from Objective Systems (http://www.obj-sys.com/index.php). It costs money, takes a bit of getting used to, but I found it extremely worthwhile in terms of time saved in development.

As well as serialisation routines they give you some very handy extras that C doesn't provide. For instance, their code generator will give you routines to copy data types, which is quite handy if that datatype is a structure full of pointers referencing allocated memory.

There's probably some free tools and libraries out there too. A good alternative is Google's protocol buffers which has a C binding (http://code.google.com/p/protobuf-c/).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.