3

I'm trying to reimplement the standard strlen function in C, and I'm closely replicating its behavior. I defined my function based on standard C declaration for strlen:

size_t ft_strlen(const char *s);

and implemented it correctly.

However, I noticed something unexpected:

Calling strlen(NULL) causes a compile-time warning:

ft_strlen.c:36:29: warning: null passed to a callee that requires a non-null argument [-Wnonnull] printf("%zu\n", strlen(0)); ~^ 1 warning generated.

Calling ft_strlen(NULL) compiles fine, but crashes at runtime with

zsh: segmentation fault (core dumped) ./a.out

Given that both functions have the exact same signature and are passed the same input and both compiled with cc, why does strlen(NULL) fail to compile, while ft_strlen(NULL) does not?

The compiler version:

Ubuntu clang version 12.0.1-19ubuntu3 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin

I'm trying to understand the compiler-level behavior and how standard library functions differ from user-defined ones, and how can I potentially replicate strlen() myself.

5
  • Please edit your question to include the exact compiler + version you are using. Also add the full complete error messages you get from both executions/compilations. Commented Oct 12 at 11:38
  • 5
    Do they have exactly the same signatures? This link shows way more: extern size_t strlen (const char *__s) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__pure__)) __attribute__ ((__nonnull__ (1))); godbolt.org/z/55PM6z48P It's really hard to answer this without exact information on the toolchain and library you're using Commented Oct 12 at 11:50
  • You did not mention the compiler you are using Commented Oct 12 at 12:50
  • 1
    You will not get it without compiler specific attributes or pragmas. Commented Oct 12 at 15:35
  • The correct solution to the root problem here is to sanitize your pointer at the point in your application where it might end up as a null pointer, rather than passing on the burden of checking your application-tier bugs to some completely unrelated library-tier function. That's just the creation of a dirty hack in order to fix symptoms of the bug, rather than the bug itself. Commented Oct 13 at 9:15

5 Answers 5

11

Why does strlen(NULL) cause a compile-time error, while my custom ft_strlen(NULL) only crashes at runtime?

Because your C implementation provides features beyond those contemplated by the C language specification, and uses them to detect this issue. The spec itself does not require compile-time null checks under any circumstances, but it does allow implementations to emit whatever diagnostics they like (in addition to required diagnostics for a variety of specific issues), and to reject sources more or less arbitrarily.

In your particular case, some of the possibilities include

  • the compiler has special knowledge of some functions from the standard library, including strlen(), which it uses to diagnose the issue
  • the system provides some way to mark arbitrary pointer-type function parameters as requiring non-null arguments, and uses it on strlen(). This would mean that the system's declaration for strlen() is not actually the same as that of your ft_strlen(). In Glibc, for example, there is a __nonnull__ attribute (a Glibc / GCC extension) indicating that the argument should not be null

The way to use standard C to express that the argument must not be null involves expressing the parameter's type as an array with static length:

size_t ft_strlen(const char s[static 1]);

That still declares the parameter as a pointer to const char, but the [static 1] expresses that the function relies on the caller to provide an argument that points to at least one element (which a null pointer definitely does not do). Whether your compiler will actually emit a diagnostic about null arguments in calls to such a function is a different question, but there's a decent chance of it.

Sign up to request clarification or add additional context in comments.

1 Comment

It's also notable that gcc's standard library (glibc or equivalent) is considered part of the compiler/implementation and therefore does not need to follow standard C (or even be written in C). Whereas our custom functions have to be standard C. Therefore non-portable __attribute__((nonnull(1))) is acceptable for a standard lib function which assumes that the compiler is gcc, but should be avoided in application level code where const char s[static 1] is a portable solution.
7

I'm trying to understand the compiler-level behavior and how standard library functions differ from user-defined ones,…

Knowledge about strlen is built into the compiler. This is possible because the specification of strlen is written in the C standard and the standard reserves the name strlen (and others in the standard C library) for this purpose. C 2024 7.1.3 says “… All identifiers with external linkage in any of the following subclauses … are always reserved for use as identifiers with external linkage…”

In contrast, the compiler knows nothing about your function other than what you tell it.

… and how can I potentially replicate strlen() myself.

In this case, you can get the same behavior using the static keyword to tell the compiler the function expects a pointer to at least one element:

size_t ft_strlen(const char s[static 1]);

In general, replicating the compiler behavior for standard library functions may not be possible using only features in the C standard. It may be necessary to use compiler extensions and, even then, may not be possible for certain behaviors.

Comments

3

One way to make ft_strlen(NULL) be stopped at compiler-time, is to add this line at the top of the code:

 __attribute__((nonnull(1)))

This is an attribute, a GCC/Clang compiler-specific directive that tells the compiler "The first argument to this function must not be NULL".

2 Comments

I would still appreciate any other inputs, other reasoning, and other solutions. I would need to get the same result not using this attribute or gcc or clang derectives.
I would need to get the same result not using this attribute or gcc or clang derectives -- What is wrong with your current solution? The strlen function was never intended to hold the user's hand -- if the user of the function gives it a null pointer, that's the user's issue, not strlen. All the "classic" strlen functions (without any attributes given) do not have the NULL check -- it is just that gcc just happens to use this attribute. The strlen is supposed to do no checking whatsoever, and instead leaves it up to the caller to make sure what it is given is valid.
3

I understand your question is multi-layered, and the run-time crashing of your custom strlen() is nearly a side note, but I thought I'd address just this one aspect nonetheless. Does your code care for the possibility of a NULL parameter as does the following custom strlen()?

Runnable code here: https://godbolt.org/z/5fGTrjaKE

#include <stdio.h>   /* printf()  */

size_t mstrlen( const char *str )
{
    size_t len = 0;

    if( str )  /* Prevent run-time crash on NULL pointer. */
    {
        for(; str[len]; len++);
    }
    return len;
}


int main()
{

    char s[] = "stars";

    printf("mstrlen(%s) = %zu\n", s, mstrlen(s));
    printf("mstrlen(NULL) = %zu\n",  mstrlen(NULL));

    return 0;
}

3 Comments

how does it answer the question? Hi did not ask about preventing NULL pointer dereferenicng only about compiler behaviour.
It may be tangential to OP's direct question, yet it is related to the area of concern.
I guess one point here is that because the behavior resulting from passing a null pointer to the standard library's strlen() is undefined, an implementation that reliably returns 0 under such circumstances is consistent with the spec. Within the scope of any particular C implementation, UB does not have to manifest as program crashes or as unpredictable behavior.
1

Look at the prototype of strlen() in <string.h> and you will probably see the reason:

  • on my system it is declared as __pure, which probably means it is an intrinsic function of the compiler with different checking in the compiler than is done for normal user functions.

  • Of course, you can check in the function for a NULL value passed as argument. Did you do that? probably not.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.