Rectangle 27 34

ISO/IEC 9899:2011 7.22.1.4 The strtol, strtoll, strtoul and strtoull f...

Note that names beginning with an underscore are reserved for the implementation; it is best to avoid using such names in your code. Hence, _val should be just val.

The full specification of error handling for strtol() and its relatives is complex, surprisingly complex, when you first run across it. One thing you're doing absolutely right is using a function to invoke strtol(); using it 'raw' in code is probably not correct.

Since the question is tagged with both C and C++, I will quote from the C2011 standard; you can find the appropriate wording in the C++ standard for yourself.

long int strtol(const char * restrict nptr, char ** restrict endptr, int base);

2 [...] First, they decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters (as specified by the isspace function), a subject sequence resembling an integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters, including the terminating null character of the input string. [...]

7 If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of nptr is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

8 The strtol, strtoll, strtoul, and strtoull functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

errno
strtol()
parseLong()
static long parseLong(const char *str)
{
    errno = 0;
    char *temp;
    long val = strtol(str, &temp, 0);

    if (temp == str || *temp != '\0' ||
        ((val == LONG_MIN || val == LONG_MAX) && errno == ERANGE))
        fprintf(stderr, "Could not convert '%s' to long and leftover string is: '%s'\n",
                str, temp);
        // cerr << "Could not convert '" << str << "' to long and leftover string is '"
        //      << temp << "'\n";
    return val;
}

Note that on error, this returns 0 or LONG_MIN or LONG_MAX, depending on what strtol() returned. If your calling code needs to know whether the conversion was successful or not, you need a different function interface see below. Also, note that errors should be printed to stderr rather than stdout, and error messages should be terminated by a newline \n; if they're not, they aren't guaranteed to appear in a timely fashion.

Now, in library code you probably do not want any printing, and your calling code might want to know whether the conversion was successful of not, so you might revise the interface too. In that case, you'd probably modify the function so it returns a success/failure indication:

bool parseLong(const char *str, long *val)
{
    char *temp;
    bool rc = true;
    errno = 0;
    *val = strtol(str, &temp, 0);

    if (temp == str || *temp != '\0' ||
        ((*val == LONG_MIN || *val == LONG_MAX) && errno == ERANGE))
        rc = false;

    return rc;
}
if (parseLong(str, &value))
    conversion successful
else
    handle error

If you need to distinguish between 'trailing junk', 'invalid numeric string', 'value too big' and 'value too small' (and 'no error'), you'd use an integer or enum instead of a boolean return code. If you want to allow trailing white space but no other characters, or if you don't want to allow any leading white space, you have more work to do in the function. The code allows octal, decimal and hexadecimal; if you want strictly decimal, you need to change the 0 to 10 in the call to strtol().

If your functions are to masquerade as part of the standard library, they should not set errno to 0 permanently, so you'd need to wrap the code to preserve errno:

int saved = errno;  // At the start, before errno = 0;

rest of function

if (errno == 0)     // Before the return
    errno = saved;

Thanks for the extensive answer! But why do you explicitly check for "errno == ERANGE" instead of "errno != 0"? If the user could specify an own base for conversion, errno could also be set to EINVAL... Also, "man strtol" (linux.die.net/man/3/strtol) uses the following code for error checking, and I really don't get the reason for this: "if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN)) || (errno != 0 && val == 0)){ error }". Why isn't this a simple "errno != 0" as well?

The standard doesn't mention setting errno == EINVAL for values of base other than 0 or 2..36, but it is a reasonable thing to do. In general, you should be cautious about trying to detect error conditions with errno rather than the return from a function; the library can set errno to a non-zero value even if the function succeeds. (On Solaris, if the output was not a terminal, you'd find errno == ENOTTY after a successful operation.) In theory, strtol() could convert "1" to 1 and set errno to a non-zero value and this would be legitimate but perverted (and successful).

Is there a reason errno == ERANGE is checked unconditionally, whether strtol returned LONG_MIN/LONG_MAX or not? (For the reason you give in the comment, a library function may set errno on success.)

@mafso: Originally, some variation on the theme of exhaustion, laziness or carelessness. I've updated the answer to address your valid point, and miscellaneous other minor issues (spelling, etc).

There's an error in your example. val is a long int *, but you do the check val == LONG_MIN, it should be *val == LONG_MIN...

c++ - Correct usage of strtol - Stack Overflow

c++ c
Rectangle 27 7

It is only sensible to consider strtol() and strtoul() (or strtoll() or strtoull() from <stdlib.h>, or perhaps strtoimax() or strtoumax() from <inttypes.h>) if you care about error conditions. If you don't care about error conditions on overflow, any of them could be used. Neither atoi() nor atol() nor sscanf() gives you control if the values overflow. Additionally, neither atoi() nor atol() provides support for hex or octal inputs (so in fact you can't use those to meet your requirements).

Note that calling the strtoX() functions is not entirely trivial. You have to set errno to 0 before calling them, and pass a pointer to get the end location, and analyze carefully to know what happened. Remember, all possible return values from these functions are valid outputs, but some of them may also indicate invalid inputs and errno and the end pointer help you distinguish between them all.

If you need to convert to int after reading the value using, say, strtoll(), you can check the range of the returned value (stored in a long long) against the range defined in <limits.h> for int: INT_MIN and INT_MAX.

strtol()

Note that none of these functions tells you which conversion was used. You'll need to analyze the string yourself. Quirky note: did you know that there is no decimal 0 in C source; when you write 0, you are writing an octal constant (because its first digit is a 0). There are no practical consequences to this piece of trivia.

When the value entered is larger (or smaller) than will fit into the integer type. The returned value is clamped to the end of the range supported by the type, but errno == ERANGE indicates that the overflow occurred. If you've got 32-bit long values, this could happen on a value 5,000,000,000 (minus the commas), even though all the digits were used.

c - atoi vs atol vs strtol vs strtoul vs sscanf - Stack Overflow

c
Rectangle 27 0

It is only sensible to consider strtol() and strtoul() (or strtoll() or strtoull() from <stdlib.h>, or perhaps strtoimax() or strtoumax() from <inttypes.h>) if you care about error conditions. If you don't care about error conditions on overflow, any of them could be used. Neither atoi() nor atol() nor sscanf() gives you control if the values overflow. Additionally, neither atoi() nor atol() provides support for hex or octal inputs (so in fact you can't use those to meet your requirements).

Note that calling the strtoX() functions is not entirely trivial. You have to set errno to 0 before calling them, and pass a pointer to get the end location, and analyze carefully to know what happened. Remember, all possible return values from these functions are valid outputs, but some of them may also indicate invalid inputs and errno and the end pointer help you distinguish between them all.

If you need to convert to int after reading the value using, say, strtoll(), you can check the range of the returned value (stored in a long long) against the range defined in <limits.h> for int: INT_MIN and INT_MAX.

strtol()

Note that none of these functions tells you which conversion was used. You'll need to analyze the string yourself. Quirky note: did you know that there is no decimal 0 in C source; when you write 0, you are writing an octal constant (because its first digit is a 0). There are no practical consequences to this piece of trivia.

When the value entered is larger (or smaller) than will fit into the integer type. The returned value is clamped to the end of the range supported by the type, but errno == ERANGE indicates that the overflow occurred. If you've got 32-bit long values, this could happen on a value 5,000,000,000 (minus the commas), even though all the digits were used.

c - atoi vs atol vs strtol vs strtoul vs sscanf - Stack Overflow

c
Rectangle 27 0

ISO/IEC 9899:2011 7.22.1.4 The strtol, strtoll, strtoul and strtoull f...

Note that names beginning with an underscore are reserved for the implementation; it is best to avoid using such names in your code. Hence, _val should be just val.

The full specification of error handling for strtol() and its relatives is complex, surprisingly complex, when you first run across it. One thing you're doing absolutely right is using a function to invoke strtol(); using it 'raw' in code is probably not correct.

Since the question is tagged with both C and C++, I will quote from the C2011 standard; you can find the appropriate wording in the C++ standard for yourself.

long int strtol(const char * restrict nptr, char ** restrict endptr, int base);

2 [...] First, they decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters (as specified by the isspace function), a subject sequence resembling an integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters, including the terminating null character of the input string. [...]

7 If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of nptr is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

8 The strtol, strtoll, strtoul, and strtoull functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

errno
strtol()
parseLong()
static long parseLong(const char *str)
{
    errno = 0;
    char *temp;
    long val = strtol(str, &temp, 0);

    if (temp == str || *temp != '\0' ||
        ((val == LONG_MIN || val == LONG_MAX) && errno == ERANGE))
        fprintf(stderr, "Could not convert '%s' to long and leftover string is: '%s'\n",
                str, temp);
        // cerr << "Could not convert '" << str << "' to long and leftover string is '"
        //      << temp << "'\n";
    return val;
}

Note that on error, this returns 0 or LONG_MIN or LONG_MAX, depending on what strtol() returned. If your calling code needs to know whether the conversion was successful or not, you need a different function interface see below. Also, note that errors should be printed to stderr rather than stdout, and error messages should be terminated by a newline \n; if they're not, they aren't guaranteed to appear in a timely fashion.

Now, in library code you probably do not want any printing, and your calling code might want to know whether the conversion was successful of not, so you might revise the interface too. In that case, you'd probably modify the function so it returns a success/failure indication:

bool parseLong(const char *str, long *val)
{
    char *temp;
    bool rc = true;
    errno = 0;
    *val = strtol(str, &temp, 0);

    if (temp == str || *temp != '\0' ||
        ((*val == LONG_MIN || *val == LONG_MAX) && errno == ERANGE))
        rc = false;

    return rc;
}
if (parseLong(str, &value))
    conversion successful
else
    handle error

If you need to distinguish between 'trailing junk', 'invalid numeric string', 'value too big' and 'value too small' (and 'no error'), you'd use an integer or enum instead of a boolean return code. If you want to allow trailing white space but no other characters, or if you don't want to allow any leading white space, you have more work to do in the function. The code allows octal, decimal and hexadecimal; if you want strictly decimal, you need to change the 0 to 10 in the call to strtol().

If your functions are to masquerade as part of the standard library, they should not set errno to 0 permanently, so you'd need to wrap the code to preserve errno:

int saved = errno;  // At the start, before errno = 0;

rest of function

if (errno == 0)     // Before the return
    errno = saved;

Thanks for the extensive answer! But why do you explicitly check for "errno == ERANGE" instead of "errno != 0"? If the user could specify an own base for conversion, errno could also be set to EINVAL... Also, "man strtol" (linux.die.net/man/3/strtol) uses the following code for error checking, and I really don't get the reason for this: "if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN)) || (errno != 0 && val == 0)){ error }". Why isn't this a simple "errno != 0" as well?

The standard doesn't mention setting errno == EINVAL for values of base other than 0 or 2..36, but it is a reasonable thing to do. In general, you should be cautious about trying to detect error conditions with errno rather than the return from a function; the library can set errno to a non-zero value even if the function succeeds. (On Solaris, if the output was not a terminal, you'd find errno == ENOTTY after a successful operation.) In theory, strtol() could convert "1" to 1 and set errno to a non-zero value and this would be legitimate but perverted (and successful).

Is there a reason errno == ERANGE is checked unconditionally, whether strtol returned LONG_MIN/LONG_MAX or not? (For the reason you give in the comment, a library function may set errno on success.)

@mafso: Originally, some variation on the theme of exhaustion, laziness or carelessness. I've updated the answer to address your valid point, and miscellaneous other minor issues (spelling, etc).

There's an error in your example. val is a long int *, but you do the check val == LONG_MIN, it should be *val == LONG_MIN...

c++ - correct usage of strtol - Stack Overflow

c++ c
Rectangle 27 0

If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

Since the correct value of your input string is too large for the type, you get LONG_MAX, and errno is set to ERANGE.

Whenever one of the strto(u)l(l) functions returns one of the TYPE_MAX or TYPE_MIN values, you need to check errno to find out whether it's a correct result, or your input was out-of-range.

Damn, I was just about to paste the same. :-D @PascalCuoq: Behaviour of the strto... functions is very well-defined even in case of overflows. That's what makes them superior to e.g. *scanf() functions for parsing numerical input.

Can you also specify from where you're quoting?

@DevSolar Not everybody knows that :-)

In contrast, why can (int)strtoul("-2",NULL,0) return the "correct" value?

c - What is the difference between strtol and strtoul? - Stack Overflo...

c std strtol
Rectangle 27 0

Since your return_val is an unsigned int, you should probably be using strtoul() which has been standard since C89 and is therefore supported by MSVC (whereas strtoll() has only been standard since C99 and is not supported by MSVC).

Your testing of the error conditions is not adequate. You need to set errno to zero before calling the conversion function; you also need to detect whether an error was reported, which is trickier than it seems.

Section 7.20.1.4 'The strtol, strtoll, strtoul, and strtoull functions' of the C99 standard says:

The strtol, strtoll, strtoul, and strtoull functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

You also have to read the look at the value stored in the endptr parameter to the conversion functions to tell that no conversion was performed (as opposed to a valid zero was converted).

If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of nptr is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

So, you must write code more like this (omitting the test against EINVAL because the standard does not mention these functions setting errno to EINVAL):

unsigned int return_val=0;

if (index + 1 <= argc - 1)
{
    char *end;
    unsigned long ul;
    errno = 0;
    ul = strtoul(argv[index+1], &end, 10);
    if ((ul == 0 && end == argv[index+1]) ||
        (ul == ULONG_MAX && errno == ERANGE) ||
        (ul > UINT_MAX))
    {
        fprintf(stderr, "Could not parse argument %s for switch %s!\n",
                argv[index], argv[index+1]);
        return 0;
    }
    retval = (unsigned int)ul;
}

Note that this is simpler than the test for a signed integer conversion which must take into account the negative <type>_MIN limit as well as the <type>_MAX limit.

Also note that you really should record the result in an unsigned long and then check whether it fits within your designated range, which may be limited to UINT_MAX (which can be less than ULONG_MAX in a Unix-like 64-bit environment).

errno = ERANGE

also, you should check for retval > UINT_MAX as well (before downcasting to unsigned int), in case it's on a 64-bit long platform.

@bdonlan: yes - thanks (to both comments). I fixed the downcast with the last paragraph without seeing your comment (but the first edition of the answer didn't include that, and your comment is spot on). I had missed my '=' instead of '==' typo (but the compiler would have warned me about it had I compiled the code).

You've fixed the downcast but left in the ul > UINT_MAX test which will now always be false.

@DavidConrad: why do you think that test will always be false? On a 64-bit (Unix) machine, it is trivial to have an unsigned long value that is much larger than will fit in unsigned int. On Windows 64 or a 32-bit system, you're correct; however, I believe the code will work correctly on those too though a compiler might decide to warn about the test. If that was a problem, I'd arrange for conditional compilation, probably look for ULONG_MAX != UINT_MAX and only making the comparision if the two are different.

What is wrong with the following C code? - Stack Overflow

c
Rectangle 27 0

You're running into overflow of the long type, which is signed.

You probably should use:

print("%lx\n", strtoul("0xfffff70a", NULL, 0));
                    ^
                    |
                 important!

instead, note the 'u' for "unsigned" (see manual page).

Also note that you can't print an unsigned long with plain %x, you need to qualify it as being bigger than int.

c - What is the difference between strtol and strtoul? - Stack Overflo...

c std strtol
Rectangle 27 0

It might be easier to understand if you started at 7.20.1.4 (The strtol, strtoll, strtoul, and strtoull functions) 2 of the C99 standard, instead of 4:

2 The strtol, strtoll, strtoul, and strtoull functions convert the initial portion of the string pointed to by nptr to long int, long long int, unsigned long int, and unsigned long long int representation, respectively. First, they decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters (as specified by the isspace function), a subject sequence resembling an integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters, including the terminating null character of the input string. Then, they attempt to convert the subject sequence to an integer, and return the result.

3 If the value of base is zero, the expected form of the subject sequence is that of an integer constant as described in 6.4.4.1, optionally preceded by a plus or minus sign, but not including an integer suffix. If the value of base is between 2 and 36 (inclusive), the expected form of the subject sequence is a sequence of letters and digits representing an integer with the radix specified by base, optionally preceded by a plus or minus sign, but not including an integer suffix. The letters from a (or A) through z (or Z) are ascribed the values 10 through 35; only letters and digits whose ascribed values are less than that of base are permitted. If the value of base is 16, the characters 0x or 0X may optionally precede the sequence of letters and digits, following the sign if present.

4 The subject sequence is defined as the longest initial subsequence of the input string, ...

In particular, 3 clarifies what a subject sequence is.

c - Confusing language in specification of strtol, et al - Stack Overf...

c standards-compliance language-lawyer strtol
Rectangle 27 0

Your architecture has a 32-bit long type. 0xfffff70A is not representable as a signed 32-bit long. errno should have been set to ERANGE.

-0x80000000
0x7fffffff

c - What is the difference between strtol and strtoul? - Stack Overflo...

c std strtol
Rectangle 27 0

It would appear that you have a configuration where sizeof(long) == 4 (i.e. 32-bit long). You might want to look into strtoull()/strtoll() instead of strtoul()/strtol(), and use [unsigned] long long variables instead...

Edit: actually, never mind the [unsigned] long long bit, as you already have uint64_t...

Yes, using strtoull() solved the problem. Tanks.

c - Using strtol() to Parse Long Values out of Lines of a File - Stack...

c parsing strtol
Rectangle 27 0

[C++11: 21.5/3]: Throws: invalid_argument if strtol, strtoul, strtoll, or strtoull reports that no conversion could be performed. [..]

[C99: 7.20.1.4/5]: If the subject sequence has the expected form and the value of base is zero, the sequence of characters starting with the rst digit is interpreted as an integer constant according to the rules of 6.4.4.1. If the subject sequence has the expected form and the value of base is between 2 and 36, it is used as the base for conversion, ascribing to each letter its value as given above. [..]

No semantics are specified in C99 for the case when base is not zero or between 2 and 36, so the result is undefined. This does not necessarily satisfy the excerpt from [C++11: 21.5/3].

In short, this is UB; you'd expect an exception only when the base is valid but the input value is inconvertible in that base. This is a bug in neither GCC nor the standard.

Well, it sounds like a bug in the C Standard to me: it should specify precisely what happens when base is not valid. Not specifying it is quite lazy. (Yes, I do understand that at this point there are tons of incompatible implementations, and part of the reason is precisely because the Standard never cared about clarifying that.)

You may be right, but it's tragic that GCC produces undefined behavior when it would be so trivial to produce an intuitive, implementation-defined behavior of throwing when the base is invalid. All it would take is for their __stoa() to initialize the endptr to the input string, and then magically invalid_argument would be thrown in that case. This seems like a case where C fell short by not specifying what happens when base is invalid, then common implementations (GCC, Clang) specified it (EINVAL, as one would expect). Then C++ left it unspecified again, but this time GCC left it.

@peppe: Are you essentially stating that the standard is buggy if it does not "bother" to specify semantics for a precondition failure? Sorry, but no. That's not how the world works. It shouldn't even be how your own code works!!

Well, okay, that said, in this particular instance I might have expected better.

Come on, it's not evenly remotely compareable to operator[]. It's not inline. It's not O(1). The actual code that gets run is difficult to estimate. It has a loop and some integer math in it, including divisions and modulos. Not standardising the outcome of such a basic check (which notably is already being done by glibc and BSD's libc, and which you need to do anyhow for the base == 0 case), if not lazy, it is pretty close to being lazy.

c++ - C++11 std::stoi silently fails when base not in [2,36] (GCC) - S...

c++ gcc c++11 std
Rectangle 27 0

7.20.1.4 The strtol, strtoll, strtoul, and strtoull functions

Argument indexes count from 0 with the name of the program. argv[1] contains the string "-b", while argv[2] contains the string "101".

#include <string.h>
strcmp()

Internally, the number will be binary regardless of how you convert it. However, there isn't an easy way of printing a binary number in C. You can use strtol() (from <stdlib.h>), or one of its relatives, to convert the string as a binary value. You should probably use that to convert the string into a value (though, in general, you need to be rather careful detecting the error conditions from strtol(); it is subtle with its returns).

long strtol(const char * restrict nptr, char ** restrict endptr, int base);

7 If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of nptr is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

8 The strtol, strtoll, strtoul, and strtoull functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

You would want to ensure that all the non-blank characters were converted, amongst other things.

if (strcmp(argv[1], "-b") == 0)

Reading command line arguments in a C program - Stack Overflow

c command-line-arguments