C library functions that make changes to arrays or objects take at least two arguments: a pointer to the array or object and an integer indicating the number of elements or bytes to be manipulated. For the purposes of this rule, the element count of a pointer is the size of the object to which it points, expressed by the number of elements that are valid to access. Supplying arguments to such a function might cause the function to form a pointer that does not point into or just past the end of the object, resulting in undefined behavior.
Annex J of the C Standard [ISO/IEC 9899:2011] states that it is undefined behavior if the "pointer passed to a library function array parameter does not have a value such that all address computations and object accesses are valid." .
In the following code,
int
arr[
5
];
int
*p = arr;
unsigned
char
*p2 = (unsigned
char
*)arr;
unsigned
char
*p3 = arr +
2
;
void
*p4 = arr;
|
the element count of the pointer
p
is
sizeof(arr) / sizeof(arr[0])
, that is,
5
. The element count of the pointer
p2
is
sizeof(arr)
, that is,
20
, on implementations where
sizeof(int) == 4
. The element count of the pointer
p3
is
12
on implementations where
sizeof(int) == 4
, because
p3
points two elements past the start of the array
arr
. The element count of
p4
is treated as though it were
unsigned char *
instead of
void *
, so it is the same as
p2
.
Pointer + Integer
The following standard library functions take a pointer argument and a size argument, with the constraint that the pointer must point to a valid memory object of at least the number of elements indicated by the size argument.
fgets() |
fgetws() |
mbstowcs() 1
|
wcstombs() 1 |
mbrtoc16() 2
|
mbrtoc32() 2 |
mbsrtowcs() 1 |
wcsrtombs() 1 |
mbtowc() 2
|
mbrtowc() 1
|
mblen() |
mbrlen() |
memchr() |
wmemchr() |
memset() |
wmemset() |
strftime() |
wcsftime() |
strxfrm()1
|
wcsxfrm()1
|
strncat()2
|
wcsncat()2
|
snprintf() |
vsnprintf() |
swprintf() |
vswprintf() |
setvbuf() |
tmpnam_s() |
snprintf_s() |
sprintf_s() |
vsnprintf_s() |
vsprintf_s() |
gets_s() |
getenv_s() |
wctomb_s() |
mbstowcs_s()3
|
wcstombs_s()3
|
memcpy_s()3
|
memmove_s()3
|
strncpy_s()3
|
strncat_s()3
|
strtok_s()2
|
strerror_s() |
strnlen_s() |
asctime_s() |
ctime_s() |
snwprintf_s() |
swprintf_s() |
vsnwprintf_s() |
vswprintf_s() |
wcsncpy_s()3
|
wmemcpy_s()3
|
wmemmove_s()3
|
wcsncat_s()3
|
wcstok_s()2
|
wcsnlen_s() |
wcrtomb_s() |
mbsrtowcs_s()3
|
wcsrtombs_s()3
|
memset_s()4
|
1 Takes two pointers and an integer, but the integer
specifies the element count only of the output buffer, not of the
input buffer.
2 Takes two pointers and an integer, but the
integer specifies the element count only of the input buffer, not
of the output buffer.
3 Takes two pointers and two integers; each integer
corresponds to the element count of one of the pointers.
4 Takes a pointer and two size-related
integers; the first size-related integer parameter specifies
the number of bytes available in the buffer; the second
size-related integer parameter specifies the number of bytes to write
within the buffer.
For calls that take a pointer and an integer size, the given size should not be greater than the element count of the pointer.
Noncompliant Code Example (Element Count)
In this noncompliant code example, the incorrect element
count is used in a call to
wmemcpy()
. The
sizeof
operator returns the size expressed in bytes, but
wmemcpy()
uses an element count based on
wchar_t *
.
#include <string.h>
#include <wchar.h>
static
const
char
str[] =
"Hello world"
;
static
const
wchar_t w_str[] = L
"Hello world"
;
void
func(
void
) {
char
buffer[
32
];
wchar_t w_buffer[
32
];
memcpy(buffer, str,
sizeof(str));
/* Compliant */
wmemcpy(w_buffer, w_str,
sizeof(w_str));
/* Noncompliant */
}
|
Compliant Solution (Element Count)
When using functions that operate on pointed-to regions, programmers
must always express the integer size in terms of the element
count expected by the function. For example,
memcpy()
expects the element count expressed in terms of
void *
, but
wmemcpy()
expects the element count expressed in terms of
wchar_t *
. Instead of the
sizeof
operator, functions that return the number of elements in the string
are called, which matches the expected element count for the
copy functions. In the case of this compliant solution, where the
argument is an array
A
of type
T
, the expression
sizeof(A) / sizeof(T)
, or equivalently
sizeof(A) / sizeof(*A)
, can be used to compute the number of elements in the array.
#include <string.h>
#include <wchar.h>
static
const
char
str[] =
"Hello world"
;
static
const
wchar_t w_str[] = L
"Hello world"
;
void
func(
void
) {
char
buffer[
32
];
wchar_t w_buffer[
32
];
memcpy(buffer, str,
strlen(str) +
1
);
wmemcpy(w_buffer, w_str,
wcslen(w_str) +
1
);
}
|
Noncompliant Code Example (Pointer + Integer)
This noncompliant code example assigns a value greater than the number
of bytes of available memory to
n
, which is then passed to
memset()
:
#include <stdlib.h>
#include <string.h>
void
f1(size_t nchars) {
char
*p = (
char
*)malloc(nchars);
/* ... */
const
size_t n = nchars +
1
;
/* ... */
memset(p,
0
, n);
}
|
Compliant Solution (Pointer + Integer)
This compliant solution ensures that the value of
n
is not greater than the number of bytes of the dynamic memory pointed
to by the pointer
p
:
#include <stdlib.h>
#include <string.h>
void
f1(size_t nchars) {
char
*p = (
char
*)malloc(nchars);
/* ... */
const
size_t n = nchars;
/* ... */
memset(p,
0
, n);
}
|
Noncompliant Code Example (Pointer + Integer)
In this noncompliant code example, the element count of the
array
a
is
ARR_SIZE
elements. Because
memset()
expects a byte count, the size of the array is scaled incorrectly by
sizeof(int)
instead of
sizeof(long)
, which can form an invalid pointer on architectures where
sizeof(int) != sizeof(long)
.
#include <string.h>
void
f2(
void
) {
const
size_t ARR_SIZE =
4
;
long
a[ARR_SIZE];
const
size_t n = sizeof(
int
) * ARR_SIZE;
void
*p = a;
memset(p,
0
, n);
}
|
Compliant Solution (Pointer + Integer)
In this compliant solution, the element count required by
memset()
is properly calculated without resorting to scaling:
#include <string.h>
void
f2(
void
) {
const
size_t ARR_SIZE =
4
;
long
a[ARR_SIZE];
const
size_t n = sizeof(a);
void
*p = a;
memset(p,
0
, n);
}
|
Two Pointers + One Integer
The following standard library functions take two pointer arguments and a size argument, with the constraint that both pointers must point to valid memory objects of at least the number of elements indicated by the size argument.
|
wmemcpy() |
memmove() |
wmemmove() |
strncpy() |
wcsncpy() |
memcmp() |
wmemcmp() |
strncmp() |
wcsncmp() |
strcpy_s() |
wcscpy_s() |
strcat_s() |
wcscat_s() |
For calls that take two pointers and an integer size, the given size should not be greater than the element count of either pointer.
Noncompliant Code Example (Two Pointers + One Integer)
In this noncompliant code example, the value of
n
is incorrectly computed, allowing a read past the end of the object
referenced by
q
:
#include <string.h>
void
f4() {
char
p[
40
];
const
char
*q =
"Too short"
;
size_t n = sizeof(p);
memcpy(p, q, n);
}
|
Compliant Solution (Two Pointers + One Integer)
This compliant solution ensures that
n
is equal to the size of the character array:
#include <string.h>
void
f4() {
char
p[
40
];
const
char
*q =
"Too short"
;
size_t n = sizeof(p) <
strlen(q) +
1
? sizeof(p) : strlen(q) +
1
;
memcpy(p, q, n);
}
|
One Pointer + Two Integers
The following standard library functions take a pointer argument and two size arguments, with the constraint that the pointer must point to a valid memory object containing at least as many bytes as the product of the two size arguments.
bsearch() |
bsearch_s() |
qsort() |
qsort_s() |
fread() |
fwrite() |
|
For calls that take a pointer and two integers, one
integer represents the number of bytes required for an
individual object, and a second integer represents the number
of elements in the array. The resulting product of the two
integers should not be greater than the element count of
the pointer were it expressed as an unsigned char *
.
Noncompliant Code Example (One Pointer + Two Integers)
This noncompliant code example allocates a variable number of objects
of type
struct obj
. The function checks that
num_objs
is small enough to prevent wrapping, in compliance with INT30-C.
Ensure that unsigned integer operations do not wrap. The size
of
struct obj
is assumed to be 16 bytes to account for padding to achieve the
assumed alignment of
long long
. However, the padding typically depends on the target
architecture, so this object size may be incorrect, resulting in an
incorrect element count.
#include <stdint.h>
#include <stdio.h>
struct obj {
char
c;
long
long
i;
};
void
func(FILE *f, struct obj
*objs, size_t num_objs) {
const
size_t obj_size =
16
;
if
(num_objs > (SIZE_MAX /
obj_size) ||
num_objs != fwrite(objs,
obj_size, num_objs, f)) {
/* Handle error */
}
}
|
Compliant Solution (One Pointer + Two Integers)
This compliant solution uses
the sizeof
operator to correctly provide the object size
and num_objs
to provide the element count:
#include <stdint.h>
#include <stdio.h>
struct obj {
char
c;
long
long
i;
};
void
func(FILE *f, struct obj
*objs, size_t num_objs) {
const
size_t obj_size = sizeof
*objs;
if
(num_objs > (SIZE_MAX /
obj_size) ||
num_objs != fwrite(objs,
obj_size, num_objs, f)) {
/* Handle error */
}
}
|
Noncompliant Code Example (One Pointer + Two Integers)
In this noncompliant code example, the function
f()
calls
fread()
to read
nitems
of type
wchar_t
, each
size
bytes in size, into an array of
BUFFER_SIZE
elements,
wbuf
. However, the expression used to compute the value of
nitems
fails to account for the fact that, unlike the size of
char
, the size of
wchar_t
may be greater than 1. Consequently,
fread()
could attempt to form pointers past the end of
wbuf
and use them to assign values to nonexistent elements of the
array. Such an attempt is undefined
behavior. (See undefined
behavior 109
.) A likely consequence of this
undefined behavior is a buffer overflow. For a discussion of this
programming error in the Common Weakness Enumeration database,
see CWE-121,
"Stack-based Buffer Overflow," and CWE-805,
"Buffer Access with Incorrect Length Value."
#include <stddef.h>
#include <stdio.h>
void
f(
FILE
*file) {
enum
{ BUFFER_SIZE = 1024 };
wchar_t
wbuf[BUFFER_SIZE];
const
size_t
size =
sizeof
(*wbuf);
const
size_t
nitems =
sizeof
(wbuf);
size_t
nread =
fread
(wbuf, size, nitems, file);
/* ... */
}
|
Compliant Solution (One Pointer + Two Integers)
This compliant solution correctly computes the maximum number of items
for
fread()
to read from the file:
#include <stddef.h>
#include <stdio.h>
void
f(
FILE
*file) {
enum
{ BUFFER_SIZE = 1024 };
wchar_t
wbuf[BUFFER_SIZE];
const
size_t
size =
sizeof
(*wbuf);
const
size_t
nitems =
sizeof
(wbuf) / size;
size_t
nread =
fread
(wbuf, size, nitems, file);
/* ... */
}
|
Noncompliant Code Example (Heartbleed)
CERT vulnerability 720951 describes a vulnerability in OpenSSL versions 1.0.1 through 1.0.1f, popularly known as "Heartbleed." This vulnerability allows an attacker to steal information that under normal conditions would be protected by Secure Socket Layer/Transport Layer Security (SSL/TLS) encryption.
Despite the seriousness of the vulnerability, Heartbleed is the result of a common programming error and an apparent lack of awareness of secure coding principles. Following is the vulnerable code:
int
dtls1_process_heartbeat(SSL *s)
{
unsigned
char
*p =
&s->s3->rrec.data[0], *pl;
unsigned
short
hbtype;
unsigned
int
payload;
unsigned
int
padding = 16;
/* Use minimum padding */
/* Read type and payload
length first */
hbtype = *p++;
n2s(p, payload);
pl = p;
/* ... More code ... */
if
(hbtype == TLS1_HB_REQUEST) {
unsigned
char
*buffer, *bp;
int
r;
/*
* Allocate memory for the
response; size is 1 byte
* message type, plus 2 bytes
payload length, plus
* payload, plus padding.
*/
buffer = OPENSSL_malloc(1 + 2 +
payload + padding);
bp = buffer;
/* Enter response type,
length, and copy payload */
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy
(bp, pl, payload);
/* ... More code ... */
}
/* ... More code ... */
}
|
This code processes a "heartbeat" packet from a client. As specified in RFC 6520, when the program receives a heartbeat packet, it must echo the packet's data back to the client. In addition to the data, the packet contains a length field that conventionally indicates the number of bytes in the packet data, but there is nothing to prevent a malicious packet from lying about its data length.
The
p
pointer, along with
payload
and
p1
, contains data from a packet. The code allocates a
buffer
sufficient to contain
payload
bytes, with some overhead, then copies
payload
bytes starting at
p1
into this buffer and sends it to the client. Notably absent from this
code are any checks that the payload integer variable extracted from
the heartbeat packet corresponds to the size of the packet data.
Because the client can specify an arbitrary value of
payload
, an attacker can cause the server to read and return the contents of
memory beyond the end of the packet data, which violates INT04-C.
Enforce limits on integer values originating from tainted sources.
The resulting call to
memcpy()
can then copy the contents of memory past the end of the packet data
and the packet itself, potentially exposing sensitive data to the
attacker. This call to
memcpy()
violates ARR38-C.
Guarantee that library functions do not form invalid pointers. A
version of ARR38-C also appears in ISO/IEC
TS 17961:2013, "Forming invalid pointers by library functions
[libptr]." This rule would require a conforming analyzer to diagnose
the Heartbleed vulnerability.
Compliant Solution (Heartbleed)
OpenSSL version 1.0.1g contains the following patch, which guarantees
that
payload
is within a valid range. The range is limited by the size of
the input record.
int
dtls1_process_heartbeat(SSL *s)
{
unsigned
char
*p =
&s->s3->rrec.data[0], *pl;
unsigned
short
hbtype;
unsigned
int
payload;
unsigned
int
padding = 16;
/* Use minimum padding */
/* ... More code ... */
/* Read type and payload
length first */
if
(1 + 2 + 16 >
s->s3->rrec.length)
return
0;
/* Silently discard */
hbtype = *p++;
n2s(p, payload);
if
(1 + 2 + payload + 16 >
s->s3->rrec.length)
return
0;
/* Silently discard per RFC
6520 */
pl = p;
/* ... More code ... */
if
(hbtype == TLS1_HB_REQUEST) {
unsigned
char
*buffer, *bp;
int
r;
/*
* Allocate memory for the
response; size is 1 byte
* message type, plus 2 bytes
payload length, plus
* payload, plus padding.
*/
buffer = OPENSSL_malloc(1 + 2 +
payload + padding);
bp = buffer;
/* Enter response type,
length, and copy payload */
*bp++ = TLS1_HB_RESPONSE;
s2n(payload, bp);
memcpy
(bp, pl, payload);
/* ... More code ... */
}
/* ... More code ... */
}
|