Day 82: A Pointer to a 2-D Array is…?

While cleaning up my notes I stumbled upon the one line of code that brought it all together. Since I started working with arrays I had forgotten how pointers are normally declared. Forgetting that point lead to a confusion that carried over when dealing with pointers and arrays that were way more complex. Now it all makes sense. The pointer to an array itself is of the same type as the array itself. The pointer to the first element of the array is of the same type as the first element of the array. These two types are distinctly different. I just didn’t understand that even if it may seem obvious.

TLDR;

Okay, so here are the highlights of what I did:

  • I continued cleaning up my notes on arrays and pointer return types in C. I organized my notes into general info -> multidimensional arrays -> array arithmetic -> array pointer types. So far it makes sense to me and is a lot more concise than before. I didn’t really get to do anything else but yeah this was the one line of code that helped me out. char (*p)[5][7][6] = &arr;. That & really made the difference. I had forgotten that was a thing in C. I rarely see it since most of the time we just want to access what’s inside the array and not the array’s address itself. Now I think I understand how this all works within the rest of what I know with C. My return type errors should stop now. I get what I was doing wrong.
  • Unfortunately, I didn’t get to work on anything else today. Detailed cleaning takes time LOL.

Notes on C Multidimensional Arrays So Far

Basic Arrays in C

In C, an array is a contiguous block of memory allocated for one or more elements of the same data type. This data can be bound to a variable just like a single piece of data. There are some similarities shared between single data values and arrays but we will focus on the differences. The syntax and functionality of an array in particular, are what these notes will focus on. Especially in their relation to return values. Here is an example program that constructs an array of integers and then prints the memory addresses of that array’s elements.

#include <stdio.h>

int main() {
  int x[4] = {20, 55, 80, 105};

  for(int i = 0; i < 4; ++i)
    printf("&x[%d] = %p\tx[%i] = %i\n", i, (x+i), i, x[i]);

  printf("Address of array x: %p", x);
  return 0;
}
# Output with increment by i

&x[0] = 0061FF1C        x[0] = 20
&x[1] = 0061FF20        x[1] = 55
&x[2] = 0061FF24        x[2] = 80
&x[3] = 0061FF28        x[3] = 105
Address of array x: 0061FF1C

There is a difference of 4 bytes between two consecutive elements of array x. It is because the size of int is 4 bytes (on our compiler). Notice that, the address of &x[0] and x is the same. It’s because the variable name x points to the first element of the array. From the above example, it is clear that &x[0] is equivalent to x. And, x[0] is equivalent to *x.

Similarly,

  • &x[1] is equivalent to x+1 and x[1] is equivalent to *(x+1).
  • &x[2] is equivalent to x+2 and x[2] is equivalent to *(x+2).

Basically, &x[i] is equivalent to x+i and x[i] is equivalent to *(x+i). Remember, x by default points to the start of the array.

So here is something interesting: I thought I needed to multiply the i by the size of each element in the array. In the example above each element has a memory size of 4 bytes. So 4 * i would be the increment to access the address of each element in the array. That’s what I thought but that’s not what happened. When I incremented by i alone I got different addresses but the same values, whereas when I incremented by (i*4) again, I got the same values but different addresses.

# Ouput with increment by i x 4

&x[0] = 0061FF1C        x[0] = 20
&x[1] = 0061FF2C        x[1] = 55
&x[2] = 0061FF3C        x[2] = 80
&x[3] = 0061FF4C        x[3] = 105
Address of array x: 0061FF1C

I am not sure why this happens with the addresses above.

In most contexts, array names decay to pointers. In simple words, array names are converted to pointers. That’s the reason why you can use pointers to access elements of arrays. However, you should remember that pointers and arrays are not the same.

Pointers Review

— Fill in with some content on pointers from somewhere else in my notes —

Pointers and Multidimensional Arrays

To start we must be aware that when dealing with multidimensional arrays there is a different dynamic between pointers and the array binding/variable themselves. A duality did exist between one-dimensional arrays and pointers (meaning the array binding/variable and a pointer to that array were interchangeable… They both pointed to the first element in the array by default – I think).

The difficulty in working with multidimensional arrays is that they operate outside of what we would assume. Because of the pseudo layering found in multidimensional arrays and how most C compilers manage memory the tooling gets a bit wonky. It’s not completely crazy but it takes some getting used to and is one area that confuses many students that are learning C.

There are two key components to understanding how to work with multidimensional arrays in C:

  1. Multidimensional Array Pointer Arithmetic
  2. Multidimensional Array Data Types

Both of these topics will be covered below with a few examples. Hopefully, by the end you will have a better understanding of how to work with multidimensional arrays.

Multidimensional Pointer Arithmetic

In C/C++, arrays and pointers have similar semantics, except on data type information.

As an example, given a 3D array:

int buffer[5][7][6];

An element at location [2][1][2] can be accessed as buffer[2][1][2] or *( *( *(buffer + 2) + 1) + 2).

Observe the following declaration:

T *p; // p is a pointer to an object of type T

When a pointer p is pointing to an object of type T, the expression *p is of type T. For example buffer is of type array of 5 two dimensional arrays. The type of the expression *buffer is “array of arrays (i.e. two dimensional array)”.

Based on the above concept translating the expression *( *( *(buffer + 2) + 1) + 2) step-by-step makes it more clear:

  1. buffer – An array of 5 two dimensional arrays, i.e. its type is “array of 5 two dimensional arrays”.
  2. buffer + 2 – displacement for 3rd element in the array of 5 two dimensional arrays.
  3. *(buffer + 2) – dereferencing, i.e. its type is now two dimensional array.
  4. *(buffer + 2) + 1 – displacement to access 2nd element in the array of 7 one dimensional arrays.
  5. *( *(buffer + 2) + 1) – dereferencing (accessing), now the type of expression *( *(buffer + 2) + 1) is an array of integers.
  6. *( *(buffer + 2) + 1) + 2 – displacement to get element at 3rd position in the single dimension array of integers.
  7. *( *( *(buffer + 2) + 1) + 2) – accessing the element at 3rd position (the overall expression type is int now).

The compiler calculates an “offset” to access an array element. The “offset” is calculated based on dimensions of the array. In the above case, offset = 2 * (7 * 6) + 1 * (6) + 2. Both (7 * 6) and (6) are dimensions, note that the higher dimension (5) is not used in the offset calculation. During compile time the compiler is aware of the dimensions of the array. Using offset we can access the element as shown below,

element_data = *( (int *)buffer + offset );

It is not always possible to declare dimensions of array at compile time. Sometimes we need to interpret a buffer as a multidimensional array object. For instance, when we are processing a 3D image whose dimensions are determined at run-time, usual array subscript rules can’t be used. It is due to the lack of fixed dimensions during compile time. Consider the following example:

int *base;

Where base is pointing to a large image buffer that represents a 3D image of dimensions l x b x h where lb and h are variables. If we want to access an element at location (2, 3, 4) we need to calculate offset of the element as

offset = 2 * (b x h) + 3 * (h) + 4 and the element located at base + offset.

Generalizing further, given start address (say base) of an array of size [l x b x h] dimensions, we can access the element at an arbitrary location (a, b, c) in the following way,

data = *(base + a * (b x h) + b * (h) + c); // Note that we haven’t used the higher dimension l.

The same concept can be applied to any number of dimensions. We don’t need the higher dimension to calculate the offset of any element in the multidimensional array. It is the reason behind omitting the higher dimension when we pass multidimensional arrays to functions. The higher dimension is needed only when the programmer is iterating over a limited number of elements of the higher dimension.

In summary, we can produce a pointer to a multidimensional array like int buffer[5][7][6] by using a type casting to represent the data type T. Here it would be int (*pointer)[5][7][6] = &buffer;.

However, by default the value of buffer after it’s declaration as a 3-D array is a pointer to the first 2-D array element in buffer (int (*pointer)[][6] = buffer).

// A C/C++ puzzle, predict the output of following program

int main(){
   char arr[5][7][6];
   char (*p)[5][7][6] = &arr;
 
   /* Hint: &arr - is of type const pointer to an array of
      5 two dimensional arrays of size [7][6] */
 
   printf("%d\n", (&arr + 1) - &arr);
   printf("%d\n", (char *)(&arr + 1) - (char *)&arr);
   printf("%d\n", (unsigned)(arr + 1) - (unsigned)arr);
   printf("%d\n", (unsigned)(p + 1) - (unsigned)p);
 
   return 0;
}
Output:

1

210

42

210

References


Conclusion

That’s all for today. This is my sixth round of the “#100daysofcode” challenge. I will be continuing my work from round five into round six. I am currently working through the book “Cracking the Coding Interview” by Gayle Laakmann McDowell. My goal is to become more familiar with algorithms and data structures. This goal was derived from my goal to better understand operating systems and key programs that I use in the terminal regularly e.g. Git. This goal was in term derived from my desire to better understand the fundamental tools used for coding outside of popular GUIs. This in turn was derived from my desire to be a better back-end developer.

I have no idea if my path is correct but I am walking down this road anyways. Worst case scenario I learn a whole bunch of stuff that will help me out on my own personal projects.