Blog 2023 08 09 C++23: multidimensional operator[]
Post
Cancel

C++23: multidimensional operator[]

Two weeks ago we talked about how C++23 allows for the call and subscription operators to be static. This time, we stay in the land of operators and we’ll continue discussing the subscript operator. We are going to see how the subscription operator becomes multidimensional thanks to the authors of P2128R6.

Motivations

If you ask C++ developers about how they access the elements of multidimensional arrays, it’s likely that you’ll get several different answers depending on their experience.

If you ask someone who is not-so-experienced or works in a non-mathematical domain, there is a fair chance that the answer will be that you should use multiple subscription operators in a row: myMatrix[x][y].

There are a couple of problems with this approach:

  • it implies that myMatrix[x] is a valid expression
  • due to failure to inline it has negative performance implications
  • myMatrix[x] might even perform a deep copy!

If you ask someone who is familiar with scientific libraries, the answer to how to access a multi-dimensional array will likely be to use the call operator myMatrix(x, y).

While it can solve the problems mentioned for chained subscript operators, it has other issues.

  • it’s far from being intuitive to read myMatrix(x, y) = 42. Are we assigning a value to a function call or what?
  • it’s not consistent with the one-dimensional access of myMatrix[x]
  • it might be difficult to distinguish invocables and multi-dimensional arrays. Even for the compiler
  • even library authors often consider this as a workaround

There is yet another method. You might pass a tuple to the subscript operator: myMatrix[{x, y}]. Well, this is not very intuitive either and is still inconsistent with one-dimensional access

Using the subscript operator with several parameters (myMatrix[x, y]) would solve these issues.

What is changing?

C++20 deprecated uses of the comma operator in subscripting expressions. Let’s see what changed through an example:

1
2
3
4
5
6
7
8
9
10
11
12
13
std::vector<int> numbers = {1, 2, 3, 4, 5};

// The output of this in C++17:
// warning: left operand of comma operator has no effect [-Wunused-value]
// 2
std::cout << numbers[0, 1] << '\n';


// The output of this in C++20:  
// warning: top-level comma expression in array subscript is deprecated [-Wcomma-subscript]
// warning: left operand of comma operator has no effect [-Wunused-value]
// 2
std::cout << numbers[0, 1] << '\n';

In the above example, the comma operator is used and the first parameter is simply ignored. You get a warning for this by the compiler and C++20 added a second commit showing deprecation.

As this deprecation is quite new, the changes presented by P2128R6 are only valid for new standard and user types. Starting from C++23, operator[] should be able to accept zero or more arguments, including variadic arguments.

C-arrays, vectors, arrays, and other already existing containers do not benefit from this change. At least, not yet:

1
2
3
4
5
6
7
8
9
10
11
12
std::vector<std::vector<int>> numbers = {
  {1, 2, 3, 4, 5},
  {11, 12, 13, 14, 15}
};

std::cout << numbers[0, 1] << '\n';
/*
<source>:10:25: warning: top-level comma expression in array subscript changed meaning in C++23 [-Wcomma-subscript]
   10 |     std::cout << numbers[0, 1] << '\n';
<source>:10:15: error: no match for 'operator<<' (operand types are 'std::ostream' {aka 'std::basic_ostream<char>'} and '__gnu_cxx::__alloc_traits<std::allocator<std::vector<int> >, std::vector<int> >::value_type' {aka 'std::vector<int>'})
   10 |     std::cout << numbers[0, 1] << '\n';
*/

This example fails, because numbers[0, 1] for a vector of vectors still tries to access numbers[1] instead of numbers[0][1].

On the other hand, new types such as mdspan in C++23 or mdarray most probably in C++26 will get this feature automatically.

For existing types, the new meaning might get adopted starting from C++26.

For the time being, it’s not so easy to showcase this example. One solution is to use the mdspan library from kokkos (Godbolt):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#include <vector>
#include <https://raw.githubusercontent.com/kokkos/mdspan/single-header/mdspan.hpp>
#include <iostream>

int main()
{
  std::vector v = {1,2,3,4,5,6,7,8,9,10,11,12};

  // View data as contiguous memory representing 2 rows of 6 ints each
  auto multispan = std::experimental::mdspan(v.data(), 2, 6);
  std::cout << multispan[1, 1] << '\n';
}

/*
8
*/

Conclusion

Today we discussed the different options we currently have in order to access items of multi-dimensional arrays. As we saw, using chained subscript operators, or the function call operator have their own shortcomings as well as passing a tuple to the subscript operator.

Starting from C++23, at least for new types, the subscript operator will take several arguments comma-separated thanks to P2128R6.

Connect deeper

If you liked this article, please

This post is licensed under CC BY 4.0 by the author.