Expresso Documentation

2.0.2

This document describes the C++ array class template package Expresso which provides efficient and flexible arrays for numerical computations with arbitrary array element types.

The library is available for download under the LGPL copyleft terms (see COPYING) from the Download page Printable versions of this document (pdf and ps) are available on the above download page.

Key features

Easy As simple and straightforward to use as the c++ language syntax allows.
Safe memory handling is completely automatic using reference counting. No memory leaks.
Efficient by using expression template techniques there is no need for temporary objects in expressions. Fast elementwise access with parenthesis operator.
Flexible: The rank and the element type is arbitrary (template arguments). The size and index ranges of an array can be reallocated at runtime. Dimensions may be permuted and reversed, much like in matlab. Arrays may refer to subsets of other arrays.
Fortran compatible Pointers can be passed to fortran routines where they are declared as fortran arrays and indexed as usual. The indexing convention is Fortran (innermost rank first).
Extendible The expert may define new array specializations with new ways of storing data, for example banded, sparse ... If certain restrictions are fulfilled, these array types may be used with any other array types or scalars to form expressions using the template operators.

Platforms

Everything is strict ANSI standard C++. Most compilers now implement all the features of ANSI C++ that are needed by expresso.

The following compilers have been found to work fine with expresso. Newer versions of the same compiler is very likely to work just as well.

gcc 3.0
intel icc 7.0
Digital Dec cxx 6.0
Portland group C++ 1.7
KAI C++ 3.2f
MIPSpro Compilers: Version 7.41

The following old compiler versions have been found to fail compiling expresso:

gcc 2.8.1
Sun C++ 4.2
Microsoft Visual C++ 6
MIPSpro Compilers: Version 7.20

Quick start

Use the expresso array template as follows

Unpack the package where you want it

> cd /usr/local/include
/usr/local/include> gunzip < expresso.tgz | tar xf -

Edit your sourcefile in the working directory for example as follows:

#include "expresso.h"
using namespace esso;

main() {
  Array<3,float> x(10,20,30);

  // You have now a fortran compatible 3-dimensional array of floats
  // with index range (0:9,0:19,0:29)

  x<<=0;            // Efficient elementwise copy from scalar
  x(2,5,14)=1;      // Fortran-like element access
  x<<=25*Sqrt(x)-x; // Efficient arithmetic expressions
}

Compile your program with an ANSI-C++ compiler. The performance depends on the compilers ability to optimize the code, expresso gives the compiler maximal opportunities to do a good job. The free compiler gcc will give fast code with the following options:
```
> gcc -I/usr/local/include/expresso -O3 testc.cc
```

No special libraries need to be linked in.

This section contains a more thourough introduction to expresso. The most important features are presented. After reading this short text, you should be able to use expresso for most tasks, however, first read the exremely short "Quick start" section.

The Array template

An esso::Array object is declared as follows:

  Array<3,float> a(10,20,30);

where '3' is the rank (i.e. the number of dimensions), 'float' is the element type and the parameters 10,20,30 are index ranges. The index range of this array is (0:9, 0:19, 0:29). The corresponding fortran declaration would be

  real a(0:9,0:19,0:29);

There is a second flavor of Array which may give more efficient code but is somewhat less flexible.

  Array<3,Fix<float> > a(10,20,30);

The difference is a choice between speed and flexibility: the standard Array is more flexible and Fix is faster.

The most useful constructors for Arrays follow (other exist), see esso::Array class documentation):

  Ix<3> ix(10,20,30);

  // Range as defined in 'ix'
  Array<3,float> a(ix); 

  // An array with start index is 1 instead of 0
  Array<3,float> b(ix,Ix<3>(1));

  // Copy constructor, this creates a reference to the same elements
  Array<3,float> c(b);

The class template Ix is a small vector of integers whose length is known at compile time, see esso::Ix.

Element access

A simple, powerful and fast way of accessing an element is using operator (). This works just like in fortran:

  Array<3,Fix<float> > x(10,10,10);

  for(int k=x.L(2); k<x.U(2); k++)
    for(int j=x.L(1); j<x.U(1); j++)
      for(int i=x.L(0); i<x.U(0); i++)
        x(i,j,k)=x(i,j,k)*x(i,j,k)-1;

This produces optimal code in many cases, because the index operator is inlined and carefully designed for optimal performance. (The Fix Array specialization is recommended for best performance). However, in general array expression techniques (see the section on Arithmetic operations) give equal or better performance and in addition more safety because the programmer need not be concerned with index bounds.

Note the usage of member functions L and U. These functions return lower and upper index bounds for the Array.

Just as in fortran, the innermost loop should be over the first index for best performance, this is because the elements are by default stored in memory in the same way as fortran (unless you have permuted the ranks in your code). Note that the storage order of ranks is thus by default reversed compared to built-in C and C++ arrays.

The Ix template

The esso::Ix template is used to specify an small vector. An Ix object can be declared as follows:

  Ix<3,T> i;

Replace T by the desired element type (default is a signed integer type) The index elements are accessed with operator []. The obvious constructors and operators are defined:

  Ix<3> i(10,15,20);   // All elements are initialized
  Ix<3> j(10);         // 10 is used to initialize all elements
  Ix<3> k;             // Uninitialized
  Ix<3> l(i);          // Elements of Ix object i are copied

  j[1]=25;             // Element access
  j=25;                // Assign all elements with 25
  j=i;                 // Copy elements from Ix object
  cout << j[0] <<' '<< j[1] <<' '<< j[2] <<endl;

The output of the above code is "10 25 20".

Index objects can (of course) be used to access elements in an array:

  Ix<2> i(2,2);

  Array<2,float> x(i);   // Allocate array using length given by i

  i = Ix<2>(1,0);
  a[i] = 3.15;      // Access an element using an Ix object as index
  cout << a;

Note that the stream operator is defined for Array objects.

It is recommended that the ()-operator are used for indexing an Array rather than Ix objects when performance is important, because most of todays compilers are better at optimizing such code. Ix objects are still very commonly used though, for example as arguments to logical operators and to declarations of Arrays.

Another possibility is to use Ix objects as element types for Array objects. The following code snippet shows some of the flexibility by such technique:

  Array<2,Ix<3,float> > a(10,20);
  a(2,4)[1] = 0;          // Access a single float element
  a(2,4) = 1;             // Access a Ix object and assign with a scalar
  a <<= Ix<3,float>(1,2,3); // Assign all Ix elements with an Ix object
  a <<= 3.14;             // Assign all float elements with a scalar

Reference counting

One or more Array objects can refer to the same memory block, or different parts of the same memory block. This is all handled automatically, so the programmer need seldomly be concerned with the memory allocation.

How is this done?

When an array is created memory is allocated automatically to fit it's declared index range (unless declared with the default constructor, in which case no memory is allocated). When another array is assigned from the first, a counter in the memory block object is increased, so that at all times the memory block knows how many Array objects are referring to it. When an Array object stops referring to a certain memory block the counter is decreased one step. When the counter reaches zero, the memory block is deallocated automatically.

As an example, it is safe and reasonably efficient to return a locally declared Array object from a function, because the memory will continue to exist until the last Array object referring to it is destroyed or assigned some other memory block.

  Array<3,float> f(float q) {
    Array<3,float> array(10,10,10);
    array<<=q;
    return array;       // Safe and fast!
  }

  main() {
    Array<3,float> x;

    f(2.8);             // The returned memory is safely deleted

    x=f(3.14);          // x now points to the memory allocated in the
                        // call to f()
  }

Other examples where arrays refer to the same memory area:

  Array<3,float> a(10,10,10);

  Array<3,float> b(a);   // b refers to the same memory as a
  Array<2,float> c;
  c = Rmdim(a,2,5);            // c refers to the slice resulting
                               // from fixing dimension 2 to index value 5

  void f(Array<3,float>);

  f(a);                        // The local array object in f
                               // refers to the same memory as a

What is a slice? It is the (N-K)-dimensional array obtained by taking a N-dimensional array and specifying K of it's indexes.

In many senses the Array objects work as a pointer with automatic memory allocation mechanism and an index range.

Assignment operations

If elementwise assignment is wanted, use operator <<=

  Array<3,float> a(10,10,10),b(15,10,5);

  a<<= 3.14;                   // Copy 3.14 to all elements

  b<<=a;                       // Copy elements from Array a to b

If the index ranges differ between the source and destination in an assignment (as in the above example) the loops are only performed over the intersection of the source and destination array index ranges. For more info on assignment and expressions of arrays, see documentation of operator <<=.

The operators +=, -=, *= and /= work the same way as <<= but also performs an arithmetic operation.

To create a reference rather than copying elements, use operator =

  Array<3,float> a(10,10,10),b(15,10,5);

  a<<= 3.14;                   // Copy 3.14 to all elements

  b=a;                         // Do not copy elements, create reference

In the above example, the index range of b is identical to that of a after the assignment.

Logical operations

An array can be manipulated in many more ways at runtime than is possible in fortran 77 or fortran 90. The index range can be shifted or restricted, dimensions may be permuted, and the memory block can be reinitialized with a new memory area. These manipulations (which do not change the actual values of the elements) are refered to as Logical array operations.

  Array<3,float> a(10,10,10);

  a.Shift(0,10);          // The index range now is (10:19,0:9,0:9)
  a.Permute(0,2);         // Ranks 0 and 2 are permuted
  a.Reverse(1);           // The order of elements in dimension 1 is reversed

Many more logical functions exist that takes different arguments, see esso::Array class documentation.

These member functions also exist in a function flavors:

  Array<3,float> a(10,10,10),b,c,d;
  b = Shift(a,0,10);
  c = Permute(a,0,2);
  d = Reverse(a,1);

See esso namespace documentation for listing of more logical functions.

The Permutation class

To improve safety and convenience, a special class is provide to help dealing with permutations of ranks. Using this object you can create elementary permutations easily, permute arrays and Ix objects using any permutation, compose permutations as product of other permutations, invert and transform permutations in different ways. See esso::Permutation for more details.

Arithmetic operations

The usual arithmetic operations can be used to form expressions that can in turn be used as right hand side in an array assignment operation. Any combination of arrays and scalar (element types) can be mixed in expressions. (The arrays must have the same rank of course).

  Array<2,float> a,b,c,d;
  a<<=a-b*c+3.14;

The assignment is done over the intersection of all arrays.

In addition to the four arithmetic operations there are Max, Min, Abs, Sqr, Sqrt, Pow and some other functions defined to be used in arithmetic expressions, see esso namespace documentation for a complete listing.

Of course, the result of logical functions can also be used in arithmetic expressions. (However, arithmetic expressions can in most cases not be used as input to logical functions).

When compiled, the assignment with an arithmetic expression right hand side results in a single loop, i.e. no temporary array objects are created. This means that a good compiler may produce optimal code for such array expression assignments.

Experience with some of todays compilers indicate that best performance is attained by forming expressions that do not contain to many operands. As a rule of thumb, use of ten or more array operands is not recommended. Also, using arithmetic assignment operators (i.e. +=, -= ...) wherever possible is recommended. (i.e. rather than a<<=a+1, use a+=1).

Fixed index range

The Fix type arrays can have one or more of it's innermost index ranges fixed at compile time. The syntax is as follows:

  Array<4,Fix<float[30][20]> > a(10,5); // Index range is (0:29,0:19,0:9,0:4)
  a(0,1,10,20) = 3.14;    // The fixed ranks are indexed just as usual.

All index ranges can not be fixed, at least one rank must be variable (namely the last).

Note that an alternative is to use Ix objects as elements:

  Array<2,Ix<20,Ix<30,float> > > a(10,5); // Index range is (0:29,0:19,0:9,0:4)
  a(0,1)[10][20] = 3.14;

There are advantages and disadvantages with both techniques, choose the way you find most convenient and efficient for your application.

MPI support

For parallel programs using MPI expresso provides functions for creating an MPI datatype corresponding to an expresso array. Both standard C MPI bindings and C++ bindings are supported. See esso::CreateMPIArray and esso::CreateMPIPPArray You must include expresso/mpi.h and define USE_MPI or USE_MPIPP macro to use these functions.

Debugging support

Defining the macro EXPRESSO_DEBUG activates a large number of runtime tests in the expresso code. For instance, all array access operations are checked against the valid index bounds. Almost all arguments to expresso functions are checked at runtime. On error an assertion fails and the program aborts. Together with a standard debugger this will hopefully help in finding your bugs easily.

Beware that the runtime checks activated by the EXPRESSO_DEBUG macro will decrease efficiency drastically. It is only meant for use while developing or debugging.

Also beware that compiling only some files with EXPRESSO_DEBUG may sometimes produce strange results because inlined functions may be defined differently in different object files. It is recommended not to link together files compiled with different EXPRESSO_DEBUG macro settings.

Compilation errors

In the design of expresso emphasis has been made to maximize the compiler's ability to discover typos and other errors at compile time wherever possible. On the other hand it is impossible to control how each compiler reports a compile error. In many cases a syntax error can produce a very long and cryptic compiler error. The reason is the heavy use of recursively defined template classes needed to attain the features and high performance that expresso provides. It is often quite impossible to understand what the problem is by just looking at the compiler output.

The recommended way of dealing with this is to just look at the compiler output to determine which line in your code caused the error, then look in your code at that line to see what the problem could be.

The good part is that (in contrast to fortran and other less type-safe languages) if your code compiles there is a very good chance that it actually does what you expected.

Author:: Oskar Enoksson, 2004