Skip to content

User manual

This user manual covers compiling OpenBLAS itself, linking your code to OpenBLAS, example code to use the C (CBLAS) and Fortran (BLAS) APIs, and some troubleshooting tips. Compiling OpenBLAS is optional, since you may be able to install with a package manager.

Note

The OpenBLAS documentation does not contain API reference documentation for BLAS or LAPACK, since these are standardized APIs, the documentation for which can be found in other places. If you want to understand every BLAS and LAPACK function and definition, we recommend reading the Netlib BLAS and Netlib LAPACK documentation.

OpenBLAS does contain a limited number of functions that are non-standard, these are documented at OpenBLAS extension functions.

Compiling OpenBLAS

Normal compile

The default way to build and install OpenBLAS from source is with Make:

make  # add `-j4` to compile in parallel with 4 processes
make install

By default, the CPU architecture is detected automatically when invoking make, and the build is optimized for the detected CPU. To override the autodetection, use the TARGET flag:

# `make TARGET=xxx` sets target CPU: e.g. for an Intel Nehalem CPU:
make TARGET=NEHALEM
The full list of known target CPU architectures can be found in TargetList.txt in the root of the repository.

Cross compile

For a basic cross-compilation with Make, three steps need to be taken:

  • Set the CC and FC environment variables to select the cross toolchains for C and Fortran.
  • Set the HOSTCC environment variable to select the host C compiler (i.e. the regular C compiler for the machine on which you are invoking the build).
  • Set TARGET explicitly to the CPU architecture on which the produced OpenBLAS binaries will be used.

Cross-compilation examples

Compile the library for ARM Cortex-A9 linux on an x86-64 machine (note: install only gnueabihf versions of the cross toolchain - see this issue comment for why):

make CC=arm-linux-gnueabihf-gcc FC=arm-linux-gnueabihf-gfortran HOSTCC=gcc TARGET=CORTEXA9

Compile OpenBLAS for a loongson3a CPU on an x86-64 machine:

make BINARY=64 CC=mips64el-unknown-linux-gnu-gcc FC=mips64el-unknown-linux-gnu-gfortran HOSTCC=gcc TARGET=LOONGSON3A

Compile OpenBLAS for loongson3a CPU with the loongcc (based on Open64) compiler on an x86-64 machine:

make CC=loongcc FC=loongf95 HOSTCC=gcc TARGET=LOONGSON3A CROSS=1 CROSS_SUFFIX=mips64el-st-linux-gnu-   NO_LAPACKE=1 NO_SHARED=1 BINARY=32

Building a debug version

Add DEBUG=1 to your build command, e.g.:

make DEBUG=1

Install to a specific directory

Note

Installing to a directory is optional; it is also possible to use the shared or static libraries directly from the build directory.

Use make install with the PREFIX flag to install to a specific directory:

make install PREFIX=/path/to/installation/directory

The default directory is /opt/OpenBLAS.

Important

Note that any flags passed to make during build should also be passed to make install to circumvent any install errors, i.e. some headers not being copied over correctly.

For more detailed information on building/installing from source, please read the Installation Guide.

Linking to OpenBLAS

OpenBLAS can be used as a shared or a static library.

The shared library is normally called libopenblas.so, but not that the name may be different as a result of build flags used or naming choices by a distro packager (see [distributing.md] for details). To link a shared library named libopenblas.so, the flag -lopenblas is needed. To find the OpenBLAS headers, a -I/path/to/includedir is needed. And unless the library is installed in a directory that the linker searches by default, also -L and -Wl,-rpath flags are needed. For a source file test.c (e.g., the example code under Call CBLAS interface further down), the shared library can then be linked with:

gcc -o test test.c -I/your_path/OpenBLAS/include/ -L/your_path/OpenBLAS/lib -Wl,-rpath,/your_path/OpenBLAS/lib -lopenblas

The -Wl,-rpath,/your_path/OpenBLAS/lib linker flag can be omitted if you ran ldconfig to update linker cache, put /your_path/OpenBLAS/lib in /etc/ld.so.conf or a file in /etc/ld.so.conf.d, or installed OpenBLAS in a location that is part of the ld.so default search path (usually /lib, /usr/lib and /usr/local/lib). Alternatively, you can set the environment variable LD_LIBRARY_PATH to point to the folder that contains libopenblas.so. Otherwise, the build may succeed but at runtime loading the library will fail with a message like:

cannot open shared object file: no such file or directory

More flags may be needed, depending on how OpenBLAS was built:

  • If libopenblas is multi-threaded, please add -lpthread.
  • If the library contains LAPACK functions (usually also true), please add -lgfortran (other Fortran libraries may also be needed, e.g. -lquadmath). Note that if you only make calls to LAPACKE routines, i.e. your code has #include "lapacke.h" and makes calls to methods like LAPACKE_dgeqrf, then -lgfortran is not needed.

Tip

Usually a pkg-config file (e.g., openblas.pc) is installed together with a libopenblas shared library. pkg-config is a tool that will tell you the exact flags needed for linking. For example:

$ pkg-config --cflags openblas
-I/usr/local/include
$ pkg-config --libs openblas
-L/usr/local/lib -lopenblas

Linking a static library is simpler - add the path to the static OpenBLAS library to the compile command:

gcc -o test test.c /your/path/libopenblas.a

Code examples

Call CBLAS interface

This example shows calling cblas_dgemm in C:

#include <cblas.h>
#include <stdio.h>

void main()
{
  int i=0;
  double A[6] = {1.0,2.0,1.0,-3.0,4.0,-1.0};         
  double B[6] = {1.0,2.0,1.0,-3.0,4.0,-1.0};  
  double C[9] = {.5,.5,.5,.5,.5,.5,.5,.5,.5}; 
  cblas_dgemm(CblasColMajor, CblasNoTrans, CblasTrans,3,3,2,1,A, 3, B, 3,2,C,3);

  for(i=0; i<9; i++)
    printf("%lf ", C[i]);
  printf("\n");
}

To compile this file, save it as test_cblas_dgemm.c and then run:

gcc -o test_cblas_open test_cblas_dgemm.c -I/your_path/OpenBLAS/include/ -L/your_path/OpenBLAS/lib -lopenblas -lpthread -lgfortran
will result in a test_cblas_open executable.

Call BLAS Fortran interface

This example shows calling the dgemm Fortran interface in C:

#include "stdio.h"
#include "stdlib.h"
#include "sys/time.h"
#include "time.h"

extern void dgemm_(char*, char*, int*, int*,int*, double*, double*, int*, double*, int*, double*, double*, int*);

int main(int argc, char* argv[])
{
  int i;
  printf("test!\n");
  if(argc<4){
    printf("Input Error\n");
    return 1;
  }

  int m = atoi(argv[1]);
  int n = atoi(argv[2]);
  int k = atoi(argv[3]);
  int sizeofa = m * k;
  int sizeofb = k * n;
  int sizeofc = m * n;
  char ta = 'N';
  char tb = 'N';
  double alpha = 1.2;
  double beta = 0.001;

  struct timeval start,finish;
  double duration;

  double* A = (double*)malloc(sizeof(double) * sizeofa);
  double* B = (double*)malloc(sizeof(double) * sizeofb);
  double* C = (double*)malloc(sizeof(double) * sizeofc);

  srand((unsigned)time(NULL));

  for (i=0; i<sizeofa; i++)
    A[i] = i%3+1;//(rand()%100)/10.0;

  for (i=0; i<sizeofb; i++)
    B[i] = i%3+1;//(rand()%100)/10.0;

  for (i=0; i<sizeofc; i++)
    C[i] = i%3+1;//(rand()%100)/10.0;
  //#if 0
  printf("m=%d,n=%d,k=%d,alpha=%lf,beta=%lf,sizeofc=%d\n",m,n,k,alpha,beta,sizeofc);
  gettimeofday(&start, NULL);
  dgemm_(&ta, &tb, &m, &n, &k, &alpha, A, &m, B, &k, &beta, C, &m);
  gettimeofday(&finish, NULL);

  duration = ((double)(finish.tv_sec-start.tv_sec)*1000000 + (double)(finish.tv_usec-start.tv_usec)) / 1000000;
  double gflops = 2.0 * m *n*k;
  gflops = gflops/duration*1.0e-6;

  FILE *fp;
  fp = fopen("timeDGEMM.txt", "a");
  fprintf(fp, "%dx%dx%d\t%lf s\t%lf MFLOPS\n", m, n, k, duration, gflops);
  fclose(fp);

  free(A);
  free(B);
  free(C);
  return 0;
}

To compile this file, save it as time_dgemm.c and then run:

gcc -o time_dgemm time_dgemm.c /your/path/libopenblas.a -lpthread
You can then run it as: ./time_dgemm <m> <n> <k>, with m, n, and k input parameters to the time_dgemm executable.

Note

When calling the Fortran interface from C, you have to deal with symbol name differences caused by compiler conventions. That is why the dgemm_ function call in the example above has a trailing underscore. This is what it looks like when using gcc/gfortran, however such details may change for different compilers. Hence it requires extra support code. The CBLAS interface may be more portable when writing C code.

When writing code that needs to be portable and work across different platforms and compilers, the above code example is not recommended for usage. Instead, we advise looking at how OpenBLAS (or BLAS in general, since this problem isn't specific to OpenBLAS) functions are called in widely used projects like Julia, SciPy, or R.

Troubleshooting

  • Please read the FAQ first, your problem may be described there.
  • Please ensure you are using a recent enough compiler, that supports the features your CPU provides (example: GCC versions before 4.6 were known to not support AVX kernels, and before 6.1 AVX512CD kernels).
  • The number of CPU cores supported by default is <=256. On Linux x86-64, there is experimental support for up to 1024 cores and 128 NUMA nodes if you build the library with BIGNUMA=1.
  • OpenBLAS does not set processor affinity by default. On Linux, you can enable processor affinity by commenting out the line NO_AFFINITY=1 in Makefile.rule.
  • On Loongson 3A, make test is known to fail with a pthread_create error and an EAGAIN error code. However, it will be OK when you run the same testcase in a shell.