Introduction
- Note
- Please refer to the Python tutorial for a short overview of the NPZ format from a Python point of view.
The NPY / NPZ ("a zip file containing multiple NPY files") file format is a "standard binary file format in NumPy", appropriate for binary serialization of large chunks of data. A description of the NPY format is available here.
The C++ implementation of this binary format relies on the rogersce/cnpy library, available under the MIT license. Additional example code can be found directly from the rogersce/cnpy repository.
Comparison with some other file formats
The NPZ binary format is intended to provide a quick and efficient mean to read/save large arrays of data, mostly for debugging purpose. While the first and direct option for saving data would be to use file text, the choice of the NPZ format presents the following advantages:
- it is a binary format, that is the resulting file size will be smaller compared to a plain text file (especially with floating-point numbers),
- it provides exact floating-point representation, that is there is no need to bother with floating-point precision (see for instance the setprecision or std::hexfloat functions),
- it provides some basic compatibility with the NumPy NPZ format (numpy.load and numpy.savez),
- large arrays of data can be easily appended, with support for multi-dimensional arrays.
On the other hand, the main disadvantages are:
- it is a non-human readable format, suitable for saving large arrays of data, but not for easy debugging,
- saving
string
data is not direct, since it must be treated as vector of char
data,
- the current implementation only works on little-endian platform (which is the major endianness nowadays).
You can refer to this Wikipedia page for an exhaustive comparison of data-serialization formats.
Hands-on
How to save/read string data
Saving C++ std::string
data can be achieved the following way:
- create a
string
object and convert it to a vector<char>
object: const std::string save_string = "Open Source Visual Servoing Platform";
std::vector<char> vec_save_string(save_string.begin(), save_string.end());
- add and save the data to the
.npz
file, the identifier is the variable name and the "w"
means write
("a"
means append
to the archive): const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string identifier = "My string data";
visp::cnpy::npz_save(npz_filename, identifier, &vec_save_string[0], { vec_save_string.size() },
"w");
void npz_save(std::string zipname, std::string fname, const T *data, const std::vector< size_t > &shape, std::string mode="w")
Reading back the data can be done easily:
- load the data:
const std::string npz_filename = "tutorial_npz_read_write.npz";
VISP_EXPORT npz_t npz_load(std::string fname)
std::map< std::string, NpyArray > npz_t
- the identifier is then needed,
- a conversion from
vector<char>
to std::string
object is required: const std::string identifier = "My string data";
if (npz_data.find(identifier) != npz_data.end()) {
std::vector<char> vec_arr_string_data = arr_string_data.
as_vec<
char>();
const std::string read_string = std::string(vec_arr_string_data.begin(), vec_arr_string_data.end());
std::cout << "Read string: " << read_string << std::endl;
}
std::vector< T > as_vec() const
- Note
- In the previous example, there is no need to save a "null-terminated" character since it is handled at reading using a specific constructor which uses iterators to the begenning and ending of the
string
data. Additional information can be found here. The other approach would consist to
- append the null character "\0" to the vector: "vec_save_string.push_back(`\0`);"
- and uses the constructor that accepts a pointer of data: "std::string read_string(arr_string_data.data<char>());"
How to save basic data types
Saving C++ basic data type such as int32_t
, float
or even std::complex<double>
is straightforward:
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string int_identifier = "My int data";
int int_data = 99;
const std::string double_identifier = "My double data";
double double_data = 3.14;
const std::string complex_identifier = "My complex data";
std::complex<double> complex_data(int_data, double_data);
Reading back the data can be done easily:
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string int_identifier = "My int data";
const std::string double_identifier = "My double data";
const std::string complex_identifier = "My complex data";
visp::cnpy::npz_t::iterator it_int = npz_data.find(int_identifier);
visp::cnpy::npz_t::iterator it_double = npz_data.find(double_identifier);
visp::cnpy::npz_t::iterator it_complex = npz_data.find(complex_identifier);
if (it_int != npz_data.end() && it_double != npz_data.end() && it_complex != npz_data.end()) {
int int_data = *arr_data_int.
data<
int>();
double double_data = *arr_data_double.
data<
double>();
std::complex<double> complex_data = *arr_data_complex.
data<std::complex<double>>();
std::cout << "Read int data: " << int_data << std::endl;
std::cout << "Read double data: " << double_data << std::endl;
std::cout << "Read complex data, real: " << complex_data.real() << " ; imag: " << complex_data.imag() << std::endl;
}
How to save a vpImage
Finally, one of the advantages of the NPZ
is the possibility to save multi-dimensional arrays easily. As an example, we will save first a vpImage<vpRGBa>
.
Following code shows how to read an image:
const std::string img_filename = "ballons.jpg";
static void read(vpImage< unsigned char > &I, const std::string &filename, int backend=IO_DEFAULT_BACKEND)
Then, saving a color image can be achieved as easily as:
if (img.getSize() != 0) {
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string img_identifier = "My color image";
visp::cnpy::npz_save(npz_filename, img_identifier, &img.bitmap[0], { img.getRows(), img.getCols() },
"w");
}
We have passed the address to the bitmap array, that is a vector of vpRGBa
. The shape of the array is thus "height x width" since all basic elements of the bitmap are already of vpRGBa
type (4 unsigned char
elements).
Reading back the image is done with:
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string img_identifier = "My color image";
visp::cnpy::npz_t::iterator it_img = npz_data.find(img_identifier);
if (it_img != npz_data.end()) {
const bool copy_data = false;
std::cout << "Img: " << img.getWidth() << "x" << img.getHeight() << std::endl;
std::unique_ptr<vpDisplay> ptr_display;
#if defined(VISP_HAVE_X11)
ptr_display = std::make_unique<vpDisplayX>(img);
#elif defined(VISP_HAVE_GDI)
ptr_display = std::make_unique<vpDisplayGDI>(img);
#endif
}
static bool getClick(const vpImage< unsigned char > &I, bool blocking=true)
static void display(const vpImage< unsigned char > &I)
static void flush(const vpImage< unsigned char > &I)
static void displayText(const vpImage< unsigned char > &I, const vpImagePoint &ip, const std::string &s, const vpColor &color)
std::vector< size_t > shape
The vpImage
constructor accepting a vpRGBa
pointer is used, with the appropriate image height and width values.
Finally, the image is displayed.
How to save a multi-dimensional array
Similarly, the following code shows how to save a multi-dimensional array with a shape corresponding to {H x W x 3}
:
const std::string img_filename = "ballons.jpg";
if (img.getSize() != 0) {
std::vector<unsigned char> vec_data_img;
vec_data_img.resize(3*img.getSize());
img.getSize());
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string img_identifier = "My RGB image";
visp::cnpy::npz_save(npz_filename, img_identifier, &vec_data_img[0], { img.getRows(), img.getCols(), 3 },
"w");
}
static void RGBaToRGB(unsigned char *rgba, unsigned char *rgb, unsigned int size)
Finally, the image can be read back and displayed with:
const std::string npz_filename = "tutorial_npz_read_write.npz";
const std::string img_identifier = "My RGB image";
visp::cnpy::npz_t::iterator it_img = npz_data.find(img_identifier);
if (it_img != npz_data.end()) {
img.getSize());
std::unique_ptr<vpDisplay> ptr_display;
#if defined(VISP_HAVE_X11)
ptr_display = std::make_unique<vpDisplayX>(img);
#elif defined(VISP_HAVE_GDI)
ptr_display = std::make_unique<vpDisplayGDI>(img);
#endif
}
static void RGBToRGBa(unsigned char *rgb, unsigned char *rgba, unsigned int size)
A specific conversion from RGB
to RGBa
must be done for compatibility with the ViSP vpRGBa
format.