cars.data_structures.cars_dataset

cars_dataset module:

Module Contents

Classes

CarsDataset

CarsDataset.

Functions

run_save_arrays(future_result, file_name[, tag, ...])

Save future when arrived

run_save_points(future_result, file_name[, overwrite])

Save future result when arrived

load_single_tile_array(→ xarray.Dataset)

Load a xarray tile

load_single_tile_points(tile_path_name)

Load a panda dataframe

save_single_tile_array(dataset, tile_path_name)

Save xarray to directory, saving the data in a different file that

save_single_tile_points(dataframe, tile_path_name)

Save dataFrame to directory, saving the data in a different file that

fill_dataset(dataset[, saving_info, window, profile, ...])

From a full xarray dataset, fill info properly.

fill_dataframe(dataframe[, saving_info, attributes])

From a full pandas dataframe, fill info properly.

fill_dict(data_dict[, saving_info, attributes])

From a fulldict, fill info properly.

save_dataframe(dataframe, file_name[, overwrite])

Save DataFrame to csv format. The content of dataframe is merged to

save_dataset(dataset, file_name, tag[, ...])

Reconstruct and save data.

create_tile_path(→ str)

Create path of tile, according to its position in CarsDataset grid

save_numpy_array(array, file_name)

Save numpy array to file

load_numpy_array(→ numpy.ndarray)

Load numpy array from file

create_none(nb_row, nb_col)

Create a grid filled with None. The created grid is a 2D list :

overlap_array_to_dict(overlap)

Convert matrix of overlaps, to dict format used in CarsDatasets.

window_array_to_dict(window[, overlap])

Convert matrix of windows, to dict format used in CarsDatasets.

dict_profile_to_rio_profile(→ Dict)

Transform a rasterio Profile transformed into serializable Dict,

rio_profile_to_dict_profile(→ Dict)

Transform a rasterio profile into a serializable Dict.

save_dict(dictionary, file_path[, safe_save])

Save dict to json file

load_dict(→ Dict)

Load dict from json file

separate_dicts(dictionary, list_tags)

Separate a dict into two, the second one containing the given tags.

get_attributes_dataframe(dataframe)

Get attributes field in .attr of dataframe

get_window_dataset(dataset)

Get window in dataset

get_overlaps_dataset(dataset)

Get overlaps in dataset

get_profile_rasterio(dataset)

Get profile in dataset

get_attributes(dataset)

Get attributes in dataset

get_profile_for_tag_dataset(→ Dict)

Get profile according to layer to save.

generate_rasterio_window(→ rasterio.windows.Window)

Generate rasterio window to use.

Attributes

CARS_DS_TYPE_ARRAY

CARS_DS_TYPE_POINTS

CARS_DS_TYPE_DICT

TILES_INFO_FILE

OVERLAP_FILE

GRID_FILE

PROFILE_FILE

ATTRIBUTE_FILE

DATASET_FILE

DATAFRAME_FILE

PROFILE

WINDOW

OVERLAPS

ATTRIBUTES

SAVING_INFO

cars.data_structures.cars_dataset.CARS_DS_TYPE_ARRAY = 'arrays'
cars.data_structures.cars_dataset.CARS_DS_TYPE_POINTS = 'points'
cars.data_structures.cars_dataset.CARS_DS_TYPE_DICT = 'dict'
cars.data_structures.cars_dataset.TILES_INFO_FILE = 'tiles_info.json'
cars.data_structures.cars_dataset.OVERLAP_FILE = 'overlaps.npy'
cars.data_structures.cars_dataset.GRID_FILE = 'grid.npy'
cars.data_structures.cars_dataset.PROFILE_FILE = 'profile.json'
cars.data_structures.cars_dataset.ATTRIBUTE_FILE = 'attributes.json'
cars.data_structures.cars_dataset.DATASET_FILE = 'dataset'
cars.data_structures.cars_dataset.DATAFRAME_FILE = 'dataframe.csv'
cars.data_structures.cars_dataset.PROFILE = 'profile'
cars.data_structures.cars_dataset.WINDOW = 'window'
cars.data_structures.cars_dataset.OVERLAPS = 'overlaps'
cars.data_structures.cars_dataset.ATTRIBUTES = 'attributes'
cars.data_structures.cars_dataset.SAVING_INFO = 'saving_info'
class cars.data_structures.cars_dataset.CarsDataset(dataset_type, load_from_disk=None)

CarsDataset.

Internal CARS structure for organazing tiles (xr.Datasets or pd.DataFrames).

property shape

Return the shape of tiling grid (nb_row, nb_col) :return: shape of grid

property tiling_grid

Tiling grid, containing pixel windows of tiles

Returns

tiling grid, of shape [N, M, 4], containing [row_min, row_max, col_min, col_max]

Return type

np.ndarray

__repr__()

Repr function :return: printable self CarsDataset

__str__()

Str function :return: printable self CarsDataset

custom_print()

Return string of self :return: printable self

__getitem__(key)

Get item : return the [row, col] dataset

Parameters

key – tuple index

Returns

tile

Return type

xr.Dataset or pd.DataFrame

__setitem__(key, newvalue)

Set new tile

Parameters
  • key (tuple(int, int)) – tuple of row and col indexes

  • newvalue – tile to set

load_single_tile(tile_path_name: str)

Load a single tile

Parameters

tile_path_name (str) – Path of tile to load

Returns

single tile

Return type

xarray Dataset or Panda dataframe to file

save_single_tile(tile, tile_path_name: str)

Save xarray Dataset or Panda dataframe to file

Parameters
  • tile (xr.Dataset or pd.DataFrame) – tile to save

  • tile_path_name – Path of file to save in

run_save(future_result, file_name: str, **kwargs)

Save future result when arrived

Parameters
  • future_result – xarray.Dataset received

  • file_name – filename to save data to

get_window_as_dict(row, col, from_terrain=False, resolution=1)

Get window in pixels for rasterio. Set from_terrain if tiling grid was defined in geographic coordinates.

Parameters
  • row (int) – row

  • col (int) – col

  • from_terrain (bool) – true if in terrain coordinates

  • resolution (float) – resolution

Returns

New window : { “row_min” : row_min , “row_max” : row_max “col_min” : col_min “col_max” : col_max }

Return type

Dict

create_grid(nb_col: int, nb_row: int, row_split: int, col_split: int, row_overlap: int, col_overlap: int)

Generate grid of positions by splitting [0, nb_row]x[0, nb_col] in splits of xsplit x ysplit size

:param nb_col : number of columns :param nb_row : number of lines :param col_split: width of splits :param row_split: height of splits :param col_overlap: overlap to apply on rows :param row_overlap: overlap to apply on cols

generate_none_tiles()
Generate the structure of data tiles, with Nones, according

to grid shape.

create_empty_copy(cars_ds)

Copy attributes, grid, overlaps, and create Nones.

Parameters

cars_ds (CarsDataset) – CarsDataset to copy

generate_descriptor(future_result, file_name, tag=None, dtype=None, nodata=None)

Generate de rasterio descriptor for the given future result

Only works with pixelic tiling grid

Parameters
  • future_result (xr.Dataset) – Future result

  • file_name (str) – file name to save futures to

  • tag (str) – tag to save

  • dtype (str) – dtype

  • nodata (float) – no data value

save_cars_dataset(directory)

Save whole CarsDataset to given directory, including tiling grids, attributes, overlaps, and all the xr.Dataset or pd.DataFrames.

Parameters

directory (str) – Path where to save self CarsDataset

load_cars_dataset_from_disk(directory)

Load whole CarsDataset from given directory

Parameters

directory (str) – Path where is saved CarsDataset to load

cars.data_structures.cars_dataset.run_save_arrays(future_result, file_name, tag=None, descriptor=None)

Save future when arrived

Parameters
  • future_result (xarray.Dataset) – xarray.Dataset received

  • file_name (str) – filename to save data to

  • tag (str) – dataset tag to rasterize

  • descriptor – rasterio descriptor

cars.data_structures.cars_dataset.run_save_points(future_result, file_name, overwrite=False)

Save future result when arrived

Parameters
  • future_result (pandas Dataframe) – pandas Dataframe received

  • file_name (str) – filename to save data to

  • overwrite (bool) – overwrite file

cars.data_structures.cars_dataset.load_single_tile_array(tile_path_name: str) xarray.Dataset

Load a xarray tile

Parameters

tile_path_name (str) – Path of tile to load

Returns

tile dataset

Return type

xr.Dataset

cars.data_structures.cars_dataset.load_single_tile_points(tile_path_name: str)

Load a panda dataframe

Parameters

tile_path_name (str) – Path of tile to load

Returns

Tile dataframe

Return type

Panda dataframe

cars.data_structures.cars_dataset.save_single_tile_array(dataset: xarray.Dataset, tile_path_name: str)

Save xarray to directory, saving the data in a different file that the attributes (saved in a .json next to it).

Parameters
  • dataset (xr.Dataset) – dataset to save

  • tile_path_name (str) – Path of file to save in

cars.data_structures.cars_dataset.save_single_tile_points(dataframe, tile_path_name: str)

Save dataFrame to directory, saving the data in a different file that the attributes (saved in a .json next to it).

Parameters
  • dataframe (pd.DataFrame) – dataframe to save

  • tile_path_name (str) – Path of file to save in

cars.data_structures.cars_dataset.fill_dataset(dataset, saving_info=None, window=None, profile=None, attributes=None, overlaps=None)

From a full xarray dataset, fill info properly. User can fill with saving information (containing CarsDataset id), window of current tile and its overlaps, rasterio profile of full data, and attributes associated to data

Parameters
  • dataset (xarray_dataset) – dataset to fill

  • saving_info (dict) – created by Orchestrator.get_saving_infos

  • window (dict) –

  • profile (dict) –

  • attributes (dict) –

cars.data_structures.cars_dataset.fill_dataframe(dataframe, saving_info=None, attributes=None)

From a full pandas dataframe, fill info properly. User can fill with saving information (containing CarsDataset id), and attributes associated to data

Parameters
  • dataframe (pandas dataframe) – dataframe to fill

  • saving_info (dict) – created by Orchestrator.get_saving_infos

  • attributes (dict) –

cars.data_structures.cars_dataset.fill_dict(data_dict, saving_info=None, attributes=None)

From a fulldict, fill info properly. User can fill with saving information (containing CarsDataset id), and attributes associated to data

Parameters
  • data_dict (Dict) – dictionnary to fill

  • saving_info (dict) – created by Orchestrator.get_saving_infos

  • attributes (dict) –

cars.data_structures.cars_dataset.save_dataframe(dataframe, file_name, overwrite=True)

Save DataFrame to csv format. The content of dataframe is merged to the content of existing saved Dataframe, if overwrite==False

Parameters
  • file_name (str) – file name to save data to

  • overwrite (bool) – overwrite file if exists

cars.data_structures.cars_dataset.save_dataset(dataset, file_name, tag, use_windows_and_overlaps=False, descriptor=None)

Reconstruct and save data. In order to save properly the dataset to corresponding tiff file, dataset must have been filled with saving info, profile, window, overlaps (if not 0), and rasterio descriptor if already created. See fill_dataset.

Parameters
  • dataset (xr.Dataset) – dataset to save

  • file_name (str) – file name to save data to

  • tag (str) – tag to reconstruct

  • use_windows_and_overlaps (bool) – use saved window and overlaps

  • descriptor (rasterio dataset) – descriptor to use with rasterio

cars.data_structures.cars_dataset.create_tile_path(col: int, row: int, directory: str) str

Create path of tile, according to its position in CarsDataset grid

Parameters
  • col (int) – numero of column

  • row (int) – numero of row

  • directory (str) – path where to save tile

Returns

full path

Return type

str

cars.data_structures.cars_dataset.save_numpy_array(array: numpy.ndarray, file_name: str)

Save numpy array to file

Parameters
  • array (np.ndarray) – array to save

  • file_name (str) – numero of row

cars.data_structures.cars_dataset.load_numpy_array(file_name: str) numpy.ndarray

Load numpy array from file

Parameters

file_name (str) – numero of row

Returns

array

Return type

np.ndarray

cars.data_structures.cars_dataset.create_none(nb_row: int, nb_col: int)

Create a grid filled with None. The created grid is a 2D list : ex: [[None, None], [None, None]]

Parameters
  • nb_row – number of rows

  • nb_col – number of cols

Returns

Grid filled with None

Return type

list of list

cars.data_structures.cars_dataset.overlap_array_to_dict(overlap)

Convert matrix of overlaps, to dict format used in CarsDatasets. Input is : [o_up, o_down, o_left, o_right]. Output is : {“up”: o_up, “down”: o_down, “left”: o_left, “right”: o_right}

Parameters

overlap (List) – overlaps

Returns

New overlaps

Return type

Dict

cars.data_structures.cars_dataset.window_array_to_dict(window, overlap=None)

Convert matrix of windows, to dict format used in CarsDatasets. Use overlaps if you want to get window with overlaps inputs are :

  • window : [row_min, row_max, col_min, col_max], with pixel format

  • overlap (optional): [o_row_min, o_row_max, o_col_min, o_col_max]

outputs are :
{

“row_min” : row_min - o_row_min, “row_max” : row_max + o_row_max, “col_min” : col_min - o_col_min, “col_max” : col_max - o_col_max,

}

Parameters
  • window (List) – window

  • overlap (List) – overlaps

Returns

New window

Return type

Dict

cars.data_structures.cars_dataset.dict_profile_to_rio_profile(dict_profile: Dict) Dict

Transform a rasterio Profile transformed into serializable Dict, into a rasterio profile.

Parameters

profile (Dict) – rasterio Profile transformed into serializable Dict

Returns

Profile

Return type

Rasterio Profile

cars.data_structures.cars_dataset.rio_profile_to_dict_profile(in_profile: Dict) Dict

Transform a rasterio profile into a serializable Dict.

Parameters

in_profile (Dict) – rasterio Profile transformed into serializable Dict

Returns

Profile

Return type

Dict

cars.data_structures.cars_dataset.save_dict(dictionary, file_path: str, safe_save=False)

Save dict to json file

Parameters
  • dictionary (Dict) – dictionary to save

  • file_path (str) – file path to use

  • safe_save (bool) – if True, be robust to types

cars.data_structures.cars_dataset.load_dict(file_path: str) Dict

Load dict from json file

Parameters

file_path (str) – file path to use

cars.data_structures.cars_dataset.separate_dicts(dictionary, list_tags)

Separate a dict into two, the second one containing the given tags.

For example, {key1: val1, key2: val2, key3: val3} with list_tags = [key2] will be split in : {key1: val1, key3: val3} and {key2: val2}

cars.data_structures.cars_dataset.get_attributes_dataframe(dataframe)

Get attributes field in .attr of dataframe

Parameters

dataframe (pandas dataframe) – dataframe

cars.data_structures.cars_dataset.get_window_dataset(dataset)

Get window in dataset

Parameters

dataset (xr.Dataset) – dataset

cars.data_structures.cars_dataset.get_overlaps_dataset(dataset)

Get overlaps in dataset

Parameters

dataset (xr.Dataset) – dataset

cars.data_structures.cars_dataset.get_profile_rasterio(dataset)

Get profile in dataset

Parameters

dataset (xr.Dataset) – dataset

cars.data_structures.cars_dataset.get_attributes(dataset)

Get attributes in dataset

Parameters

dataset (xr.Dataset) – dataset

cars.data_structures.cars_dataset.get_profile_for_tag_dataset(dataset, tag: str) Dict

Get profile according to layer to save. This function modify current rasterio dataset to fix the number of bands of the data associated to given tag.

Parameters

tag (str) – tag to use

Returns

Profile

Return type

Rasterio Profile

cars.data_structures.cars_dataset.generate_rasterio_window(window: Dict) rasterio.windows.Window

Generate rasterio window to use.

Parameters

window (dict) – window to convert, containing ‘row_min’, ‘row_max’, ‘col_min’, ‘col_max

Returns

rasterio window

Return type

rio.windows.Window