Data
sgptools.utils.data
get_dataset(dataset_path=None, num_train=1000, num_test=2500, num_candidates=150, **kwargs)
Method to generate/load datasets and preprocess them for SP/IPP. The method uses kmeans to generate train and test sets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_path |
str
|
Path to a tif dataset file. If None, the method will generate synthetic data. |
None
|
num_train |
int
|
Number of training samples to generate. |
1000
|
num_test |
int
|
Number of testing samples to generate. |
2500
|
num_candidates |
int
|
Number of candidate locations to generate. |
150
|
Returns:
Name | Type | Description |
---|---|---|
X_train |
ndarray
|
(n, d); Training set inputs |
y_train |
ndarray
|
(n, 1); Training set labels |
X_test |
ndarray
|
(n, d); Testing set inputs |
y_test |
ndarray
|
(n, 1); Testing set labels |
candidates |
ndarray
|
(n, d); Candidate sensor placement locations |
X |
ndarray
|
(n, d); Full dataset inputs |
y |
ndarray
|
(n, 1); Full dataset labels |
Source code in sgptools/utils/data.py
point_pos(point, d, theta)
Generate a point at a distance d from a point at angle theta.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
point |
ndarray
|
(N, 2); array of points |
required |
d |
float
|
distance |
required |
theta |
float
|
angle in radians |
required |
Returns:
Name | Type | Description |
---|---|---|
X |
ndarray
|
(N,); array of x-coordinate |
Y |
ndarray
|
(N,); array of y-coordinate |
Source code in sgptools/utils/data.py
prep_synthetic_dataset(shape=(50, 50), min_height=0.0, max_height=30.0, roughness=0.5, **kwargs)
Generates a 50x50 grid of synthetic elevation data using the diamond square algorithm.
Refer to the following repo for more details
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shape |
tuple
|
(x, y); Grid size along the x and y axis |
(50, 50)
|
min_height |
float
|
Minimum allowed height in the sampled data |
0.0
|
max_height |
float
|
Maximum allowed height in the sampled data |
30.0
|
roughness |
float
|
Roughness of the sampled data |
0.5
|
Returns:
Name | Type | Description |
---|---|---|
X |
ndarray
|
(n, d); Dataset input features |
y |
ndarray
|
(n, 1); Dataset labels |
Source code in sgptools/utils/data.py
prep_tif_dataset(dataset_path)
Load and preprocess a dataset from a GeoTIFF file (.tif file). The input features are set to the x and y pixel block coordinates and the labels are read from the file. The method also removes all invalid points.
Large tif files
need to be downsampled using the following command:
gdalwarp -tr 50 50 <input>.tif <output>.tif
Args: dataset_path (str): Path to the dataset file, used only when dataset_type is 'tif'.
Returns: X (ndarray): (n, d); Dataset input features y (ndarray): (n, 1); Dataset labels
Source code in sgptools/utils/data.py
remove_circle_patches(X, Y, circle_patches)
Remove points inside polycircle patchesgons.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
(ndarray
|
(N,); array of x-coordinate |
required |
Y |
(ndarray
|
(N,); array of y-coordinate |
required |
circle_patches |
list of matplotlib circle patches
|
Circle patches to remove from the X, Y points |
required |
Returns:
Name | Type | Description |
---|---|---|
X |
ndarray
|
(N,); array of x-coordinate |
Y |
ndarray
|
(N,); array of y-coordinate |
Source code in sgptools/utils/data.py
remove_polygons(X, Y, polygons)
Remove points inside polygons.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
(ndarray
|
(N,); array of x-coordinate |
required |
Y |
(ndarray
|
(N,); array of y-coordinate |
required |
polygons |
list of matplotlib path polygon
|
Polygons to remove from the X, Y points |
required |
Returns:
Name | Type | Description |
---|---|---|
X |
ndarray
|
(N,); array of x-coordinate |
Y |
ndarray
|
(N,); array of y-coordinate |