libcity.data.dataset.dataset_subclass¶
- libcity.data.dataset.dataset_subclass.acfm_dataset
- libcity.data.dataset.dataset_subclass.astgcn_dataset
- libcity.data.dataset.dataset_subclass.ccrnn_dataset
- libcity.data.dataset.dataset_subclass.chebconv_dataset
- libcity.data.dataset.dataset_subclass.convgcn_dataset
- libcity.data.dataset.dataset_subclass.crann_dataset
- libcity.data.dataset.dataset_subclass.cstn_dataset
- libcity.data.dataset.dataset_subclass.dmvstnet_dataset
- libcity.data.dataset.dataset_subclass.geosan_dataset
- libcity.data.dataset.dataset_subclass.gman_dataset
- libcity.data.dataset.dataset_subclass.gsnet_dataset
- libcity.data.dataset.dataset_subclass.gts_dataset
- libcity.data.dataset.dataset_subclass.hgcn_dataset
- libcity.data.dataset.dataset_subclass.line_dataset
- libcity.data.dataset.dataset_subclass.multi_stgcnet_dataset
- libcity.data.dataset.dataset_subclass.pbs_trajectory_dataset
- libcity.data.dataset.dataset_subclass.reslstm_dataset
- libcity.data.dataset.dataset_subclass.staggcn_dataset
- libcity.data.dataset.dataset_subclass.stdn_dataset
- libcity.data.dataset.dataset_subclass.stg2seq_dataset
- libcity.data.dataset.dataset_subclass.stresnet_dataset
- libcity.data.dataset.dataset_subclass.tgclstm_dataset
-
class
libcity.data.dataset.dataset_subclass.
ACFMDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_grid_dataset.TrafficStateGridDataset
,libcity.data.dataset.traffic_state_cpt_dataset.TrafficStateCPTDataset
-
_get_external_array
(timestamp_list, ext_data=None, previous_ext=False, ext_time=True)[source]¶ 根据时间戳数组,获取对应时间的外部特征
- Parameters
timestamp_list (list) – 时间戳序列
ext_data – 外部数据
previous_ext – 是否是用过去时间段的外部数据,因为对于预测的时间段Y, 一般没有真实的外部数据,所以用前一个时刻的数据,多步预测则用提前多步的数据
ext_time – 是否加载时间数据,False则只考虑星期,True则加上小时的信息
- Returns
External data shape is (len(timestamp_list), ext_dim)
- Return type
numpy.ndarray
-
_load_ext_data
(ts_x, ts_y)[source]¶ 加载对应时间的外部数据(.ext)
- Parameters
ts_x – 输入数据X对应的时间戳,shape: (num_samples, T_c+T_p+T_t)
ts_y – 输出数据Y对应的时间戳,shape:(num_samples, )
- Returns
- tuple contains:
ext_x(numpy.ndarray): 对应时间的外部数据, shape: (num_samples, T_c+T_p+T_t, ext_dim), ext_y(numpy.ndarray): 对应时间的外部数据, shape: (num_samples, ext_dim)
- Return type
tuple
-
-
class
libcity.data.dataset.dataset_subclass.
ASTGCNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
_generate_input_data
(df)[source]¶ 根据全局参数len_closeness/len_period/len_trend切分输入,产生模型需要的输入
- Parameters
df (np.ndarray) – 输入数据, shape: (len_time, …, feature_dim)
- Returns
- tuple contains:
sources(np.ndarray): 模型输入数据, shape: (num_samples, Tw+Td+Th, …, feature_dim)
targets(np.ndarray): 模型输出数据, shape: (num_samples, Tp, …, feature_dim)
- Return type
tuple
-
_get_sample_indices
(data_sequence, label_start_idx)[source]¶ 根据全局参数len_closeness/len_period/len_trend找到数据预测目标数据 段: [label_start_idx: label_start_idx+output_window)
- Parameters
data_sequence (np.ndarray) – 输入数据,shape: (len_time, …, feature_dim)
label_start_idx (int) – the first index of predicting target, 预测开始的时间片的索引
- Returns
- tuple contains:
trend_sample: 输入数据1, (len_trend * self.output_window, …, feature_dim)
period_sample: 输入数据2, (len_period * self.output_window, …, feature_dim)
closeness_sample: 输入数据3, (len_closeness * self.output_window, …, feature_dim)
target: 输出数据, (self.output_window, …, feature_dim)
- Return type
tuple
-
_search_data
(sequence_length, label_start_idx, num_for_predict, num_of_depend, units)[source]¶ 根据全局参数len_closeness/len_period/len_trend找到数据索引的位置
- Parameters
sequence_length (int) – 历史数据的总长度
label_start_idx (int) – 预测开始的时间片的索引
num_for_predict (int) – 预测的时间片序列长度
num_of_depend (int) – len_trend/len_period/len_closeness
units (int) – trend/period/closeness的长度(以小时为单位)
- Returns
起点-终点区间段的数组,list[(start_idx, end_idx)]
- Return type
list
-
-
class
libcity.data.dataset.dataset_subclass.
CCRNNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
_generate_data
()[source]¶ 加载数据文件(.dyna/.grid/.od/.gridod)和外部数据(.ext),且将二者融合,以X,y的形式返回
- Returns
- tuple contains:
x(np.ndarray): 模型输入数据,(num_samples, input_length, …, feature_dim)
y(np.ndarray): 模型输出数据,(num_samples, output_length, …, feature_dim)
- Return type
tuple
-
_generate_train_val_test
()[source]¶ 加载数据集,并划分训练集、测试集、验证集,并缓存数据集
- Returns
- tuple contains:
x_train: (num_samples, input_length, …, feature_dim)
y_train: (num_samples, input_length, …, feature_dim)
x_val: (num_samples, input_length, …, feature_dim)
y_val: (num_samples, input_length, …, feature_dim)
x_test: (num_samples, input_length, …, feature_dim)
y_test: (num_samples, input_length, …, feature_dim)
- Return type
tuple
-
_load_cache_train_val_test
()[source]¶ 加载之前缓存好的训练集、测试集、验证集
- Returns
- tuple contains:
x_train: (num_samples, input_length, …, feature_dim)
y_train: (num_samples, input_length, …, feature_dim)
x_val: (num_samples, input_length, …, feature_dim)
y_val: (num_samples, input_length, …, feature_dim)
x_test: (num_samples, input_length, …, feature_dim)
y_test: (num_samples, input_length, …, feature_dim)
- Return type
tuple
-
_load_rel
()[source]¶ 根据网格结构构建邻接矩阵,一个格子跟他周围的8个格子邻接
- Returns
self.adj_mx, N*N的邻接矩阵
- Return type
np.ndarray
-
_split_train_val_test
(x, y, df=None)[source]¶ 划分训练集、测试集、验证集,并缓存数据集
- Parameters
x (np.ndarray) – 输入数据 (num_samples, input_length, …, feature_dim)
y (np.ndarray) – 输出数据 (num_samples, input_length, …, feature_dim)
- Returns
- tuple contains:
x_train: (num_samples, input_length, …, feature_dim)
y_train: (num_samples, input_length, …, feature_dim)
x_val: (num_samples, input_length, …, feature_dim)
y_val: (num_samples, input_length, …, feature_dim)
x_test: (num_samples, input_length, …, feature_dim)
y_test: (num_samples, input_length, …, feature_dim)
- Return type
tuple
-
-
class
libcity.data.dataset.dataset_subclass.
CONVGCNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
class
libcity.data.dataset.dataset_subclass.
CRANNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
class
libcity.data.dataset.dataset_subclass.
CSTNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_grid_od_dataset.TrafficStateGridOdDataset
-
_generate_data
()[source]¶ 加载数据文件(.gridod)和外部数据(.ext),以X, W, y的形式返回
- Returns
- tuple contains:
X(np.ndarray): 模型输入数据,(num_samples, input_length, …, feature_dim)
W(np.ndarray): 模型外部数据,(num_samples, input_length, ext_dim) y(np.ndarray): 模型输出数据,(num_samples, output_length, …, feature_dim)
- Return type
tuple
-
-
class
libcity.data.dataset.dataset_subclass.
ChebConvDataset
(config)[source]¶ Bases:
libcity.data.dataset.abstract_dataset.AbstractDataset
-
_get_scalar
(scaler_type, data)[source]¶ 根据全局参数`scaler_type`选择数据归一化方法
- Parameters
data – 训练数据X
- Returns
归一化对象
- Return type
-
-
class
libcity.data.dataset.dataset_subclass.
DMVSTNetDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_grid_dataset.TrafficStateGridDataset
-
class
libcity.data.dataset.dataset_subclass.
GMANDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
class
libcity.data.dataset.dataset_subclass.
GSNetDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_cpt_dataset.TrafficStateCPTDataset
-
class
libcity.data.dataset.dataset_subclass.
GTSDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
class
libcity.data.dataset.dataset_subclass.
GeoSANDataset
(config)[source]¶ Bases:
libcity.data.dataset.abstract_dataset.AbstractDataset
-
static
collect_fn_quadkey
(batch, data_source, sampler, region_processer, loc2quadkey=None, k=5, with_trg_quadkey=True)[source]¶
-
static
-
class
libcity.data.dataset.dataset_subclass.
HGCNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
class
libcity.data.dataset.dataset_subclass.
LINEDataset
(config)[source]¶ Bases:
libcity.data.dataset.abstract_dataset.AbstractDataset
-
_generate_data
()[source]¶ LINE 采用的是按类似于 Skip-Gram 的训练方式,类似于 Word2Vec(Skip-Gram),将单词对类比成图中的一条边, LINE 同时采用了两个优化,一个是对边按照正比于边权重的概率进行采样,另一个是类似于 Word2Vec 当中的负采样方法, 在采样一条边时,同时产生该边起始点到目标点(按正比于度^0.75的概率采样获得)的多个”负采样”边。 最后,为了通过 Python 的均匀分布随机数产生符合目标分布的采样,使用 O(1) 的 alias 采样方法
-
-
class
libcity.data.dataset.dataset_subclass.
MultiSTGCnetDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
,libcity.data.dataset.traffic_state_cpt_dataset.TrafficStateCPTDataset
-
class
libcity.data.dataset.dataset_subclass.
PBSTrajectoryDataset
(config)[source]¶ Bases:
libcity.data.dataset.abstract_dataset.AbstractDataset
popularity based negative sampling weighted random sampling based on np.random.choice
-
cutter_filter
()[source]¶ - 切割后的轨迹存储格式: (dict)
- {
- uid: [
- [
checkin_record, checkin_record, …
], [
checkin_record, checkin_record, …
}
-
encode_traj
(data)[source]¶ encode the cut trajectory
- Parameters
data (dict) –
the key is uid, the value is the uid’s trajectories. For example: {
- uid: [
trajectory1, trajectory2
]
} trajectory1 = [
dyna_id, dyna_id, …..
]
- Returns
- For example:
- {
data_feature: {…}, pad_item: {…}, encoded_data: {uid: encoded_trajectories}
}
- Return type
dict
-
-
class
libcity.data.dataset.dataset_subclass.
RESLSTMDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
class
libcity.data.dataset.dataset_subclass.
STAGGCNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
class
libcity.data.dataset.dataset_subclass.
STDNDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_datatset.TrafficStateDataset
-
_split_train_val_test_stdn
(x, y, flatten_att_nbhd_inputs, flatten_att_flow_inputs, att_lstm_inputs, nbhd_inputs, flow_inputs, lstm_inputs)[source]¶ 划分训练集、测试集、验证集,并缓存数据集
- Parameters
x (np.ndarray) – 输入数据 (num_samples, input_length, …, feature_dim)
y (np.ndarray) – 输出数据 (num_samples, input_length, …, feature_dim)
- Returns
- tuple contains:
x_train: (num_samples, input_length, …, feature_dim)
y_train: (num_samples, input_length, …, feature_dim)
x_val: (num_samples, input_length, …, feature_dim)
y_val: (num_samples, input_length, …, feature_dim)
x_test: (num_samples, input_length, …, feature_dim)
y_test: (num_samples, input_length, …, feature_dim)
- Return type
tuple
-
-
class
libcity.data.dataset.dataset_subclass.
STG2SeqDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset
-
_generate_data
()[source]¶ 加载数据文件(.dyna/.grid/.od/.gridod)和外部数据(.ext),且将二者融合,以X,y的形式返回
- Returns
- tuple contains:
x(np.ndarray): 模型输入数据,(num_samples, input_length, …, feature_dim)
y(np.ndarray): 模型输出数据,(num_samples, output_length, …, feature_dim)
- Return type
tuple
-
_generate_train_val_test
()[source]¶ 加载数据集,并划分训练集、测试集、验证集,并缓存数据集
- Returns
- tuple contains:
x_train: (num_samples, input_length, …, feature_dim)
y_train: (num_samples, input_length, …, feature_dim)
x_val: (num_samples, input_length, …, feature_dim)
y_val: (num_samples, input_length, …, feature_dim)
x_test: (num_samples, input_length, …, feature_dim)
y_test: (num_samples, input_length, …, feature_dim)
- Return type
tuple
-
_load_cache_train_val_test
()[source]¶ 加载之前缓存好的训练集、测试集、验证集
- Returns
- tuple contains:
x_train: (num_samples, input_length, …, feature_dim)
y_train: (num_samples, input_length, …, feature_dim)
x_val: (num_samples, input_length, …, feature_dim)
y_val: (num_samples, input_length, …, feature_dim)
x_test: (num_samples, input_length, …, feature_dim)
y_test: (num_samples, input_length, …, feature_dim)
- Return type
tuple
-
_load_rel
()[source]¶ 根据网格结构构建邻接矩阵,一个格子跟他周围的8个格子邻接
- Returns
self.adj_mx, N*N的邻接矩阵
- Return type
np.ndarray
-
_split_train_val_test
(x, y, df=None)[source]¶ 划分训练集、测试集、验证集,并缓存数据集
- Parameters
x (np.ndarray) – 输入数据 (num_samples, input_length, …, feature_dim)
y (np.ndarray) – 输出数据 (num_samples, input_length, …, feature_dim)
- Returns
- tuple contains:
x_train: (num_samples, input_length, …, feature_dim)
y_train: (num_samples, input_length, …, feature_dim)
x_val: (num_samples, input_length, …, feature_dim)
y_val: (num_samples, input_length, …, feature_dim)
x_test: (num_samples, input_length, …, feature_dim)
y_test: (num_samples, input_length, …, feature_dim)
- Return type
tuple
-
-
class
libcity.data.dataset.dataset_subclass.
STResNetDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_grid_dataset.TrafficStateGridDataset
,libcity.data.dataset.traffic_state_cpt_dataset.TrafficStateCPTDataset
STResNet外部数据源代码只用了ext_y, 没有用到ext_x!
-
_get_external_array
(timestamp_list, ext_data=None, previous_ext=False, ext_time=True)[source]¶ 根据时间戳数组,获取对应时间的外部特征
- Parameters
timestamp_list – 时间戳序列
ext_data – 外部数据
previous_ext – 是否是用过去时间段的外部数据,因为对于预测的时间段Y, 一般没有真实的外部数据,所以用前一个时刻的数据,多步预测则用提前多步的数据
- Returns
External data shape is (len(timestamp_list), ext_dim)
- Return type
np.ndarray
-
_load_ext_data
(ts_x, ts_y)[source]¶ 加载对应时间的外部数据(.ext)
- Parameters
ts_x – 输入数据X对应的时间戳,shape: (num_samples, T_c+T_p+T_t)
ts_y – 输出数据Y对应的时间戳,shape:(num_samples, )
- Returns
- tuple contains:
ext_x(np.ndarray): 对应时间的外部数据, shape: (num_samples, T_c+T_p+T_t, ext_dim), ext_y(np.ndarray): 对应时间的外部数据, shape: (num_samples, ext_dim)
- Return type
tuple
-
-
class
libcity.data.dataset.dataset_subclass.
TGCLSTMDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_point_dataset.TrafficStatePointDataset