libcity.data.dataset.traffic_state_cpt_dataset¶
-
class
libcity.data.dataset.traffic_state_cpt_dataset.
TrafficStateCPTDataset
(config)[source]¶ Bases:
libcity.data.dataset.traffic_state_datatset.TrafficStateDataset
交通状态预测数据集的另一个基类 部分交通预测模型通过对接近度(closeness)/周期(period)/趋势(trend)进行建模实现预测。 默认使用len_closeness/len_period/len_trend的数据预测当前时刻的数据,即一个X,一个y。(一般是单步预测) 数据原始的时间戳不能为空!。 一般对外部数据进行单独建模,因此数据为[X, y, X_ext(可选), y_ext(可选)]。 默认使用`train_rate`和`eval_rate`在样本数量(num_samples)维度上直接切分训练集、测试集、验证集。
-
_add_external_information
(df, ext_data=None)[source]¶ 将外部数据和原始交通状态数据结合到高维数组中,子类必须实现这个方法来指定如何融合外部数据和交通状态数据, 由于基于len_closeness/len_period/len_trend的方法一般将外部数据单独处理,所以不需要实现此方法。
- Parameters
df (np.ndarray) – 交通状态数据多维数组
ext_data (np.ndarray) – 外部数据
- Returns
融合后的外部数据和交通状态数据
- Return type
np.ndarray
-
_generate_data
()[source]¶ 加载数据文件(.dyna/.grid/.od/.gridod)和外部数据(.ext)
- Returns
- tuple contains:
x(np.ndarray): 模型输入数据,(num_samples, T_c+T_p+T_t, …, feature_dim)
y(np.ndarray): 模型输出数据,(num_samples, 1, …, feature_dim)
ext_x(np.ndarray): 模型输入外部数据,(num_samples, T_c+T_p+T_t, ext_dim)
ext_y(np.ndarray): 模型输出外部数据,(num_samples, ext_dim)
- Return type
tuple
-
_generate_input_data
(df)[source]¶ 根据全局参数len_closeness/len_period/len_trend切分输入,产生模型需要的输入。 interval_period是period的长度,一般是一天,单位是天, interval_trend是trend的长度,一般是一周,单位是天, pad_**则是向前或向后扩展多长的距离, 用三段的输入一起与预测输出,单步预测。
- Parameters
df (np.ndarray) – 数据数组,shape: (len_time, …, feature_dim)
- Returns
- tuple contains:
x(np.ndarray): 模型输入数据,(num_samples, T_c+T_p+T_t, …, feature_dim)
y(np.ndarray): 模型输出数据,(num_samples, 1, …, feature_dim)
ts_x: 输入数据对应的时间片,(num_samples, T_c+T_p+T_t)
ts_y: 输出数据对应的时间片,(num_samples, )
- Return type
tuple
-
_generate_train_val_test
()[source]¶ 加载数据集,并划分训练集、测试集、验证集,并缓存数据集
- Returns
- tuple contains:
x_train: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_train: (num_samples, 1, …, feature_dim)
x_val: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_val: (num_samples, 1, …, feature_dim)
x_test: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_test: (num_samples, 1, …, feature_dim)
ext_x_train: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_train: (num_samples, ext_dim)
ext_x_val: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_val: (num_samples, ext_dim)
ext_x_test: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_test: (num_samples, ext_dim)
- Return type
tuple
-
_get_external_array
(timestamp_list, ext_data=None, previous_ext=False)[source]¶ 根据时间戳数组,获取对应时间的外部特征
- Parameters
timestamp_list (np.ndarray) – 时间戳序列
ext_data – 外部数据
previous_ext – 是否是用过去时间段的外部数据,因为对于预测的时间段Y, 一般没有真实的外部数据,所以用前一个时刻的数据,多步预测则用提前多步的数据
- Returns
External data shape is (len(timestamp_list), ext_dim)
- Return type
np.ndarray
-
_load_cache_train_val_test
()[source]¶ 加载之前缓存好的训练集、测试集、验证集
- Returns
- tuple contains:
x_train: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_train: (num_samples, 1, …, feature_dim)
x_val: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_val: (num_samples, 1, …, feature_dim)
x_test: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_test: (num_samples, 1, …, feature_dim)
ext_x_train: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_train: (num_samples, ext_dim)
ext_x_val: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_val: (num_samples, ext_dim)
ext_x_test: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_test: (num_samples, ext_dim)
- Return type
tuple
-
_load_data
()[source]¶ 加载数据文件(.dyna/.grid/.od/.gridod)
- Returns
- tuple contains:
x: (num_samples, T_c+T_p+T_t, …, feature_dim) y: (num_samples, 1, …, feature_dim) ts_x: (num_samples, T_c+T_p+T_t) ts_y: (num_samples, )
- Return type
tuple
-
_load_ext_data
(ts_x, ts_y)[source]¶ 加载对应时间的外部数据(.ext)
- Parameters
ts_x – 输入数据X对应的时间戳,shape: (num_samples, T_c+T_p+T_t)
ts_y – 输出数据Y对应的时间戳,shape:(num_samples, )
- Returns
- tuple contains:
ext_x(np.ndarray): 对应时间的外部数据, shape: (num_samples, T_c+T_p+T_t, ext_dim), ext_y(np.ndarray): 对应时间的外部数据, shape: (num_samples, ext_dim)
- Return type
tuple
-
_split_train_val_test
(x, y, ext_x=None, ext_y=None)[source]¶ 划分训练集、测试集、验证集,并缓存数据集
- Parameters
x (np.ndarray) – 输入数据 (num_samples, T_c+T_p+T_t, …, feature_dim)
y (np.ndarray) – 输出数据 (num_samples, 1, …, feature_dim)
ext_x (np.ndarray) – 输入外部数据 (num_samples, T_c+T_p+T_t, ext_dim)
ext_y (np.ndarray) – 输出外部数据 (num_samples, ext_dim)
- Returns
- tuple contains:
x_train: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_train: (num_samples, 1, …, feature_dim)
x_val: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_val: (num_samples, 1, …, feature_dim)
x_test: (num_samples, T_c+T_p+T_t, …, feature_dim)
y_test: (num_samples, 1, …, feature_dim)
ext_x_train: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_train: (num_samples, ext_dim)
ext_x_val: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_val: (num_samples, ext_dim)
ext_x_test: (num_samples, T_c+T_p+T_t, ext_dim)
ext_y_test: (num_samples, ext_dim)
- Return type
tuple
-