Openpcdet训练自己的数据集
一. Openpcdet的安装以及使用
* Openpcdet详细内容请看以下链接:
GitHub - open-mmlab/OpenPCDet: OpenPCDet Toolbox for LiDAR-based 3D Object Detection.
1.首先gitclone原文代码
2. 这里我建议自己按照作者github上的docs/install文件夹下指示一步步安装,(之前根据csdn上教程一直有报错),然后下载spconv,以及cumm, github链接如下:
GitHub - traveller59/spconv: Spatial Sparse Convolution Library
GitHub - FindDefinition/cumm: CUda Matrix Multiply library.
3. 打开spconv中的readme,并且严格按照readme步骤安装,一般需要编译一段时间。
4. 打开cumm中readme,严格按照上面指示安装。
5. 安装完成之后按照测试数据跑通检验一下。
二. Openpcdet训练自己的数据集
* 本人移植其他的数据集,由于我有自己的image数据,已经按照kitti数据集的格式转换为velodyne, calib, label, image四个文件,并且实现了评估,以及最终的检测结果,所以可能和其他博主不一样。
* 如果你只有velodyne,label,或者数据集格式还不知道如何转换,文件建议参考以下这几个博主的链接:
Training using our own dataset · Issue #771 · open-mmlab/OpenPCDet · GitHub
3D目标检测(4):OpenPCDet训练篇--自定义数据集 - 知乎
这里首先总结以下主要涉及到以下三个文件的修改
* pcdet/datasets/custom/custom_dataset.py
* tools/cfgs/custom_models/pointpillar.yaml (也可以是其他模型)
* tools/cfgs/dataset_configs/custom_dataset.yaml
* demo.py
1.pcdet/datasets/custom/custom_dataset.py
其实custom_dataset.py只需要大家去模仿kitti_dataset.py去删改就可以了,而且大部分内容不需要用户修改,这里我修改了:
1)get_lidar函数
* 获取激光雷达数据,其他的get_image也类似
2) __getitem__函数
* 这个函数最重要,是获取数据字典并更新的关键
* 如果有些字典不需要可以删改,如calib,image等
3)get_infos函数
* 生成字典信息infos
infos={'image':xxx,
'calib': xxx,
'annos': xxx}
annos = {'name': xxx,
'truncated': xxx,
'alpha':xxx,
.............}
其中annos就是解析你的label文件生成的字典, 如类别名,是否被遮挡,bbox的角度
同理有些字典信息不需要可以增删
3) create_custom_infos函数
这个函数主要用来生成你的数据字典,一般以.pkl后缀,如果你不需要评估,可以将其中的评估部分删除,原理也很简单。
4) main函数中的类别信息
修改后的代码如下:
-
import copy
-
import pickle
-
import os
-
from skimage import io
-
import numpy as np
-
-
from ..kitti import kitti_utils
-
from ...ops.roiaware_pool3d import roiaware_pool3d_utils
-
from ...utils import box_utils, common_utils, calibration_kitti, object3d_custom
-
from ..dataset import DatasetTemplate
-
-
-
class CustomDataset(DatasetTemplate):
-
def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None, ext='.bin'):
-
"""
-
Args:
-
root_path:
-
dataset_cfg:
-
class_names:
-
training:
-
logger:
-
"""
-
super().__init__(
-
dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
-
)
-
self.split = self.dataset_cfg.DATA_SPLIT[self.mode]
-
self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')
-
-
split_dir = os.path.join(self.root_path, 'ImageSets', (self.split '.txt')) # custom/ImagSets/xxx.txt
-
self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if os.path.exists(split_dir) else None # xxx.txt内的内容
-
-
self.custom_infos = []
-
self.include_data(self.mode) # train/val
-
self.map_class_to_kitti = self.dataset_cfg.MAP_CLASS_TO_KITTI
-
self.ext = ext
-
-
-
def include_data(self, mode):
-
self.logger.info('Loading Custom dataset.')
-
custom_infos = []
-
-
for info_path in self.dataset_cfg.INFO_PATH[mode]:
-
info_path = self.root_path / info_path
-
if not info_path.exists():
-
continue
-
with open(info_path, 'rb') as f:
-
infos = pickle.load(f)
-
-
def get_label(self, idx):
-
label_file = self.root_split_path / 'label_2' / ('%s.txt' % idx)
-
assert label_file.exists()
-
return object3d_custom.get_objects_from_label(label_file)
-
-
def get_lidar(self, idx, getitem=True):
-
if getitem == True:
-
lidar_file = self.root_split_path '/velodyne/' ('%s.bin' % idx)
-
else:
-
lidar_file = self.root_split_path / 'velodyne' / ('%s.bin' % idx)
-
return np.fromfile(str(lidar_file), dtype=np.float32).reshape(-1, 4)
-
-
def get_image(self, idx):
-
"""
-
Loads image for a sample
-
Args:
-
idx: int, Sample index
-
Returns:
-
image: (H, W, 3), RGB Image
-
"""
-
img_file = self.root_split_path / 'image_2' / ('%s.png' % idx)
-
assert img_file.exists()
-
image = io.imread(img_file)
-
image = image.astype(np.float32)
-
image /= 255.0
-
return image
-
-
def get_image_shape(self, idx):
-
img_file = self.root_split_path / 'image_2' / ('%s.png' % idx)
-
assert img_file.exists()
-
return np.array(io.imread(img_file).shape[:2], dtype=np.int32)
-
-
def get_fov_flag(self, pts_rect, img_shape, calib):
-
"""
-
Args:
-
pts_rect:
-
img_shape:
-
calib:
-
-
Returns:
-
-
"""
-
pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
-
val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
-
val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
-
val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
-
pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
-
-
return pts_valid_flag
-
-
def set_split(self, split):
-
super().__init__(
-
dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training,
-
root_path=self.root_path, logger=self.logger
-
)
-
self.split = split
-
-
split_dir = self.root_path / 'ImageSets' / (self.split '.txt')
-
self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None
-
custom_infos.extend(infos)
-
-
self.custom_infos.extend(custom_infos)
-
self.logger.info('Total samples for CUSTOM dataset: %d' % (len(custom_infos)))
-
-
-
def __len__(self):
-
if self._merge_all_iters_to_one_epoch:
-
return len(self.sample_id_list) * self.total_epochs
-
-
return len(self.custom_infos)
-
-
def __getitem__(self, index):
-
if self._merge_all_iters_to_one_epoch:
-
index = index % len(self.custom_infos)
-
-
info = copy.deepcopy(self.custom_infos[index])
-
-
sample_idx = info['point_cloud']['lidar_idx']
-
img_shape = info['image']['image_shape']
-
calib = self.get_calib(sample_idx)
-
get_item_list = self.dataset_cfg.get('GET_ITEM_LIST', ['points'])
-
input_dict = {
-
'frame_id': self.sample_id_list[index],
-
'calib': calib,
-
}
-
-
# 如果annos标签存在info的字典里
-
if 'annos' in info:
-
annos = info['annos']
-
annos = common_utils.drop_info_with_name(annos, name='DontCare')
-
loc, dims, rots = annos['location'], annos['dimensions'], annos['rotation_y']
-
gt_names = annos['name']
-
gt_boxes_camera = np.concatenate([loc, dims, rots[..., np.newaxis]], axis=1).astype(np.float32)
-
gt_boxes_lidar = box_utils.boxes3d_kitti_camera_to_lidar(gt_boxes_camera, calib)
-
-
# 更新gtbox
-
input_dict.update({
-
'gt_names': gt_names,
-
'gt_boxes': gt_boxes_lidar
-
})
-
if "gt_boxes2d" in get_item_list:
-
input_dict['gt_boxes2d'] = annos["bbox"]
-
-
# 获取fov视角的points
-
if "points" in get_item_list:
-
points = self.get_lidar(sample_idx, False)
-
if self.dataset_cfg.FOV_POINTS_ONLY:
-
pts_rect = calib.lidar_to_rect(points[:, 0:3])
-
fov_flag = self.get_fov_flag(pts_rect, img_shape, calib)
-
points = points[fov_flag]
-
input_dict['points'] = points
-
-
input_dict['calib'] = calib
-
data_dict = self.prepare_data(data_dict=input_dict)
-
-
data_dict['image_shape'] = img_shape
-
return data_dict
-
-
def evaluation(self, det_annos, class_names, **kwargs):
-
if 'annos' not in self.custom_infos[0].keys():
-
return 'No ground-truth boxes for evaluation', {}
-
-
def kitti_eval(eval_det_annos, eval_gt_annos, map_name_to_kitti):
-
from ..kitti.kitti_object_eval_python import eval as kitti_eval
-
from ..kitti import kitti_utils
-
-
kitti_utils.transform_annotations_to_kitti_format(eval_det_annos, map_name_to_kitti=map_name_to_kitti)
-
kitti_utils.transform_annotations_to_kitti_format(
-
eval_gt_annos, map_name_to_kitti=map_name_to_kitti,
-
info_with_fakelidar=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
-
)
-
kitti_class_names = [map_name_to_kitti[x] for x in class_names]
-
ap_result_str, ap_dict = kitti_eval.get_official_eval_result(
-
gt_annos=eval_gt_annos, dt_annos=eval_det_annos, current_classes=kitti_class_names
-
)
-
return ap_result_str, ap_dict
-
-
eval_det_annos = copy.deepcopy(det_annos)
-
eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.custom_infos]
-
-
if kwargs['eval_metric'] == 'kitti':
-
ap_result_str, ap_dict = kitti_eval(eval_det_annos, eval_gt_annos, self.map_class_to_kitti)
-
else:
-
raise NotImplementedError
-
-
return ap_result_str, ap_dict
-
-
def get_calib(self, idx):
-
calib_file = self.root_split_path / 'calib' / ('%s.txt' % idx)
-
assert calib_file.exists()
-
return calibration_kitti.Calibration(calib_file)
-
-
def get_infos(self, num_workers=4, has_label=True, count_inside_pts=True, sample_id_list=None):
-
import concurrent.futures as futures
-
-
def process_single_scene(sample_idx):
-
-
# 生成point_cloud字典
-
print('%s sample_idx: %s' % (self.split, sample_idx))
-
info = {}
-
pc_info = {'num_features': 4, 'lidar_idx': sample_idx}
-
info['point_cloud'] = pc_info
-
-
# 生成image字典
-
image_info = {'image_idx': sample_idx, 'image_shape': self.get_image_shape(sample_idx)}
-
info['image'] = image_info
-
-
# 生成calib字典
-
calib = self.get_calib(sample_idx)
-
P2 = np.concatenate([calib.P2, np.array([[0., 0., 0., 1.]])], axis=0)
-
R0_4x4 = np.zeros([4, 4], dtype=calib.R0.dtype)
-
R0_4x4[3, 3] = 1.
-
R0_4x4[:3, :3] = calib.R0
-
V2C_4x4 = np.concatenate([calib.V2C, np.array([[0., 0., 0., 1.]])], axis=0)
-
calib_info = {'P2': P2, 'R0_rect': R0_4x4, 'Tr_velo_to_cam': V2C_4x4}
-
info['calib'] = calib_info
-
-
if has_label:
-
# 生成annos字典
-
obj_list = self.get_label(sample_idx)
-
annotations = {}
-
annotations['name'] = np.array([obj.cls_type for obj in obj_list])
-
annotations['truncated'] = np.array([obj.truncation for obj in obj_list])
-
annotations['occluded'] = np.array([obj.occlusion for obj in obj_list])
-
annotations['alpha'] = np.array([obj.alpha for obj in obj_list])
-
annotations['bbox'] = np.concatenate([obj.box2d.reshape(1, 4) for obj in obj_list], axis=0)
-
annotations['dimensions'] = np.array([[obj.l, obj.h, obj.w] for obj in obj_list]) # lhw(camera) format
-
annotations['location'] = np.concatenate([obj.loc.reshape(1, 3) for obj in obj_list], axis=0)
-
annotations['rotation_y'] = np.array([obj.ry for obj in obj_list])
-
annotations['score'] = np.array([obj.score for obj in obj_list])
-
annotations['difficulty'] = np.array([obj.level for obj in obj_list], np.int32)
-
-
num_objects = len([obj.cls_type for obj in obj_list if obj.cls_type != 'DontCare'])
-
num_gt = len(annotations['name'])
-
index = list(range(num_objects)) [-1] * (num_gt - num_objects)
-
annotations['index'] = np.array(index, dtype=np.int32)
-
-
loc = annotations['location'][:num_objects]
-
dims = annotations['dimensions'][:num_objects]
-
rots = annotations['rotation_y'][:num_objects]
-
loc_lidar = calib.rect_to_lidar(loc)
-
l, h, w = dims[:, 0:1], dims[:, 1:2], dims[:, 2:3]
-
loc_lidar[:, 2] = h[:, 0] / 2
-
gt_boxes_lidar = np.concatenate([loc_lidar, l, w, h, -(np.pi / 2 rots[..., np.newaxis])], axis=1)
-
annotations['gt_boxes_lidar'] = gt_boxes_lidar
-
-
info['annos'] = annotations
-
-
if count_inside_pts:
-
points = self.get_lidar(sample_idx, False)
-
calib = self.get_calib(sample_idx)
-
pts_rect = calib.lidar_to_rect(points[:, 0:3])
-
-
fov_flag = self.get_fov_flag(pts_rect, info['image']['image_shape'], calib)
-
pts_fov = points[fov_flag]
-
corners_lidar = box_utils.boxes_to_corners_3d(gt_boxes_lidar)
-
num_points_in_gt = -np.ones(num_gt, dtype=np.int32)
-
-
for k in range(num_objects):
-
flag = box_utils.in_hull(pts_fov[:, 0:3], corners_lidar[k])
-
num_points_in_gt[k] = flag.sum()
-
annotations['num_points_in_gt'] = num_points_in_gt
-
-
return info
-
-
sample_id_list = sample_id_list if sample_id_list is not None else self.sample_id_list
-
with futures.ThreadPoolExecutor(num_workers) as executor:
-
infos = executor.map(process_single_scene, sample_id_list)
-
return list(infos)
-
-
def create_groundtruth_database(self, info_path=None, used_classes=None, split='train'):
-
import torch
-
-
database_save_path = Path(self.root_path) / ('gt_database' if split == 'train' else ('gt_database_%s' % split))
-
db_info_save_path = Path(self.root_path) / ('custom_dbinfos_%s.pkl' % split)
-
-
database_save_path.mkdir(parents=True, exist_ok=True)
-
all_db_infos = {}
-
-
with open(info_path, 'rb') as f:
-
infos = pickle.load(f)
-
-
for k in range(len(infos)):
-
print('gt_database sample: %d/%d' % (k 1, len(infos)))
-
info = infos[k]
-
sample_idx = info['point_cloud']['lidar_idx']
-
points = self.get_lidar(sample_idx, False)
-
annos = info['annos']
-
names = annos['name']
-
difficulty = annos['difficulty']
-
bbox = annos['bbox']
-
gt_boxes = annos['gt_boxes_lidar']
-
-
num_obj = gt_boxes.shape[0]
-
point_indices = roiaware_pool3d_utils.points_in_boxes_cpu(
-
torch.from_numpy(points[:, 0:3]), torch.from_numpy(gt_boxes)
-
).numpy() # (nboxes, npoints)
-
-
for i in range(num_obj):
-
filename = '%s_%s_%d.bin' % (sample_idx, names[i], i)
-
filepath = database_save_path / filename
-
gt_points = points[point_indices[i] > 0]
-
-
gt_points[:, :3] -= gt_boxes[i, :3]
-
with open(filepath, 'w') as f:
-
gt_points.tofile(f)
-
-
if (used_classes is None) or names[i] in used_classes:
-
db_path = str(filepath.relative_to(self.root_path)) # gt_database/xxxxx.bin
-
db_info = {'name': names[i], 'path': db_path, 'image_idx': sample_idx, 'gt_idx': i,
-
'box3d_lidar': gt_boxes[i], 'num_points_in_gt': gt_points.shape[0],
-
'difficulty': difficulty[i], 'bbox': bbox[i], 'score': annos['score'][i]}
-
if names[i] in all_db_infos:
-
all_db_infos[names[i]].append(db_info)
-
else:
-
all_db_infos[names[i]] = [db_info]
-
-
# Output the num of all classes in database
-
for k, v in all_db_infos.items():
-
print('Database %s: %d' % (k, len(v)))
-
-
with open(db_info_save_path, 'wb') as f:
-
pickle.dump(all_db_infos, f)
-
-
-
def create_label_file_with_name_and_box(class_names, gt_names, gt_boxes, save_label_path):
-
with open(save_label_path, 'w') as f:
-
for idx in range(gt_boxes.shape[0]):
-
boxes = gt_boxes[idx]
-
name = gt_names[idx]
-
if name not in class_names:
-
continue
-
line = "{x} {y} {z} {l} {w} {h} {angle} {name}\n".format(
-
x=boxes[0], y=boxes[1], z=(boxes[2]), l=boxes[3],
-
w=boxes[4], h=boxes[5], angle=boxes[6], name=name
-
)
-
f.write(line)
-
-
def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
-
"""
-
Args:
-
batch_dict:
-
frame_id:
-
pred_dicts: list of pred_dicts
-
pred_boxes: (N, 7), Tensor
-
pred_scores: (N), Tensor
-
pred_labels: (N), Tensor
-
class_names:
-
output_path:
-
-
Returns:
-
-
"""
-
def get_template_prediction(num_samples):
-
ret_dict = {
-
'name': np.zeros(num_samples), 'truncated': np.zeros(num_samples),
-
'occluded': np.zeros(num_samples), 'alpha': np.zeros(num_samples),
-
'bbox': np.zeros([num_samples, 4]), 'dimensions': np.zeros([num_samples, 3]),
-
'location': np.zeros([num_samples, 3]), 'rotation_y': np.zeros(num_samples),
-
'score': np.zeros(num_samples), 'boxes_lidar': np.zeros([num_samples, 7])
-
}
-
return ret_dict
-
-
def generate_single_sample_dict(batch_index, box_dict):
-
pred_scores = box_dict['pred_scores'].cpu().numpy()
-
pred_boxes = box_dict['pred_boxes'].cpu().numpy()
-
pred_labels = box_dict['pred_labels'].cpu().numpy()
-
pred_dict = get_template_prediction(pred_scores.shape[0])
-
if pred_scores.shape[0] == 0:
-
return pred_dict
-
-
calib = batch_dict['calib'][batch_index]
-
image_shape = batch_dict['image_shape'][batch_index].cpu().numpy()
-
pred_boxes_camera = box_utils.boxes3d_lidar_to_kitti_camera(pred_boxes, calib)
-
pred_boxes_img = box_utils.boxes3d_kitti_camera_to_imageboxes(
-
pred_boxes_camera, calib, image_shape=image_shape
-
)
-
-
pred_dict['name'] = np.array(class_names)[pred_labels - 1]
-
pred_dict['alpha'] = -np.arctan2(-pred_boxes[:, 1], pred_boxes[:, 0]) pred_boxes_camera[:, 6]
-
pred_dict['bbox'] = pred_boxes_img
-
pred_dict['dimensions'] = pred_boxes_camera[:, 3:6]
-
pred_dict['location'] = pred_boxes_camera[:, 0:3]
-
pred_dict['rotation_y'] = pred_boxes_camera[:, 6]
-
pred_dict['score'] = pred_scores
-
pred_dict['boxes_lidar'] = pred_boxes
-
-
return pred_dict
-
-
annos = []
-
for index, box_dict in enumerate(pred_dicts):
-
frame_id = batch_dict['frame_id'][index]
-
-
single_pred_dict = generate_single_sample_dict(index, box_dict)
-
single_pred_dict['frame_id'] = frame_id
-
annos.append(single_pred_dict)
-
-
if output_path is not None:
-
cur_det_file = output_path / ('%s.txt' % frame_id)
-
with open(cur_det_file, 'w') as f:
-
bbox = single_pred_dict['bbox']
-
loc = single_pred_dict['location']
-
dims = single_pred_dict['dimensions'] # lhw -> hwl
-
-
for idx in range(len(bbox)):
-
print('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f'
-
% (single_pred_dict['name'][idx], single_pred_dict['alpha'][idx],
-
bbox[idx][0], bbox[idx][1], bbox[idx][2], bbox[idx][3],
-
dims[idx][1], dims[idx][2], dims[idx][0], loc[idx][0],
-
loc[idx][1], loc[idx][2], single_pred_dict['rotation_y'][idx],
-
single_pred_dict['score'][idx]), file=f)
-
-
return annos
-
-
-
def create_custom_infos(dataset_cfg, class_names, data_path, save_path, workers=4):
-
dataset = CustomDataset(
-
dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path,
-
training=False, logger=common_utils.create_logger()
-
)
-
train_split, val_split = 'train', 'val'
-
num_features = len(dataset_cfg.POINT_FEATURE_ENCODING.src_feature_list)
-
-
train_filename = save_path / ('custom_infos_%s.pkl' % train_split)
-
val_filename = save_path / ('custom_infos_%s.pkl' % val_split)
-
-
print('------------------------Start to generate data infos------------------------')
-
-
dataset.set_split(train_split)
-
custom_infos_train = dataset.get_infos(
-
num_workers=workers, has_label=True, count_inside_pts=True
-
)
-
with open(train_filename, 'wb') as f:
-
pickle.dump(custom_infos_train, f)
-
print('Custom info train file is saved to %s' % train_filename)
-
-
dataset.set_split(val_split)
-
custom_infos_val = dataset.get_infos(
-
num_workers=workers, has_label=True, count_inside_pts=True
-
)
-
with open(val_filename, 'wb') as f:
-
pickle.dump(custom_infos_val, f)
-
print('Custom info train file is saved to %s' % val_filename)
-
-
print('------------------------Start create groundtruth database for data augmentation------------------------')
-
dataset.set_split(train_split)
-
dataset.create_groundtruth_database(train_filename, split=train_split)
-
print('------------------------Data preparation done------------------------')
-
-
-
if __name__ == '__main__':
-
import sys
-
-
if sys.argv.__len__() > 1 and sys.argv[1] == 'create_custom_infos':
-
import yaml
-
from pathlib import Path
-
from easydict import EasyDict
-
-
dataset_cfg = EasyDict(yaml.safe_load(open(sys.argv[2])))
-
ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve()
-
create_custom_infos(
-
dataset_cfg=dataset_cfg,
-
class_names=['Car', 'Pedestrian', 'Van'],
-
data_path=ROOT_DIR / 'data' / 'custom',
-
save_path=ROOT_DIR / 'data' / 'custom',
-
)
2. tools/cfgs/custom_models/pointpillar.yaml
这个函数主要是网络模型参数的配置
我主要修改了以下几个点:
1) CLASS_NAMES(替换成你自己的类别信息)
2) _BASE_CONFIFG(custom_dataset.yaml的路径,建议用详细的绝对路径)
3) POINT_CLOUD_RANGE和VOXEL_SIZE
这两者很重要,直接影响后面模型的传播,如果设置不对很容易报错
官方建议 Voxel设置:X,Y方向个数是16的倍数。Z方向为40。
之前尝试设置了一些还是不行,这个我也没太明白到底怎么回事,索性我就不修改
4) ANCHOR_GENERATOR_CONFIG
我修改了自己的类别属性以及feature_map_stride,去除了gt_sampling
完整的代码如下:
-
CLASS_NAMES: ['Car', 'Pedestrian', 'Van']
-
-
DATA_CONFIG:
-
_BASE_CONFIG_: /home/gmm/下载/OpenPCDet/tools/cfgs/dataset_configs/custom_dataset.yaml
-
POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]
-
DATA_PROCESSOR:
-
- NAME: mask_points_and_boxes_outside_range
-
REMOVE_OUTSIDE_BOXES: True
-
-
- NAME: shuffle_points
-
SHUFFLE_ENABLED: {
-
'train': True,
-
'test': False
-
}
-
-
- NAME: transform_points_to_voxels
-
VOXEL_SIZE: [0.16, 0.16, 4]
-
MAX_POINTS_PER_VOXEL: 32
-
MAX_NUMBER_OF_VOXELS: {
-
'train': 16000,
-
'test': 40000
-
}
-
DATA_AUGMENTOR:
-
DISABLE_AUG_LIST: ['placeholder']
-
AUG_CONFIG_LIST:
-
# - NAME: gt_sampling
-
# USE_ROAD_PLANE: True
-
# DB_INFO_PATH:
-
# - custom_dbinfos_train.pkl
-
# PREPARE: {
-
# filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Van:5']
-
# }
-
#
-
# SAMPLE_GROUPS: ['Car:15', 'Pedestrian:15', 'Van:15']
-
# NUM_POINT_FEATURES: 4
-
# DATABASE_WITH_FAKELIDAR: False
-
# REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
-
# LIMIT_WHOLE_SCENE: False
-
-
- NAME: random_world_flip
-
ALONG_AXIS_LIST: ['x']
-
-
- NAME: random_world_rotation
-
WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
-
-
- NAME: random_world_scaling
-
WORLD_SCALE_RANGE: [0.95, 1.05]
-
-
MODEL:
-
NAME: PointPillar
-
-
VFE:
-
NAME: PillarVFE
-
WITH_DISTANCE: False
-
USE_ABSLOTE_XYZ: True
-
USE_NORM: True
-
NUM_FILTERS: [64]
-
-
MAP_TO_BEV:
-
NAME: PointPillarScatter
-
NUM_BEV_FEATURES: 64
-
-
BACKBONE_2D:
-
NAME: BaseBEVBackbone
-
LAYER_NUMS: [3, 5, 5]
-
LAYER_STRIDES: [2, 2, 2]
-
NUM_FILTERS: [64, 128, 256]
-
UPSAMPLE_STRIDES: [1, 2, 4]
-
NUM_UPSAMPLE_FILTERS: [128, 128, 128]
-
-
DENSE_HEAD:
-
NAME: AnchorHeadSingle
-
CLASS_AGNOSTIC: False
-
-
USE_DIRECTION_CLASSIFIER: True
-
DIR_OFFSET: 0.78539
-
DIR_LIMIT_OFFSET: 0.0
-
NUM_DIR_BINS: 2
-
-
ANCHOR_GENERATOR_CONFIG: [
-
{
-
'class_name': 'Car',
-
'anchor_sizes': [[1.8, 4.7, 1.8]],
-
'anchor_rotations': [0, 1.57],
-
'anchor_bottom_heights': [0],
-
'align_center': False,
-
'feature_map_stride': 2,
-
'matched_threshold': 0.55,
-
'unmatched_threshold': 0.45
-
},
-
{
-
'class_name': 'Pedestrian',
-
'anchor_sizes': [[0.77, 0.92, 1.83]],
-
'anchor_rotations': [0, 1.57],
-
'anchor_bottom_heights': [0],
-
'align_center': False,
-
'feature_map_stride': 2,
-
'matched_threshold': 0.5,
-
'unmatched_threshold': 0.45
-
},
-
{
-
'class_name': 'Van',
-
'anchor_sizes': [[2.5, 5.7, 1.9]],
-
'anchor_rotations': [0, 1.57],
-
'anchor_bottom_heights': [0],
-
'align_center': False,
-
'feature_map_stride': 2,
-
'matched_threshold': 0.5,
-
'unmatched_threshold': 0.45
-
},
-
]
-
-
TARGET_ASSIGNER_CONFIG:
-
NAME: AxisAlignedTargetAssigner
-
POS_FRACTION: -1.0
-
SAMPLE_SIZE: 512
-
NORM_BY_NUM_EXAMPLES: False
-
MATCH_HEIGHT: False
-
BOX_CODER: ResidualCoder
-
-
LOSS_CONFIG:
-
LOSS_WEIGHTS: {
-
'cls_weight': 1.0,
-
'loc_weight': 2.0,
-
'dir_weight': 0.2,
-
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
-
}
-
-
POST_PROCESSING:
-
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
-
SCORE_THRESH: 0.1
-
OUTPUT_RAW_SCORE: False
-
-
EVAL_METRIC: kitti
-
-
NMS_CONFIG:
-
MULTI_CLASSES_NMS: False
-
NMS_TYPE: nms_gpu
-
NMS_THRESH: 0.01
-
NMS_PRE_MAXSIZE: 4096
-
NMS_POST_MAXSIZE: 500
-
-
-
OPTIMIZATION:
-
BATCH_SIZE_PER_GPU: 4
-
NUM_EPOCHS: 80
-
-
OPTIMIZER: adam_onecycle
-
LR: 0.003
-
WEIGHT_DECAY: 0.01
-
MOMENTUM: 0.9
-
-
MOMS: [0.95, 0.85]
-
PCT_START: 0.4
-
DIV_FACTOR: 10
-
DECAY_STEP_LIST: [35, 45]
-
LR_DECAY: 0.1
-
LR_CLIP: 0.0000001
-
-
LR_WARMUP: False
-
WARMUP_EPOCH: 1
-
-
GRAD_NORM_CLIP: 10
3. tools/cfgs/dataset_configs/custom_dataset.yaml
修改了DATA_PATH, POINT_CLOUD_RANGE和MAP_CLASS_TO_KITTI还有其他的一些类别属性。
修改后的代码如下:
-
DATASET: 'CustomDataset'
-
DATA_PATH: '/home/gmm/下载/OpenPCDet/data/custom'
-
-
POINT_CLOUD_RANGE: [0, -40, -3, 70.4, 40, 1]
-
-
DATA_SPLIT: {
-
'train': train,
-
'test': val
-
}
-
-
INFO_PATH: {
-
'train': [custom_infos_train.pkl],
-
'test': [custom_infos_val.pkl],
-
}
-
-
GET_ITEM_LIST: ["points"]
-
FOV_POINTS_ONLY: True
-
-
MAP_CLASS_TO_KITTI: {
-
'Car': 'Car',
-
'Pedestrian': 'Pedestrian',
-
'Van': 'Cyclist',
-
}
-
-
-
DATA_AUGMENTOR:
-
DISABLE_AUG_LIST: ['placeholder']
-
AUG_CONFIG_LIST:
-
- NAME: gt_sampling
-
USE_ROAD_PLANE: False
-
DB_INFO_PATH:
-
- custom_dbinfos_train.pkl
-
PREPARE: {
-
filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Van:5'],
-
}
-
-
SAMPLE_GROUPS: ['Car:20', 'Pedestrian:15', 'Van:20']
-
NUM_POINT_FEATURES: 4
-
DATABASE_WITH_FAKELIDAR: False
-
REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
-
LIMIT_WHOLE_SCENE: True
-
-
- NAME: random_world_flip
-
ALONG_AXIS_LIST: ['x']
-
-
- NAME: random_world_rotation
-
WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
-
-
- NAME: random_world_scaling
-
WORLD_SCALE_RANGE: [0.95, 1.05]
-
-
-
POINT_FEATURE_ENCODING: {
-
encoding_type: absolute_coordinates_encoding,
-
used_feature_list: ['x', 'y', 'z', 'intensity'],
-
src_feature_list: ['x', 'y', 'z', 'intensity'],
-
}
-
-
DATA_PROCESSOR:
-
- NAME: mask_points_and_boxes_outside_range
-
REMOVE_OUTSIDE_BOXES: True
-
-
- NAME: shuffle_points
-
SHUFFLE_ENABLED: {
-
'train': True,
-
'test': False
-
}
-
-
- NAME: transform_points_to_voxels
-
VOXEL_SIZE: [0.05, 0.05, 0.1]
-
MAX_POINTS_PER_VOXEL: 5
-
MAX_NUMBER_OF_VOXELS: {
-
'train': 16000,
-
'test': 40000
-
}
4. demo.py
之前训练之后检测框并没有出来,后来我才发现可能是自己的数据集太少,出来的检测框精度太低,于是我在V.draw_scenes部分作了一点修改,并在之前加入一个mask限制条件,结果果然出来检测框了。
demo.py修改部分的代码:
-
with torch.no_grad():
-
for idx, data_dict in enumerate(demo_dataset):
-
logger.info(f'Visualized sample index: \t{idx 1}')
-
data_dict = demo_dataset.collate_batch([data_dict])
-
load_data_to_gpu(data_dict)
-
pred_dicts, _ = model.forward(data_dict)
-
-
scores = pred_dicts[0]['pred_scores'].detach().cpu().numpy()
-
mask = scores > 0.3
-
-
V.draw_scenes(
-
points=data_dict['points'][:, 1:], ref_boxes=pred_dicts[0]['pred_boxes'][mask],
-
ref_scores=pred_dicts[0]['pred_scores'], ref_labels=pred_dicts[0]['pred_labels'],
-
)
-
-
if not OPEN3D_FLAG:
-
mlab.show(stop=True)
三. 运行过程
1. 生成数据字典
python -m pcdet.datasets.custom.custom_dataset create_custom_infos tools/cfgs/dataset_configs/custom_dataset.yaml
2. 训练
这里我偷懒只训练10轮,自己可以自定义
python tools/train.py --cfg_file tools/cfgs/custom_models/pointpillar.yaml --batch_size=1 --epochs=10
这里有个警告不知道怎么回事,暂时忽略[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
3. 评估
由于数据集样本设置比较少,而且训练次数比较少,可以看出评估结果较差
4. 结果
还好能有显示,如果没有出现检测框可以把demo.py的score调低
这篇好文章是转载于:学新通技术网
- 版权申明: 本站部分内容来自互联网,仅供学习及演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,请提供相关证据及您的身份证明,我们将在收到邮件后48小时内删除。
- 本站站名: 学新通技术网
- 本文地址: /boutique/detail/tanhgfkheg
-
photoshop保存的图片太大微信发不了怎么办
PHP中文网 06-15 -
《学习通》视频自动暂停处理方法
HelloWorld317 07-05 -
Android 11 保存文件到外部存储,并分享文件
Luke 10-12 -
word里面弄一个表格后上面的标题会跑到下面怎么办
PHP中文网 06-20 -
photoshop扩展功能面板显示灰色怎么办
PHP中文网 06-14 -
微信公众号没有声音提示怎么办
PHP中文网 03-31 -
excel下划线不显示怎么办
PHP中文网 06-23 -
excel打印预览压线压字怎么办
PHP中文网 06-22 -
怎样阻止微信小程序自动打开
PHP中文网 06-13 -
TikTok加速器哪个好免费的TK加速器推荐
TK小达人 10-01