Add at new repo again
This commit is contained in:
@@ -0,0 +1,196 @@
|
||||
|
||||
# Benchmarks
|
||||
|
||||
Here we benchmark the training speed of a Mask R-CNN in detectron2,
|
||||
with some other popular open source Mask R-CNN implementations.
|
||||
|
||||
|
||||
### Settings
|
||||
|
||||
* Hardware: 8 NVIDIA V100s with NVLink.
|
||||
* Software: Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.5,
|
||||
TensorFlow 1.15.0rc2, Keras 2.2.5, MxNet 1.6.0b20190820.
|
||||
* Model: an end-to-end R-50-FPN Mask-RCNN model, using the same hyperparameter as the
|
||||
[Detectron baseline config](https://github.com/facebookresearch/Detectron/blob/master/configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml)
|
||||
(it does no have scale augmentation).
|
||||
* Metrics: We use the average throughput in iterations 100-500 to skip GPU warmup time.
|
||||
Note that for R-CNN-style models, the throughput of a model typically changes during training, because
|
||||
it depends on the predictions of the model. Therefore this metric is not directly comparable with
|
||||
"train speed" in model zoo, which is the average speed of the entire training run.
|
||||
|
||||
|
||||
### Main Results
|
||||
|
||||
```eval_rst
|
||||
+-------------------------------+--------------------+
|
||||
| Implementation | Throughput (img/s) |
|
||||
+===============================+====================+
|
||||
| |D2| |PT| | 62 |
|
||||
+-------------------------------+--------------------+
|
||||
| mmdetection_ |PT| | 53 |
|
||||
+-------------------------------+--------------------+
|
||||
| maskrcnn-benchmark_ |PT| | 53 |
|
||||
+-------------------------------+--------------------+
|
||||
| tensorpack_ |TF| | 50 |
|
||||
+-------------------------------+--------------------+
|
||||
| simpledet_ |mxnet| | 39 |
|
||||
+-------------------------------+--------------------+
|
||||
| Detectron_ |C2| | 19 |
|
||||
+-------------------------------+--------------------+
|
||||
| `matterport/Mask_RCNN`__ |TF| | 14 |
|
||||
+-------------------------------+--------------------+
|
||||
|
||||
.. _maskrcnn-benchmark: https://github.com/facebookresearch/maskrcnn-benchmark/
|
||||
.. _tensorpack: https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN
|
||||
.. _mmdetection: https://github.com/open-mmlab/mmdetection/
|
||||
.. _simpledet: https://github.com/TuSimple/simpledet/
|
||||
.. _Detectron: https://github.com/facebookresearch/Detectron
|
||||
__ https://github.com/matterport/Mask_RCNN/
|
||||
|
||||
.. |D2| image:: https://github.com/facebookresearch/detectron2/raw/master/.github/Detectron2-Logo-Horz.svg?sanitize=true
|
||||
:height: 15pt
|
||||
:target: https://github.com/facebookresearch/detectron2/
|
||||
.. |PT| image:: https://pytorch.org/assets/images/logo-icon.svg
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://pytorch.org
|
||||
.. |TF| image:: https://static.nvidiagrid.net/ngc/containers/tensorflow.png
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://tensorflow.org
|
||||
.. |mxnet| image:: https://github.com/dmlc/web-data/raw/master/mxnet/image/mxnet_favicon.png
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://mxnet.apache.org/
|
||||
.. |C2| image:: https://caffe2.ai/static/logo.svg
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://caffe2.ai
|
||||
```
|
||||
|
||||
|
||||
Details for each implementation:
|
||||
|
||||
* __Detectron2__: with release v0.1.2, run:
|
||||
```
|
||||
python tools/train_net.py --config-file configs/Detectron1-Comparisons/mask_rcnn_R_50_FPN_noaug_1x.yaml --num-gpus 8
|
||||
```
|
||||
|
||||
* __mmdetection__: at commit `b0d845f`, run
|
||||
```
|
||||
./tools/dist_train.sh configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco.py 8
|
||||
```
|
||||
|
||||
* __maskrcnn-benchmark__: use commit `0ce8f6f` with `sed -i ‘s/torch.uint8/torch.bool/g’ **/*.py; sed -i 's/AT_CHECK/TORCH_CHECK/g' **/*.cu`
|
||||
to make it compatible with PyTorch 1.5. Then, run training with
|
||||
```
|
||||
python -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file configs/e2e_mask_rcnn_R_50_FPN_1x.yaml
|
||||
```
|
||||
The speed we observed is faster than its model zoo, likely due to different software versions.
|
||||
|
||||
* __tensorpack__: at commit `caafda`, `export TF_CUDNN_USE_AUTOTUNE=0`, then run
|
||||
```
|
||||
mpirun -np 8 ./train.py --config DATA.BASEDIR=/data/coco TRAINER=horovod BACKBONE.STRIDE_1X1=True TRAIN.STEPS_PER_EPOCH=50 --load ImageNet-R50-AlignPadding.npz
|
||||
```
|
||||
|
||||
* __SimpleDet__: at commit `9187a1`, run
|
||||
```
|
||||
python detection_train.py --config config/mask_r50v1_fpn_1x.py
|
||||
```
|
||||
|
||||
* __Detectron__: run
|
||||
```
|
||||
python tools/train_net.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml
|
||||
```
|
||||
Note that many of its ops run on CPUs, therefore the performance is limited.
|
||||
|
||||
* __matterport/Mask_RCNN__: at commit `3deaec`, apply the following diff, `export TF_CUDNN_USE_AUTOTUNE=0`, then run
|
||||
```
|
||||
python coco.py train --dataset=/data/coco/ --model=imagenet
|
||||
```
|
||||
Note that many small details in this implementation might be different
|
||||
from Detectron's standards.
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
(diff to make it use the same hyperparameters - click to expand)
|
||||
</summary>
|
||||
|
||||
```diff
|
||||
diff --git i/mrcnn/model.py w/mrcnn/model.py
|
||||
index 62cb2b0..61d7779 100644
|
||||
--- i/mrcnn/model.py
|
||||
+++ w/mrcnn/model.py
|
||||
@@ -2367,8 +2367,8 @@ class MaskRCNN():
|
||||
epochs=epochs,
|
||||
steps_per_epoch=self.config.STEPS_PER_EPOCH,
|
||||
callbacks=callbacks,
|
||||
- validation_data=val_generator,
|
||||
- validation_steps=self.config.VALIDATION_STEPS,
|
||||
+ #validation_data=val_generator,
|
||||
+ #validation_steps=self.config.VALIDATION_STEPS,
|
||||
max_queue_size=100,
|
||||
workers=workers,
|
||||
use_multiprocessing=True,
|
||||
diff --git i/mrcnn/parallel_model.py w/mrcnn/parallel_model.py
|
||||
index d2bf53b..060172a 100644
|
||||
--- i/mrcnn/parallel_model.py
|
||||
+++ w/mrcnn/parallel_model.py
|
||||
@@ -32,6 +32,7 @@ class ParallelModel(KM.Model):
|
||||
keras_model: The Keras model to parallelize
|
||||
gpu_count: Number of GPUs. Must be > 1
|
||||
"""
|
||||
+ super().__init__()
|
||||
self.inner_model = keras_model
|
||||
self.gpu_count = gpu_count
|
||||
merged_outputs = self.make_parallel()
|
||||
diff --git i/samples/coco/coco.py w/samples/coco/coco.py
|
||||
index 5d172b5..239ed75 100644
|
||||
--- i/samples/coco/coco.py
|
||||
+++ w/samples/coco/coco.py
|
||||
@@ -81,7 +81,10 @@ class CocoConfig(Config):
|
||||
IMAGES_PER_GPU = 2
|
||||
|
||||
# Uncomment to train on 8 GPUs (default is 1)
|
||||
- # GPU_COUNT = 8
|
||||
+ GPU_COUNT = 8
|
||||
+ BACKBONE = "resnet50"
|
||||
+ STEPS_PER_EPOCH = 50
|
||||
+ TRAIN_ROIS_PER_IMAGE = 512
|
||||
|
||||
# Number of classes (including background)
|
||||
NUM_CLASSES = 1 + 80 # COCO has 80 classes
|
||||
@@ -496,29 +499,10 @@ if __name__ == '__main__':
|
||||
# *** This training schedule is an example. Update to your needs ***
|
||||
|
||||
# Training - Stage 1
|
||||
- print("Training network heads")
|
||||
model.train(dataset_train, dataset_val,
|
||||
learning_rate=config.LEARNING_RATE,
|
||||
epochs=40,
|
||||
- layers='heads',
|
||||
- augmentation=augmentation)
|
||||
-
|
||||
- # Training - Stage 2
|
||||
- # Finetune layers from ResNet stage 4 and up
|
||||
- print("Fine tune Resnet stage 4 and up")
|
||||
- model.train(dataset_train, dataset_val,
|
||||
- learning_rate=config.LEARNING_RATE,
|
||||
- epochs=120,
|
||||
- layers='4+',
|
||||
- augmentation=augmentation)
|
||||
-
|
||||
- # Training - Stage 3
|
||||
- # Fine tune all layers
|
||||
- print("Fine tune all layers")
|
||||
- model.train(dataset_train, dataset_val,
|
||||
- learning_rate=config.LEARNING_RATE / 10,
|
||||
- epochs=160,
|
||||
- layers='all',
|
||||
+ layers='3+',
|
||||
augmentation=augmentation)
|
||||
|
||||
elif args.command == "evaluate":
|
||||
```
|
||||
|
||||
</details>
|
||||
@@ -0,0 +1,26 @@
|
||||
# Change Log
|
||||
|
||||
### Releases
|
||||
See release log at
|
||||
[https://github.com/facebookresearch/detectron2/releases](https://github.com/facebookresearch/detectron2/releases).
|
||||
|
||||
### Notable Backward Incompatible Changes:
|
||||
|
||||
* 03/30/2020: Custom box head's `output_size` changed to `output_shape`.
|
||||
* 02/14/2020,02/18/2020: Mask head and keypoint head now include logic for losses & inference. Custom heads
|
||||
should overwrite the feature computation by `layers()` method.
|
||||
* 11/11/2019: `detectron2.data.detection_utils.read_image` transposes images with exif information.
|
||||
|
||||
### Config Version Change Log
|
||||
|
||||
* v1: Rename `RPN_HEAD.NAME` to `RPN.HEAD_NAME`.
|
||||
* v2: A batch of rename of many configurations before release.
|
||||
|
||||
### Silent Regression in Historical Versions:
|
||||
|
||||
We list a few silent regressions since they may silently produce incorrect results and will be hard to debug.
|
||||
|
||||
* 04/01/2020 - 05/11/2020: Bad accuracy if `TRAIN_ON_PRED_BOXES` is set to True.
|
||||
* 03/30/2020 - 04/01/2020: ResNets are not correctly built.
|
||||
* 12/19/2019 - 12/26/2019: Using aspect ratio grouping causes a drop in accuracy.
|
||||
* release - 11/9/2019: Test time augmentation does not predict the last category.
|
||||
@@ -0,0 +1,83 @@
|
||||
# Compatibility with Other Libraries
|
||||
|
||||
## Compatibility with Detectron (and maskrcnn-benchmark)
|
||||
|
||||
Detectron2 addresses some legacy issues left in Detectron. As a result, their models
|
||||
are not compatible:
|
||||
running inference with the same model weights will produce different results in the two code bases.
|
||||
|
||||
The major differences regarding inference are:
|
||||
|
||||
- The height and width of a box with corners (x1, y1) and (x2, y2) is now computed more naturally as
|
||||
width = x2 - x1 and height = y2 - y1;
|
||||
In Detectron, a "+ 1" was added both height and width.
|
||||
|
||||
Note that the relevant ops in Caffe2 have [adopted this change of convention](https://github.com/pytorch/pytorch/pull/20550)
|
||||
with an extra option.
|
||||
So it is still possible to run inference with a Detectron2-trained model in Caffe2.
|
||||
|
||||
The change in height/width calculations most notably changes:
|
||||
- encoding/decoding in bounding box regression.
|
||||
- non-maximum suppression. The effect here is very negligible, though.
|
||||
|
||||
- RPN now uses simpler anchors with fewer quantization artifacts.
|
||||
|
||||
In Detectron, the anchors were quantized and
|
||||
[do not have accurate areas](https://github.com/facebookresearch/Detectron/issues/227).
|
||||
In Detectron2, the anchors are center-aligned to feature grid points and not quantized.
|
||||
|
||||
- Classification layers have a different ordering of class labels.
|
||||
|
||||
This involves any trainable parameter with shape (..., num_categories + 1, ...).
|
||||
In Detectron2, integer labels [0, K-1] correspond to the K = num_categories object categories
|
||||
and the label "K" corresponds to the special "background" category.
|
||||
In Detectron, label "0" means background, and labels [1, K] correspond to the K categories.
|
||||
|
||||
- ROIAlign is implemented differently. The new implementation is [available in Caffe2](https://github.com/pytorch/pytorch/pull/23706).
|
||||
|
||||
1. All the ROIs are shifted by half a pixel compared to Detectron in order to create better image-feature-map alignment.
|
||||
See `layers/roi_align.py` for details.
|
||||
To enable the old behavior, use `ROIAlign(aligned=False)`, or `POOLER_TYPE=ROIAlign` instead of
|
||||
`ROIAlignV2` (the default).
|
||||
|
||||
1. The ROIs are not required to have a minimum size of 1.
|
||||
This will lead to tiny differences in the output, but should be negligible.
|
||||
|
||||
- Mask inference function is different.
|
||||
|
||||
In Detectron2, the "paste_mask" function is different and should be more accurate than in Detectron. This change
|
||||
can improve mask AP on COCO by ~0.5% absolute.
|
||||
|
||||
There are some other differences in training as well, but they won't affect
|
||||
model-level compatibility. The major ones are:
|
||||
|
||||
- We fixed a [bug](https://github.com/facebookresearch/Detectron/issues/459) in
|
||||
Detectron, by making `RPN.POST_NMS_TOPK_TRAIN` per-image, rather than per-batch.
|
||||
The fix may lead to a small accuracy drop for a few models (e.g. keypoint
|
||||
detection) and will require some parameter tuning to match the Detectron results.
|
||||
- For simplicity, we change the default loss in bounding box regression to L1 loss, instead of smooth L1 loss.
|
||||
We have observed that this tends to slightly decrease box AP50 while improving box AP for higher
|
||||
overlap thresholds (and leading to a slight overall improvement in box AP).
|
||||
- We interpret the coordinates in COCO bounding box and segmentation annotations
|
||||
as coordinates in range `[0, width]` or `[0, height]`. The coordinates in
|
||||
COCO keypoint annotations are interpreted as pixel indices in range `[0, width - 1]` or `[0, height - 1]`.
|
||||
Note that this affects how flip augmentation is implemented.
|
||||
|
||||
|
||||
We will later share more details and rationale behind the above mentioned issues
|
||||
about pixels, coordinates, and "+1"s.
|
||||
|
||||
|
||||
## Compatibility with Caffe2
|
||||
|
||||
As mentioned above, despite the incompatibilities with Detectron, the relevant
|
||||
ops have been implemented in Caffe2.
|
||||
Therefore, models trained with detectron2 can be converted in Caffe2.
|
||||
See [Deployment](../tutorials/deployment.md) for the tutorial.
|
||||
|
||||
## Compatibility with TensorFlow
|
||||
|
||||
Most ops are available in TensorFlow, although some tiny differences in
|
||||
the implementation of resize / ROIAlign / padding need to be addressed.
|
||||
A working conversion script is provided by [tensorpack FasterRCNN](https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN/convert_d2)
|
||||
to run a standard detectron2 model in TensorFlow.
|
||||
@@ -0,0 +1,49 @@
|
||||
# Contributing to detectron2
|
||||
|
||||
## Issues
|
||||
We use GitHub issues to track public bugs and questions.
|
||||
Please make sure to follow one of the
|
||||
[issue templates](https://github.com/facebookresearch/detectron2/issues/new/choose)
|
||||
when reporting any issues.
|
||||
|
||||
Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe
|
||||
disclosure of security bugs. In those cases, please go through the process
|
||||
outlined on that page and do not file a public issue.
|
||||
|
||||
## Pull Requests
|
||||
We actively welcome your pull requests.
|
||||
|
||||
However, if you're adding any significant features (e.g. > 50 lines), please
|
||||
make sure to have a corresponding issue to discuss your motivation and proposals,
|
||||
before sending a PR. We do not always accept new features, and we take the following
|
||||
factors into consideration:
|
||||
|
||||
1. Whether the same feature can be achieved without modifying detectron2.
|
||||
Detectron2 is designed so that you can implement many extensions from the outside, e.g.
|
||||
those in [projects](https://github.com/facebookresearch/detectron2/tree/master/projects).
|
||||
If some part is not as extensible, you can also bring up the issue to make it more extensible.
|
||||
2. Whether the feature is potentially useful to a large audience, or only to a small portion of users.
|
||||
3. Whether the proposed solution has a good design / interface.
|
||||
4. Whether the proposed solution adds extra mental/practical overhead to users who don't
|
||||
need such feature.
|
||||
5. Whether the proposed solution breaks existing APIs.
|
||||
|
||||
When sending a PR, please do:
|
||||
|
||||
1. If a PR contains multiple orthogonal changes, split it to several PRs.
|
||||
2. If you've added code that should be tested, add tests.
|
||||
3. For PRs that need experiments (e.g. adding a new model or new methods),
|
||||
you don't need to update model zoo, but do provide experiment results in the description of the PR.
|
||||
4. If APIs are changed, update the documentation.
|
||||
5. Make sure your code lints with `./dev/linter.sh`.
|
||||
|
||||
|
||||
## Contributor License Agreement ("CLA")
|
||||
In order to accept your pull request, we need you to submit a CLA. You only need
|
||||
to do this once to work on any of Facebook's open source projects.
|
||||
|
||||
Complete your CLA here: <https://code.facebook.com/cla>
|
||||
|
||||
## License
|
||||
By contributing to detectron2, you agree that your contributions will be licensed
|
||||
under the LICENSE file in the root directory of this source tree.
|
||||
@@ -0,0 +1,10 @@
|
||||
Notes
|
||||
======================================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
benchmarks
|
||||
compatibility
|
||||
contributing
|
||||
changelog
|
||||
Reference in New Issue
Block a user