Add at new repo again
This commit is contained in:
@@ -0,0 +1,4 @@
|
||||
# Read the docs:
|
||||
|
||||
The latest documentation built from this directory is available at [detectron2.readthedocs.io](https://detectron2.readthedocs.io/).
|
||||
Documents in this directory are not meant to be read on github.
|
@@ -0,0 +1,99 @@
|
||||
# Setup Builtin Datasets
|
||||
|
||||
Detectron2 has builtin support for a few datasets.
|
||||
The datasets are assumed to exist in a directory specified by the environment variable
|
||||
`DETECTRON2_DATASETS`.
|
||||
Under this directory, detectron2 expects to find datasets in the structure described below.
|
||||
|
||||
You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`.
|
||||
If left unset, the default is `./datasets` relative to your current working directory.
|
||||
|
||||
The [model zoo](https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md)
|
||||
contains configs and models that use these builtin datasets.
|
||||
|
||||
## Expected dataset structure for COCO instance/keypoint detection:
|
||||
|
||||
```
|
||||
coco/
|
||||
annotations/
|
||||
instances_{train,val}2017.json
|
||||
person_keypoints_{train,val}2017.json
|
||||
{train,val}2017/
|
||||
# image files that are mentioned in the corresponding json
|
||||
```
|
||||
|
||||
You can use the 2014 version of the dataset as well.
|
||||
|
||||
Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset,
|
||||
which you can download with `./prepare_for_tests.sh`.
|
||||
|
||||
## Expected dataset structure for PanopticFPN:
|
||||
|
||||
```
|
||||
coco/
|
||||
annotations/
|
||||
panoptic_{train,val}2017.json
|
||||
panoptic_{train,val}2017/ # png annotations
|
||||
panoptic_stuff_{train,val}2017/ # generated by the script mentioned below
|
||||
```
|
||||
|
||||
Install panopticapi by:
|
||||
```
|
||||
pip install git+https://github.com/cocodataset/panopticapi.git
|
||||
```
|
||||
Then, run `python prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations.
|
||||
|
||||
## Expected dataset structure for LVIS instance segmentation:
|
||||
```
|
||||
coco/
|
||||
{train,val,test}2017/
|
||||
lvis/
|
||||
lvis_v0.5_{train,val}.json
|
||||
lvis_v0.5_image_info_test.json
|
||||
```
|
||||
|
||||
Install lvis-api by:
|
||||
```
|
||||
pip install git+https://github.com/lvis-dataset/lvis-api.git
|
||||
```
|
||||
|
||||
Run `python prepare_cocofied_lvis.py` to prepare "cocofied" LVIS annotations for evaluation of models trained on the COCO dataset.
|
||||
|
||||
## Expected dataset structure for cityscapes:
|
||||
```
|
||||
cityscapes/
|
||||
gtFine/
|
||||
train/
|
||||
aachen/
|
||||
color.png, instanceIds.png, labelIds.png, polygons.json,
|
||||
labelTrainIds.png
|
||||
...
|
||||
val/
|
||||
test/
|
||||
leftImg8bit/
|
||||
train/
|
||||
val/
|
||||
test/
|
||||
```
|
||||
Install cityscapes scripts by:
|
||||
```
|
||||
pip install git+https://github.com/mcordts/cityscapesScripts.git
|
||||
```
|
||||
|
||||
Note: labelTrainIds.png are created using cityscapesescript with:
|
||||
```
|
||||
CITYSCAPES_DATASET=$DETECTRON2_DATASETS/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py
|
||||
```
|
||||
They are not needed for instance segmentation.
|
||||
|
||||
## Expected dataset structure for Pascal VOC:
|
||||
```
|
||||
VOC20{07,12}/
|
||||
Annotations/
|
||||
ImageSets/
|
||||
Main/
|
||||
trainval.txt
|
||||
test.txt
|
||||
# train.txt or val.txt, if you use these splits
|
||||
JPEGImages/
|
||||
```
|
@@ -0,0 +1,58 @@
|
||||
# Configs
|
||||
|
||||
Detectron2 provides a key-value based config system that can be
|
||||
used to obtain standard, common behaviors.
|
||||
|
||||
Detectron2's config system uses YAML and [yacs](https://github.com/rbgirshick/yacs).
|
||||
In addition to the [basic operations](../modules/config.html#detectron2.config.CfgNode)
|
||||
that access and update a config, we provide the following extra functionalities:
|
||||
|
||||
1. The config can have `_BASE_: base.yaml` field, which will load a base config first.
|
||||
Values in the base config will be overwritten in sub-configs, if there are any conflicts.
|
||||
We provided several base configs for standard model architectures.
|
||||
2. We provide config versioning, for backward compatibility.
|
||||
If your config file is versioned with a config line like `VERSION: 2`,
|
||||
detectron2 will still recognize it even if we change some keys in the future.
|
||||
|
||||
"Config" is a very limited abstraction.
|
||||
We do not expect all features in detectron2 to be available through configs.
|
||||
If you need something that's not available in the config space,
|
||||
please write code using detectron2's API.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
Some basic usage of the `CfgNode` object is shown here. See more in [documentation](../modules/config.html#detectron2.config.CfgNode).
|
||||
```python
|
||||
from detectron2.config import get_cfg
|
||||
cfg = get_cfg() # obtain detectron2's default config
|
||||
cfg.xxx = yyy # add new configs for your own custom components
|
||||
cfg.merge_from_file("my_cfg.yaml") # load values from a file
|
||||
|
||||
cfg.merge_from_list(["MODEL.WEIGHTS", "weights.pth"]) # can also load values from a list of str
|
||||
print(cfg.dump()) # print formatted configs
|
||||
```
|
||||
|
||||
Many builtin tools in detectron2 accepts command line config overwrite:
|
||||
Key-value pairs provided in the command line will overwrite the existing values in the config file.
|
||||
For example, [demo.py](../../demo/demo.py) can be used with
|
||||
```
|
||||
./demo.py --config-file config.yaml [--other-options] \
|
||||
--opts MODEL.WEIGHTS /path/to/weights INPUT.MIN_SIZE_TEST 1000
|
||||
```
|
||||
|
||||
To see a list of available configs in detectron2 and what they mean,
|
||||
check [Config References](../modules/config.html#config-references)
|
||||
|
||||
|
||||
### Best Practice with Configs
|
||||
|
||||
1. Treat the configs you write as "code": avoid copying them or duplicating them; use `_BASE_`
|
||||
to share common parts between configs.
|
||||
|
||||
2. Keep the configs you write simple: don't include keys that do not affect the experimental setting.
|
||||
|
||||
3. Keep a version number in your configs (or the base config), e.g., `VERSION: 2`,
|
||||
for backward compatibility.
|
||||
We print a warning when reading a config without version number.
|
||||
The official configs do not include version number because they are meant to
|
||||
be always up-to-date.
|
@@ -0,0 +1,77 @@
|
||||
|
||||
# Use Custom Dataloaders
|
||||
|
||||
## How the Existing Dataloader Works
|
||||
|
||||
Detectron2 contains a builtin data loading pipeline.
|
||||
It's good to understand how it works, in case you need to write a custom one.
|
||||
|
||||
Detectron2 provides two functions
|
||||
[build_detection_{train,test}_loader](../modules/data.html#detectron2.data.build_detection_train_loader)
|
||||
that create a default data loader from a given config.
|
||||
Here is how `build_detection_{train,test}_loader` work:
|
||||
|
||||
1. It takes the name of a registered dataset (e.g., "coco_2017_train") and loads a `list[dict]` representing the dataset items
|
||||
in a lightweight, canonical format. These dataset items are not yet ready to be used by the model (e.g., images are
|
||||
not loaded into memory, random augmentations have not been applied, etc.).
|
||||
Details about the dataset format and dataset registration can be found in
|
||||
[datasets](./datasets.md).
|
||||
2. Each dict in this list is mapped by a function ("mapper"):
|
||||
* Users can customize this mapping function by specifying the "mapper" argument in
|
||||
`build_detection_{train,test}_loader`. The default mapper is [DatasetMapper](../modules/data.html#detectron2.data.DatasetMapper).
|
||||
* The output format of such function can be arbitrary, as long as it is accepted by the consumer of this data loader (usually the model).
|
||||
The outputs of the default mapper, after batching, follow the default model input format documented in
|
||||
[Use Models](./models.html#model-input-format).
|
||||
* The role of the mapper is to transform the lightweight, canonical representation of a dataset item into a format
|
||||
that is ready for the model to consume (including, e.g., read images, perform random data augmentation and convert to torch Tensors).
|
||||
If you would like to perform custom transformations to data, you often want a custom mapper.
|
||||
3. The outputs of the mapper are batched (simply into a list).
|
||||
4. This batched data is the output of the data loader. Typically, it's also the input of
|
||||
`model.forward()`.
|
||||
|
||||
|
||||
## Write a Custom Dataloader
|
||||
|
||||
Using a different "mapper" with `build_detection_{train,test}_loader(mapper=)` works for most use cases
|
||||
of custom data loading.
|
||||
For example, if you want to resize all images to a fixed size for Mask R-CNN training, write this:
|
||||
|
||||
```python
|
||||
from detectron2.data import build_detection_train_loader
|
||||
from detectron2.data import transforms as T
|
||||
from detectron2.data import detection_utils as utils
|
||||
|
||||
def mapper(dataset_dict):
|
||||
# Implement a mapper, similar to the default DatasetMapper, but with your own customizations
|
||||
dataset_dict = copy.deepcopy(dataset_dict) # it will be modified by code below
|
||||
image = utils.read_image(dataset_dict["file_name"], format="BGR")
|
||||
image, transforms = T.apply_transform_gens([T.Resize((800, 800))], image)
|
||||
dataset_dict["image"] = torch.as_tensor(image.transpose(2, 0, 1).astype("float32"))
|
||||
|
||||
annos = [
|
||||
utils.transform_instance_annotations(obj, transforms, image.shape[:2])
|
||||
for obj in dataset_dict.pop("annotations")
|
||||
if obj.get("iscrowd", 0) == 0
|
||||
]
|
||||
instances = utils.annotations_to_instances(annos, image.shape[:2])
|
||||
dataset_dict["instances"] = utils.filter_empty_instances(instances)
|
||||
return dataset_dict
|
||||
|
||||
data_loader = build_detection_train_loader(cfg, mapper=mapper)
|
||||
# use this dataloader instead of the default
|
||||
```
|
||||
Refer to [API documentation of detectron2.data](../modules/data) for details.
|
||||
|
||||
If you want to change not only the mapper (e.g., to write different sampling or batching logic),
|
||||
you can write your own data loader. The data loader is simply a
|
||||
python iterator that produces [the format](./models.md) your model accepts.
|
||||
You can implement it using any tools you like.
|
||||
|
||||
## Use a Custom Dataloader
|
||||
|
||||
If you use [DefaultTrainer](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer),
|
||||
you can overwrite its `build_{train,test}_loader` method to use your own dataloader.
|
||||
See the [densepose dataloader](../../projects/DensePose/train_net.py)
|
||||
for an example.
|
||||
|
||||
If you write your own training loop, you can plug in your data loader easily.
|
@@ -0,0 +1,221 @@
|
||||
# Use Custom Datasets
|
||||
|
||||
Datasets that have builtin support in detectron2 are listed in [datasets](../../datasets).
|
||||
If you want to use a custom dataset while also reusing detectron2's data loaders,
|
||||
you will need to
|
||||
|
||||
1. __Register__ your dataset (i.e., tell detectron2 how to obtain your dataset).
|
||||
2. Optionally, __register metadata__ for your dataset.
|
||||
|
||||
Next, we explain the above two concepts in detail.
|
||||
|
||||
The [Colab tutorial](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
|
||||
has a live example of how to register and train on a dataset of custom formats.
|
||||
|
||||
### Register a Dataset
|
||||
|
||||
To let detectron2 know how to obtain a dataset named "my_dataset", you will implement
|
||||
a function that returns the items in your dataset and then tell detectron2 about this
|
||||
function:
|
||||
```python
|
||||
def my_dataset_function():
|
||||
...
|
||||
return list[dict] in the following format
|
||||
|
||||
from detectron2.data import DatasetCatalog
|
||||
DatasetCatalog.register("my_dataset", my_dataset_function)
|
||||
```
|
||||
|
||||
Here, the snippet associates a dataset "my_dataset" with a function that returns the data.
|
||||
The registration stays effective until the process exists.
|
||||
|
||||
The function can processes data from its original format into either one of the following:
|
||||
1. Detectron2's standard dataset dict, described below. This will work with many other builtin
|
||||
features in detectron2, so it's recommended to use it when it's sufficient for your task.
|
||||
2. Your custom dataset dict. You can also return arbitrary dicts in your own format,
|
||||
such as adding extra keys for new tasks.
|
||||
Then you will need to handle them properly downstream as well.
|
||||
See below for more details.
|
||||
|
||||
#### Standard Dataset Dicts
|
||||
|
||||
For standard tasks
|
||||
(instance detection, instance/semantic/panoptic segmentation, keypoint detection),
|
||||
we load the original dataset into `list[dict]` with a specification similar to COCO's json annotations.
|
||||
This is our standard representation for a dataset.
|
||||
|
||||
Each dict contains information about one image.
|
||||
The dict may have the following fields,
|
||||
and the required fields vary based on what the dataloader or the task needs (see more below).
|
||||
|
||||
+ `file_name`: the full path to the image file. Will apply rotation and flipping if the image has such exif information.
|
||||
+ `height`, `width`: integer. The shape of image.
|
||||
+ `image_id` (str or int): a unique id that identifies this image. Used
|
||||
during evaluation to identify the images, but a dataset may use it for different purposes.
|
||||
+ `annotations` (list[dict]): each dict corresponds to annotations of one instance
|
||||
in this image. Required by instance detection/segmentation or keypoint detection tasks.
|
||||
|
||||
Images with empty `annotations` will by default be removed from training,
|
||||
but can be included using `DATALOADER.FILTER_EMPTY_ANNOTATIONS`.
|
||||
|
||||
Each dict contains the following keys, of which `bbox`,`bbox_mode` and `category_id` are required:
|
||||
+ `bbox` (list[float]): list of 4 numbers representing the bounding box of the instance.
|
||||
+ `bbox_mode` (int): the format of bbox.
|
||||
It must be a member of
|
||||
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
|
||||
Currently supports: `BoxMode.XYXY_ABS`, `BoxMode.XYWH_ABS`.
|
||||
+ `category_id` (int): an integer in the range [0, num_categories) representing the category label.
|
||||
The value num_categories is reserved to represent the "background" category, if applicable.
|
||||
+ `segmentation` (list[list[float]] or dict): the segmentation mask of the instance.
|
||||
+ If `list[list[float]]`, it represents a list of polygons, one for each connected component
|
||||
of the object. Each `list[float]` is one simple polygon in the format of `[x1, y1, ..., xn, yn]`.
|
||||
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
|
||||
depend on whether "bbox_mode" is relative.
|
||||
+ If `dict`, it represents the per-pixel segmentation mask in COCO's RLE format. The dict should have
|
||||
keys "size" and "counts". You can convert a uint8 segmentation mask of 0s and 1s into
|
||||
RLE format by `pycocotools.mask.encode(np.asarray(mask, order="F"))`.
|
||||
+ `keypoints` (list[float]): in the format of [x1, y1, v1,..., xn, yn, vn].
|
||||
v[i] means the [visibility](http://cocodataset.org/#format-data) of this keypoint.
|
||||
`n` must be equal to the number of keypoint categories.
|
||||
The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates,
|
||||
depend on whether "bbox_mode" is relative.
|
||||
|
||||
Note that the coordinate annotations in COCO format are integers in range [0, H-1 or W-1].
|
||||
By default, detectron2 adds 0.5 to absolute keypoint coordinates to convert them from discrete
|
||||
pixel indices to floating point coordinates.
|
||||
+ `iscrowd`: 0 (default) or 1. Whether this instance is labeled as COCO's "crowd
|
||||
region". Don't include this field if you don't know what it means.
|
||||
+ `sem_seg_file_name`: the full path to the ground truth semantic segmentation file.
|
||||
Required by semantic segmentation task.
|
||||
It should be an image whose pixel values are integer labels.
|
||||
|
||||
|
||||
Fast R-CNN (with precomputed proposals) is rarely used today.
|
||||
To train a Fast R-CNN, the following extra keys are needed:
|
||||
|
||||
+ `proposal_boxes` (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.
|
||||
+ `proposal_objectness_logits` (array): numpy array with shape (K, ), which corresponds to the objectness
|
||||
logits of proposals in 'proposal_boxes'.
|
||||
+ `proposal_bbox_mode` (int): the format of the precomputed proposal bbox.
|
||||
It must be a member of
|
||||
[structures.BoxMode](../modules/structures.html#detectron2.structures.BoxMode).
|
||||
Default is `BoxMode.XYXY_ABS`.
|
||||
|
||||
#### Custom Dataset Dicts for New Tasks
|
||||
|
||||
In the `list[dict]` that your dataset function returns, the dictionary can also have arbitrary custom data.
|
||||
This will be useful for a new task that needs extra information not supported
|
||||
by the standard dataset dicts. In this case, you need to make sure the downstream code can handle your data
|
||||
correctly. Usually this requires writing a new `mapper` for the dataloader (see [Use Custom Dataloaders](./data_loading.md)).
|
||||
|
||||
When designing a custom format, note that all dicts are stored in memory
|
||||
(sometimes serialized and with multiple copies).
|
||||
To save memory, each dict is meant to contain small but sufficient information
|
||||
about each sample, such as file names and annotations.
|
||||
Loading full samples typically happens in the data loader.
|
||||
|
||||
For attributes shared among the entire dataset, use `Metadata` (see below).
|
||||
To avoid extra memory, do not save such information repeatly for each sample.
|
||||
|
||||
### "Metadata" for Datasets
|
||||
|
||||
Each dataset is associated with some metadata, accessible through
|
||||
`MetadataCatalog.get(dataset_name).some_metadata`.
|
||||
Metadata is a key-value mapping that contains information that's shared among
|
||||
the entire dataset, and usually is used to interpret what's in the dataset, e.g.,
|
||||
names of classes, colors of classes, root of files, etc.
|
||||
This information will be useful for augmentation, evaluation, visualization, logging, etc.
|
||||
The structure of metadata depends on the what is needed from the corresponding downstream code.
|
||||
|
||||
If you register a new dataset through `DatasetCatalog.register`,
|
||||
you may also want to add its corresponding metadata through
|
||||
`MetadataCatalog.get(dataset_name).some_key = some_value`, to enable any features that need the metadata.
|
||||
You can do it like this (using the metadata key "thing_classes" as an example):
|
||||
|
||||
```python
|
||||
from detectron2.data import MetadataCatalog
|
||||
MetadataCatalog.get("my_dataset").thing_classes = ["person", "dog"]
|
||||
```
|
||||
|
||||
Here is a list of metadata keys that are used by builtin features in detectron2.
|
||||
If you add your own dataset without these metadata, some features may be
|
||||
unavailable to you:
|
||||
|
||||
* `thing_classes` (list[str]): Used by all instance detection/segmentation tasks.
|
||||
A list of names for each instance/thing category.
|
||||
If you load a COCO format dataset, it will be automatically set by the function `load_coco_json`.
|
||||
|
||||
* `thing_colors` (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each thing category.
|
||||
Used for visualization. If not given, random colors are used.
|
||||
|
||||
* `stuff_classes` (list[str]): Used by semantic and panoptic segmentation tasks.
|
||||
A list of names for each stuff category.
|
||||
|
||||
* `stuff_colors` (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each stuff category.
|
||||
Used for visualization. If not given, random colors are used.
|
||||
|
||||
* `keypoint_names` (list[str]): Used by keypoint localization. A list of names for each keypoint.
|
||||
|
||||
* `keypoint_flip_map` (list[tuple[str]]): Used by the keypoint localization task. A list of pairs of names,
|
||||
where each pair are the two keypoints that should be flipped if the image is
|
||||
flipped horizontally during augmentation.
|
||||
* `keypoint_connection_rules`: list[tuple(str, str, (r, g, b))]. Each tuple specifies a pair of keypoints
|
||||
that are connected and the color to use for the line between them when visualized.
|
||||
|
||||
Some additional metadata that are specific to the evaluation of certain datasets (e.g. COCO):
|
||||
|
||||
* `thing_dataset_id_to_contiguous_id` (dict[int->int]): Used by all instance detection/segmentation tasks in the COCO format.
|
||||
A mapping from instance class ids in the dataset to contiguous ids in range [0, #class).
|
||||
Will be automatically set by the function `load_coco_json`.
|
||||
|
||||
* `stuff_dataset_id_to_contiguous_id` (dict[int->int]): Used when generating prediction json files for
|
||||
semantic/panoptic segmentation.
|
||||
A mapping from semantic segmentation class ids in the dataset
|
||||
to contiguous ids in [0, num_categories). It is useful for evaluation only.
|
||||
|
||||
* `json_file`: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.
|
||||
* `panoptic_root`, `panoptic_json`: Used by panoptic evaluation.
|
||||
* `evaluator_type`: Used by the builtin main training script to select
|
||||
evaluator. Don't use it in a new training script.
|
||||
You can just provide the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
|
||||
for your dataset directly in your main script.
|
||||
|
||||
NOTE: For background on the concept of "thing" and "stuff", see
|
||||
[On Seeing Stuff: The Perception of Materials by Humans and Machines](http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf).
|
||||
In detectron2, the term "thing" is used for instance-level tasks,
|
||||
and "stuff" is used for semantic segmentation tasks.
|
||||
Both are used in panoptic segmentation.
|
||||
|
||||
### Register a COCO Format Dataset
|
||||
|
||||
If your dataset is already a json file in the COCO format,
|
||||
the dataset and its associated metadata can be registered easily with:
|
||||
```python
|
||||
from detectron2.data.datasets import register_coco_instances
|
||||
register_coco_instances("my_dataset", {}, "json_annotation.json", "path/to/image/dir")
|
||||
```
|
||||
|
||||
If your dataset is in COCO format but with extra custom per-instance annotations,
|
||||
the [load_coco_json](../modules/data.html#detectron2.data.datasets.load_coco_json)
|
||||
function might be useful.
|
||||
|
||||
### Update the Config for New Datasets
|
||||
|
||||
Once you've registered the dataset, you can use the name of the dataset (e.g., "my_dataset" in
|
||||
example above) in `cfg.DATASETS.{TRAIN,TEST}`.
|
||||
There are other configs you might want to change to train or evaluate on new datasets:
|
||||
|
||||
* `MODEL.ROI_HEADS.NUM_CLASSES` and `MODEL.RETINANET.NUM_CLASSES` are the number of thing classes
|
||||
for R-CNN and RetinaNet models, respectively.
|
||||
* `MODEL.ROI_KEYPOINT_HEAD.NUM_KEYPOINTS` sets the number of keypoints for Keypoint R-CNN.
|
||||
You'll also need to set [Keypoint OKS](http://cocodataset.org/#keypoints-eval)
|
||||
with `TEST.KEYPOINT_OKS_SIGMAS` for evaluation.
|
||||
* `MODEL.SEM_SEG_HEAD.NUM_CLASSES` sets the number of stuff classes for Semantic FPN & Panoptic FPN.
|
||||
* If you're training Fast R-CNN (with precomputed proposals), `DATASETS.PROPOSAL_FILES_{TRAIN,TEST}`
|
||||
need to match the datasets. The format of proposal files are documented
|
||||
[here](../modules/data.html#detectron2.data.load_proposals_into_dataset).
|
||||
|
||||
New models
|
||||
(e.g. [TensorMask](../../projects/TensorMask),
|
||||
[PointRend](../../projects/PointRend))
|
||||
often have similar configs of their own that need to be changed as well.
|
@@ -0,0 +1,92 @@
|
||||
# Deployment
|
||||
|
||||
## Caffe2 Deployment
|
||||
We currently support converting a detectron2 model to Caffe2 format through ONNX.
|
||||
The converted Caffe2 model is able to run without detectron2 dependency in either Python or C++.
|
||||
It has a runtime optimized for CPU & mobile inference, but not for GPU inference.
|
||||
|
||||
Caffe2 conversion requires PyTorch ≥ 1.4 and ONNX ≥ 1.6.
|
||||
|
||||
### Coverage
|
||||
|
||||
It supports 3 most common meta architectures: `GeneralizedRCNN`, `RetinaNet`, `PanopticFPN`,
|
||||
and most official models under these 3 meta architectures.
|
||||
|
||||
Users' custom extensions under these architectures (added through registration) are supported
|
||||
as long as they do not contain control flow or operators not available in Caffe2 (e.g. deformable convolution).
|
||||
For example, custom backbones and heads are often supported out of the box.
|
||||
|
||||
### Usage
|
||||
|
||||
The conversion APIs are documented at [the API documentation](../modules/export).
|
||||
We provide a tool, `caffe2_converter.py` as an example that uses
|
||||
these APIs to convert a standard model.
|
||||
|
||||
To convert an official Mask R-CNN trained on COCO, first
|
||||
[prepare the COCO dataset](../../datasets/), then pick the model from [Model Zoo](../../MODEL_ZOO.md), and run:
|
||||
```
|
||||
cd tools/deploy/ && ./caffe2_converter.py --config-file ../../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
|
||||
--output ./caffe2_model --run-eval \
|
||||
MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl \
|
||||
MODEL.DEVICE cpu
|
||||
```
|
||||
|
||||
Note that:
|
||||
1. The conversion needs valid sample inputs & weights to trace the model. That's why the script requires the dataset.
|
||||
You can modify the script to obtain sample inputs in other ways.
|
||||
2. With the `--run-eval` flag, it will evaluate the converted models to verify its accuracy.
|
||||
The accuracy is typically slightly different (within 0.1 AP) from PyTorch due to
|
||||
numerical precisions between different implementations.
|
||||
It's recommended to always verify the accuracy in case your custom model is not supported by the
|
||||
conversion.
|
||||
|
||||
The converted model is available at the specified `caffe2_model/` directory. Two files `model.pb`
|
||||
and `model_init.pb` that contain network structure and network parameters are necessary for deployment.
|
||||
These files can then be loaded in C++ or Python using Caffe2's APIs.
|
||||
|
||||
The script generates `model.svg` file which contains a visualization of the network.
|
||||
You can also load `model.pb` to tools such as [netron](https://github.com/lutzroeder/netron) to visualize it.
|
||||
|
||||
### Use the model in C++/Python
|
||||
|
||||
The model can be loaded in C++. An example [caffe2_mask_rcnn.cpp](../../tools/deploy/) is given,
|
||||
which performs CPU/GPU inference using `COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x`.
|
||||
|
||||
The C++ example needs to be built with:
|
||||
* PyTorch with caffe2 inside
|
||||
* gflags, glog, opencv
|
||||
* protobuf headers that match the version of your caffe2
|
||||
* MKL headers if caffe2 is built with MKL
|
||||
|
||||
The following can compile the example inside [official detectron2 docker](../../docker/):
|
||||
```
|
||||
sudo apt update && sudo apt install libgflags-dev libgoogle-glog-dev libopencv-dev
|
||||
pip install mkl-include
|
||||
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protobuf-cpp-3.6.1.tar.gz
|
||||
tar xf protobuf-cpp-3.6.1.tar.gz
|
||||
export CPATH=$(readlink -f ./protobuf-3.6.1/src/):$HOME/.local/include
|
||||
export CMAKE_PREFIX_PATH=$HOME/.local/lib/python3.6/site-packages/torch/
|
||||
mkdir build && cd build
|
||||
cmake -DTORCH_CUDA_ARCH_LIST=$TORCH_CUDA_ARCH_LIST .. && make
|
||||
|
||||
# To run:
|
||||
./caffe2_mask_rcnn --predict_net=./model.pb --init_net=./model_init.pb --input=input.jpg
|
||||
```
|
||||
|
||||
Note that:
|
||||
|
||||
* All converted models (the .pb files) take two input tensors:
|
||||
"data" is an NCHW image, and "im_info" is an Nx3 tensor consisting of (height, width, 1.0) for
|
||||
each image (the shape of "data" might be larger than that in "im_info" due to padding).
|
||||
|
||||
* The converted models do not contain post-processing operations that
|
||||
transform raw layer outputs into formatted predictions.
|
||||
The example only produces raw outputs (28x28 masks) from the final
|
||||
layers that are not post-processed, because in actual deployment, an application often needs
|
||||
its custom lightweight post-processing (e.g. full-image masks for every detected object is often not necessary).
|
||||
|
||||
We also provide a python wrapper around the converted model, in the
|
||||
[Caffe2Model.\_\_call\_\_](../modules/export.html#detectron2.export.Caffe2Model.__call__) method.
|
||||
This method has an interface that's identical to the [pytorch versions of models](./models.md),
|
||||
and it internally applies pre/post-processing code to match the formats.
|
||||
They can serve as a reference for pre/post-processing in actual deployment.
|
@@ -0,0 +1,43 @@
|
||||
|
||||
# Evaluation
|
||||
|
||||
Evaluation is a process that takes a number of inputs/outputs pairs and aggregate them.
|
||||
You can always [use the model](./models.md) directly and just parse its inputs/outputs manually to perform
|
||||
evaluation.
|
||||
Alternatively, evaluation is implemented in detectron2 using the [DatasetEvaluator](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluator)
|
||||
interface.
|
||||
|
||||
Detectron2 includes a few `DatasetEvaluator` that computes metrics using standard dataset-specific
|
||||
APIs (e.g., COCO, LVIS).
|
||||
You can also implement your own `DatasetEvaluator` that performs some other jobs
|
||||
using the inputs/outputs pairs.
|
||||
For example, to count how many instances are detected on the validation set:
|
||||
|
||||
```
|
||||
class Counter(DatasetEvaluator):
|
||||
def reset(self):
|
||||
self.count = 0
|
||||
def process(self, inputs, outputs):
|
||||
for output in outputs:
|
||||
self.count += len(output["instances"])
|
||||
def evaluate(self):
|
||||
# save self.count somewhere, or print it, or return it.
|
||||
return {"count": self.count}
|
||||
```
|
||||
|
||||
Once you have some `DatasetEvaluator`, you can run it with
|
||||
[inference_on_dataset](../modules/evaluation.html#detectron2.evaluation.inference_on_dataset).
|
||||
For example,
|
||||
|
||||
```python
|
||||
val_results = inference_on_dataset(
|
||||
model,
|
||||
val_data_loader,
|
||||
DatasetEvaluators([COCOEvaluator(...), Counter()]))
|
||||
```
|
||||
Compared to running the evaluation manually using the model, the benefit of this function is that
|
||||
you can merge evaluators together using [DatasetEvaluators](../modules/evaluation.html#detectron2.evaluation.DatasetEvaluators).
|
||||
In this way you can run all evaluations without having to go through the dataset multiple times.
|
||||
|
||||
The `inference_on_dataset` function also provides accurate speed benchmarks for the
|
||||
given model and dataset.
|
@@ -0,0 +1,53 @@
|
||||
# Extend Detectron2's Defaults
|
||||
|
||||
__Research is about doing things in new ways__.
|
||||
This brings a tension in how to create abstractions in code,
|
||||
which is a challenge for any research engineering project of a significant size:
|
||||
|
||||
1. On one hand, it needs to have very thin abstractions to allow for the possibility of doing
|
||||
everything in new ways. It should be reasonably easy to break existing
|
||||
abstractions and replace them with new ones.
|
||||
|
||||
2. On the other hand, such a project also needs reasonably high-level
|
||||
abstractions, so that users can easily do things in standard ways,
|
||||
without worrying too much about the details that only certain researchers care about.
|
||||
|
||||
In detectron2, there are two types of interfaces that address this tension together:
|
||||
|
||||
1. Functions and classes that take a config (`cfg`) argument
|
||||
(sometimes with only a few extra arguments).
|
||||
|
||||
Such functions and classes implement
|
||||
the "standard default" behavior: it will read what it needs from the
|
||||
config and do the "standard" thing.
|
||||
Users only need to load a given config and pass it around, without having to worry about
|
||||
which arguments are used and what they all mean.
|
||||
|
||||
2. Functions and classes that have well-defined explicit arguments.
|
||||
|
||||
Each of these is a small building block of the entire system.
|
||||
They require users' expertise to understand what each argument should be,
|
||||
and require more effort to stitch together to a larger system.
|
||||
But they can be stitched together in more flexible ways.
|
||||
|
||||
When you need to implement something not supported by the "standard defaults"
|
||||
included in detectron2, these well-defined components can be reused.
|
||||
|
||||
3. (experimental) A few classes are implemented with the
|
||||
[@configurable](../../modules/config.html#detectron2.config.configurable)
|
||||
decorator - they can be called with either a config, or with explicit arguments.
|
||||
Their explicit argument interfaces are currently __experimental__ and subject to change.
|
||||
|
||||
|
||||
If you only need the standard behavior, the [Beginner's Tutorial](./getting_started.md)
|
||||
should suffice. If you need to extend detectron2 to your own needs,
|
||||
see the following tutorials for more details:
|
||||
|
||||
* Detectron2 includes a few standard datasets. To use custom ones, see
|
||||
[Use Custom Datasets](./datasets.md).
|
||||
* Detectron2 contains the standard logic that creates a data loader for training/testing from a
|
||||
dataset, but you can write your own as well. See [Use Custom Data Loaders](./data_loading.md).
|
||||
* Detectron2 implements many standard detection models, and provide ways for you
|
||||
to overwrite their behaviors. See [Use Models](./models.md) and [Write Models](./write-models.md).
|
||||
* Detectron2 provides a default training loop that is good for common training tasks.
|
||||
You can customize it with hooks, or write your own loop instead. See [training](./training.md).
|
@@ -0,0 +1,79 @@
|
||||
## Getting Started with Detectron2
|
||||
|
||||
This document provides a brief intro of the usage of builtin command-line tools in detectron2.
|
||||
|
||||
For a tutorial that involves actual coding with the API,
|
||||
see our [Colab Notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
|
||||
which covers how to run inference with an
|
||||
existing model, and how to train a builtin model on a custom dataset.
|
||||
|
||||
For more advanced tutorials, refer to our [documentation](https://detectron2.readthedocs.io/tutorials/extend.html).
|
||||
|
||||
|
||||
### Inference Demo with Pre-trained Models
|
||||
|
||||
1. Pick a model and its config file from
|
||||
[model zoo](MODEL_ZOO.md),
|
||||
for example, `mask_rcnn_R_50_FPN_3x.yaml`.
|
||||
2. We provide `demo.py` that is able to run builtin standard models. Run it with:
|
||||
```
|
||||
cd demo/
|
||||
python demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
|
||||
--input input1.jpg input2.jpg \
|
||||
[--other-options]
|
||||
--opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
|
||||
```
|
||||
The configs are made for training, therefore we need to specify `MODEL.WEIGHTS` to a model from model zoo for evaluation.
|
||||
This command will run the inference and show visualizations in an OpenCV window.
|
||||
|
||||
For details of the command line arguments, see `demo.py -h` or look at its source code
|
||||
to understand its behavior. Some common arguments are:
|
||||
* To run __on your webcam__, replace `--input files` with `--webcam`.
|
||||
* To run __on a video__, replace `--input files` with `--video-input video.mp4`.
|
||||
* To run __on cpu__, add `MODEL.DEVICE cpu` after `--opts`.
|
||||
* To save outputs to a directory (for images) or a file (for webcam or video), use `--output`.
|
||||
|
||||
|
||||
### Training & Evaluation in Command Line
|
||||
|
||||
We provide a script in "tools/{,plain_}train_net.py", that is made to train
|
||||
all the configs provided in detectron2.
|
||||
You may want to use it as a reference to write your own training script.
|
||||
|
||||
To train a model with "train_net.py", first
|
||||
setup the corresponding datasets following
|
||||
[datasets/README.md](./datasets/README.md),
|
||||
then run:
|
||||
```
|
||||
cd tools/
|
||||
./train_net.py --num-gpus 8 \
|
||||
--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml
|
||||
```
|
||||
|
||||
The configs are made for 8-GPU training.
|
||||
To train on 1 GPU, you may need to [change some parameters](https://arxiv.org/abs/1706.02677), e.g.:
|
||||
```
|
||||
./train_net.py \
|
||||
--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \
|
||||
--num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025
|
||||
```
|
||||
|
||||
For most models, CPU training is not supported.
|
||||
|
||||
To evaluate a model's performance, use
|
||||
```
|
||||
./train_net.py \
|
||||
--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \
|
||||
--eval-only MODEL.WEIGHTS /path/to/checkpoint_file
|
||||
```
|
||||
For more options, see `./train_net.py -h`.
|
||||
|
||||
### Use Detectron2 APIs in Your Code
|
||||
|
||||
See our [Colab Notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
|
||||
to learn how to use detectron2 APIs to:
|
||||
1. run inference with an existing model
|
||||
2. train a builtin model on a custom dataset
|
||||
|
||||
See [detectron2/projects](https://github.com/facebookresearch/detectron2/tree/master/projects)
|
||||
for more ways to build your project on detectron2.
|
@@ -0,0 +1,18 @@
|
||||
Tutorials
|
||||
======================================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
install
|
||||
getting_started
|
||||
builtin_datasets
|
||||
extend
|
||||
datasets
|
||||
data_loading
|
||||
models
|
||||
write-models
|
||||
training
|
||||
evaluation
|
||||
configs
|
||||
deployment
|
@@ -0,0 +1,184 @@
|
||||
## Installation
|
||||
|
||||
Our [Colab Notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
|
||||
has step-by-step instructions that install detectron2.
|
||||
The [Dockerfile](docker)
|
||||
also installs detectron2 with a few simple commands.
|
||||
|
||||
### Requirements
|
||||
- Linux or macOS with Python ≥ 3.6
|
||||
- PyTorch ≥ 1.4
|
||||
- [torchvision](https://github.com/pytorch/vision/) that matches the PyTorch installation.
|
||||
You can install them together at [pytorch.org](https://pytorch.org) to make sure of this.
|
||||
- OpenCV, optional, needed by demo and visualization
|
||||
- pycocotools: `pip install cython; pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'`
|
||||
|
||||
|
||||
### Build Detectron2 from Source
|
||||
|
||||
gcc & g++ ≥ 5 are required. [ninja](https://ninja-build.org/) is recommended for faster build.
|
||||
After having them, run:
|
||||
```
|
||||
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
|
||||
# (add --user if you don't have permission)
|
||||
|
||||
# Or, to install it from a local clone:
|
||||
git clone https://github.com/facebookresearch/detectron2.git
|
||||
python -m pip install -e detectron2
|
||||
|
||||
# Or if you are on macOS
|
||||
# CC=clang CXX=clang++ python -m pip install -e .
|
||||
```
|
||||
|
||||
To __rebuild__ detectron2 that's built from a local clone, use `rm -rf build/ **/*.so` to clean the
|
||||
old build first. You often need to rebuild detectron2 after reinstalling PyTorch.
|
||||
|
||||
### Install Pre-Built Detectron2 (Linux only)
|
||||
```
|
||||
# for CUDA 10.1:
|
||||
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/index.html
|
||||
```
|
||||
You can replace cu101 with "cu{100,92}" or "cpu".
|
||||
|
||||
Note that:
|
||||
1. Such installation has to be used with certain version of official PyTorch release.
|
||||
See [releases](https://github.com/facebookresearch/detectron2/releases) for requirements.
|
||||
It will not work with a different version of PyTorch or a non-official build of PyTorch.
|
||||
2. Such installation is out-of-date w.r.t. master branch of detectron2. It may not be
|
||||
compatible with the master branch of a research project that uses detectron2 (e.g. those in
|
||||
[projects](projects) or [meshrcnn](https://github.com/facebookresearch/meshrcnn/)).
|
||||
|
||||
### Common Installation Issues
|
||||
|
||||
If you met issues using the pre-built detectron2, please uninstall it and try building it from source.
|
||||
|
||||
Click each issue for its solutions:
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
Undefined torch/aten/caffe2 symbols, or segmentation fault immediately when running the library.
|
||||
</summary>
|
||||
<br/>
|
||||
|
||||
This usually happens when detectron2 or torchvision is not
|
||||
compiled with the version of PyTorch you're running.
|
||||
|
||||
Pre-built torchvision or detectron2 has to work with the corresponding official release of pytorch.
|
||||
If the error comes from a pre-built torchvision, uninstall torchvision and pytorch and reinstall them
|
||||
following [pytorch.org](http://pytorch.org). So the versions will match.
|
||||
|
||||
If the error comes from a pre-built detectron2, check [release notes](https://github.com/facebookresearch/detectron2/releases)
|
||||
to see the corresponding pytorch version required for each pre-built detectron2.
|
||||
|
||||
If the error comes from detectron2 or torchvision that you built manually from source,
|
||||
remove files you built (`build/`, `**/*.so`) and rebuild it so it can pick up the version of pytorch currently in your environment.
|
||||
|
||||
If you cannot resolve this problem, please include the output of `gdb -ex "r" -ex "bt" -ex "quit" --args python -m detectron2.utils.collect_env`
|
||||
in your issue.
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
Undefined C++ symbols (e.g. `GLIBCXX`) or C++ symbols not found.
|
||||
</summary>
|
||||
<br/>
|
||||
Usually it's because the library is compiled with a newer C++ compiler but run with an old C++ runtime.
|
||||
|
||||
This often happens with old anaconda.
|
||||
Try `conda update libgcc`. Then rebuild detectron2.
|
||||
|
||||
The fundamental solution is to run the code with proper C++ runtime.
|
||||
One way is to use `LD_PRELOAD=/path/to/libstdc++.so`.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
"Not compiled with GPU support" or "Detectron2 CUDA Compiler: not available".
|
||||
</summary>
|
||||
<br/>
|
||||
CUDA is not found when building detectron2.
|
||||
You should make sure
|
||||
|
||||
```
|
||||
python -c 'import torch; from torch.utils.cpp_extension import CUDA_HOME; print(torch.cuda.is_available(), CUDA_HOME)'
|
||||
```
|
||||
|
||||
print valid outputs at the time you build detectron2.
|
||||
|
||||
Most models can run inference (but not training) without GPU support. To use CPUs, set `MODEL.DEVICE='cpu'` in the config.
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
"invalid device function" or "no kernel image is available for execution".
|
||||
</summary>
|
||||
<br/>
|
||||
Two possibilities:
|
||||
|
||||
* You build detectron2 with one version of CUDA but run it with a different version.
|
||||
|
||||
To check whether it is the case,
|
||||
use `python -m detectron2.utils.collect_env` to find out inconsistent CUDA versions.
|
||||
In the output of this command, you should expect "Detectron2 CUDA Compiler", "CUDA_HOME", "PyTorch built with - CUDA"
|
||||
to contain cuda libraries of the same version.
|
||||
|
||||
When they are inconsistent,
|
||||
you need to either install a different build of PyTorch (or build by yourself)
|
||||
to match your local CUDA installation, or install a different version of CUDA to match PyTorch.
|
||||
|
||||
* Detectron2 or PyTorch/torchvision is not built for the correct GPU architecture (compute compatibility).
|
||||
|
||||
The GPU architecture for PyTorch/detectron2/torchvision is available in the "architecture flags" in
|
||||
`python -m detectron2.utils.collect_env`.
|
||||
|
||||
The GPU architecture flags of detectron2/torchvision by default matches the GPU model detected
|
||||
during compilation. This means the compiled code may not work on a different GPU model.
|
||||
To overwrite the GPU architecture for detectron2/torchvision, use `TORCH_CUDA_ARCH_LIST` environment variable during compilation.
|
||||
|
||||
For example, `export TORCH_CUDA_ARCH_LIST=6.0,7.0` makes it compile for both P100s and V100s.
|
||||
Visit [developer.nvidia.com/cuda-gpus](https://developer.nvidia.com/cuda-gpus) to find out
|
||||
the correct compute compatibility number for your device.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
Undefined CUDA symbols; cannot open libcudart.so; other nvcc failures.
|
||||
</summary>
|
||||
<br/>
|
||||
The version of NVCC you use to build detectron2 or torchvision does
|
||||
not match the version of CUDA you are running with.
|
||||
This often happens when using anaconda's CUDA runtime.
|
||||
|
||||
Use `python -m detectron2.utils.collect_env` to find out inconsistent CUDA versions.
|
||||
In the output of this command, you should expect "Detectron2 CUDA Compiler", "CUDA_HOME", "PyTorch built with - CUDA"
|
||||
to contain cuda libraries of the same version.
|
||||
|
||||
When they are inconsistent,
|
||||
you need to either install a different build of PyTorch (or build by yourself)
|
||||
to match your local CUDA installation, or install a different version of CUDA to match PyTorch.
|
||||
</details>
|
||||
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
"ImportError: cannot import name '_C'".
|
||||
</summary>
|
||||
<br/>
|
||||
Please build and install detectron2 following the instructions above.
|
||||
|
||||
If you are running code from detectron2's root directory, `cd` to a different one.
|
||||
Otherwise you may not import the code that you installed.
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
ONNX conversion segfault after some "TraceWarning".
|
||||
</summary>
|
||||
<br/>
|
||||
The ONNX package is compiled with too old compiler.
|
||||
|
||||
Please build and install ONNX from its source code using a compiler
|
||||
whose version is closer to what's used by PyTorch (available in `torch.__config__.show()`).
|
||||
</details>
|
@@ -0,0 +1,151 @@
|
||||
# Use Models
|
||||
|
||||
Models (and their sub-models) in detectron2 are built by
|
||||
functions such as `build_model`, `build_backbone`, `build_roi_heads`:
|
||||
```python
|
||||
from detectron2.modeling import build_model
|
||||
model = build_model(cfg) # returns a torch.nn.Module
|
||||
```
|
||||
|
||||
`build_model` only builds the model structure, and fill it with random parameters.
|
||||
See below for how to load an existing checkpoint to the model,
|
||||
and how to use the `model` object.
|
||||
|
||||
### Load/Save a Checkpoint
|
||||
```python
|
||||
from detectron2.checkpoint import DetectionCheckpointer
|
||||
DetectionCheckpointer(model).load(file_path) # load a file to model
|
||||
|
||||
checkpointer = DetectionCheckpointer(model, save_dir="output")
|
||||
checkpointer.save("model_999") # save to output/model_999.pth
|
||||
```
|
||||
|
||||
Detectron2's checkpointer recognizes models in pytorch's `.pth` format, as well as the `.pkl` files
|
||||
in our model zoo.
|
||||
See [API doc](../modules/checkpoint.html#detectron2.checkpoint.DetectionCheckpointer)
|
||||
for more details about its usage.
|
||||
|
||||
The model files can be arbitrarily manipulated using `torch.{load,save}` for `.pth` files or
|
||||
`pickle.{dump,load}` for `.pkl` files.
|
||||
|
||||
### Use a Model
|
||||
|
||||
A model can be called by `outputs = model(inputs)`, where `inputs` is a `list[dict]`.
|
||||
Each dict corresponds to one image and the required keys
|
||||
depend on the type of model, and whether the model is in training or evaluation mode.
|
||||
For example, in order to do inference,
|
||||
all existing models expect the "image" key, and optionally "height" and "width".
|
||||
The detailed format of inputs and outputs of existing models are explained below.
|
||||
|
||||
When in training mode, all models are required to be used under an `EventStorage`.
|
||||
The training statistics will be put into the storage:
|
||||
```python
|
||||
from detectron2.utils.events import EventStorage
|
||||
with EventStorage() as storage:
|
||||
losses = model(inputs)
|
||||
```
|
||||
|
||||
If you only want to do simple inference using an existing model,
|
||||
[DefaultPredictor](../modules/engine.html#detectron2.engine.defaults.DefaultPredictor)
|
||||
is a wrapper around model that provides such basic functionality.
|
||||
It includes default behavior including model loading, preprocessing,
|
||||
and operates on single image rather than batches.
|
||||
|
||||
### Model Input Format
|
||||
|
||||
Users can implement custom models that support any arbitrary input format.
|
||||
Here we describe the standard input format that all builtin models support in detectron2.
|
||||
They all take a `list[dict]` as the inputs. Each dict
|
||||
corresponds to information about one image.
|
||||
|
||||
The dict may contain the following keys:
|
||||
|
||||
* "image": `Tensor` in (C, H, W) format. The meaning of channels are defined by `cfg.INPUT.FORMAT`.
|
||||
Image normalization, if any, will be performed inside the model using
|
||||
`cfg.MODEL.PIXEL_{MEAN,STD}`.
|
||||
* "instances": an [Instances](../modules/structures.html#detectron2.structures.Instances)
|
||||
object, with the following fields:
|
||||
+ "gt_boxes": a [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing N boxes, one for each instance.
|
||||
+ "gt_classes": `Tensor` of long type, a vector of N labels, in range [0, num_categories).
|
||||
+ "gt_masks": a [PolygonMasks](../modules/structures.html#detectron2.structures.PolygonMasks)
|
||||
or [BitMasks](../modules/structures.html#detectron2.structures.BitMasks) object storing N masks, one for each instance.
|
||||
+ "gt_keypoints": a [Keypoints](../modules/structures.html#detectron2.structures.Keypoints)
|
||||
object storing N keypoint sets, one for each instance.
|
||||
* "proposals": an [Instances](../modules/structures.html#detectron2.structures.Instances)
|
||||
object used only in Fast R-CNN style models, with the following fields:
|
||||
+ "proposal_boxes": a [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing P proposal boxes.
|
||||
+ "objectness_logits": `Tensor`, a vector of P scores, one for each proposal.
|
||||
* "height", "width": the **desired** output height and width, which is not necessarily the same
|
||||
as the height or width of the `image` input field.
|
||||
For example, the `image` input field might be a resized image,
|
||||
but you may want the outputs to be in **original** resolution.
|
||||
|
||||
If provided, the model will produce output in this resolution,
|
||||
rather than in the resolution of the `image` as input into the model. This is more efficient and accurate.
|
||||
* "sem_seg": `Tensor[int]` in (H, W) format. The semantic segmentation ground truth.
|
||||
Values represent category labels starting from 0.
|
||||
|
||||
|
||||
#### How it connects to data loader:
|
||||
|
||||
The output of the default [DatasetMapper]( ../modules/data.html#detectron2.data.DatasetMapper) is a dict
|
||||
that follows the above format.
|
||||
After the data loader performs batching, it becomes `list[dict]` which the builtin models support.
|
||||
|
||||
|
||||
### Model Output Format
|
||||
|
||||
When in training mode, the builtin models output a `dict[str->ScalarTensor]` with all the losses.
|
||||
|
||||
When in inference mode, the builtin models output a `list[dict]`, one dict for each image.
|
||||
Based on the tasks the model is doing, each dict may contain the following fields:
|
||||
|
||||
* "instances": [Instances](../modules/structures.html#detectron2.structures.Instances)
|
||||
object with the following fields:
|
||||
* "pred_boxes": [Boxes](../modules/structures.html#detectron2.structures.Boxes) object storing N boxes, one for each detected instance.
|
||||
* "scores": `Tensor`, a vector of N scores.
|
||||
* "pred_classes": `Tensor`, a vector of N labels in range [0, num_categories).
|
||||
+ "pred_masks": a `Tensor` of shape (N, H, W), masks for each detected instance.
|
||||
+ "pred_keypoints": a `Tensor` of shape (N, num_keypoint, 3).
|
||||
Each row in the last dimension is (x, y, score). Scores are larger than 0.
|
||||
* "sem_seg": `Tensor` of (num_categories, H, W), the semantic segmentation prediction.
|
||||
* "proposals": [Instances](../modules/structures.html#detectron2.structures.Instances)
|
||||
object with the following fields:
|
||||
* "proposal_boxes": [Boxes](../modules/structures.html#detectron2.structures.Boxes)
|
||||
object storing N boxes.
|
||||
* "objectness_logits": a torch vector of N scores.
|
||||
* "panoptic_seg": A tuple of `(Tensor, list[dict])`. The tensor has shape (H, W), where each element
|
||||
represent the segment id of the pixel. Each dict describes one segment id and has the following fields:
|
||||
* "id": the segment id
|
||||
* "isthing": whether the segment is a thing or stuff
|
||||
* "category_id": the category id of this segment. It represents the thing
|
||||
class id when `isthing==True`, and the stuff class id otherwise.
|
||||
|
||||
|
||||
### Partially execute a model:
|
||||
|
||||
Sometimes you may want to obtain an intermediate tensor inside a model.
|
||||
Since there are typically hundreds of intermediate tensors, there isn't an API that provides you
|
||||
the intermediate result you need.
|
||||
You have the following options:
|
||||
|
||||
1. Write a (sub)model. Following the [tutorial](./write-models.md), you can
|
||||
rewrite a model component (e.g. a head of a model), such that it
|
||||
does the same thing as the existing component, but returns the output
|
||||
you need.
|
||||
2. Partially execute a model. You can create the model as usual,
|
||||
but use custom code to execute it instead of its `forward()`. For example,
|
||||
the following code obtains mask features before mask head.
|
||||
|
||||
```python
|
||||
images = ImageList.from_tensors(...) # preprocessed input tensor
|
||||
model = build_model(cfg)
|
||||
features = model.backbone(images.tensor)
|
||||
proposals, _ = model.proposal_generator(images, features)
|
||||
instances = model.roi_heads._forward_box(features, proposals)
|
||||
mask_features = [features[f] for f in model.roi_heads.in_features]
|
||||
mask_features = model.roi_heads.mask_pooler(mask_features, [x.pred_boxes for x in instances])
|
||||
```
|
||||
|
||||
Note that both options require you to read the existing forward code to understand
|
||||
how to write code to obtain the outputs you need.
|
@@ -0,0 +1,50 @@
|
||||
# Training
|
||||
|
||||
From the previous tutorials, you may now have a custom model and data loader.
|
||||
|
||||
You are free to create your own optimizer, and write the training logic: it's
|
||||
usually easy with PyTorch, and allow researchers to see the entire training
|
||||
logic more clearly and have full control.
|
||||
One such example is provided in [tools/plain_train_net.py](../../tools/plain_train_net.py).
|
||||
|
||||
We also provide a standarized "trainer" abstraction with a
|
||||
[minimal hook system](../modules/engine.html#detectron2.engine.HookBase)
|
||||
that helps simplify the standard types of training.
|
||||
|
||||
You can use
|
||||
[SimpleTrainer().train()](../modules/engine.html#detectron2.engine.SimpleTrainer)
|
||||
which provides minimal abstraction for single-cost single-optimizer single-data-source training.
|
||||
The builtin `train_net.py` script uses
|
||||
[DefaultTrainer().train()](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer),
|
||||
which includes more standard default behavior that one might want to opt in,
|
||||
including default configurations for learning rate schedule,
|
||||
logging, evaluation, checkpointing etc.
|
||||
This also means that it's less likely to support some non-standard behavior
|
||||
you might want during research.
|
||||
|
||||
To customize the training loops, you can:
|
||||
|
||||
1. If your customization is similar to what `DefaultTrainer` is already doing,
|
||||
you can change behavior of `DefaultTrainer` by overwriting [its methods](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer)
|
||||
in a subclass, like what [tools/train_net.py](../../tools/train_net.py) does.
|
||||
2. If you need something very novel, you can start from [tools/plain_train_net.py](../../tools/plain_train_net.py) to implement them yourself.
|
||||
|
||||
### Logging of Metrics
|
||||
|
||||
During training, metrics are saved to a centralized [EventStorage](../modules/utils.html#detectron2.utils.events.EventStorage).
|
||||
You can use the following code to access it and log metrics to it:
|
||||
```
|
||||
from detectron2.utils.events import get_event_storage
|
||||
|
||||
# inside the model:
|
||||
if self.training:
|
||||
value = # compute the value from inputs
|
||||
storage = get_event_storage()
|
||||
storage.put_scalar("some_accuracy", value)
|
||||
```
|
||||
|
||||
Refer to its documentation for more details.
|
||||
|
||||
Metrics are then saved to various destinations with [EventWriter](../modules/utils.html#module-detectron2.utils.events).
|
||||
DefaultTrainer enables a few `EventWriter` with default configurations.
|
||||
See above for how to customize them.
|
@@ -0,0 +1,39 @@
|
||||
# Write Models
|
||||
|
||||
If you are trying to do something completely new, you may wish to implement
|
||||
a model entirely from scratch within detectron2. However, in many situations you may
|
||||
be interested in modifying or extending some components of an existing model.
|
||||
Therefore, we also provide a registration mechanism that lets you override the
|
||||
behavior of certain internal components of standard models.
|
||||
|
||||
For example, to add a new backbone, import this code in your code:
|
||||
```python
|
||||
from detectron2.modeling import BACKBONE_REGISTRY, Backbone, ShapeSpec
|
||||
|
||||
@BACKBONE_REGISTRY.register()
|
||||
class ToyBackBone(Backbone):
|
||||
def __init__(self, cfg, input_shape):
|
||||
# create your own backbone
|
||||
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=16, padding=3)
|
||||
|
||||
def forward(self, image):
|
||||
return {"conv1": self.conv1(image)}
|
||||
|
||||
def output_shape(self):
|
||||
return {"conv1": ShapeSpec(channels=64, stride=16)}
|
||||
```
|
||||
Then, you can use `cfg.MODEL.BACKBONE.NAME = 'ToyBackBone'` in your config object.
|
||||
`build_model(cfg)` will then call your `ToyBackBone` instead.
|
||||
|
||||
As another example, to add new abilities to the ROI heads in the Generalized R-CNN meta-architecture,
|
||||
you can implement a new
|
||||
[ROIHeads](../modules/modeling.html#detectron2.modeling.ROIHeads) subclass and put it in the `ROI_HEADS_REGISTRY`.
|
||||
See [densepose in detectron2](../../projects/DensePose)
|
||||
and [meshrcnn](https://github.com/facebookresearch/meshrcnn)
|
||||
for examples that implement new ROIHeads to perform new tasks.
|
||||
And [projects/](../../projects/)
|
||||
contains more examples that implement different architectures.
|
||||
|
||||
A complete list of registries can be found in [API documentation](../modules/modeling.html#model-registries).
|
||||
You can register components in these registries to customize different parts of a model, or the
|
||||
entire model.
|
Reference in New Issue
Block a user