Machine LearningRegularization in Deep Learning

Data Augmentation: Expanding Training Distributions

LevelAdvanced

Duration90 mins

TopicRegularization in Deep Learning

5 / 5

Augmentation Strategies: Synthesis and Best Practices

From Techniques to Strategies

We've explored a rich landscape of augmentation techniques: geometric transforms, photometric adjustments, mixing strategies, learned policies, and test-time enhancement. But knowing individual techniques isn't enough—what practitioners need are strategies: coherent approaches that combine techniques appropriately for specific goals.

This page synthesizes everything we've learned into actionable guidelines. We'll provide complete augmentation recipes for common scenarios, discuss how to adapt strategies to computational constraints and data characteristics, and develop a systematic framework for augmentation design. By the end, you'll be equipped to make effective augmentation decisions for any deep learning project.

What You Will Master

By the end of this page, you will have complete augmentation recipes for major vision tasks, understand how to adapt strategies based on data size, model capacity, and compute budget, diagnose augmentation failures and tune strategies accordingly, and develop new augmentation strategies for novel domains.

Complete Training Recipes

Let's start with battle-tested augmentation recipes for major vision tasks. These represent current best practices used by leading research labs and production systems.

Recipe 1: ImageNet Classification (Standard)

This recipe achieves state-of-the-art accuracy on ImageNet-1K, used by EfficientNet, ViT, and ConvNeXt training.

imagenet_recipe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
from torchvision import transforms
from timm.data.auto_augment import rand_augment_transform
from timm.data.mixup import Mixup
from timm.data.random_erasing import RandomErasing
 
def create_imagenet_train_transform(
    img_size: int = 224,
    color_jitter: float = 0.4,
    aa_policy: str = 'rand-m9-mstd0.5',  # RandAugment
    reprob: float = 0.25,  # Random erasing probability
    remode: str = 'pixel',  # Erasing fill mode
):
    """
    Standard ImageNet training transform.
    
    Components:
    1. Random Resized Crop (scale 0.08-1.0, ratio 3/4-4/3)
    2. Horizontal Flip (50%)
    3. RandAugment (N=2, M=9) or TrivialAugment
    4. Random Erasing (25%, pixel fill)
    5. Normalization (ImageNet stats)
    
    Mixup/CutMix applied at batch level during training.
    """
    # Primary transforms
    primary_tfms = [
        transforms.RandomResizedCrop(
            img_size,
            scale=(0.08, 1.0),
            ratio=(3/4, 4/3),
            interpolation=transforms.InterpolationMode.BICUBIC,
        ),
        transforms.RandomHorizontalFlip(0.5),
    ]
    
    # Auto augment
    aa_transform = rand_augment_transform(
        config_str=aa_policy,
        hparams={'translate_const': int(img_size * 0.45)}
    )
    
    # Color jitter (applied independently of RandAugment)
    color_tfm = transforms.ColorJitter(
        brightness=color_jitter,
        contrast=color_jitter,
        saturation=color_jitter,
    )
    
    # Final transforms
    final_tfms = [
        transforms.ToTensor(),
        transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
    ]
    
    # Random erasing (applied after normalization)
    if reprob > 0:
        final_tfms.append(RandomErasing(
            probability=reprob,
            mode=remode,
            device='cpu'
        ))
    
    return transforms.Compose(
        primary_tfms + [aa_transform, color_tfm] + final_tfms
    )
 
 
def create_imagenet_val_transform(img_size: int = 224, crop_pct: float = 0.875):
    """
    Standard ImageNet validation transform.
    
    Resize to img_size/crop_pct, then center crop to img_size.
    No augmentation.
    """
    resize_size = int(img_size / crop_pct)
    
    return transforms.Compose([
        transforms.Resize(
            resize_size,
            interpolation=transforms.InterpolationMode.BICUBIC
        ),
        transforms.CenterCrop(img_size),
        transforms.ToTensor(),
        transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
    ])
 
 
def create_mixup_cutmix(
    mixup_alpha: float = 0.8,
    cutmix_alpha: float = 1.0,
    prob: float = 1.0,
    switch_prob: float = 0.5,
    num_classes: int = 1000
):
    """
    Create Mixup/CutMix batch transform.
    
    Applied at training loop level, not in data transform.
    """
    return Mixup(
        mixup_alpha=mixup_alpha,
        cutmix_alpha=cutmix_alpha,
        prob=prob,
        switch_prob=switch_prob,
        mode='batch',
        label_smoothing=0.1,
        num_classes=num_classes,
    )

Recipe 2: Object Detection (YOLO/DETR-style)

Object detection requires coordinated transformation of images and bounding boxes. Modern detectors use aggressive augmentation.

detection_recipe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
import albumentations as A
from albumentations.pytorch import ToTensorV2
 
def create_detection_train_transform(
    img_size: int = 640,
    mosaic_prob: float = 1.0,
    mixup_prob: float = 0.15,
):
    """
    Modern object detection training transform.
    
    Components:
    1. Mosaic augmentation (4 images combined) - applied externally
    2. Random perspective/affine
    3. HSV color augmentation
    4. Horizontal flip
    5. Scale jitter
    
    Note: Mosaic and Mixup are applied at dataset level, not here.
    This transform handles single-image augmentations.
    """
    return A.Compose([
        # Geometric
        A.LongestMaxSize(max_size=img_size),
        A.PadIfNeeded(
            min_height=img_size,
            min_width=img_size,
            border_mode=0,
            value=(114, 114, 114)
        ),
        A.HorizontalFlip(p=0.5),
        A.ShiftScaleRotate(
            shift_limit=0.1,
            scale_limit=0.2,
            rotate_limit=10,
            border_mode=0,
            p=0.5
        ),
        A.Perspective(scale=(0.01, 0.05), p=0.3),
        
        # Photometric
        A.OneOf([
            A.HueSaturationValue(
                hue_shift_limit=20,
                sat_shift_limit=30,
                val_shift_limit=20,
                p=1.0
            ),
            A.RandomBrightnessContrast(
                brightness_limit=0.2,
                contrast_limit=0.2,
                p=1.0
            ),
        ], p=0.8),
        
        A.GaussNoise(var_limit=(10, 50), p=0.2),
        A.GaussianBlur(blur_limit=(3, 7), p=0.2),
        
        # Final
        A.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
        ToTensorV2(),
    ], bbox_params=A.BboxParams(
        format='yolo',
        min_visibility=0.3,
        label_fields=['labels']
    ))
 
 
class MosaicDataset:
    """
    Dataset wrapper that applies Mosaic augmentation.
    
    Combines 4 random images into one, simulating crowded scenes
    and varied object scales.
    """
    
    def __init__(
        self,
        dataset,
        img_size: int = 640,
        mosaic_prob: float = 1.0
    ):
        self.dataset = dataset
        self.img_size = img_size
        self.mosaic_prob = mosaic_prob
    
    def __len__(self):
        return len(self.dataset)
    
    def __getitem__(self, idx):
        if np.random.random() > self.mosaic_prob:
            return self.dataset[idx]
        
        # Sample 3 additional random images
        indices = [idx] + np.random.choice(
            len(self.dataset), 3, replace=False
        ).tolist()
        
        # Load images and annotations
        images = []
        all_boxes = []
        all_labels = []
        
        for i in indices:
            img, boxes, labels = self.dataset.load_raw(i)
            images.append(img)
            all_boxes.append(boxes)
            all_labels.append(labels)
        
        # Combine into mosaic
        return self._create_mosaic(images, all_boxes, all_labels)
    
    def _create_mosaic(self, images, boxes_list, labels_list):
        """Create 4-image mosaic. See earlier implementation."""
        pass  # Implementation as shown in page-1

Recipe 3: Semantic Segmentation

Segmentation benefits from strong geometric augmentation but must transform masks identically.

segmentation_recipe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import albumentations as A
from albumentations.pytorch import ToTensorV2
 
def create_segmentation_train_transform(
    img_size: int = 512,
    scale_range: tuple = (0.5, 2.0),
):
    """
    Semantic segmentation training transform.
    
    Key principles:
    1. Same geometric transforms to image and mask
    2. Nearest-neighbor interpolation for mask (preserve label values)
    3. Strong scale augmentation for multi-scale objects
    4. Moderate photometric augmentation
    """
    return A.Compose([
        # Random scale + crop (most important for segmentation)
        A.RandomScale(scale_limit=(scale_range[0] - 1, scale_range[1] - 1), p=1.0),
        A.RandomCrop(height=img_size, width=img_size, p=1.0),
        
        # Geometric augmentation
        A.HorizontalFlip(p=0.5),
        A.VerticalFlip(p=0.1),  # Only if domain-appropriate
        A.RandomRotate90(p=0.5),
        A.ShiftScaleRotate(
            shift_limit=0.0625,
            scale_limit=0.1,
            rotate_limit=15,
            p=0.5
        ),
        
        # Photometric (image only, mask unchanged)
        A.OneOf([
            A.GaussNoise(var_limit=(10, 50)),
            A.GaussianBlur(blur_limit=3),
            A.MotionBlur(blur_limit=3),
        ], p=0.3),
        
        A.ColorJitter(
            brightness=0.2,
            contrast=0.2,
            saturation=0.2,
            hue=0.1,
            p=0.5
        ),
        
        A.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
        ToTensorV2(),
    ])
 
 
def create_segmentation_val_transform(img_size: int = 512):
    """
    Minimal validation transform for segmentation.
    
    No augmentation, just resize and normalize.
    For evaluation, often use sliding window or full image.
    """
    return A.Compose([
        A.Resize(img_size, img_size),
        A.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
        ToTensorV2(),
    ])

Augmentation Recipes Summary
Task	Geometric	Photometric	Mixing	Erasing	TTA
Classification	RRC + Flip	Strong (RandAug)	Mixup + CutMix	RE 25%	Flip + MultiCrop
Detection	Mosaic + Flip + Affine	Moderate HSV	Mixup 15%	None	Flip + MultiScale
Segmentation	Scale + Flip + Rotate	Moderate	Copy-Paste	None	Flip + MultiScale
Self-Supervised	Strong crop + Flip	Very strong	None	Optional	Not applicable
Fine-tuning	Light crop + Flip	Light	Light Mixup	None	Flip

Adapting Strategies to Data Size

Optimal augmentation intensity depends critically on dataset size. Too weak augmentation on small datasets leads to overfitting; too strong on large datasets wastes compute and may hurt convergence.

The Data Size Continuum

Very Small Data (<1K samples)

Maximum augmentation intensity
Aggressive geometric transforms
Strong mixing (Mixup α=1.0, CutMix α=1.0)
Consider synthetic data generation
Transfer learning essential

Small Data (1K-10K samples)

Strong augmentation
Full RandAugment (M=14-20)
Mixup and CutMix
Regular dropout and weight decay

Medium Data (10K-100K samples)

Moderate augmentation
Standard RandAugment (M=9-14)
Moderate mixing (α=0.4-0.8)
Standard regularization

Large Data (100K-1M samples)

Light to moderate augmentation
RandAugment (M=7-12)
Light mixing or none
Less aggressive regularization

Very Large Data (>1M samples)

Minimal augmentation sufficient
Basic transforms (flip, crop)
Mixing optional
Focus on compute efficiency

data_size_adaptation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
from dataclasses import dataclass
from typing import Optional
 
@dataclass
class AugmentationConfig:
    """Complete augmentation configuration."""
    # Geometric
    crop_scale_min: float = 0.08
    flip_prob: float = 0.5
    rotation_range: int = 0
    
    # RandAugment
    randaug_n: int = 2
    randaug_m: int = 9
    
    # Mixing
    mixup_alpha: float = 0.0
    cutmix_alpha: float = 0.0
    mix_prob: float = 0.0
    
    # Erasing
    erase_prob: float = 0.0
    
    # Regularization
    label_smoothing: float = 0.0
    drop_path: float = 0.0
    weight_decay: float = 0.0001
 
 
def get_augmentation_config(
    dataset_size: int,
    model_size: str = 'medium',  # 'small', 'medium', 'large'
    task: str = 'classification'
) -> AugmentationConfig:
    """
    Get recommended augmentation config based on dataset and model size.
    
    Heuristics derived from empirical studies on various scales.
    """
    # Base configurations by dataset size
    if dataset_size < 1000:
        # Very small: maximum augmentation
        config = AugmentationConfig(
            crop_scale_min=0.05,
            rotation_range=30,
            randaug_n=3,
            randaug_m=20,
            mixup_alpha=1.0,
            cutmix_alpha=1.0,
            mix_prob=1.0,
            erase_prob=0.4,
            label_smoothing=0.2,
            drop_path=0.3,
            weight_decay=0.1,
        )
    elif dataset_size < 10000:
        # Small: strong augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=15,
            randaug_n=2,
            randaug_m=15,
            mixup_alpha=0.8,
            cutmix_alpha=1.0,
            mix_prob=0.8,
            erase_prob=0.35,
            label_smoothing=0.1,
            drop_path=0.2,
            weight_decay=0.05,
        )
    elif dataset_size < 100000:
        # Medium: moderate augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=10,
            randaug_n=2,
            randaug_m=10,
            mixup_alpha=0.4,
            cutmix_alpha=1.0,
            mix_prob=0.5,
            erase_prob=0.25,
            label_smoothing=0.1,
            drop_path=0.1,
            weight_decay=0.01,
        )
    elif dataset_size < 1000000:
        # Large: light augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=0,
            randaug_n=2,
            randaug_m=9,
            mixup_alpha=0.2,
            cutmix_alpha=0.5,
            mix_prob=0.3,
            erase_prob=0.1,
            label_smoothing=0.1,
            drop_path=0.05,
            weight_decay=0.0001,
        )
    else:
        # Very large: minimal augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=0,
            randaug_n=2,
            randaug_m=7,
            mixup_alpha=0.0,
            cutmix_alpha=0.0,
            mix_prob=0.0,
            erase_prob=0.0,
            label_smoothing=0.0,
            drop_path=0.0,
            weight_decay=0.0001,
        )
    
    # Adjust for model size
    if model_size == 'large':
        # Larger models need more regularization
        config.randaug_m = min(30, config.randaug_m + 5)
        config.drop_path = min(0.5, config.drop_path + 0.1)
        config.weight_decay *= 2
    elif model_size == 'small':
        # Smaller models need less regularization
        config.randaug_m = max(5, config.randaug_m - 3)
        config.drop_path = max(0, config.drop_path - 0.05)
    
    return config

The Size-Augmentation Trade-off

A useful heuristic: if your model can memorize the training set (train accuracy = 100%, val accuracy much lower), you need more augmentation. If training accuracy plateaus well below 100%, you may have too much augmentation or need more capacity.

Domain-Specific Strategies

Different domains have unique characteristics requiring specialized augmentation approaches. Understanding domain constraints is essential for effective augmentation.

Medical Imaging

Domain Characteristics:

Limited data, high inter-observer variability
Subtle diagnostic features
Color/intensity has clinical meaning
Left-right asymmetry may be significant

Strategy:

Heavy geometric: elastic deformation, rotation (controlled), scaling
Light photometric: intensity normalization, controlled contrast
Domain-specific: stain normalization, artifact simulation
Avoid: aggressive color changes, arbitrary flip/rotation

medical_augmentation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
import albumentations as A
import numpy as np
from scipy.ndimage import gaussian_filter, map_coordinates
 
class StainNormalization(A.ImageOnlyTransform):
    """
    Color normalization using Macenko method for histopathology.
    
    Normalizes stain colors to a reference image, reducing
    inter-scanner and inter-lab variability.
    """
    
    def __init__(self, reference_img=None, always_apply=False, p=0.5):
        super().__init__(always_apply, p)
        self.reference_img = reference_img
        # Pre-compute reference stain matrix if provided
        if reference_img is not None:
            self.ref_stain_matrix = self._compute_stain_matrix(reference_img)
    
    def _compute_stain_matrix(self, img):
        """Compute stain matrix using SVD."""
        # Simplified implementation - full version uses Macenko method
        pass
    
    def apply(self, img, **params):
        """Normalize stains to reference."""
        # Implementation would apply color deconvolution and renormalization
        return img
 
 
def create_medical_train_transform(
    img_size: int = 512,
    modality: str = 'histopathology',  # 'histopathology', 'xray', 'ct', 'mri'
):
    """
    Medical imaging augmentation strategy.
    
    Carefully designed to preserve diagnostic information while
    adding realistic variability.
    """
    
    if modality == 'histopathology':
        # Histopathology: all orientations valid, color important
        return A.Compose([
            # Geometric - any orientation
            A.RandomResizedCrop(img_size, img_size, scale=(0.5, 1.0)),
            A.HorizontalFlip(p=0.5),
            A.VerticalFlip(p=0.5),
            A.RandomRotate90(p=0.5),
            A.Rotate(limit=180, border_mode=0, p=0.5),
            
            # Elastic deformation - tissue variability
            A.ElasticTransform(
                alpha=120, sigma=120 * 0.05,
                alpha_affine=120 * 0.03,
                p=0.3
            ),
            
            # Color - stain variation
            A.HueSaturationValue(
                hue_shift_limit=10,
                sat_shift_limit=15,
                val_shift_limit=10,
                p=0.5
            ),
            A.RandomBrightnessContrast(0.1, 0.1, p=0.5),
            
            A.Normalize(),
        ])
    
    elif modality == 'xray':
        # X-ray: orientation matters, grayscale
        return A.Compose([
            # Geometric - limited rotation
            A.RandomResizedCrop(img_size, img_size, scale=(0.8, 1.0)),
            A.HorizontalFlip(p=0.5),  # Only if L/R symmetric
            A.Rotate(limit=10, border_mode=0, p=0.3),
            A.ShiftScaleRotate(
                shift_limit=0.05, scale_limit=0.1, rotate_limit=0,
                p=0.3
            ),
            
            # Intensity - exposure variation
            A.RandomBrightnessContrast(0.1, 0.15, p=0.5),
            A.CLAHE(clip_limit=2.0, p=0.3),
            A.GaussNoise(var_limit=(5, 20), p=0.2),
            
            A.Normalize(),
        ])
    
    elif modality in ['ct', 'mri']:
        # CT/MRI: volumetric (2D slice), intensity is calibrated
        return A.Compose([
            A.RandomResizedCrop(img_size, img_size, scale=(0.8, 1.0)),
            A.HorizontalFlip(p=0.5),
            A.Rotate(limit=15, border_mode=0, p=0.3),
            A.ElasticTransform(alpha=50, sigma=50 * 0.05, p=0.2),
            
            # Be careful with intensity for windowed images
            A.RandomBrightnessContrast(0.05, 0.05, p=0.3),
            A.GaussNoise(var_limit=(3, 10), p=0.2),
            
            A.Normalize(),
        ])

Satellite and Remote Sensing

Domain Characteristics:

All orientations equally valid (nadir view)
Multiple spectral bands (not just RGB)
Large scale variation (zoom levels)
Temporal variation (seasons, time of day)

Strategy:

Full rotation (0°, 90°, 180°, 270°)
Aggressive scale augmentation
Band-wise normalization
Weather/atmospheric simulation

Document Analysis and OCR

Domain Characteristics:

Text orientation matters (no arbitrary rotation)
Perspective distortion from scanning
Variable backgrounds and lighting
Font and style variation

Strategy:

Perspective transforms (subtle)
Elastic deformation (paper warping)
Background substitution
Font/style augmentation (synthetic rendering)
Avoid: horizontal flip, large rotation

Domain-Specific Augmentation Guidelines
Domain	Flip	Rotation	Color	Special
Natural Images	H: yes, V: no	Small (±15°)	Strong	RandAugment
Medical/Histopath	Both	Any (90°)	Conservative	Stain normalization
Medical/X-ray	H only (if symmetric)	Small (±10°)	Intensity only	CLAHE
Satellite	Both + 90°	Any	Band-wise	Weather simulation
Documents/OCR	None	Very small (±3°)	BG variation	Elastic, perspective
Autonomous Driving	H (with labels)	None	Strong	Weather, lighting simulation
Face Recognition	H: yes	Small	Moderate	3D face transform

Diagnosing and Fixing Augmentation Issues

Augmentation can cause subtle problems that manifest as poor performance. Learning to diagnose these issues is crucial for effective tuning.

Common Symptoms and Causes

Symptom: Training accuracy never reaches expected levels

Cause: Augmentation too aggressive
Fix: Reduce RandAugment M, mixing alpha, or erasing probability
Diagnosis: Plot training loss—should decrease smoothly

Symptom: Validation accuracy much lower than training

Cause: Augmentation too weak (overfitting)
Fix: Increase augmentation strength
Diagnosis: Training accuracy saturates while validation plateaus lower

Symptom: Model performs poorly on specific input types

Cause: Augmentation doesn't cover that distribution
Fix: Add targeted augmentations for edge cases
Diagnosis: Analyze failure cases for common patterns

Symptom: Predictions are poorly calibrated

Cause: Hard labels with insufficient regularization
Fix: Add label smoothing, Mixup, or temperature scaling
Diagnosis: Reliability diagram shows overconfidence

augmentation_diagnostics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
import torch
import numpy as np
import matplotlib.pyplot as plt
from typing import List, Dict
 
class AugmentationDiagnostics:
    """
    Tools for diagnosing augmentation effectiveness.
    """
    
    def __init__(self, model, train_loader, val_loader, device='cuda'):
        self.model = model
        self.train_loader = train_loader
        self.val_loader = val_loader
        self.device = device
    
    def compute_augmentation_sensitivity(
        self,
        sample_images: torch.Tensor,
        transforms: List,
        n_trials: int = 10
    ) -> Dict:
        """
        Measure how sensitive model predictions are to augmentations.
        
        High sensitivity may indicate:
        - Model hasn't learned invariances
        - Augmentations are too aggressive
        - Need for test-time augmentation
        """
        self.model.eval()
        sensitivities = []
        
        with torch.no_grad():
            for img in sample_images:
                predictions = []
                
                for _ in range(n_trials):
                    # Apply random augmentation
                    for transform in transforms:
                        aug_img = transform(img.unsqueeze(0))
                        pred = self.model(aug_img.to(self.device))
                        predictions.append(pred.cpu())
                
                # Measure prediction variance
                preds = torch.stack(predictions).squeeze()
                sensitivity = preds.std(dim=0).mean().item()
                sensitivities.append(sensitivity)
        
        return {
            'mean_sensitivity': np.mean(sensitivities),
            'std_sensitivity': np.std(sensitivities),
            'per_sample': sensitivities,
        }
    
    def detect_underfitting(
        self,
        target_train_acc: float = 0.95
    ) -> Dict:
        """
        Check if augmentation is preventing model from fitting training data.
        """
        self.model.eval()
        correct = 0
        total = 0
        
        with torch.no_grad():
            for images, labels in self.train_loader:
                images, labels = images.to(self.device), labels.to(self.device)
                outputs = self.model(images)
                _, predicted = outputs.max(1)
                correct += predicted.eq(labels).sum().item()
                total += labels.size(0)
        
        train_acc = correct / total
        
        return {
            'train_accuracy': train_acc,
            'is_underfitting': train_acc < target_train_acc,
            'recommendation': (
                'Reduce augmentation strength' if train_acc < target_train_acc
                else 'Training accuracy acceptable'
            )
        }
    
    def detect_overfitting(
        self,
        max_gap: float = 0.05  # Max acceptable train-val gap
    ) -> Dict:
        """
        Check train-validation accuracy gap for overfitting.
        """
        self.model.eval()
        
        # Train accuracy
        train_correct = 0
        train_total = 0
        with torch.no_grad():
            for images, labels in self.train_loader:
                images, labels = images.to(self.device), labels.to(self.device)
                outputs = self.model(images)
                _, predicted = outputs.max(1)
                train_correct += predicted.eq(labels).sum().item()
                train_total += labels.size(0)
        
        # Val accuracy
        val_correct = 0
        val_total = 0
        with torch.no_grad():
            for images, labels in self.val_loader:
                images, labels = images.to(self.device), labels.to(self.device)
                outputs = self.model(images)
                _, predicted = outputs.max(1)
                val_correct += predicted.eq(labels).sum().item()
                val_total += labels.size(0)
        
        train_acc = train_correct / train_total
        val_acc = val_correct / val_total
        gap = train_acc - val_acc
        
        return {
            'train_accuracy': train_acc,
            'val_accuracy': val_acc,
            'gap': gap,
            'is_overfitting': gap > max_gap,
            'recommendation': (
                'Increase augmentation strength' if gap > max_gap
                else 'Regularization appears adequate'
            )
        }
    
    def visualize_augmented_samples(
        self,
        images: torch.Tensor,
        transform,
        n_samples: int = 8
    ):
        """
        Visualize augmented samples for sanity checking.
        
        Helps identify if augmentations are too aggressive
        or semantically inappropriate.
        """
        fig, axes = plt.subplots(2, n_samples, figsize=(n_samples * 2, 4))
        
        for i in range(min(n_samples, len(images))):
            # Original
            orig = images[i].permute(1, 2, 0).cpu().numpy()
            orig = (orig - orig.min()) / (orig.max() - orig.min())
            axes[0, i].imshow(orig)
            axes[0, i].set_title('Original')
            axes[0, i].axis('off')
            
            # Augmented
            aug = transform(images[i].unsqueeze(0)).squeeze()
            aug = aug.permute(1, 2, 0).cpu().numpy()
            aug = (aug - aug.min()) / (aug.max() - aug.min())
            axes[1, i].imshow(aug)
            axes[1, i].set_title('Augmented')
            axes[1, i].axis('off')
        
        plt.tight_layout()
        return fig

Always Visualize Augmentations

Before training, always visualize augmented samples. This catches common issues: color normalization applied before augmentations (wrong range), transforms that destroy semantic content, or probability settings that rarely apply augmentations.

Computational Efficiency

Augmentation adds computational overhead. Efficient implementation is essential for large-scale training.

GPU vs CPU Augmentation

CPU Augmentation (Traditional)

Done in data loading pipeline
Uses PIL/OpenCV/Albumentations
Parallelized across DataLoader workers
Memory efficient
Bottleneck when GPU is fast

GPU Augmentation (Modern)

Done on GPU using Kornia/torchvision tensors
Can be faster for batch operations
Integrates with mixed precision training
Useful for very fast models
Limited operation support

efficient_augmentation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
import torch
import torch.nn as nn
import kornia.augmentation as K
from torchvision.transforms import v2 as T
 
class GPUAugmentation(nn.Module):
    """
    GPU-accelerated augmentation pipeline using Kornia.
    
    Applies augmentations on GPU as part of the forward pass,
    benefiting from batch parallelism and GPU acceleration.
    """
    
    def __init__(self, img_size: int = 224):
        super().__init__()
        
        # Kornia augmentations run on GPU
        self.augment = nn.Sequential(
            K.RandomResizedCrop(
                size=(img_size, img_size),
                scale=(0.08, 1.0),
                ratio=(3/4, 4/3),
            ),
            K.RandomHorizontalFlip(p=0.5),
            K.ColorJitter(
                brightness=0.4,
                contrast=0.4,
                saturation=0.4,
                hue=0.1,
                p=0.8
            ),
            K.RandomGrayscale(p=0.2),
            K.RandomGaussianBlur(
                kernel_size=(3, 3),
                sigma=(0.1, 2.0),
                p=0.1
            ),
            K.Normalize(
                mean=torch.tensor([0.485, 0.456, 0.406]),
                std=torch.tensor([0.229, 0.224, 0.225])
            ),
        )
    
    @torch.no_grad()
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """Apply augmentations on GPU."""
        return self.augment(x)
 
 
class HybridAugmentation:
    """
    Hybrid CPU+GPU augmentation strategy.
    
    - Expensive operations (crop, resize) on CPU during loading
    - Fast operations (flip, color) on GPU in batch
    
    Optimizes for overall throughput.
    """
    
    def __init__(self, img_size: int = 224):
        # CPU: geometry changes (expensive)
        self.cpu_transform = T.Compose([
            T.RandomResizedCrop(
                img_size,
                scale=(0.08, 1.0),
                antialias=True
            ),
            T.ToImage(),
            T.ToDtype(torch.float32, scale=True),
        ])
        
        # GPU: fast batch operations
        self.gpu_transform = nn.Sequential(
            K.RandomHorizontalFlip(p=0.5),
            K.ColorJitter(0.4, 0.4, 0.4, 0.1, p=0.8),
            K.Normalize(
                mean=torch.tensor([0.485, 0.456, 0.406]),
                std=torch.tensor([0.229, 0.224, 0.225])
            ),
        )
    
    def apply_cpu(self, img):
        """Called in DataLoader worker."""
        return self.cpu_transform(img)
    
    @torch.no_grad()
    def apply_gpu(self, batch: torch.Tensor) -> torch.Tensor:
        """Called after batch is on GPU."""
        return self.gpu_transform(batch)
 
 
class CachedAugmentation:
    """
    Cache augmented versions for small datasets.
    
    Pre-generates N augmented versions of each training sample,
    trading memory for compute during training.
    """
    
    def __init__(
        self,
        dataset,
        transform,
        n_cached_versions: int = 10
    ):
        self.dataset = dataset
        self.n_cached = n_cached_versions
        self.cache = {}
        
        print(f"Pre-generating {n_cached_versions} augmented versions...")
        for idx in range(len(dataset)):
            img, label = dataset[idx]
            self.cache[idx] = [
                (transform(img), label)
                for _ in range(n_cached_versions)
            ]
        print("Caching complete.")
    
    def __len__(self):
        return len(self.dataset) * self.n_cached
    
    def __getitem__(self, idx):
        sample_idx = idx // self.n_cached
        version_idx = idx % self.n_cached
        return self.cache[sample_idx][version_idx]

Speed Optimizations

•Use Pillow-SIMD for CPU ops
•Increase DataLoader workers
•Pin memory for faster GPU transfer
•Use persistent workers
•Batch augmentations on GPU

Memory Optimizations

•Generate augmentations on-the-fly
•Use smaller intermediate buffers
•Avoid holding original + augmented
•Stream large datasets
•Use mixed-precision where possible

Emerging Augmentation Techniques

The augmentation landscape continues to evolve. Several recent techniques show promise for specific applications.

Copy-Paste Augmentation

Copy-Paste (Ghiasi et al., 2021) cuts objects from one image and pastes them onto another, with full instance mask handling:

Extract object using instance mask
Random scale and position
Paste onto target image
Update target's instance masks and labels

Particularly effective for instance segmentation and detection, especially with rare objects.

GridMask with Scheduling

GridMask removes grid-patterned regions, learning to recognize objects from fragments. Modern implementations schedule GridMask intensity:

Start mild, increase through training
Provides curriculum: easy → hard
Often outperforms static masking

AugMax: Adversarial Augmentation

AugMax (Wang et al., 2021) selects augmentations adversarially to maximize loss:

$$T^* = \arg\max_{T \in \mathcal{T}} L(f_\theta(T(x)), y)$$

The model trains on worst-case augmentations, improving robustness. Computationally expensive but effective for safety-critical applications.

Generative Augmentation

Using generative models (GANs, diffusion) to synthesize training data:

Benefits:

Unlimited diversity
Can generate impossible-to-collect variations
Controllable attribute modification

Challenges:

Quality varies; may introduce artifacts
Computationally expensive generation
Mode collapse reduces diversity

Current practice:

Use generative augmentation for rare classes
Supplement, don't replace, real data
Validate that synthetic samples help

Text-Guided Augmentation

With vision-language models, augmentations can be guided by text:

"Make this street scene rainy" "Add pedestrians to this parking lot" "Change season from summer to winter"

Stable Diffusion and similar models enable highly targeted augmentations that were previously impossible.

The Future of Augmentation

The trend is toward learned, adaptive, and generative augmentation. Future systems will likely: automatically discover domain-appropriate augmentations, adapt augmentation intensity dynamically during training, generate synthetic data when real data is insufficient, and validate augmentation effectiveness in-training.

Summary: Augmentation Strategies

We've synthesized the complete augmentation landscape into actionable strategies, providing a comprehensive framework for augmentation decisions in any deep learning project.

Key Takeaways

•Complete recipes exist for major tasks—ImageNet classification, detection, and segmentation each have well-tested augmentation pipelines.
•Scale strategy with data size—more data needs less augmentation; limited data requires aggressive regularization.
•Domain constraints override general rules—medical, satellite, and document domains have specific requirements.
•Diagnose with train/val gap—large gap means more augmentation; low training accuracy means less.
•Efficiency matters at scale—GPU augmentation, caching, and hybrid strategies optimize throughput.
•Emerging techniques expand possibilities—Copy-Paste, adversarial, and generative augmentations address new challenges.
•Always visualize before training—catch augmentation bugs early through visual inspection.

Module Complete:

You've now mastered data augmentation in deep learning—from individual transformations through mixing strategies, learned policies, test-time enhancement, and comprehensive strategy design. This knowledge enables you to:

Design augmentation pipelines for any vision task
Adapt strategies to data and model characteristics
Diagnose and fix augmentation-related issues
Apply test-time augmentation for maximum accuracy
Stay current with emerging augmentation techniques

Data augmentation remains one of the most impactful and cost-effective techniques for improving deep learning performance. The strategies you've learned will serve you across diverse applications and continue to evolve as the field advances.

Module Complete: Data Augmentation

Congratulations! You have completed the Data Augmentation module. You now possess comprehensive knowledge of augmentation theory, techniques, and strategies—essential tools for training robust, generalizable deep learning models.

5 / 5

Loading learning content...

Machine LearningRegularization in Deep Learning

Data Augmentation: Expanding Training Distributions

LevelAdvanced

Duration90 mins

TopicRegularization in Deep Learning

5 / 5

Augmentation Strategies: Synthesis and Best Practices

From Techniques to Strategies

What You Will Master

Complete Training Recipes

Let's start with battle-tested augmentation recipes for major vision tasks. These represent current best practices used by leading research labs and production systems.

Recipe 1: ImageNet Classification (Standard)

This recipe achieves state-of-the-art accuracy on ImageNet-1K, used by EfficientNet, ViT, and ConvNeXt training.

imagenet_recipe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
from torchvision import transforms
from timm.data.auto_augment import rand_augment_transform
from timm.data.mixup import Mixup
from timm.data.random_erasing import RandomErasing
 
def create_imagenet_train_transform(
    img_size: int = 224,
    color_jitter: float = 0.4,
    aa_policy: str = 'rand-m9-mstd0.5',  # RandAugment
    reprob: float = 0.25,  # Random erasing probability
    remode: str = 'pixel',  # Erasing fill mode
):
    """
    Standard ImageNet training transform.
    
    Components:
    1. Random Resized Crop (scale 0.08-1.0, ratio 3/4-4/3)
    2. Horizontal Flip (50%)
    3. RandAugment (N=2, M=9) or TrivialAugment
    4. Random Erasing (25%, pixel fill)
    5. Normalization (ImageNet stats)
    
    Mixup/CutMix applied at batch level during training.
    """
    # Primary transforms
    primary_tfms = [
        transforms.RandomResizedCrop(
            img_size,
            scale=(0.08, 1.0),
            ratio=(3/4, 4/3),
            interpolation=transforms.InterpolationMode.BICUBIC,
        ),
        transforms.RandomHorizontalFlip(0.5),
    ]
    
    # Auto augment
    aa_transform = rand_augment_transform(
        config_str=aa_policy,
        hparams={'translate_const': int(img_size * 0.45)}
    )
    
    # Color jitter (applied independently of RandAugment)
    color_tfm = transforms.ColorJitter(
        brightness=color_jitter,
        contrast=color_jitter,
        saturation=color_jitter,
    )
    
    # Final transforms
    final_tfms = [
        transforms.ToTensor(),
        transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
    ]
    
    # Random erasing (applied after normalization)
    if reprob > 0:
        final_tfms.append(RandomErasing(
            probability=reprob,
            mode=remode,
            device='cpu'
        ))
    
    return transforms.Compose(
        primary_tfms + [aa_transform, color_tfm] + final_tfms
    )
 
 
def create_imagenet_val_transform(img_size: int = 224, crop_pct: float = 0.875):
    """
    Standard ImageNet validation transform.
    
    Resize to img_size/crop_pct, then center crop to img_size.
    No augmentation.
    """
    resize_size = int(img_size / crop_pct)
    
    return transforms.Compose([
        transforms.Resize(
            resize_size,
            interpolation=transforms.InterpolationMode.BICUBIC
        ),
        transforms.CenterCrop(img_size),
        transforms.ToTensor(),
        transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
    ])
 
 
def create_mixup_cutmix(
    mixup_alpha: float = 0.8,
    cutmix_alpha: float = 1.0,
    prob: float = 1.0,
    switch_prob: float = 0.5,
    num_classes: int = 1000
):
    """
    Create Mixup/CutMix batch transform.
    
    Applied at training loop level, not in data transform.
    """
    return Mixup(
        mixup_alpha=mixup_alpha,
        cutmix_alpha=cutmix_alpha,
        prob=prob,
        switch_prob=switch_prob,
        mode='batch',
        label_smoothing=0.1,
        num_classes=num_classes,
    )

Recipe 2: Object Detection (YOLO/DETR-style)

Object detection requires coordinated transformation of images and bounding boxes. Modern detectors use aggressive augmentation.

detection_recipe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
import albumentations as A
from albumentations.pytorch import ToTensorV2
 
def create_detection_train_transform(
    img_size: int = 640,
    mosaic_prob: float = 1.0,
    mixup_prob: float = 0.15,
):
    """
    Modern object detection training transform.
    
    Components:
    1. Mosaic augmentation (4 images combined) - applied externally
    2. Random perspective/affine
    3. HSV color augmentation
    4. Horizontal flip
    5. Scale jitter
    
    Note: Mosaic and Mixup are applied at dataset level, not here.
    This transform handles single-image augmentations.
    """
    return A.Compose([
        # Geometric
        A.LongestMaxSize(max_size=img_size),
        A.PadIfNeeded(
            min_height=img_size,
            min_width=img_size,
            border_mode=0,
            value=(114, 114, 114)
        ),
        A.HorizontalFlip(p=0.5),
        A.ShiftScaleRotate(
            shift_limit=0.1,
            scale_limit=0.2,
            rotate_limit=10,
            border_mode=0,
            p=0.5
        ),
        A.Perspective(scale=(0.01, 0.05), p=0.3),
        
        # Photometric
        A.OneOf([
            A.HueSaturationValue(
                hue_shift_limit=20,
                sat_shift_limit=30,
                val_shift_limit=20,
                p=1.0
            ),
            A.RandomBrightnessContrast(
                brightness_limit=0.2,
                contrast_limit=0.2,
                p=1.0
            ),
        ], p=0.8),
        
        A.GaussNoise(var_limit=(10, 50), p=0.2),
        A.GaussianBlur(blur_limit=(3, 7), p=0.2),
        
        # Final
        A.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
        ToTensorV2(),
    ], bbox_params=A.BboxParams(
        format='yolo',
        min_visibility=0.3,
        label_fields=['labels']
    ))
 
 
class MosaicDataset:
    """
    Dataset wrapper that applies Mosaic augmentation.
    
    Combines 4 random images into one, simulating crowded scenes
    and varied object scales.
    """
    
    def __init__(
        self,
        dataset,
        img_size: int = 640,
        mosaic_prob: float = 1.0
    ):
        self.dataset = dataset
        self.img_size = img_size
        self.mosaic_prob = mosaic_prob
    
    def __len__(self):
        return len(self.dataset)
    
    def __getitem__(self, idx):
        if np.random.random() > self.mosaic_prob:
            return self.dataset[idx]
        
        # Sample 3 additional random images
        indices = [idx] + np.random.choice(
            len(self.dataset), 3, replace=False
        ).tolist()
        
        # Load images and annotations
        images = []
        all_boxes = []
        all_labels = []
        
        for i in indices:
            img, boxes, labels = self.dataset.load_raw(i)
            images.append(img)
            all_boxes.append(boxes)
            all_labels.append(labels)
        
        # Combine into mosaic
        return self._create_mosaic(images, all_boxes, all_labels)
    
    def _create_mosaic(self, images, boxes_list, labels_list):
        """Create 4-image mosaic. See earlier implementation."""
        pass  # Implementation as shown in page-1

Recipe 3: Semantic Segmentation

Segmentation benefits from strong geometric augmentation but must transform masks identically.

segmentation_recipe.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
import albumentations as A
from albumentations.pytorch import ToTensorV2
 
def create_segmentation_train_transform(
    img_size: int = 512,
    scale_range: tuple = (0.5, 2.0),
):
    """
    Semantic segmentation training transform.
    
    Key principles:
    1. Same geometric transforms to image and mask
    2. Nearest-neighbor interpolation for mask (preserve label values)
    3. Strong scale augmentation for multi-scale objects
    4. Moderate photometric augmentation
    """
    return A.Compose([
        # Random scale + crop (most important for segmentation)
        A.RandomScale(scale_limit=(scale_range[0] - 1, scale_range[1] - 1), p=1.0),
        A.RandomCrop(height=img_size, width=img_size, p=1.0),
        
        # Geometric augmentation
        A.HorizontalFlip(p=0.5),
        A.VerticalFlip(p=0.1),  # Only if domain-appropriate
        A.RandomRotate90(p=0.5),
        A.ShiftScaleRotate(
            shift_limit=0.0625,
            scale_limit=0.1,
            rotate_limit=15,
            p=0.5
        ),
        
        # Photometric (image only, mask unchanged)
        A.OneOf([
            A.GaussNoise(var_limit=(10, 50)),
            A.GaussianBlur(blur_limit=3),
            A.MotionBlur(blur_limit=3),
        ], p=0.3),
        
        A.ColorJitter(
            brightness=0.2,
            contrast=0.2,
            saturation=0.2,
            hue=0.1,
            p=0.5
        ),
        
        A.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
        ToTensorV2(),
    ])
 
 
def create_segmentation_val_transform(img_size: int = 512):
    """
    Minimal validation transform for segmentation.
    
    No augmentation, just resize and normalize.
    For evaluation, often use sliding window or full image.
    """
    return A.Compose([
        A.Resize(img_size, img_size),
        A.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        ),
        ToTensorV2(),
    ])

Augmentation Recipes Summary
Task	Geometric	Photometric	Mixing	Erasing	TTA
Classification	RRC + Flip	Strong (RandAug)	Mixup + CutMix	RE 25%	Flip + MultiCrop
Detection	Mosaic + Flip + Affine	Moderate HSV	Mixup 15%	None	Flip + MultiScale
Segmentation	Scale + Flip + Rotate	Moderate	Copy-Paste	None	Flip + MultiScale
Self-Supervised	Strong crop + Flip	Very strong	None	Optional	Not applicable
Fine-tuning	Light crop + Flip	Light	Light Mixup	None	Flip

Adapting Strategies to Data Size

Optimal augmentation intensity depends critically on dataset size. Too weak augmentation on small datasets leads to overfitting; too strong on large datasets wastes compute and may hurt convergence.

The Data Size Continuum

Very Small Data (<1K samples)

Maximum augmentation intensity
Aggressive geometric transforms
Strong mixing (Mixup α=1.0, CutMix α=1.0)
Consider synthetic data generation
Transfer learning essential

Small Data (1K-10K samples)

Strong augmentation
Full RandAugment (M=14-20)
Mixup and CutMix
Regular dropout and weight decay

Medium Data (10K-100K samples)

Moderate augmentation
Standard RandAugment (M=9-14)
Moderate mixing (α=0.4-0.8)
Standard regularization

Large Data (100K-1M samples)

Light to moderate augmentation
RandAugment (M=7-12)
Light mixing or none
Less aggressive regularization

Very Large Data (>1M samples)

Minimal augmentation sufficient
Basic transforms (flip, crop)
Mixing optional
Focus on compute efficiency

data_size_adaptation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
from dataclasses import dataclass
from typing import Optional
 
@dataclass
class AugmentationConfig:
    """Complete augmentation configuration."""
    # Geometric
    crop_scale_min: float = 0.08
    flip_prob: float = 0.5
    rotation_range: int = 0
    
    # RandAugment
    randaug_n: int = 2
    randaug_m: int = 9
    
    # Mixing
    mixup_alpha: float = 0.0
    cutmix_alpha: float = 0.0
    mix_prob: float = 0.0
    
    # Erasing
    erase_prob: float = 0.0
    
    # Regularization
    label_smoothing: float = 0.0
    drop_path: float = 0.0
    weight_decay: float = 0.0001
 
 
def get_augmentation_config(
    dataset_size: int,
    model_size: str = 'medium',  # 'small', 'medium', 'large'
    task: str = 'classification'
) -> AugmentationConfig:
    """
    Get recommended augmentation config based on dataset and model size.
    
    Heuristics derived from empirical studies on various scales.
    """
    # Base configurations by dataset size
    if dataset_size < 1000:
        # Very small: maximum augmentation
        config = AugmentationConfig(
            crop_scale_min=0.05,
            rotation_range=30,
            randaug_n=3,
            randaug_m=20,
            mixup_alpha=1.0,
            cutmix_alpha=1.0,
            mix_prob=1.0,
            erase_prob=0.4,
            label_smoothing=0.2,
            drop_path=0.3,
            weight_decay=0.1,
        )
    elif dataset_size < 10000:
        # Small: strong augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=15,
            randaug_n=2,
            randaug_m=15,
            mixup_alpha=0.8,
            cutmix_alpha=1.0,
            mix_prob=0.8,
            erase_prob=0.35,
            label_smoothing=0.1,
            drop_path=0.2,
            weight_decay=0.05,
        )
    elif dataset_size < 100000:
        # Medium: moderate augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=10,
            randaug_n=2,
            randaug_m=10,
            mixup_alpha=0.4,
            cutmix_alpha=1.0,
            mix_prob=0.5,
            erase_prob=0.25,
            label_smoothing=0.1,
            drop_path=0.1,
            weight_decay=0.01,
        )
    elif dataset_size < 1000000:
        # Large: light augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=0,
            randaug_n=2,
            randaug_m=9,
            mixup_alpha=0.2,
            cutmix_alpha=0.5,
            mix_prob=0.3,
            erase_prob=0.1,
            label_smoothing=0.1,
            drop_path=0.05,
            weight_decay=0.0001,
        )
    else:
        # Very large: minimal augmentation
        config = AugmentationConfig(
            crop_scale_min=0.08,
            rotation_range=0,
            randaug_n=2,
            randaug_m=7,
            mixup_alpha=0.0,
            cutmix_alpha=0.0,
            mix_prob=0.0,
            erase_prob=0.0,
            label_smoothing=0.0,
            drop_path=0.0,
            weight_decay=0.0001,
        )
    
    # Adjust for model size
    if model_size == 'large':
        # Larger models need more regularization
        config.randaug_m = min(30, config.randaug_m + 5)
        config.drop_path = min(0.5, config.drop_path + 0.1)
        config.weight_decay *= 2
    elif model_size == 'small':
        # Smaller models need less regularization
        config.randaug_m = max(5, config.randaug_m - 3)
        config.drop_path = max(0, config.drop_path - 0.05)
    
    return config

The Size-Augmentation Trade-off

Domain-Specific Strategies

Different domains have unique characteristics requiring specialized augmentation approaches. Understanding domain constraints is essential for effective augmentation.

Medical Imaging

Domain Characteristics:

Limited data, high inter-observer variability
Subtle diagnostic features
Color/intensity has clinical meaning
Left-right asymmetry may be significant

Strategy:

Heavy geometric: elastic deformation, rotation (controlled), scaling
Light photometric: intensity normalization, controlled contrast
Domain-specific: stain normalization, artifact simulation
Avoid: aggressive color changes, arbitrary flip/rotation

medical_augmentation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
import albumentations as A
import numpy as np
from scipy.ndimage import gaussian_filter, map_coordinates
 
class StainNormalization(A.ImageOnlyTransform):
    """
    Color normalization using Macenko method for histopathology.
    
    Normalizes stain colors to a reference image, reducing
    inter-scanner and inter-lab variability.
    """
    
    def __init__(self, reference_img=None, always_apply=False, p=0.5):
        super().__init__(always_apply, p)
        self.reference_img = reference_img
        # Pre-compute reference stain matrix if provided
        if reference_img is not None:
            self.ref_stain_matrix = self._compute_stain_matrix(reference_img)
    
    def _compute_stain_matrix(self, img):
        """Compute stain matrix using SVD."""
        # Simplified implementation - full version uses Macenko method
        pass
    
    def apply(self, img, **params):
        """Normalize stains to reference."""
        # Implementation would apply color deconvolution and renormalization
        return img
 
 
def create_medical_train_transform(
    img_size: int = 512,
    modality: str = 'histopathology',  # 'histopathology', 'xray', 'ct', 'mri'
):
    """
    Medical imaging augmentation strategy.
    
    Carefully designed to preserve diagnostic information while
    adding realistic variability.
    """
    
    if modality == 'histopathology':
        # Histopathology: all orientations valid, color important
        return A.Compose([
            # Geometric - any orientation
            A.RandomResizedCrop(img_size, img_size, scale=(0.5, 1.0)),
            A.HorizontalFlip(p=0.5),
            A.VerticalFlip(p=0.5),
            A.RandomRotate90(p=0.5),
            A.Rotate(limit=180, border_mode=0, p=0.5),
            
            # Elastic deformation - tissue variability
            A.ElasticTransform(
                alpha=120, sigma=120 * 0.05,
                alpha_affine=120 * 0.03,
                p=0.3
            ),
            
            # Color - stain variation
            A.HueSaturationValue(
                hue_shift_limit=10,
                sat_shift_limit=15,
                val_shift_limit=10,
                p=0.5
            ),
            A.RandomBrightnessContrast(0.1, 0.1, p=0.5),
            
            A.Normalize(),
        ])
    
    elif modality == 'xray':
        # X-ray: orientation matters, grayscale
        return A.Compose([
            # Geometric - limited rotation
            A.RandomResizedCrop(img_size, img_size, scale=(0.8, 1.0)),
            A.HorizontalFlip(p=0.5),  # Only if L/R symmetric
            A.Rotate(limit=10, border_mode=0, p=0.3),
            A.ShiftScaleRotate(
                shift_limit=0.05, scale_limit=0.1, rotate_limit=0,
                p=0.3
            ),
            
            # Intensity - exposure variation
            A.RandomBrightnessContrast(0.1, 0.15, p=0.5),
            A.CLAHE(clip_limit=2.0, p=0.3),
            A.GaussNoise(var_limit=(5, 20), p=0.2),
            
            A.Normalize(),
        ])
    
    elif modality in ['ct', 'mri']:
        # CT/MRI: volumetric (2D slice), intensity is calibrated
        return A.Compose([
            A.RandomResizedCrop(img_size, img_size, scale=(0.8, 1.0)),
            A.HorizontalFlip(p=0.5),
            A.Rotate(limit=15, border_mode=0, p=0.3),
            A.ElasticTransform(alpha=50, sigma=50 * 0.05, p=0.2),
            
            # Be careful with intensity for windowed images
            A.RandomBrightnessContrast(0.05, 0.05, p=0.3),
            A.GaussNoise(var_limit=(3, 10), p=0.2),
            
            A.Normalize(),
        ])

Satellite and Remote Sensing

Domain Characteristics:

All orientations equally valid (nadir view)
Multiple spectral bands (not just RGB)
Large scale variation (zoom levels)
Temporal variation (seasons, time of day)

Strategy:

Full rotation (0°, 90°, 180°, 270°)
Aggressive scale augmentation
Band-wise normalization
Weather/atmospheric simulation

Document Analysis and OCR

Domain Characteristics:

Text orientation matters (no arbitrary rotation)
Perspective distortion from scanning
Variable backgrounds and lighting
Font and style variation

Strategy:

Perspective transforms (subtle)
Elastic deformation (paper warping)
Background substitution
Font/style augmentation (synthetic rendering)
Avoid: horizontal flip, large rotation

Domain-Specific Augmentation Guidelines
Domain	Flip	Rotation	Color	Special
Natural Images	H: yes, V: no	Small (±15°)	Strong	RandAugment
Medical/Histopath	Both	Any (90°)	Conservative	Stain normalization
Medical/X-ray	H only (if symmetric)	Small (±10°)	Intensity only	CLAHE
Satellite	Both + 90°	Any	Band-wise	Weather simulation
Documents/OCR	None	Very small (±3°)	BG variation	Elastic, perspective
Autonomous Driving	H (with labels)	None	Strong	Weather, lighting simulation
Face Recognition	H: yes	Small	Moderate	3D face transform

Diagnosing and Fixing Augmentation Issues

Augmentation can cause subtle problems that manifest as poor performance. Learning to diagnose these issues is crucial for effective tuning.

Common Symptoms and Causes

Symptom: Training accuracy never reaches expected levels

Cause: Augmentation too aggressive
Fix: Reduce RandAugment M, mixing alpha, or erasing probability
Diagnosis: Plot training loss—should decrease smoothly

Symptom: Validation accuracy much lower than training

Cause: Augmentation too weak (overfitting)
Fix: Increase augmentation strength
Diagnosis: Training accuracy saturates while validation plateaus lower

Symptom: Model performs poorly on specific input types

Cause: Augmentation doesn't cover that distribution
Fix: Add targeted augmentations for edge cases
Diagnosis: Analyze failure cases for common patterns

Symptom: Predictions are poorly calibrated

Cause: Hard labels with insufficient regularization
Fix: Add label smoothing, Mixup, or temperature scaling
Diagnosis: Reliability diagram shows overconfidence

augmentation_diagnostics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
import torch
import numpy as np
import matplotlib.pyplot as plt
from typing import List, Dict
 
class AugmentationDiagnostics:
    """
    Tools for diagnosing augmentation effectiveness.
    """
    
    def __init__(self, model, train_loader, val_loader, device='cuda'):
        self.model = model
        self.train_loader = train_loader
        self.val_loader = val_loader
        self.device = device
    
    def compute_augmentation_sensitivity(
        self,
        sample_images: torch.Tensor,
        transforms: List,
        n_trials: int = 10
    ) -> Dict:
        """
        Measure how sensitive model predictions are to augmentations.
        
        High sensitivity may indicate:
        - Model hasn't learned invariances
        - Augmentations are too aggressive
        - Need for test-time augmentation
        """
        self.model.eval()
        sensitivities = []
        
        with torch.no_grad():
            for img in sample_images:
                predictions = []
                
                for _ in range(n_trials):
                    # Apply random augmentation
                    for transform in transforms:
                        aug_img = transform(img.unsqueeze(0))
                        pred = self.model(aug_img.to(self.device))
                        predictions.append(pred.cpu())
                
                # Measure prediction variance
                preds = torch.stack(predictions).squeeze()
                sensitivity = preds.std(dim=0).mean().item()
                sensitivities.append(sensitivity)
        
        return {
            'mean_sensitivity': np.mean(sensitivities),
            'std_sensitivity': np.std(sensitivities),
            'per_sample': sensitivities,
        }
    
    def detect_underfitting(
        self,
        target_train_acc: float = 0.95
    ) -> Dict:
        """
        Check if augmentation is preventing model from fitting training data.
        """
        self.model.eval()
        correct = 0
        total = 0
        
        with torch.no_grad():
            for images, labels in self.train_loader:
                images, labels = images.to(self.device), labels.to(self.device)
                outputs = self.model(images)
                _, predicted = outputs.max(1)
                correct += predicted.eq(labels).sum().item()
                total += labels.size(0)
        
        train_acc = correct / total
        
        return {
            'train_accuracy': train_acc,
            'is_underfitting': train_acc < target_train_acc,
            'recommendation': (
                'Reduce augmentation strength' if train_acc < target_train_acc
                else 'Training accuracy acceptable'
            )
        }
    
    def detect_overfitting(
        self,
        max_gap: float = 0.05  # Max acceptable train-val gap
    ) -> Dict:
        """
        Check train-validation accuracy gap for overfitting.
        """
        self.model.eval()
        
        # Train accuracy
        train_correct = 0
        train_total = 0
        with torch.no_grad():
            for images, labels in self.train_loader:
                images, labels = images.to(self.device), labels.to(self.device)
                outputs = self.model(images)
                _, predicted = outputs.max(1)
                train_correct += predicted.eq(labels).sum().item()
                train_total += labels.size(0)
        
        # Val accuracy
        val_correct = 0
        val_total = 0
        with torch.no_grad():
            for images, labels in self.val_loader:
                images, labels = images.to(self.device), labels.to(self.device)
                outputs = self.model(images)
                _, predicted = outputs.max(1)
                val_correct += predicted.eq(labels).sum().item()
                val_total += labels.size(0)
        
        train_acc = train_correct / train_total
        val_acc = val_correct / val_total
        gap = train_acc - val_acc
        
        return {
            'train_accuracy': train_acc,
            'val_accuracy': val_acc,
            'gap': gap,
            'is_overfitting': gap > max_gap,
            'recommendation': (
                'Increase augmentation strength' if gap > max_gap
                else 'Regularization appears adequate'
            )
        }
    
    def visualize_augmented_samples(
        self,
        images: torch.Tensor,
        transform,
        n_samples: int = 8
    ):
        """
        Visualize augmented samples for sanity checking.
        
        Helps identify if augmentations are too aggressive
        or semantically inappropriate.
        """
        fig, axes = plt.subplots(2, n_samples, figsize=(n_samples * 2, 4))
        
        for i in range(min(n_samples, len(images))):
            # Original
            orig = images[i].permute(1, 2, 0).cpu().numpy()
            orig = (orig - orig.min()) / (orig.max() - orig.min())
            axes[0, i].imshow(orig)
            axes[0, i].set_title('Original')
            axes[0, i].axis('off')
            
            # Augmented
            aug = transform(images[i].unsqueeze(0)).squeeze()
            aug = aug.permute(1, 2, 0).cpu().numpy()
            aug = (aug - aug.min()) / (aug.max() - aug.min())
            axes[1, i].imshow(aug)
            axes[1, i].set_title('Augmented')
            axes[1, i].axis('off')
        
        plt.tight_layout()
        return fig

Always Visualize Augmentations

Computational Efficiency

Augmentation adds computational overhead. Efficient implementation is essential for large-scale training.

GPU vs CPU Augmentation

CPU Augmentation (Traditional)

Done in data loading pipeline
Uses PIL/OpenCV/Albumentations
Parallelized across DataLoader workers
Memory efficient
Bottleneck when GPU is fast

GPU Augmentation (Modern)

Done on GPU using Kornia/torchvision tensors
Can be faster for batch operations
Integrates with mixed precision training
Useful for very fast models
Limited operation support

efficient_augmentation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
import torch
import torch.nn as nn
import kornia.augmentation as K
from torchvision.transforms import v2 as T
 
class GPUAugmentation(nn.Module):
    """
    GPU-accelerated augmentation pipeline using Kornia.
    
    Applies augmentations on GPU as part of the forward pass,
    benefiting from batch parallelism and GPU acceleration.
    """
    
    def __init__(self, img_size: int = 224):
        super().__init__()
        
        # Kornia augmentations run on GPU
        self.augment = nn.Sequential(
            K.RandomResizedCrop(
                size=(img_size, img_size),
                scale=(0.08, 1.0),
                ratio=(3/4, 4/3),
            ),
            K.RandomHorizontalFlip(p=0.5),
            K.ColorJitter(
                brightness=0.4,
                contrast=0.4,
                saturation=0.4,
                hue=0.1,
                p=0.8
            ),
            K.RandomGrayscale(p=0.2),
            K.RandomGaussianBlur(
                kernel_size=(3, 3),
                sigma=(0.1, 2.0),
                p=0.1
            ),
            K.Normalize(
                mean=torch.tensor([0.485, 0.456, 0.406]),
                std=torch.tensor([0.229, 0.224, 0.225])
            ),
        )
    
    @torch.no_grad()
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """Apply augmentations on GPU."""
        return self.augment(x)
 
 
class HybridAugmentation:
    """
    Hybrid CPU+GPU augmentation strategy.
    
    - Expensive operations (crop, resize) on CPU during loading
    - Fast operations (flip, color) on GPU in batch
    
    Optimizes for overall throughput.
    """
    
    def __init__(self, img_size: int = 224):
        # CPU: geometry changes (expensive)
        self.cpu_transform = T.Compose([
            T.RandomResizedCrop(
                img_size,
                scale=(0.08, 1.0),
                antialias=True
            ),
            T.ToImage(),
            T.ToDtype(torch.float32, scale=True),
        ])
        
        # GPU: fast batch operations
        self.gpu_transform = nn.Sequential(
            K.RandomHorizontalFlip(p=0.5),
            K.ColorJitter(0.4, 0.4, 0.4, 0.1, p=0.8),
            K.Normalize(
                mean=torch.tensor([0.485, 0.456, 0.406]),
                std=torch.tensor([0.229, 0.224, 0.225])
            ),
        )
    
    def apply_cpu(self, img):
        """Called in DataLoader worker."""
        return self.cpu_transform(img)
    
    @torch.no_grad()
    def apply_gpu(self, batch: torch.Tensor) -> torch.Tensor:
        """Called after batch is on GPU."""
        return self.gpu_transform(batch)
 
 
class CachedAugmentation:
    """
    Cache augmented versions for small datasets.
    
    Pre-generates N augmented versions of each training sample,
    trading memory for compute during training.
    """
    
    def __init__(
        self,
        dataset,
        transform,
        n_cached_versions: int = 10
    ):
        self.dataset = dataset
        self.n_cached = n_cached_versions
        self.cache = {}
        
        print(f"Pre-generating {n_cached_versions} augmented versions...")
        for idx in range(len(dataset)):
            img, label = dataset[idx]
            self.cache[idx] = [
                (transform(img), label)
                for _ in range(n_cached_versions)
            ]
        print("Caching complete.")
    
    def __len__(self):
        return len(self.dataset) * self.n_cached
    
    def __getitem__(self, idx):
        sample_idx = idx // self.n_cached
        version_idx = idx % self.n_cached
        return self.cache[sample_idx][version_idx]

Speed Optimizations

•Use Pillow-SIMD for CPU ops
•Increase DataLoader workers
•Pin memory for faster GPU transfer
•Use persistent workers
•Batch augmentations on GPU

Memory Optimizations

•Generate augmentations on-the-fly
•Use smaller intermediate buffers
•Avoid holding original + augmented
•Stream large datasets
•Use mixed-precision where possible

Emerging Augmentation Techniques

The augmentation landscape continues to evolve. Several recent techniques show promise for specific applications.

Copy-Paste Augmentation

Copy-Paste (Ghiasi et al., 2021) cuts objects from one image and pastes them onto another, with full instance mask handling:

Extract object using instance mask
Random scale and position
Paste onto target image
Update target's instance masks and labels

Particularly effective for instance segmentation and detection, especially with rare objects.

GridMask with Scheduling

GridMask removes grid-patterned regions, learning to recognize objects from fragments. Modern implementations schedule GridMask intensity:

Start mild, increase through training
Provides curriculum: easy → hard
Often outperforms static masking

AugMax: Adversarial Augmentation

AugMax (Wang et al., 2021) selects augmentations adversarially to maximize loss:

$$T^* = \arg\max_{T \in \mathcal{T}} L(f_\theta(T(x)), y)$$

The model trains on worst-case augmentations, improving robustness. Computationally expensive but effective for safety-critical applications.

Generative Augmentation

Using generative models (GANs, diffusion) to synthesize training data:

Benefits:

Unlimited diversity
Can generate impossible-to-collect variations
Controllable attribute modification

Challenges:

Quality varies; may introduce artifacts
Computationally expensive generation
Mode collapse reduces diversity

Current practice:

Use generative augmentation for rare classes
Supplement, don't replace, real data
Validate that synthetic samples help

Text-Guided Augmentation

With vision-language models, augmentations can be guided by text:

"Make this street scene rainy" "Add pedestrians to this parking lot" "Change season from summer to winter"

Stable Diffusion and similar models enable highly targeted augmentations that were previously impossible.

The Future of Augmentation

Summary: Augmentation Strategies

We've synthesized the complete augmentation landscape into actionable strategies, providing a comprehensive framework for augmentation decisions in any deep learning project.

Key Takeaways

•Complete recipes exist for major tasks—ImageNet classification, detection, and segmentation each have well-tested augmentation pipelines.
•Scale strategy with data size—more data needs less augmentation; limited data requires aggressive regularization.
•Domain constraints override general rules—medical, satellite, and document domains have specific requirements.
•Diagnose with train/val gap—large gap means more augmentation; low training accuracy means less.
•Efficiency matters at scale—GPU augmentation, caching, and hybrid strategies optimize throughput.
•Emerging techniques expand possibilities—Copy-Paste, adversarial, and generative augmentations address new challenges.
•Always visualize before training—catch augmentation bugs early through visual inspection.

Module Complete:

Design augmentation pipelines for any vision task
Adapt strategies to data and model characteristics
Diagnose and fix augmentation-related issues
Apply test-time augmentation for maximum accuracy
Stay current with emerging augmentation techniques

Module Complete: Data Augmentation

5 / 5