Post

Fixing Unstable Y-axis Range in Matplotlib Multi-scale Data Visualization

Stabilizing the y-axis range when comparing multi-scale detection results in Matplotlib bar charts.

Fixing Unstable Y-axis Range in Matplotlib Multi-scale Data Visualization

Project Structure

1
2
3
4
5
6
7
8
9
10
11
32shot_fewshot/
├── 32shot_fewshot_origin_circle_scale_3.0/
│   ├── output/
│   │   ├── detection_results_2024_11_06_08_05_57.csv
│   │   ├── detection_results_2024_11_07_06_49_11.csv
│   │   └── ...
│   ├── train/
│   ├── val/
│   └── model_checkpoints/
├── 32shot_fewshot_origin_circle_scale_3.1/
└── ...

The Challenge

When dealing with multiple scale experiments:

  1. Each scale folder contains multiple CSV files with detection results
  2. Need to compare performance across different scales
  3. Visualization issues with y-axis stability
  4. Data type inconsistencies affecting plot reliability

Solution Implementation

1. Data Loading and Processing

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import os
import glob
import re

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


def load_scale_results(base_path):
    """
    Load first detection results from each scale folder
    """
    scale_results = []
    # glob order is not deterministic, so sort for reproducibility
    scale_folders = sorted(glob.glob(os.path.join(base_path, "*scale*")))
    
    for folder in scale_folders:
        match = re.search(r'scale_(\d+\.\d+)', folder)
        if match is None:
            continue
        scale = float(match.group(1))
        csv_files = sorted(glob.glob(os.path.join(folder, "output", "*.csv")))
        
        if csv_files:
            df = pd.read_csv(csv_files[0])
            scale_results.append({
                'scale': scale,
                'data': df
            })
    
    return sorted(scale_results, key=lambda x: x['scale'])

2. Visualization with Stable Y-axis

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def plot_scale_comparison(scale_results, methods):
    plt.figure(figsize=(15, 8))
    
    # Set a fixed, explicit y-axis range (works before or after plotting)
    plt.ylim(0, 100)
    
    colors = plt.cm.rainbow(np.linspace(0, 1, len(scale_results)))
    x = np.arange(len(methods))
    width = 0.8 / len(scale_results)
    
    for idx, data in enumerate(scale_results):
        scale = data['scale']
        # Ensure numeric conversion; non-numeric cells become NaN instead of crashing
        accuracies = pd.to_numeric(data['data']['acc'][1:9], errors="coerce").fillna(0)
        
        x_pos = x - (0.4 - width/2) + (idx * width)
        plt.bar(x_pos, accuracies, width, 
               label=f'Scale {scale}', 
               color=colors[idx], 
               alpha=0.7)
    
    plt.xlabel('Methods', fontsize=12)
    plt.ylabel('Accuracy (%)', fontsize=12)
    plt.title('Model Performance Across Different Scales')
    plt.xticks(x, methods, rotation=45)
    plt.legend(bbox_to_anchor=(1.05, 1))
    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.tight_layout()

Key Learnings

  1. Data Type Consistency
    • Always convert string data to numeric using pd.to_numeric()
    • Handle missing values appropriately
  2. Plot Stability
    • Set a fixed, explicit y-axis range that is identical across charts
    • Use consistent color schemes
    • Proper layout management
  3. Best Practices
    • Use a fixed, explicit y-axis range (it works whether plt.ylim() is called before or after plotting); the key is that the range is explicit and identical across every chart
    • Ensure data types are numeric before visualization
    • Apply tight_layout() for better spacing

A note on the object-oriented API

The pyplot calls above are fine, but for anything beyond a quick script the object-oriented Matplotlib API is the cleaner general style. Create the figure and axes explicitly, then operate on the ax object — the fixed y-axis range stays just as explicit:

1
2
3
4
5
6
fig, ax = plt.subplots(figsize=(15, 8))
ax.set_ylim(0, 100)  # fixed, explicit range — identical across charts

ax.bar(x_pos, accuracies, width, label=f'Scale {scale}', alpha=0.7)
ax.set_xlabel('Methods', fontsize=12)
ax.set_ylabel('Accuracy (%)', fontsize=12)
This post is licensed under CC BY 4.0 by the author.