# Two-Step ICC Printer Profiling — Technique Explained

This document explains the two-pass (two-step) ICC printer profiling technique
implemented in [wsprofiler](https://github.com/crenedecotret/wsprofiler). It is
written for developers who want to port the approach to other software.

> The core insight: instead of generating the second chart with Argyll's
> `targen -c precond.icc` (which merely densifies sampling near the
> intermediate profile's gamut surface), **wsprofiler uses a greedy,
> coverage-maximising patch selection algorithm** that scores ~10 k candidate
> RGB values on four independent criteria and picks the best ones.

---

## 1. High-Level Flow

```
  ┌──────────────────────────────────────────────────────────────────┐
  │                    TWO-STEP WIZARD FLOW                          │
  ├──────────────────────────────────────────────────────────────────┤
  │                                                                  │
  │  STEP 1: Generate & measure a full-coverage chart                │
  │    targen → printtarg → [print & measure] → chart1.ti3           │
  │                                                                  │
  │  STEP 2: Build intermediate profile                              │
  │    colprof chart1.ti3 → precond.icc                              │
  │                                                                  │
  │  STEP 3: Generate refinement chart (THE CORE ALGORITHM)          │
  │    predict ~10k candidates via xicclu → score → greedily select  │
  │    → write chart2.ti1 → printtarg → [print & measure] → chart2   │
  │                                                                  │
  │  STEP 4: Merge & build final profile                             │
  │    combine chart1.ti3 + chart2.ti3 → combined.ti3                │
  │    colprof combined.ti3 → final.icc                              │
  │                                                                  │
  └──────────────────────────────────────────────────────────────────┘
```

### ArgyllCMS binaries used

| Binary | Role |
|--------|------|
| `targen` | Generate first chart definition (`.ti1`) |
| `printtarg` | Convert `.ti1` → printable `.ti2` (with page layout, registration marks) |
| `chartread` | Drive spectrophotometer, produce `.ti3` measurement file |
| `colprof` | Build ICC profile from `.ti3` measurements |
| `xicclu` | Forward-transform device RGB → Lab through a temporary ICC profile |

---

## 2. Step 1 — First Chart (Full Coverage)

A standard Argyll `targen` invocation creates a uniform-grid RGB chart:

```
targen -v -d2 -G -f<total> -e<white> -B<black> -g<gray> <name>
```

- `-d2` — 2-channel (RGB) device
- `-G` — include neutral axis sampling
- `-f` — total patches (set to patches-per-page, e.g. 672 for i1Pro + A3)
- `-e` / `-B` / `-g` — extra white/black/gray ramp patches

Then `printtarg` lays it out for the selected paper size and instrument:

```
printtarg -v -i<instr> -t300 [-h] -p<paper> <name>
```

The user prints the resulting `.ti2` chart and measures it with `chartread`,
producing a `.ti3` measurement file (`chart1.ti3`).

---

## 3. Step 2 — Intermediate Profile

```
colprof -v -ql -cmt -dpp -D "<desc>" <name>
```

- `-ql` — quick (low-quality) — sufficient as a predictor for pass 2
- `-cmt` — colorimetric intent
- `-dpp` — "perceptual per paper" (Argyll's printer profile quality)

The resulting ICC is renamed to `_precond_tmp.icc`. It will only be used to
predict RGB→Lab — it never reaches the user.

---

## 4. Step 3 — Refinement Patch Generation (Core Algorithm)

This is **the key innovation**. Instead of `targen -c precond.icc`, wsprofiler
runs a Python-based greedy selector.

### 4a. Build a ~10 k Candidate Pool

Six sources feed into one large pool, deduplicated by 8-bit RGB key
`(round(r*255), round(g*255), round(b*255))`.

| Source | Generation | Default count |
|--------|-----------|---------------|
| **Uniform grid** | Full RGB cube at `grid=17` (17³ = 4913) | ~4900 |
| **Halton sequence** | Low-discrepancy quasi-random in [0,1]³ (bases 2,3,5) | 4096 |
| **Neutral axis + perturbations** | 65 gray steps × 7 chromatic perturbations (±0.01–0.02 in R/G/B) | 455 |
| **Edge/axis curves** | 10 axes (R, G, B, C, M, Y, KW, RG, GB, BR) × 33 steps | 330 |
| **Hue sweeps** | 5 luma levels × 24 hues × 2 chroma fractions (HSV-like) | 240 |
| **Perceptual anchors** | 31 hand-picked memory colours (skin, foliage, sky, ink limits, etc.) | 31 |
| **Total (approx)** | | **~10 000** |

**Key detail — Neutral perturbations**: the 7 perturbations are tiny chromatic
offsets applied to each gray step (e.g. `+0.01R, -0.01G`). This ensures the
gray axis is densely sampled even in the presence of small printer
non-linearity.

**Key detail — Halton sequence**: a deterministic low-discrepancy sequence
that fills RGB space more evenly than pure random sampling, ensuring no large
gaps remain even after the uniform grid.

### 4b. Project Candidates to Lab

The entire candidate pool is projected to CIELAB (D50) using `xicclu` running
against the intermediate profile:

```
xicclu -ir -pl -s1 precond.icc
```

- `-ir` — forward direction (RGB → PCS)
- `-pl` — Lab output
- `-s1` — input scale 1.0 (0..1)

This is done in **one subprocess call per batch** (passing ~10 k RGB lines via
stdin) for efficiency. The predictor is also used to project the Pass-1
measurements into the same Lab space (Pass-1 `.ti3` contains **target** RGB
values, not measured Lab — we predict Lab to keep everything in a consistent,
device-independent space that is biased by the printer's actual behaviour as
captured in the intermediate profile).

### 4c. Load Pass-1 Lab

Read the Pass-1 `.ti3` file, extract the **target RGB** for each patch
(columns `RGB_R`, `RGB_G`, `RGB_B` — stored as 0..100 values in the file),
normalise to 0..1, and run through `xicclu` to get predicted Lab.

Now we have `pass1_lab`: an N×3 array of Lab values representing where the
first chart's patches sit in colour space according to the intermediate
profile.

### 4d. Greedy Selection Loop

#### Forced anchors

Five RGB points are **always selected first** (if they pass the minimum-ΔE
gate):

```python
(1.0, 1.0, 1.0)   # white
(0.0, 0.0, 0.0)   # black
(0.25, 0.25, 0.25)
(0.50, 0.50, 0.50)
(0.75, 0.75, 0.75)
```

These guarantee the extreme and mid-gray are present for calibration.

#### Scoring formula

Each candidate receives a score, and the highest-scoring *eligible* patch is
picked in each iteration. Once picked, its ΔE76 distance is used to mask
ineligible candidates in subsequent iterations (default `min_dE = 2.5`).

```
score = 1.0 × novelty + 0.5 × region_undercoverage + 0.3 × neutrality + 0.3 × luma_balance
```

| Term | Formula | Purpose |
|------|---------|---------|
| **Novelty** | `cap × tanh(raw_dE / cap)` where `raw_dE` = ΔE76 to nearest of (Pass-1 ∪ already-selected), cap default = 30.0 | Prefers perceptually new colours; saturates at ~30 ΔE to avoid chasing outliers |
| **Region undercoverage** | `1 - occ/max(occ)` where `occ` = count in that 8×8×8 Lab voxel | Spreads patches across colour space; heavily penalises well-covered Lab regions |
| **Neutrality** | `exp(-chroma / 10.0)` where chroma = sqrt(a*² + b*²) | Biases toward near-neutral colours (critical for gray balance) |
| **Luma balance** | 1.0 for the most under-represented luminance tertile (shadow < 33, mid 33–66, highlight > 66) | Ensures dark, mid, and light regions are all well-sampled |

#### Region counting (8×8×8 Lab voxel grid)

```
L* index = clamp(L* / 100 * 8, 0, 7)
a* index = clamp((a* + 128) / 256 * 8, 0, 7)
b* index = clamp((b* + 128) / 256 * 8, 0, 7)
```

The 8×8×8 grid divides CIELAB into 512 equal-sized cells. Each Lab point
falls into exactly one cell; `region_counts` tracks how many (Pass-1 +
already-selected) points occupy each cell. The scoring uses this to favour
cells with few or no occupants.

#### Eligibility gate

A candidate is **eligible** if its ΔE76 distance to every *already-selected*
patch is ≥ `min_dE` (default 2.5). This prevents clumping. The gate uses only
the selected set (not Pass-1), since Pass-1 has already been accounted for in
the novelty score.

#### Termination

The loop runs until either:
- `target_n` patches are selected (typically = patches-per-page, matching
  the first chart's page count), or
- No eligible candidates remain (all remaining candidates are within
  `min_dE` of an already-selected patch).

### 4e. Write `.ti1` Output

The output is a CGATS `.ti1` file with up to three tables:

**Table 1 — Main patches** (one row per selected candidate):
```
COLOR_REP "iRGB"
ACCURATE_EXPECTED_VALUES "true"
NUMBER_OF_FIELDS 7
BEGIN_DATA_FORMAT
SAMPLE_ID RGB_R RGB_G RGB_B XYZ_X XYZ_Y XYZ_Z
END_DATA_FORMAT
```

Each row contains the RGB target values (0..100 scale) and the **expected XYZ**
(converted from the predicted Lab using the D50 white point). Setting
`ACCURATE_EXPECTED_VALUES "true"` tells `printtarg` to use the provided XYZ
for its layout calibration rather than recalculating from the RGB.

**Table 2 — Density extremes** (8 patches, fixed):
```
C M Y values (roughly CMY overprints) for printtarg's density calibration.
```

**Table 3 — Device combination values** (9 patches):
```
White, CMY, RGB, black, mid-gray — used by printtarg for ink-limit calcs.
```

> If no predictor is available, the file uses `COLOR_REP "RGB"` with only
> 4 fields (SAMPLE_ID, RGB_R, RGB_G, RGB_B) and omits tables 2 and 3.

### 4f. Run `printtarg`

Same command as Step 1, producing the second `.ti2` chart. The user prints
and measures it.

---

## 5. Step 4 — TI3 Combiner

The combiner merges the two `.ti3` measurement files into a single file
that `colprof` can consume for final profile creation.

### Algorithm

1. **Validate compatibility** — parse both files with a structured CGATS
   parser and check that `DATA_FORMAT` matches. If Pass-1 used
   `SAMPLE_ID RGB_R RGB_G RGB_B XYZ_X XYZ_Y XYZ_Z`, Pass-2 must too.

2. **Read raw text** of file A (preserving exact header formatting,
   quoting, keyword ordering).

3. **Extract data rows** from file B — everything between `BEGIN_DATA`
   and `END_DATA`.

4. **Rebuild file A** with two modifications:
   - `NUMBER_OF_SETS` ← count(A) + count(B)
   - B's data rows inserted just before `END_DATA`

All other keywords (`DESCRIPTOR`, `CREATED`, `COLOR_REP`, `DATA_FORMAT`,
`APPROX_WHITE_POINT`, measurement condition comments, etc.) are preserved
verbatim from file A.

### Why not just use Argyll's `iccgamut` or merge tools?

The combiner is intentionally simple (~90 lines). It concatenates measured
data from two charts without any interpolation, deduplication, or weighted
averaging. The reason is architectural: each chart has patches at different
RGB locations, and `colprof` expects to see all measurements as independent
samples. The duplicate-neutral-axis patches (present in both passes) are
harmless — `colprof` averages them naturally during profile fitting.

---

## 6. Why This Approach Works Better Than `targen -c`

Argyll's `targen -c precond.icc` generates patches by:
1. Building a uniform grid in the **intermediate profile's PCS space** (Lab)
2. Inverting through the profile to find corresponding RGB values

This tends to concentrate patches near the gamut **surface** and gives fewer
samples to the interior and the neutral axis. It also produces many patches
that are perceptually very close to Pass-1 patches (low novelty).

The wsprofiler approach explicitly addresses these weaknesses:
- Each patch is scored for **novelty** relative to Pass-1 — redundant patches
  score near zero
- The **region-coverage** term forces exploration of under-sampled Lab voxels
- The **neutrality bonus** and **luma balance** terms ensure gray-axis and
  tonal-range coverage
- The **min_dE gate** prevents clumping even within a single iteration

In practice, this produces a Pass-2 chart that fills perceptual gaps left by
Pass-1, giving the final `colprof` a more uniformly sampled measurement set
and a better profile overall.

---

## 7. Porting Guide

### What you need to reimplement

| Component | Complexity | Notes |
|-----------|-----------|-------|
| Candidate pool builder | Low | 6 straightforward sampling strategies |
| Lab predictor | Medium | Wrap any ICC forward-transform engine (xicclu, LittleCMS, etc.) |
| Scoring + greedy selection | Medium | Straightforward numpy — adapt to your language's array library |
| ΔE76 computation | Low | `sqrt((L₁-L₂)² + (a₁-a₂)² + (b₁-b₂)²)` |
| TI1 writer | Medium | CGATS format is well-documented in Argyll source |
| TI3 combiner | Low | Text manipulation with CGATS awareness |
| Orchestration (wizard) | Medium | Sequential subprocess management |

### Key parameters (tune for your use case)

| Parameter | Default | Effect |
|-----------|---------|--------|
| `grid` | 17 | Uniform grid side length (17³ = 4913) |
| `halton_n` | 4096 | Halton sample count |
| `min_dE` | 2.5 | Minimum ΔE76 separation |
| `novelty_soft_cap` | 30.0 | Novelty saturation threshold |
| Scoring weights | 1.0, 0.5, 0.3, 0.3 | novelty, coverage, neutrality, luma |
| `target_n` | patches-per-page | Number of Pass-2 patches to select |

### Dependencies

- **ArgyllCMS binaries** (`targen`, `printtarg`, `chartread`, `colprof`,
  `xicclu`) on PATH
- **NumPy** (or equivalent array library for the scoring math) — the scoring
  is written to be fully vectorised

### Testing without hardware

To test the algorithm without a spectrophotometer:
1. Use the sample `.ti3` file in `assets/sample/`
2. Build an intermediate profile with `colprof`
3. Run `generate_pass2_ti1()` — it will produce a `.ti1` file
4. Verify with `printtarg` that it renders correctly (no measurement needed)

---

## 8. Key Files in the wsprofiler Source

| File | Purpose |
|------|---------|
| `wsprofiler/ui/two_step_dialog.py` | Dialog orchestration (6 wizard pages, subprocess management) |
| `wsprofiler/profiling/pass2_generator.py` | **Core algorithm** — candidate pool, scoring, greedy selection, TI1 writer |
| `wsprofiler/ti/ti3_combiner.py` | TI3 measurement merger |
| `wsprofiler/ti/ti3.py` | TI3 parser (used to load Pass-1 patches) |
| `wsprofiler/ti/ti2.py` | TI2 parser (chart definition files) |
| `wsprofiler/ti/cgats.py` | Generic CGATS file parser |
| `wsprofiler/argyll/chartread.py` | Interactive `chartread` session manager |
| `wsprofiler/tests/test_pass2_generator.py` | Unit tests with stub predictor |

---

## 9. Licence

wsprofiler is open source. See the project repository for the full licence
terms. The algorithm described here is free to reimplement.
