Pairwise Comparison
Two items side by side. Participants pick a winner — or call it a tie. Candor handles pair generation, position counterbalancing, and statistical analysis so you don't have to.
Candor does this so you don't have to
Pair generation
N items produce N*(N-1)/2 unique pairs. 5 items = 10 pairs, 10 items = 45 pairs. Candor's engine handles the combinatorial explosion seamlessly.
Display randomization
Each pair is randomly assigned AB or BA order with a 50/50 split. Double randomization prevents position bias and ensures high-quality signal.
Multiple participants per pair
Each pair is evaluated by multiple participants — not just one. Overlapping judgments let Candor measure inter-rater agreement and produce statistically reliable rankings.
Response de-mapping
Responses are automatically mapped back to original item IDs regardless of display order. You always see true, normalized item preferences.
{ original_id: "item_c" }Audio Quality
Ranking TTS or audio generation models by perceived quality and naturalness.
Visual Design
Comparing UI designs, logos, or creative assets to identify the most effective variant.
Copy Testing
A/B testing marketing copy or technical documentation for clarity and engagement.
RLHF Data
Reinforcement Learning from Human Feedback: collecting preference data for LLM tuning.
Model Benchmarking
Human-in-the-loop selection and benchmarking of generative model outputs.