FlwKit
Experimentation

A/B testing

Test different versions of your onboarding flow to find what converts best.

Overview

FlwKit experiments let you compare onboarding variants on a flow, measure conversion to paywall, and promote winners to live without an App Store release.

Experiments are managed from two places:

  • App-level Experiments tab in the dashboard sidebar for cross-flow overview.
  • Flow-level Experiments workspace for creating, editing, and analyzing one flow’s tests.

How it works

  1. Open a flow and create an experiment.
  2. FlwKit creates Control + Variant B from the flow.
  3. Assign traffic split (must total 100%).
  4. Edit variant screens independently.
  5. Start the experiment.
  6. Backend assigns variants deterministically during normal SDK flow fetch.
  7. Analytics events include experiment and variant context.
  8. Review results, then promote a winner.

App-level experiments

Use Sidebar → Experiments to see an app-wide overview:

  • experiment counts per flow
  • running/draft/completed status snapshots
  • quick actions to open a flow’s experiments or start a new one

Flow-level experiments

Use Flow → Experiments for full workflow:

  • create and configure variants
  • edit screens per variant
  • manage status (draft, running, paused, completed)
  • inspect results and promote winners

Creating an experiment

Step 1 — Name your experiment

Set:

  • Experiment name (required)
  • Hypothesis (optional)

Step 2 — Configure variants

  • Control starts from current flow content.
  • Variant B starts as a copy.
  • You can add variants up to 4 total.
  • Each variant has independent flow version content.

Step 3 — Set traffic splits

Traffic must total exactly 100% (1.0 server-side tolerance).

Recommended defaults:

  • 50/50 for standard tests
  • Uneven splits for higher-risk variants

Running an experiment

  • Start validates and moves draft → running.
  • Pause stops new assignment while preserving existing assignment data.
  • Resume continues assignment for paused tests.
  • Complete ends the test without promoting.

Reading results

Sessions and conversion rate

Primary KPI is paywall conversion: paywallReached / sessions.

Win probability

Bayesian win probability estimates the chance a variant truly beats control.

Confidence interval

FlwKit returns 95% Wilson confidence intervals for conversion rates.

How long to run an experiment

Aim for:

  • ~200 sessions per variant minimum
  • at least one full weekday/weekend cycle

Promoting a winner

Promoting a variant:

  1. Copies winner screens to base flow
  2. Marks experiment completed
  3. Serves winner content on new SDK fetches (remote config behavior)

Resetting assignments

Reset assignments:

  • generates a new assignment salt
  • invalidates cached assignments
  • preserves historical analytics

Identifying users

Device ID (default)

FlwKit persists a device ID and sends it in X-FlwKit-Device-ID.

User ID (optional)

For logged-in products, use your stable user ID to improve cross-device consistency:

FlwKit.identify(userId: currentUser.id)

Experiment limits

  • Free plan: 1 experiment per flow
  • Indie plan: unlimited experiments

FAQ

Can I run multiple experiments on the same flow?

Only one running experiment is allowed per flow at a time.

What happens when I promote?

The experiment is completed and winning screens become base flow content immediately for future fetches.

Can I change traffic splits mid-run?

Yes. Split changes on running experiments require force: true and trigger assignment reset.

Does SDK add extra latency?

No additional request for assignment; it is handled during normal flow fetch.

I edited one variant and another changed too. Why?

Variants should be version-isolated. Editing opens flow editor with a variant-specific versionId to keep changes scoped.

Troubleshooting

Results not appearing

Check:

  • experiment status is running
  • SDK is sending device ID
  • events include experiment/variant context
  • ingestion delay has passed

Variant traffic looks uneven

Small samples can look uneven. Let traffic accumulate before judging distribution.

Win probability stays low

Gather more sessions and confirm each variant has a comparable path to paywall.