Problem
I want to bucket or classify each row — small / medium / bulk order,
northern / southern, paid / unpaid — and put the label in a derived
column. case does this with less syntax and more safety than a
nested if/else.
Setup
val orders = [
{ id = 1, quantity = 12, region = "north", status = SOME "fulfilled" },
{ id = 2, quantity = 30, region = "south", status = SOME "fulfilled" },
{ id = 3, quantity = 8, region = "north", status = SOME "cancelled" },
{ id = 4, quantity = 50, region = "west", status = SOME "fulfilled" },
{ id = 5, quantity = 6, region = "east", status = NONE },
{ id = 6, quantity = 20, region = "south", status = SOME "fulfilled" }
];
Example
A size-band function applied in yield to derive a new column:
fun sizeBand q =
if q >= 40 then "bulk"
else if q >= 15 then "medium"
else "small";
from ord in orders
yield { ord.id, ord.quantity, band = sizeBand ord.quantity };
val it =
[{band="small",id=1,quantity=12},{band="medium",id=2,quantity=30},
{band="small",id=3,quantity=8},{band="bulk",id=4,quantity=50},
{band="small",id=5,quantity=6},{band="medium",id=6,quantity=20}]
: {band:string, id:int, quantity:int} list
What's happening
sizeBand is a plain function: int -> string. The type checker
pins that down from the operators alone — q >= 40 means q is
comparable, and the branches all return strings, so the result type
is string. The function gets called once per row in the yield
step, and the new band field joins whatever else you project.
For a multi-axis classifier, case is the move. It pattern-matches
on tuples, options, or any algebraic shape, and the compiler warns
when you've missed a case — exhaustiveness checking you can't get
from a pile of ifs. The wildcard _ matches anything, so a catch-
all branch at the end is the Morel spelling of SQL's ELSE.
One 0.8 rough edge to avoid: the partial record pattern fun f { quantity, region, ... } = … silently binds nothing and produces
wrong answers at runtime. Until 0.9 lands, take a tuple of the fields
you care about instead — fun describe (region, bulk) = … — and
build that tuple in the yield. The type signature ends up clearer
too.
Variations
Classify on two axes with case and a tuple pattern:
fun describe (region, bulk) =
case (region, bulk) of
("north", true) => "large northern order"
| ("north", false) => "small northern order"
| (_, true) => "large order"
| (_, false) => "small order";
from ord in orders
yield { ord.id, label = describe (ord.region, ord.quantity >= 20) };
Destructure an option inline. status is string option, so the
match distinguishes fulfilled, cancelled, anything else, and
NONE — four exhaustive branches:
from ord in orders
yield { ord.id,
state = case ord.status of
SOME "fulfilled" => "ok"
| SOME "cancelled" => "cancelled"
| SOME other => other
| NONE => "unknown" };
See also
- Recipe 08 — Select and rename columns — where derived columns first show up in
yield. - Recipe 13 — Handle missing values — more on the
optiontype and its patterns. - Recipe 17 — Define a metric once — extract the classifier into a named function and reuse it.