Scaling and Optimizing Data Pipelines with Polars
Liam Brannigan
Data Scientist & Polars Contributor
Chicago Finance Department
Generated: 2026-02-01
WARD;TYPE;STATUS;REQUEST_COST
11;Pothole in Street Complaint;Completed;125
42;Street Light Out;Open;88
27;Alley Light Out;Completed;61
vendor_requests = pl.read_csv(
"ward_service_requests.csv",
skip_rows=2,
)
vendor_requests = pl.read_csv(
"ward_service_requests.csv",
skip_rows=2,
separator=";",
)
vendor_requests.head(3)
shape: (3, 4)
| WARD | TYPE | STATUS | REQUEST_COST |
| --- | --- | --- | --- |
| i64 | str | str | i64 |
|------|-----------------------------|-----------|--------------|
| 11 | Pothole in Street Complaint | Completed | 125 |
| 42 | Street Light Out | Open | 88 |
| 27 | Alley Light Out | Completed | 61 |
WARD;TYPE;STATUS;REQUEST_COST
11;Pothole in Street Complaint;Completed;125
42;Street Light Out;Open;88
27;Alley Light Out;Completed;61
$$
WARD;TYPE;STATUS;REQUEST_COST
11;Pothole in Street Complaint;Completed;125
42;Street Light Out;Open;88
27;Alley Light Out;Completed;61
...
3;Rodent Baiting Service Request;Completed;61.5
$$
vendor_requests = pl.read_csv(
"ward_service_requests.csv",
separator=";",
skip_rows=2,
infer_schema_length=200,
)
vendor_requests.schema
Schema({'WARD': Int64, 'TYPE': String, 'STATUS': String, 'REQUEST_COST': Float64})
vendor_requests = pl.read_csv(
"ward_service_requests.csv",
separator=";",
skip_rows=2,
schema={
"WARD": pl.Int64,
"TYPE": pl.String,
"STATUS": pl.String,
"REQUEST_COST": pl.Float64,
},
)
vendor_requests = pl.read_csv(
"ward_service_requests.csv",
separator=";",
skip_rows=2,
schema_overrides={"REQUEST_COST": pl.Float64},
)
vendor_requests.schema
Schema({'WARD': Int64, 'TYPE': String, 'STATUS': String, 'REQUEST_COST': Float64})
WARD;TYPE;STATUS;REQUEST_COST
11;Pothole in Street Complaint;Completed;125.0
unknown;Street Light Out;Open;88.5
27;Alley Light Out;Completed;61.0
pl.read_csv(
"ward_service_requests.csv",
separator=";",
skip_rows=2,
infer_schema_length=200,
ignore_errors=True,
)
$$
pl.read_csv(
"ward_service_requests.csv",
separator=";",
skip_rows=2,
infer_schema_length=200,
null_values={"WARD": "unknown"},
)
$$
ignore_errorsScaling and Optimizing Data Pipelines with Polars