Migrate from python-calamine (and calamine)
calamine is the well-regarded Rust XLSX/XLS reader; python-calamine is its Python binding. Both focus on spreadsheets only.
Office Oxide is 2.8× faster than python-calamine on XLSX (5.0 ms vs 13.9 ms mean across 1,802 files) with the highest pass rate (97.8% vs 96.6%). It also adds full DOCX, PPTX, and legacy DOC/PPT support — formats calamine doesn’t read at all.
When to migrate
Switch if any of these apply:
- You also need
.docx/.pptx/.doc/.ppt(calamine is XLSX/XLS only) - You want a wider feature set: Markdown / HTML output, structured IR, templating via
EditableDocument - Pass rate matters more than the marginal performance edge calamine offers in some scenarios
- You’re on the Python binding and want fewer cross-FFI conversions
Stay on calamine if:
- You only ever read
.xlsxand.xls - You depend on calamine-specific APIs (
Reader::with_header_row,worksheet_range_at, etc.) - You need formula expressions (calamine surfaces them; the Office Oxide IR doesn’t)
Install (Python)
pip uninstall python-calamine
pip install office-oxide
Install (Rust)
# Cargo.toml
[dependencies]
# Replace:
# calamine = "0.30"
office_oxide = "0.1.0"
Side-by-side cheat sheet — Python
Open a workbook
python-calamine
from python_calamine import CalamineWorkbook
wb = CalamineWorkbook.from_path("budget.xlsx")
office_oxide
from office_oxide import Document
with Document.open("budget.xlsx") as doc:
...
Iterate sheets
python-calamine
for name in wb.sheet_names:
sheet = wb.get_sheet_by_name(name)
for row in sheet.to_python():
print(row)
office_oxide
with Document.open("budget.xlsx") as doc:
ir = doc.to_ir()
for section in ir["sections"]:
print(f"# {section.get('title')}")
for el in section["elements"]:
if el["kind"] == "Table":
for row in el["rows"]:
print(row)
Read a single sheet to rows
python-calamine
sheet = wb.get_sheet_by_name("Q4")
rows = sheet.to_python()
office_oxide
with Document.open("budget.xlsx") as doc:
table = next(
el for section in doc.to_ir()["sections"]
if section.get("title") == "Q4"
for el in section["elements"] if el["kind"] == "Table"
)
rows = table["rows"]
For a more direct path:
with Document.open("budget.xlsx") as doc:
sheet = doc.as_xlsx().sheet("Q4")
rows = sheet.rows() # list[list[str]]
Sheet names
python-calamine
print(wb.sheet_names)
office_oxide
with Document.open("budget.xlsx") as doc:
print([s.name() for s in doc.as_xlsx().sheets()])
Side-by-side cheat sheet — Rust
Open and iterate
calamine
use calamine::{open_workbook, Xlsx, Reader};
let mut wb: Xlsx<_> = open_workbook("budget.xlsx")?;
for sheet_name in wb.sheet_names() {
if let Ok(range) = wb.worksheet_range(&sheet_name) {
for row in range.rows() {
println!("{row:?}");
}
}
}
office_oxide
use office_oxide::Document;
let doc = Document::open("budget.xlsx")?;
if let Some(xlsx) = doc.as_xlsx() {
for sheet in xlsx.sheets() {
for cell in sheet.cells() {
println!("{}: {:?}", cell.address(), cell.value());
}
}
}
Format-agnostic IR (no calamine equivalent)
let doc = Document::open("budget.xlsx")?;
let ir = doc.to_ir();
serde_json::to_writer(std::io::stdout(), &ir)?;
This is the same shape you’d get from a .docx or .pptx — useful when downstream consumers should not care about the source format.
XLSX writes
calamine is read-only. Office Oxide writes XLSX cells via EditableDocument:
from office_oxide import EditableDocument
with EditableDocument.open("budget.xlsx") as ed:
ed.set_cell(0, "B5", 42_000)
ed.save("budget.xlsx")
For full XLSX construction, drop into xlsx::create::XlsxBuilder or use umya-spreadsheet / rust_xlsxwriter.
Performance
| Library | XLSX Mean | p99 | Pass Rate |
|---|---|---|---|
| office_oxide | 5.0 ms | 40 ms | 97.8% |
| python-calamine | 13.9 ms | 183 ms | 96.6% |
| openpyxl | 94.5 ms | 698 ms | 96.2% |
| Library | XLS Mean | p99 | Pass Rate |
|---|---|---|---|
| office_oxide | 2.8 ms | 75 ms | 99.2% |
| python-calamine | 9.0 ms | 96 ms | 90.7% |
What’s different
calamine returns typed Data enums per cell (Int, Float, String, Bool, DateTime, Empty, Error). The Office Oxide IR collapses to strings; for typed cell access, use the format-specific accessor:
with Document.open("budget.xlsx") as doc:
for sheet in doc.as_xlsx().sheets():
for cell in sheet.cells():
print(cell.value(), cell.value_type()) # value_type: "string" | "number" | "boolean" | "empty"
See also
- Migrate from openpyxl — for the openpyxl audience
- Migrate from xlrd — for legacy
.xlsusers - Performance benchmarks