Office Oxide CLI — Quick Start
office-oxide is a command-line tool for fast, local Office document processing. It ships the same Rust core that powers the library — zero cloud, zero dependencies.
Install
Cargo (any platform):
cargo install office_oxide_cli
cargo-binstall (pre-built binary):
cargo binstall office_oxide_cli
From source:
git clone https://github.com/yfedoseev/office_oxide
cd office_oxide
cargo install --path crates/office_oxide_cli
The installed binary is office-oxide.
Quick Start
# Extract plain text
office-oxide text report.docx
# Convert to Markdown
office-oxide markdown data.xlsx -o data.md
# Convert to HTML
office-oxide html slides.pptx -o slides.html
# Dump the format-agnostic IR as JSON
office-oxide ir document.docx -o document.ir.json
# Convert legacy DOC → modern DOCX
office-oxide convert old.doc modern.docx
Run office-oxide --help for the full flag list, or office-oxide <command> --help for any specific command.
Commands
| Command | Description |
|---|---|
text |
Extract plain UTF-8 text |
markdown |
Convert to GitHub-flavored Markdown |
html |
Convert to semantic HTML |
ir |
Dump the format-agnostic IR as JSON |
convert |
Convert between formats (legacy → OOXML, OOXML → OOXML) |
info |
Show format, page/sheet/slide counts, and metadata |
All commands accept any of the six supported formats: .docx, .xlsx, .pptx, .doc, .xls, .ppt.
Global options
-o, --output <PATH> Output file (defaults to stdout for text outputs)
-v, --verbose Show timing information
-q, --quiet Suppress non-essential output
--json Wrap output in a JSON envelope
Examples
Extract text from a spreadsheet:
office-oxide text quarterly.xlsx
Migrate a corpus of legacy .doc files in parallel:
find legacy/ -iname '*.doc' | \
parallel 'office-oxide convert {} modern/{/.}.docx'
Convert a deck for an LLM pipeline:
office-oxide markdown deck.pptx -o deck.md
Inspect a file:
office-oxide info mystery.bin
# format: xlsx, sheets: 4, named_ranges: 12, ...
Pipe through jq:
office-oxide ir report.docx | jq '.sections[].title'
Stdin / stdout
text, markdown, html, and ir write to stdout by default — handy for pipelines:
office-oxide text report.docx | grep -i "executive summary"
When --output is given, the result is written to that file instead.
See also
- Rust crate — the same engine as a library
- MCP server — give AI assistants the same toolkit
- Performance benchmarks