summaryrefslogtreecommitdiff
path: root/README.md
blob: d2e7375b3645f700704186dddbc9065729e93595 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# kjandoc

> i needed something to combine multiple `.pptx` files, i couldn't believe `pandoc` can't do that, so here's a creative attempt using OOXML manipulation to do it.

## what it does
- merges multiple .pptx files into one
- preserves full editability, with 99% fidelity to the original formatting (some minor quirks may occur)
- copies slide masters, layouts, themes, notes, and embedded media
- `pandoc`-style usage: `kjandoc input1.pptx input2.pptx -o combined.pptx`

## why this exists
`pandoc` is great, but it can't concatenate `.pptx` files.

this works directly at the OOXML/ZIP level: it reads each `.pptx` as a ZIP archive, rewires all internal XML relationships, and writes a new near full Microsoft-compliant `.pptx`.

a final LibreOffice normalization pass cleans up any lingering structural quirks to prevent PowerPoint repair prompts (not guaranteed though).

## usage
```bash
# pandoc-style usage
./kjandoc input1.pptx input2.pptx -o combined.pptx

# merge more than two
./kjandoc a.pptx b.pptx c.pptx -o combined.pptx
```

## deps
- python3
- libreoffice (for the normalization pass)
- python deps in requirements.txt (`lxml`)

## notes
- output slides are fully editable
- masters and layouts from all source files are carried over
- duplicate media files are deduplicated automatically