aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorKyle Javier [kj_sh604]2026-02-28 15:02:16 -0500
committerGitHub2026-02-28 15:02:16 -0500
commit8f9756189c777074b88de39c2de1e2f7153352c2 (patch)
tree9810f206bc9b97f1429bc7f207caf80f6ead7986 /README.md
parentfafc3e29832779b5ccbea8fd21dc9fd5af67de38 (diff)
parent47a9736a1dfa8bfd4c5e5edd111e6ad28536066f (diff)
[merge] pull request #2 from kj-sh604/feat/use-dash-st-rewrite
# feat: use `feat/same-template-concat` branch `-st` implementation as main `kjandoc` binary this pull request updates the project to improve the quality and fidelity of merged `.pptx` files, and simplifies dependencies. the most significant changes are a rewrite of the merging approach to preserve editability and formatting, and the removal of several python dependencies that are no longer needed (as seen in the `feat/same-template-concat` that I still have up) ## enhancements to merging functionality: * the merging process now operates directly on the ooxml/zip structure of `.pptx` files, preserving full editability and achieving near-complete fidelity to the original formatting. slide masters, layouts, themes, notes, and embedded media are all copied, and duplicate media files are deduplicated. * a final libreoffice normalization step is used to clean up structural issues. ## dependency updates: * removed unnecessary dependencies from `src/requirements.txt`, including `pillow`, `python-pptx`, `typing_extensions`, and `xlsxwriter`, leaving only `lxml` as a required python package.
Diffstat (limited to 'README.md')
-rw-r--r--README.md23
1 files changed, 12 insertions, 11 deletions
diff --git a/README.md b/README.md
index f214f6e..bb8b4a7 100644
--- a/README.md
+++ b/README.md
@@ -10,31 +10,32 @@ https://github.com/user-attachments/assets/c7fe58c1-ff76-41bf-977b-870247a6a3e2
## what it does
- merges multiple .pptx files into one
-- preserves visual formatting by rendering slides and rebuilding a new deck
+- preserves full editability, with 99% fidelity to the original formatting (some minor quirks may occur)
+- copies slide masters, layouts, themes, notes, and embedded media
- `pandoc`-style usage: `kjandoc input1.pptx input2.pptx -o combined.pptx`
## why this exists
-`pandoc` is great, but it can't concatenate `.pptx` files.
+`pandoc` is great, but it can't concatenate `.pptx` files.
-this uses a headless libreoffice + pdf -> png rendering to get a merge with most formatting preserved.
+this works directly at the OOXML/ZIP level: it reads each `.pptx` as a ZIP archive, rewires all internal XML relationships, and writes a new near full Microsoft-compliant `.pptx`.
-the tradeoff is the output slides are images (not editable shapes).
+a final LibreOffice normalization pass cleans up any lingering structural quirks to prevent PowerPoint repair prompts (not guaranteed though).
## usage
```bash
# pandoc-style usage
./kjandoc input1.pptx input2.pptx -o combined.pptx
-# tweak quality
-./kjandoc input1.pptx input2.pptx -o combined.pptx --dpi 150
+# merge more than two
+./kjandoc a.pptx b.pptx c.pptx -o combined.pptx
```
## deps
- python3
-- libreoffice
-- poppler (pdftoppm)
-- python deps in requirements.txt
+- libreoffice (for the normalization pass)
+- python deps in requirements.txt (`lxml`)
## notes
-- output size is larger (images)
-- visuals stay intact for the most part
+- output slides are fully editable
+- masters and layouts from all source files are carried over
+- duplicate media files are deduplicated automatically