pptx
PPTX creation, editing, and analysis
Overview
A user may ask you to create, edit, or analyze the contents of a .pptx file. A .pptx file is essentially a ZIP archive containing XML files and other resources that you can read or edit.
Reading and analyzing content
Text extraction
If you just need to read the text contents of a presentation, you should convert the document to markdown:
# Convert document to markdown
python -m markitdown path-to-file.pptx
Raw XML access
You need raw XML access for: comments, speaker notes, slide layouts, animations, design elements, and complex formatting.
Key file structures
ppt/presentation.xml- Main presentation metadata and slide referencesppt/slides/slide{N}.xml- Individual slide contents (slide1.xml, slide2.xml, etc.)ppt/notesSlides/notesSlide{N}.xml- Speaker notes for each slideppt/comments/modernComment_*.xml- Comments for specific slidesppt/slideLayouts/- Layout templates for slidesppt/slideMasters/- Master slide templatesppt/theme/- Theme and styling informationppt/media/- Images and other media files
Creating a new PowerPoint presentation
Design Principles
CRITICAL: Before creating any presentation, analyze the content and choose appropriate design elements:
- Consider the subject matter: What is this presentation about? What tone, industry, or mood does it suggest?
- Check for branding: If the user mentions a company/organization, consider their brand colors and identity
- Match palette to content: Select colors that reflect the subject
- State your approach: Explain your design choices before writing code
Requirements:
- Use web-safe fonts only: Arial, Helvetica, Times New Roman, Georgia, Courier New, Verdana, Tahoma, Trebuchet MS, Impact
- Create clear visual hierarchy through size, weight, and color
- Ensure readability: strong contrast, appropriately sized text, clean alignment
- Be consistent: repeat patterns, spacing, and visual language across slides
Converting Slides to Images
To visually analyze PowerPoint slides, convert them to images using a two-step process:
-
Convert PPTX to PDF:
soffice --headless --convert-to pdf template.pptx -
Convert PDF pages to JPEG images:
pdftoppm -jpeg -r 150 template.pdf slideThis creates files like
slide-1.jpg,slide-2.jpg, etc.
Code Style Guidelines
IMPORTANT: When generating code for PPTX operations:
- Write concise code
- Avoid verbose variable names and redundant operations
- Avoid unnecessary print statements
Dependencies
Required dependencies (should already be installed):
- markitdown:
pip install "markitdown[pptx]"(for text extraction from presentations) - pptxgenjs:
npm install -g pptxgenjs(for creating presentations) - LibreOffice:
sudo apt-get install libreoffice(for PDF conversion) - Poppler:
sudo apt-get install poppler-utils(for pdftoppm to convert PDF to images)