Python Office Automation with PPT — How to use python-pptx part1
1. Introduction
For office automation, Python is unbeatable! Starting with this article, we’ll continue discussing another commonly used series in office automation: PowerPoint presentations.
2. Preparation
The most powerful dependency library for Python PPT operations is: python-pptx. Before starting, we need to install this dependency in a virtual environment:
python
# Install dependency pip3 install python-pptx
3. PPT Structure
First, we need to understand the page structure of a PPT document:
- A PPT document corresponds to a Presentation object
- A Presentation contains multiple Slide objects, each representing a slide
- The content of each slide is composed of various Shapes
Secondly, content elements in PPT are composed of various shapes such as: text boxes, images, placeholders, tables, regular shapes, etc. By examining the source code, we find they’re all defined in the MSO_SHAPE_TYPE class.
Finally, we need to understand layout templates in PPT. Using the Presentation object’s property method slide_layouts, we can get the built-in 11 master styles:
python
# Use Presentation to get PPT's built-in 11 layout styles # Layout index starts from 0 slide_layout = presentation.slide_layouts[slide_style_index]
They are respectively:
- Title Slide
- Title and Content
- Section Header
- Two Content
- Comparison
- Title Only
- Blank
- Content with Caption
- Picture with Caption
- Title and Vertical Text
- Vertical Title and Text
Of course, you can also view corresponding master styles in Microsoft PPT / WPS.
Additionally, besides built-in layout styles, you can also use PlaceHolders to customize masters to meet specific scenario requirements.
4. Slide Management
A PPT file consists of one or multiple slides.
4.1 Adding a Slide
Simply follow these 3 steps:
- Instantiate a Presentation object
- Create a layout style using built-in templates
- Add a slide using the layout style
python
def add_slide(presentation, slide_style_index):
"""
Add slide to PPT document using built-in layout
:param presentation: Document object
:param slide_style_index: Layout index
:return:
"""
# PPT layout styles
# 11 built-in layout styles
# 0: Title Slide
# 1: Title and Content
# 2: Section Header
# 3: Two Content
# 4: Comparison
# 5: Title Only
# 6: Blank
# 7: Content with Caption
# 8: Picture with Caption
# 9: Title and Vertical Text
# 10: Vertical Title and Text
slide_layout = presentation.slide_layouts[slide_style_index]
# Add a slide using layout style
slide = presentation.slides.add_slide(slide_layout)
return slide
# 1.1 Add slides
slide1 = add_slide(self.presentation, 0)
slide2 = add_slide(self.presentation, 1)
slide3 = add_slide(self.presentation, 2)
slide4 = add_slide(self.presentation, 3)
4.2 Getting Existing Slides
The Presentation object’s slides property returns a list of all slide objects in the current PPT document.
python
def get_slides(presentation):
"""
Get all slides
:param presentation:
:return:
"""
# All slides
slides = presentation.slides
# Number of slides
slide_num = len(slides)
return slides, slide_num
def get_slide(presentation, slide_index):
"""
Get a specific slide by index
:param presentation:
:param slide_index: Page index, starting from 0
:return:
"""
slides, slide_num = get_slides(presentation=presentation)
return slides[slide_index]
# 1.2.1 Get all slides
slides, slide_num = get_slides(self.presentation)
print('Existing slides:', slides)
print('Number of slides:', slide_num)
# 1.2.2 Get a specific slide
slide = get_slide(self.presentation, 1)
print(slide.shapes)
4.3 Deleting a Slide
This is also simple – first get the current slide object, then use the following method to remove it:
python
def del_slide(presentation, slide_index=0):
"""
Delete a specific slide
:param presentation:
:param slide_index: Index
:return:
"""
# List of all slides
slides = list(presentation.slides._sldIdLst)
# Delete a specific slide by index
presentation.slides._sldIdLst.remove(slides[slide_index])
# 1.3 Delete a specific slide in PPT document by index
# Example: Delete the 4th slide
del_slide(self.presentation, 3)
5. Text and Paragraphs
First, we need to specify a Slide object, which can be an existing slide or a newly created one.
Then, use the slide object’s slide.shapes property to get the queue of all shapes in the current slide.
Finally, use the following function of the shape queue to add a text box, returning a: text box object
python
add_textbox(left, top, width, height)
Function parameters:
left: Left margintop: Top marginwidth: Text box widthheight: Text box height
This introduces another concept: Text Shape
PS: Text shapes facilitate adding paragraphs and setting styles in text boxes, obtained through the text box object’s property function text_frame.
python
def insert_textbox(slide, left, top, width, height, unit=Inches):
"""
Add text box to slide
:param unit: Unit, default set to Inches
:param slide: Slide object
:param left: Left margin
:param top: Top margin
:param width: Width
:param height: Height
:return:
"""
# Text box
textbox = slide.shapes.add_textbox(left=unit(left),
top=unit(top),
width=unit(width),
height=unit(height))
# Text box shape
tf = textbox.text_frame
return textbox, tf
For convenience, I’ve encapsulated the action of inserting text boxes into slides. Length unit defaults to: Inches, but can also be customized to centimeters, etc.
Next, let’s look at common operations for text boxes and paragraphs:
5.1 Insert Text Box and Set Default Paragraph Content
When inserting a text box, the text shape object comes with a default paragraph that can have its content set.
python
# 2. Insert a text box into the slide, returns a text box object and text box shape object textbox, tf = insert_textbox(slide, 8, 2, 10, 4, unit=Cm) # 2.1 Default paragraph paragraph_default = tf.paragraphs[0] paragraph_default.text = "Set default paragraph content"
5.2 Add New Paragraph in Text Box
Examining the source code reveals that text box shape objects are subclasses of TextFrame, so we can use the add_paragraph() function from the TextFrame class to add a new paragraph.
python
# 2.2 Add a new paragraph paragraph_new = tf.add_paragraph() # 2.3 Set paragraph content paragraph_new.text = "Welcome to follow the official account: AirPython\nWeekly sharing of Python original technical content!"
5.3 Set Paragraph and Text Styles
Like Word, using python-pptx can also set paragraph styles in PPT documents.
Alignment: Alignment is for paragraphs, just specify the paragraph object’s alignment property value.
python
def set_parg_font_style(paragraph, font_name=None, font_color=None, font_size=-1, font_bold=False, font_italic=False,
paragraph_alignment=PP_ALIGN.CENTER):
"""
Set text style in paragraph, including: font name, color, size, bold, italic
:param paragraph_alignment: Paragraph alignment
:param paragraph:
:param font_name:
:param font_color:
:param font_size:
:param font_bold:
:param font_italic:
:return:
"""
# Alignment
# Note: Alignment is for paragraphs
paragraph.alignment = paragraph_alignment
# Get font object in paragraph
font = paragraph.font
# Set font style
set_font_style(font, font_name, font_color, font_size, font_bold, font_italic)
return font
Paragraph Text Attributes: Use the paragraph object’s font property to get the font object, then set font name, size, color, italic, bold.
python
def set_font_style(font, font_name=None, font_color=None, font_size=-1, font_bold=False, font_italic=False):
"""
Set font style
:param font:
:param font_name:
:param font_color:
:param font_size:
:param font_bold:
:param font_italic:
:return:
"""
# Font name
if font_name:
font.name = font_name
# Font color
if font_color and len(font_color) == 3:
font.color.rgb = RGBColor(font_color[0], font_color[1], font_color[2])
# Font size
if font_size != -1:
font.size = Pt(font_size)
# Bold, default not bold
font.bold = font_bold
# Italic, default not italic
font.italic = font_italic
5.4 Set Text Box Background Color
Setting text box background color only requires 2 steps:
- Set shape fill type to solid
- Set text box background color
python
def set_widget_bg(widget, bg_rgb_color=None):
"""
Set background color for [textbox/cell/shape]
:param widget: Textbox, cell, shape
:param bg_rgb_color: Background color value
:return:
"""
if bg_rgb_color and len(bg_rgb_color) == 3:
# 1. Set shape fill type to solid
widget.fill.solid()
# 2. Set text box background color
widget.fill.fore_color.rgb = RGBColor(bg_rgb_color[0], bg_rgb_color[1], bg_rgb_color[2])
# 4. Set text box background color
set_widget_bg(textbox, [0, 255, 0])
Note: This method also applies to setting background colors for table cells and regular shapes.
5.5 Text Box Auto-alignment
When a text box contains very long text that can’t display completely in a single line, we just need to set the text shape’s word_wrap value to True to enable automatic line wrapping.
python
# 5. Set text box text auto-alignment tf.word_wrap = True
Related articles