Python Office Automation with Excel — How to use xlsxwriter

1. Introduction

Previously, we discussed the two most common approaches for handling Excel files in Python: xlrd/xlwt and openpyxl.

To summarize:

  • The xlrd/xlwt combination allows reading data with xlrd and writing data with xlwt, but has the drawback of not supporting the .xlsx format.
  • openpyxl supports both reading and writing Excel documents, but does not support the older .xls format.

This article continues the discussion by exploring other methods for manipulating Excel documents with Python.

2. xlsxwriter

xlsxwriter is primarily used for writing data and charts into Excel files. It is optimized to write data quickly while using relatively little memory.

Its main limitations are:

  • It cannot read or modify existing Excel files.
  • If you need to read or modify existing files, you must use it in combination with other libraries (e.g., xlrd).

First, install the xlsxwriter package:

bash

# Install the dependency package
pip3 install xlsxwriter

xlsxwriter provides the Workbook(filename) method to create a new workbook object.

Use the workbook object’s add_worksheet(sheet_name) function to create sheets within the workbook.

python

import xlsxwriter

def create_workbook_and_worksheet(filename, worksheet_names):
    """
    Create a workbook and multiple worksheets
    :param filename: Output filename
    :param worksheet_names: List of sheet names
    :return: Tuple (workbook object, list of worksheet objects)
    """
    wb = xlsxwriter.Workbook(filename)
    sheets = []

    # Add worksheets
    for worksheet_name in worksheet_names:
        sheets.append(wb.add_worksheet(worksheet_name))

    return wb, sheets

You can then write data to cells in any sheet.

To customize cell styles (e.g., font size, font family, color, background, bold), use the workbook object’s add_format() method to create a format object.

python

def create_format_styles(wb, format_styles_dict):
    """
    Create a format style for cells, including font size, font, color, background, bold, etc.
    :param wb: Workbook object
    :param format_styles_dict: Dictionary of style properties
    :return: Format object
    """
    return wb.add_format(format_styles_dict)

# Example: Define a style dictionary for a title cell
self.title_style = {'bold': True, 'bg_color': '#B0C4DE', 'font_size': 10, 'font_name': 'Microsoft Yahei'}

# Create the title font style format object
title_font_style = create_format_styles(self.wb, self.title_style)

The worksheet object’s write() function is used to write data to a cell. Parameters include row index, column index, value, and format style.

Important Note: By default, xlsxwriter uses 0-based indexing for both rows and columns (i.e., 0 represents the first row/column).

Here’s how to write data while applying a cell style:

python

def write_to_cell(sheet, row_index, column_index, value, format_styles=None):
    """
    Write data to a specific cell
    :param sheet: Worksheet object
    :param row_index: Row index (1-based: 1 = first row)
    :param column_index: Column index (1-based: 1 = first column)
    :param value: Value to write
    :param format_styles: Format object for styling (optional)
    :return: None
    """
    if row_index < 1 or column_index < 1:
        print('Invalid parameters! Write operation failed.')
    else:
        # Note: xlsxwriter uses 0-based indexing internally
        sheet.write(row_index - 1, column_index - 1, value, format_styles)

# Example: Write data to the worksheet
# First row (with title style)
write_to_cell(self.current_sheet, 1, 1, "Name", title_font_style)
write_to_cell(self.current_sheet, 1, 2, "Age", title_font_style)
# Second row (without specific style)
write_to_cell(self.current_sheet, 2, 1, 'xingag')
write_to_cell(self.current_sheet, 2, 2, 23)

xlsxwriter also supports inserting images into cells, including both local and web images.

The method used is: insert_image()

Parameters include: row index (0-based), column index, image file path, and optional parameters (like position, scaling, URL hyperlink, image_data for byte stream, etc.).

Let’s take inserting a web image as an example.

First, define an options dictionary for the image display, specifying scaling and URL hyperlink.

python

def create_image_options(x_offset=0, y_offset=0, x_scale=1, y_scale=1, url=None, tip=None, image_data=None, positioning=None):
    """
    Configure parameters for inserting an image
    Includes: offset, scaling, URL, hyperlink, tooltip, etc.
    :param x_offset: X offset within the cell
    :param y_offset: Y offset within the cell
    :param x_scale: X scale factor
    :param y_scale: Y scale factor
    :param url: URL for hyperlink
    :param tip: Tooltip text
    :param image_data: Image byte stream
    :param positioning: Positioning method
    :return: Dictionary of image options
    """
    image_options = {
        'x_offset': x_offset,
        'y_offset': y_offset,
        'x_scale': x_scale,
        'y_scale': y_scale,
        'url': url,
        'tip': tip,
        'image_data': image_data,
        'positioning': positioning,
    }
    # Remove keys with None values
    return {k: v for k, v in image_options.items() if v is not None}

# Example: Create options for a scaled image with a hyperlink
image_options = create_image_options(x_scale=0.5, y_scale=0.5, url='https://www.jianshu.com/u/f3b476549169')

Next, convert the web image into a byte stream.

python

from io import BytesIO
from urllib.request import urlopen
import ssl

def get_image_data_from_network(url):
    """
    Fetch image data from a URL as a byte stream
    :param url: Image URL
    :return: BytesIO object containing image data
    """
    # Bypass SSL verification (use with caution)
    ssl._create_default_https_context = ssl._create_unverified_context
    # Get the image byte stream from the URL
    image_data = BytesIO(urlopen(url).read())
    return image_data

Finally, insert the image into the cell.

python

def insert_network_image(sheet, row_index, column_index, image_url, filepath, image_options=None):
    """
    Insert a web image into a cell
    :param sheet: Worksheet object
    :param row_index: Row index (1-based)
    :param column_index: Column index (1-based)
    :param image_url: URL of the image to download
    :param filepath: Filename/path to associate with the image in Excel
    :param image_options: Dictionary of image options (optional)
    :return: None or error message
    """
    if row_index < 1 or column_index < 1:
        return "Invalid parameters! Image insertion failed."

    # Get the image byte stream
    image_data = get_image_data_from_network(image_url)

    # Prepare the options dictionary for insert_image
    options = image_options.copy() if image_options else {}
    options['image_data'] = image_data

    # Insert the image (convert to 0-based indexing)
    sheet.insert_image(row_index - 1, column_index - 1, filepath, options)

# Example usage
insert_network_image(self.current_sheet, 1, 1, url, '1.png', image_options)

Use the set_column() method to set column width.

Similar to openpyxl, there are two ways to specify columns: by string index (e.g., ‘A’) or by column number index.

python

def set_column_width(sheet, index_start, index_end, width):
    """
    Set the width for a range of columns
    :param sheet: Worksheet object
    :param index_start: Start column index (1-based)
    :param index_end: End column index (1-based)
    :param width: Width to set
    :return: None
    """
    # Method 1: Using string notation (e.g., 'A:C')
    # sheet.set_column('A:C', width)

    # Method 2: Using numeric indices (0-based)
    sheet.set_column(index_start - 1, index_end - 1, width)

# Example: Set width for columns 1 to 3 to 100
set_column_width(self.current_sheet, 1, 3, 100)

Use the set_row() method to set row height, passing the row index and height.

python

def set_row_height(sheet, row_index, height):
    """
    Set the height for a specific row
    :param sheet: Worksheet object
    :param row_index: Row index (1-based)
    :param height: Height to set
    :return: None
    """
    sheet.set_row(row_index - 1, height)

# Example: Set row heights
set_row_height(self.current_sheet, 1, 50)  # Set height of row 1 to 50
set_row_height(self.current_sheet, 2, 100) # Set height of row 2 to 100

After writing all data, close the workbook. This action automatically saves the file to disk.

python

def teardown(self):
    # Write the file and close it
    self.wb.close()

xlsxwriter also supports inserting various types of charts (e.g., bar charts, column charts, radar charts). Due to space limitations, this topic won’t be covered in detail here.

3. Other Approaches

Another commonly used method is xlwings.

xlwings is an open-source and free library that supports reading, writing, and modifying Excel files.

It is very powerful and can seamlessly integrate with Matplotlib, NumPy, and Pandas, supporting read/write operations for NumPy and Pandas data types. Additionally, xlwings can directly call VBA programs embedded in Excel files.

Important Note: xlwings depends on Microsoft Excel software. Users of WPS Office are advised to use openpyxl instead.

Another powerful method for manipulating Excel is Pywin32.

Pywin32 essentially calls the Windows system APIs to operate on Excel files.

Its advantages include the ability to handle complex charts and data tables.

Its disadvantages are also significant, including slower performance, high CPU usage, and the fact that it only supports Windows systems.