Python Office Automation with Excel — How to use xlsxwriter
1. Introduction
Previously, we discussed the two most common approaches for handling Excel files in Python: xlrd/xlwt and openpyxl.
To summarize:
- The xlrd/xlwt combination allows reading data with xlrd and writing data with xlwt, but has the drawback of not supporting the .xlsx format.
- openpyxl supports both reading and writing Excel documents, but does not support the older .xls format.
This article continues the discussion by exploring other methods for manipulating Excel documents with Python.
2. xlsxwriter
xlsxwriter is primarily used for writing data and charts into Excel files. It is optimized to write data quickly while using relatively little memory.
Its main limitations are:
- It cannot read or modify existing Excel files.
- If you need to read or modify existing files, you must use it in combination with other libraries (e.g., xlrd).
First, install the xlsxwriter package:
bash
# Install the dependency package pip3 install xlsxwriter
xlsxwriter provides the Workbook(filename)
method to create a new workbook object.
Use the workbook object’s add_worksheet(sheet_name)
function to create sheets within the workbook.
python
import xlsxwriter def create_workbook_and_worksheet(filename, worksheet_names): """ Create a workbook and multiple worksheets :param filename: Output filename :param worksheet_names: List of sheet names :return: Tuple (workbook object, list of worksheet objects) """ wb = xlsxwriter.Workbook(filename) sheets = [] # Add worksheets for worksheet_name in worksheet_names: sheets.append(wb.add_worksheet(worksheet_name)) return wb, sheets
You can then write data to cells in any sheet.
To customize cell styles (e.g., font size, font family, color, background, bold), use the workbook object’s add_format()
method to create a format object.
python
def create_format_styles(wb, format_styles_dict): """ Create a format style for cells, including font size, font, color, background, bold, etc. :param wb: Workbook object :param format_styles_dict: Dictionary of style properties :return: Format object """ return wb.add_format(format_styles_dict) # Example: Define a style dictionary for a title cell self.title_style = {'bold': True, 'bg_color': '#B0C4DE', 'font_size': 10, 'font_name': 'Microsoft Yahei'} # Create the title font style format object title_font_style = create_format_styles(self.wb, self.title_style)
The worksheet object’s write()
function is used to write data to a cell. Parameters include row index, column index, value, and format style.
Important Note: By default, xlsxwriter uses 0-based indexing for both rows and columns (i.e., 0 represents the first row/column).
Here’s how to write data while applying a cell style:
python
def write_to_cell(sheet, row_index, column_index, value, format_styles=None): """ Write data to a specific cell :param sheet: Worksheet object :param row_index: Row index (1-based: 1 = first row) :param column_index: Column index (1-based: 1 = first column) :param value: Value to write :param format_styles: Format object for styling (optional) :return: None """ if row_index < 1 or column_index < 1: print('Invalid parameters! Write operation failed.') else: # Note: xlsxwriter uses 0-based indexing internally sheet.write(row_index - 1, column_index - 1, value, format_styles) # Example: Write data to the worksheet # First row (with title style) write_to_cell(self.current_sheet, 1, 1, "Name", title_font_style) write_to_cell(self.current_sheet, 1, 2, "Age", title_font_style) # Second row (without specific style) write_to_cell(self.current_sheet, 2, 1, 'xingag') write_to_cell(self.current_sheet, 2, 2, 23)
xlsxwriter also supports inserting images into cells, including both local and web images.
The method used is: insert_image()
Parameters include: row index (0-based), column index, image file path, and optional parameters (like position, scaling, URL hyperlink, image_data
for byte stream, etc.).
Let’s take inserting a web image as an example.
First, define an options dictionary for the image display, specifying scaling and URL hyperlink.
python
def create_image_options(x_offset=0, y_offset=0, x_scale=1, y_scale=1, url=None, tip=None, image_data=None, positioning=None): """ Configure parameters for inserting an image Includes: offset, scaling, URL, hyperlink, tooltip, etc. :param x_offset: X offset within the cell :param y_offset: Y offset within the cell :param x_scale: X scale factor :param y_scale: Y scale factor :param url: URL for hyperlink :param tip: Tooltip text :param image_data: Image byte stream :param positioning: Positioning method :return: Dictionary of image options """ image_options = { 'x_offset': x_offset, 'y_offset': y_offset, 'x_scale': x_scale, 'y_scale': y_scale, 'url': url, 'tip': tip, 'image_data': image_data, 'positioning': positioning, } # Remove keys with None values return {k: v for k, v in image_options.items() if v is not None} # Example: Create options for a scaled image with a hyperlink image_options = create_image_options(x_scale=0.5, y_scale=0.5, url='https://www.jianshu.com/u/f3b476549169')
Next, convert the web image into a byte stream.
python
from io import BytesIO from urllib.request import urlopen import ssl def get_image_data_from_network(url): """ Fetch image data from a URL as a byte stream :param url: Image URL :return: BytesIO object containing image data """ # Bypass SSL verification (use with caution) ssl._create_default_https_context = ssl._create_unverified_context # Get the image byte stream from the URL image_data = BytesIO(urlopen(url).read()) return image_data
Finally, insert the image into the cell.
python
def insert_network_image(sheet, row_index, column_index, image_url, filepath, image_options=None): """ Insert a web image into a cell :param sheet: Worksheet object :param row_index: Row index (1-based) :param column_index: Column index (1-based) :param image_url: URL of the image to download :param filepath: Filename/path to associate with the image in Excel :param image_options: Dictionary of image options (optional) :return: None or error message """ if row_index < 1 or column_index < 1: return "Invalid parameters! Image insertion failed." # Get the image byte stream image_data = get_image_data_from_network(image_url) # Prepare the options dictionary for insert_image options = image_options.copy() if image_options else {} options['image_data'] = image_data # Insert the image (convert to 0-based indexing) sheet.insert_image(row_index - 1, column_index - 1, filepath, options) # Example usage insert_network_image(self.current_sheet, 1, 1, url, '1.png', image_options)
Use the set_column()
method to set column width.
Similar to openpyxl, there are two ways to specify columns: by string index (e.g., ‘A’) or by column number index.
python
def set_column_width(sheet, index_start, index_end, width): """ Set the width for a range of columns :param sheet: Worksheet object :param index_start: Start column index (1-based) :param index_end: End column index (1-based) :param width: Width to set :return: None """ # Method 1: Using string notation (e.g., 'A:C') # sheet.set_column('A:C', width) # Method 2: Using numeric indices (0-based) sheet.set_column(index_start - 1, index_end - 1, width) # Example: Set width for columns 1 to 3 to 100 set_column_width(self.current_sheet, 1, 3, 100)
Use the set_row()
method to set row height, passing the row index and height.
python
def set_row_height(sheet, row_index, height): """ Set the height for a specific row :param sheet: Worksheet object :param row_index: Row index (1-based) :param height: Height to set :return: None """ sheet.set_row(row_index - 1, height) # Example: Set row heights set_row_height(self.current_sheet, 1, 50) # Set height of row 1 to 50 set_row_height(self.current_sheet, 2, 100) # Set height of row 2 to 100
After writing all data, close the workbook. This action automatically saves the file to disk.
python
def teardown(self): # Write the file and close it self.wb.close()
xlsxwriter also supports inserting various types of charts (e.g., bar charts, column charts, radar charts). Due to space limitations, this topic won’t be covered in detail here.
3. Other Approaches
Another commonly used method is xlwings.
xlwings is an open-source and free library that supports reading, writing, and modifying Excel files.
It is very powerful and can seamlessly integrate with Matplotlib, NumPy, and Pandas, supporting read/write operations for NumPy and Pandas data types. Additionally, xlwings can directly call VBA programs embedded in Excel files.
Important Note: xlwings depends on Microsoft Excel software. Users of WPS Office are advised to use openpyxl instead.
Another powerful method for manipulating Excel is Pywin32.
Pywin32 essentially calls the Windows system APIs to operate on Excel files.
Its advantages include the ability to handle complex charts and data tables.
Its disadvantages are also significant, including slower performance, high CPU usage, and the fact that it only supports Windows systems.
Related articles