Excel 批量处理器

功能说明#

日常工作中经常遇到这样的需求：

从 50 个 Excel 文件中提取特定单元格的数据
把数据批量写入多个 Excel 模板
合并多个 Excel 文件到一个汇总表

这个工具封装了常用的批量操作，开箱即用。

使用方法#

安装依赖#

pip install xlwings pandas

基本用法#

from excel_batch import ExcelBatch

# 批量读取
eb = ExcelBatch("./data/*.xlsx")
df = eb.read_all(sheet="Sheet1", range="A1:D100")

# 批量写入
eb.write_all(df, sheet="汇总", start_cell="A1")

参数说明#

参数	类型	说明	默认值
pattern	str	文件匹配模式	必填
sheet	str	工作表名称	“Sheet1”
range	str	单元格范围	None (全部)
header	bool	是否包含表头	True

代码实现#

import xlwings as xw
import pandas as pd
from pathlib import Path
from typing import List, Optional

class ExcelBatch:
    """Excel 批量处理工具"""
    
    def __init__(self, pattern: str):
        self.files = list(Path(".").glob(pattern))
        if not self.files:
            raise FileNotFoundError(f"未找到匹配的文件: {pattern}")
        print(f"找到 {len(self.files)} 个文件")
    
    def read_all(
        self, 
        sheet: str = "Sheet1", 
        range: Optional[str] = None,
        header: bool = True
    ) -> pd.DataFrame:
        """批量读取所有文件"""
        dfs = []
        for f in self.files:
            try:
                with xw.App(visible=False) as app:
                    wb = app.books.open(f)
                    ws = wb.sheets[sheet]
                    if range:
                        data = ws.range(range).options(pd.DataFrame, header=header).value
                    else:
                        data = ws.used_range.options(pd.DataFrame, header=header).value
                    data["_source_file"] = f.name
                    dfs.append(data)
                    wb.close()
            except Exception as e:
                print(f"读取 {f.name} 失败: {e}")
        
        return pd.concat(dfs, ignore_index=True)
    
    def write_all(
        self,
        data: pd.DataFrame,
        sheet: str = "Sheet1",
        start_cell: str = "A1"
    ):
        """批量写入所有文件"""
        for f in self.files:
            try:
                with xw.App(visible=False) as app:
                    wb = app.books.open(f)
                    ws = wb.sheets[sheet]
                    ws.range(start_cell).value = data
                    wb.save()
                    wb.close()
                print(f"写入 {f.name} 成功")
            except Exception as e:
                print(f"写入 {f.name} 失败: {e}")

注意事项#

性能：大量文件时考虑使用 visible=False 提升速度
编码：中文路径可能出问题，建议使用英文路径
版本：需要安装 Excel，纯 Python 方案可用 openpyxl

扩展思路#

添加并行处理支持（multiprocessing）
支持 .xls 格式（需要 pywin32）
添加进度条显示（tqdm）