Example output
To automate data extraction from multiple Excel files and consolidate them into a single CSV file using Python, we can use the pandas library, which is powerful for data manipulation and analysis. Here's a custom Python script for you:
import pandas as pd
import os
def extract_and_consolidate_excel(directory_path, output_csv):
"""
Extract data from multiple Excel files in a directory and consolidate into a single CSV file.
Args:
- directory_path (str): The path to the directory containing the Excel files.
- output_csv (str): The filename for the output CSV file.
Returns:
- None
"""
# Initialize an empty DataFrame to store consolidated data
consolidated_data = pd.DataFrame()
# List all files in the directory
files = os.listdir(directory_path)
# Iterate over each file in the directory
for filename in files:
if filename.endswith('.xlsx') or filename.endswith('.xls'):
# Read Excel file into a DataFrame
excel_data = pd.read_excel(os.path.join(directory_path, filename))
# Concatenate the current DataFrame with the consolidated DataFrame
consolidated_data = pd.concat([consolidated_data, excel_data], ignore_index=True)
# Write consolidated data to a CSV file
consolidated_data.to_csv(output_csv, index=False)
print("Data extraction and consolidation complete. CSV file saved as:", output_csv)
# Example usage:
directory_path = "/path/to/excel/files/directory"
output_csv = "consolidated_data.csv"
extract_and_consolidate_excel(directory_path, output_csv)
----------------------------------------------------------------------------------------------------------------------
To use this script, replace "/path/to/excel/files/directory" with the path to the directory containing your Excel files, and "consolidated_data.csv" with the desired filename for the output CSV file. Then run the script, and it will extract data from all Excel files in the directory and consolidate them into a single CSV file.