Snippet Code Writer For Data Processing

Name: Snippet Code Writer For Data Processing GPT Prompt
Brand: PromptBase
Price: 3.99 USD
Availability: InStock
Author: doctorzpak

Snippet Code Writer For Data Processing gpt prompt mini thumbnail

GPT-4O

Tested

Tips

4 examples Tick icon

@doctorzpak

Snippet Code Writer for Data Processing Tasks

...more

$3.99

Get prompt

After purchasing, you will gain access to the prompt file which you can use with GPT or the app builder. You'll receive 20 free generation credits with this purchase. By purchasing this prompt, you agree to our terms of service.

Over 1 month ago

Prompt Details

Model

Chat - GPT-4o (gpt-4o)

Token size

109 ($0.00545 / call)

Example input

[ data type ]: CSV files containing customer data [ programming language ]: Python (Pandas)

Example output

Here is a Python code snippet that processes CSV files containing customer data using `pandas`. The code is modular, handles potential errors, and is optimized for performance. ```python import pandas as pd import os def load_csv(filepath): """ Load CSV file into a Pandas DataFrame. Handle common file-related errors. :param filepath: Path to the CSV file :return: DataFrame containing the CSV data or None if an error occurs """ try: if not os.path.exists(filepath): raise FileNotFoundError(f"File {filepath} does not exist.") # Read the CSV file with appropriate settings df = pd.read_csv(filepath, low_memory=False) # low_memory=False helps with mixed data types print(f"File '{filepath}' loaded successfully.") return df except FileNotFoundError as e: print(f"Error: {e}") except pd.errors.EmptyDataError: print("Error: File is empty.") except pd.errors.ParserError: print("Error: Parsing the file failed. Ensure it's a well-formatted CSV.") except Exception as e: print(f"An unexpected error occurred: {e}") return None def clean_data(df): """ Clean the DataFrame by handling missing values and standardizing column names. :param df: Input DataFrame :return: Cleaned DataFrame """ if df is None: return None # Drop rows with all missing values df.dropna(how='all', inplace=True) # Fill missing values in 'customer_id' with a placeholder if 'customer_id' in df.columns: df['customer_id'].fillna('Unknown', inplace=True) # Strip whitespace from string columns df.columns = df.columns.str.strip() # Strip any leading/trailing spaces in column names print("Data cleaning complete.") return df def process_customer_data(df): """ Process customer data to extract insights or perform transformations. :param df: Cleaned DataFrame :return: Processed DataFrame """ if df is None: return None # Example: Filter out rows where 'customer_id' is missing if 'customer_id' in df.columns: df = df[df['customer_id'] != 'Unknown'] # Example: Convert date columns to datetime format if 'signup_date' in df.columns: df['signup_date'] = pd.to_datetime(df['signup_date'], errors='coerce') # Example: Add a column for customer tenure (if 'signup_date' is available) if 'signup_date' in df.columns: df['customer_tenure_days'] = (pd.Timestamp.now() - df['signup_date']).dt.days print("Customer data processing complete.") return df def save_to_csv(df, output_filepath): """ Save the processed DataFrame to a CSV file. :param df: DataFrame to be saved :param output_filepath: Output path for the CSV file """ try: df.to_csv(output_filepath, index=False) print(f"Data saved to {output_filepath}") except Exception as e: print(f"Error saving the file: {e}") def main(input_filepath, output_filepath): """ Main function to orchestrate data processing steps. :param input_filepath: Path to the input CSV file :param output_filepath: Path to the output CSV file """ # Step 1: Load CSV data df = load_csv(input_filepath) # Step 2: Clean the data df_cleaned = clean_data(df) # Step 3: Process the data df_processed = process_customer_data(df_cleaned) # Step 4: Save the processed data if df_processed is not None: save_to_csv(df_processed, output_filepath) else: print("No data to save.") if __name__ == "__main__": # Example usage input_file = 'customers.csv' # Replace with the actual input CSV path output_file = 'processed_customers.csv' # Replace with the desired output CSV path main(input_file, output_file) ``` ### Key Features: 1. **Modular Functions**: - Each function handles a specific task: loading, cleaning, processing, and saving data. This makes the code reusable and easy to maintain. 2. **Error Handling**: - The code handles file-related errors (e.g., missing files, empty files) and parsing issues with `try-except` blocks. 3. **Efficient Memory Usage**: - The `low_memory=False` parameter in `pd.read_csv()` prevents memory warnings when working with large files. - Cleaning the data in place using `inplace=True` prevents unnecessary copies in memory. 4. **Edge Case Handling**: - Missing data is handled gracefully (`fillna`, `dropna`). - If columns like 'signup_date' or 'customer_id' are missing, the code checks for their existence before attempting operations. 5. **Clear Output**: - Status messages (`print()`) ensure the user knows when each stage of processing is complete. - Errors are reported clearly to aid debugging. 6. **Adaptability**: - The structure of the code makes it easy to adapt to different datasets. You can modify the `process_customer_data()` function to perform additional transformations based on the dataset you are working with. ### Edge Cases Handled: - Missing columns like `customer_id` and `signup_date`. - Empty or poorly formatted files. - Mixed data types in the CSV. This design ensures the code is maintainable and scalable for future data processing tasks.