Prompt Details
Model
Chat - (gpt-5.4)
Token size
212
Example input
[Describe your dataset]: E-commerce Sales Data
[Column to predict]: Total_Revenue
[E.g., Random Forest or X G Boost]: Random Forest Regressor
Example output
"""
Dataset: E-commerce Sales Data
Target Variable: Total_Revenue
Algorithm Preference: Random Forest Regressor
"""
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score
class EcommerceRevenuePipeline:
"""End-to-end ML pipeline for predicting E-commerce revenue."""
def __init__(self, filepath):
self.filepath = filepath
self.model = Pipeline([
('scaler', StandardScaler()),
('rf', RandomForestRegressor(n_estimators=100, random_state=42))
])
def load_and_clean(self):
try:
df = pd.read_csv(self.filepath)
# Handle missing values robustly
df.fillna(df.median(numeric_only=True), inplace=True)
return df
except FileNotFoundError:
print("Error: Dataset file not found. Please check the path.")
def feature_engineering(self, df):
# Derived feature 1: Average order value
df['Avg_Order_Value'] = df['Total_Revenue'] / df['Total_Orders']
# Derived feature 2: Customer tenure in months
df['Tenure_Months'] = (pd.to_datetime('today') - pd.to_datetime(df['Signup_Date'])).dt.days / 30
return df
def train_evaluate(self, df):
X = df.drop(['Total_Revenue', 'Signup_Date'], axis=1)
y = df['Total_Revenue']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
self.model.fit(X_train, y_train)
predictions = self.model.predict(X_test)
print(f"RMSE: {np.sqrt(mean_squared_error(y_test, predictions)):.2f}")
print(f "R2 Score: {r2_score(y_test, predictions):.2f}")
By purchasing this prompt, you agree to our terms of service
GPT-5.4
Stop manual coding for data science. This advanced prompt acts as a Senior Data Scientist to autonomously generate production-ready Python code for end-to-end Machine Learning pipelines. It covers Data Cleaning, Automated EDA, Feature Engineering, Model Training, and Evaluation. Perfect for Data Engineers and Analysts who need optimized, Object-Oriented code using Pandas and Scikit-Learn.
...more
Added 4 days ago
