Welcome to the first part of my new series on making your Power BI reports more dynamic! We’re starting with the basics: How to Create a Dataset for Power BI Using Chat GPT. For the sake of this series, I’m creating my own dataset using Chat GPT, and I want to share my knowledge with you. Follow along using the Excel dataset provided or create your own. I’ll provide a link to download the Power BI Desktop file for free at the end.
Why Create a Dataset for Power BI Using Chat GPT?
Creating a dataset from scratch can be daunting. Yes, there are many website such as kaggle that have thousands of datasets. However, sometimes its not in the format that I want or simply a different industry with different KPIs. I’ve found that Chat GPT makes this process much easier. Here’s why:
- Custom Specifications: You can define your dataset’s structure, including timeframes, customer IDs, and product IDs.
- Realism and Variability: Add seasonal patterns, fluctuations, and realistic business dynamics.
- Efficiency: Generate data quickly without the hassle of manual entry.
This method saves time and gives you a robust dataset ready for Power BI analysis. I recommend to use Chat GPT 4 or 4o for this process as the results are more accurate and realistic!
Creating Your Dataset with Chat GPT
For this series, I’m using Chat GPT to create my own dataset. I want to share my knowledge so you can do the same. Here’s how to set it up: To generate a realistic dataset, use this prompt in Chat GPT. It helps you define the structure and characteristics of your dataset:
Prompt:
“Generate a comprehensive dataset for a manufacturing business in the electronics, furniture, and appliances industry. The dataset should cover a four-year period from 2021 to 2024 and exhibit realistic business dynamics, including seasonality and variability in actuals, forecasts, and budgets.
- Define the Timeframe and Structure:
- Cover a four-year period from January 1, 2021, to December 31, 2024.
- Include daily transactions, with a total of 5,000 rows.
- Set Up Fact and Dimension Tables:
- Fact Table (Fact_Sales):
- Transaction_ID: Unique ID for each transaction.
- Date: Date of each transaction, with multiple entries per day.
- Product_ID: Unique ID for each product (10 products in total).
- Customer_ID: Unique ID for each customer (20 customers in total).
- Country_ID: Unique ID for each country (10 European countries).
- Quantity_Sold: Number of items sold, showing monthly variability.
- Sales_Amount: Revenue generated, with realistic fluctuations.
- Gross_Sales: Total sales amount before discounts.
- Discounts: Discounts applied, showing variability.
- Net_Sales: Gross sales minus discounts.
- Total_Costs: Costs related to sales, with monthly and yearly variations.
- Dimension Tables:
- Dim_Product:
- Product_ID: Unique ID for each product.
- Product_Name: Name of the product.
- Category: Product category (Electronics, Furniture, Appliances).
- Dim_Customer:
- Customer_ID: Unique ID for each customer.
- Customer_Name: Name of the customer.
- Dim_Country:
- Country_ID: Unique ID for each country.
- Country_Name: Name of the country (e.g., Germany, France).
- Region: Region associated with the country (e.g., North Europe).
- Dim_Product:
- Fact Table (Fact_Sales):
- Add Forecast Table: Forecast_Sales
- Date: Aggregated to the first day of each month (2021-2024).
- Product_ID: Unique identifier for each product (10 products).
- Customer_ID: Unique identifier for each customer (20 customers).
- Country_ID: Unique identifier for each country (10 European countries).
- Quantity_Sold: Forecasted number of items sold, showing variability and trends.
- Sales_Amount: Forecasted revenue with realistic monthly variations.
- Gross_Sales: Forecasted total sales amount before discounts.
- Discounts: Expected discounts applied, reflecting market conditions.
- Net_Sales: Forecasted gross sales minus discounts.
- Total_Costs: Forecasted costs associated with the sales.
- Add Budget Table: Budget_Sales
- Date: Aggregated to the first day of each month (2021-2024).
- Product_ID: Unique identifier for each product (10 products).
- Customer_ID: Unique identifier for each customer (20 customers).
- Country_ID: Unique identifier for each country (10 European countries).
- Quantity_Sold: Budgeted number of items to be sold, reflecting planned variability.
- Sales_Amount: Budgeted revenue, with variations across different months and years.
- Gross_Sales: Budgeted total sales before discounts.
- Discounts: Budgeted discounts to be applied.
- Net_Sales: Budgeted gross sales minus discounts.
- Total_Costs: Budgeted costs related to sales, planned with realistic variability.”
—————————
Prompt end!
- Date table: (separate)
I wrote a detailed article, how to create a dynamic date table using Dax. In there, I shared step by step guidelines as well the Dax code that you need.
Make sure to specify the timeframe, daily transactions, and variability to capture realistic business dynamics. This approach ensures your dataset reflects real-world conditions and is ready for Power BI analysis.
Be careful – Do Data Quality Checks!
The great advantage is that you have total control of the data that you get and the prompt that you are writing. But be careful. Always do data quality checks as some results can be unrealistic or simply not helpful. Check if the dimension tables accurately represent that fact tables! This is crucial for creating a perfect star schema power bi data model!
Conclusion
Figuring out how to create a dataset for Power BI using Chat GPT is like having a secret weapon for your projects. It’s a cool way to make realistic datasets without all the hassle. By following these steps, you’ll have a solid start for creating awesome Power BI reports that actually make sense. This approach is a total time-saver and makes your data analysis way more accurate.
Stay Tuned
This is just the beginning! Stay tuned for more posts in my series on making your Power BI reports more dynamic. In the next post, I’ll cover how to combine actuals with budget or forecast data. Impress your boss with some new skills! So follow along 🙂
Free Power BI File
Download my Power BI Desktop file, including all measures, data models, and settings, for free using the link below!
Download the Excel File as an input for your own data model:
manufacturing_portfolio_dataset.xlsx
If you like this type of series, I would be very happy if you give me a comment, share this post or follow me on LinkedIn.
Leave a Reply