In this article, I will demonstrate an approach that may be used in Power BI to analyse the dates such as Created On, First Donation and Last Donation dates for Donors or Supporters. This approach may also be considered for other scenarios, such as First Sale and Last Sale dates for Customers, First Membership and Last Membership dates for Members, or First Registration and Last Registration dates for Course Attendees etc. This approach may be adapted to use source dates from a dimension table such as Customer or a fact table such as Sales.
I will focus on an example where the dates to be analysed do not exist in any fact table such as Donations but are in the Supporters dimension table as separate columns, i.e., Created On, First Donation Date and Last Donation Date.
For the rest of this article, I will refer to Donors and Supporters simply as Supporters.
You can click on any of the images in this article to enlarge them.
The solution demonstrated in this article involves adding a fact table named ‘Supporter Dates’. This table can be created from the various date columns in the Supporter table and will contain rows for each Supporter for Created On, First Donation Date and Last Donation Date.
CustomerKey is the foreign key for the related Supporter Dimension. There is a relationship from Supporter to ‘Supporter Dates’ based on this key. In this case, the key value is a GUID rather than an integer. Additional steps can be taken to convert key values from GUIDs to integers. However, those steps are not covered in this article.
The values for Date Type are ‘Created On’, ‘First Donation Date’ and ‘Last Donation Date’.
The DateKey values are the Date values formatted as DateKey values, e.g., instead of a Date such as 07/06/2013, a DateKey such as 20130607 is stored in the ‘Supporter Dates’ table. The DateKey is the foreign key for the related ‘Date’ dimension. There is a relationship from ‘Date’ to ‘Supporter Dates’ based on this key.
The following image shows the Star Schema diagram and the filtering relationships from the Supporter and Date dimension tables to the ‘Supporter Dates’ fact table.
The ‘Supporter Dates’ table may be created using either Power Query or DAX. However, the best practice approach is to create the table as close to the data source as possible. Therefore, if the table is not available in the data source, using Power Query is generally better than using DAX. However, in this article, I demonstrate both approaches.
The following is an example of how the ‘Supporter Dates’ table can be created using Power Query.
Here the Source is the Supporter table, the relevant columns to be retained have been selected from the Source, the Created On, First Donation Date and Last Donation Date columns have then been unpivoted and finally, the DateKey column, created by the previous step, has been transformed from a Date column to a DateKey column using a custom function named DateToDateKey().
It is important to be aware that the Unpivot step excludes any Created On, First Donation Date, and Last Donation Date values that are blank. Therefore, there is no need to add additional steps to exclude blank values from the result.
The following is an example of how the ‘Supporter Dates’ table can be created using DAX.
You will notice that more steps than those needed when using Power Query are required to achieve the same result.
This calculation includes steps that create three temporary tables. These temporary tables are assigned to corresponding variables named _CREATED_ON_DATES, _FIRST_DONATION_DATES and _LAST_DONATION_DATES. Each of these tables contains the same three columns, i.e. CustomerKey, Date Type and DateKey.
These three temporary tables are then combined into one table using UNION(). UNION requires each of the source tables to have the same number of columns. These columns must be in the same position in each table because they are combined based on their position; they are not combined based on their names.
SUMMARISECOLUMNS() is used to summarise the Supporter table by CustomerKey and to add two additional columns named Date Type and DateKey.
FILTER() is used to exclude blank date values. Not all Supporters have a First Donation Date or a Last Donation Date. However, all Supporters have a Created On date. However, I have included FILTER for Created On Date to resolve a Circular Dependency.
The calculated values for the Date Type column are returned by three measures named [Date Type (Created On)], [Date Type (First Donation Date)] and [Date Type (Last Donation Date)]. These measures simply return a single text value, i.e. ‘Created On’, ‘First Donation Date’ and ‘Last Donation Date’. These Measures act as parameters for this and other calculations in the data model.
The calculated values for the DateKey column are retrieved from the Supporter[Created On], Supporter[First Donation Date] and Supporter[Last Donation Date] columns. Then, the retrieved values are transformed from Date values into DateKey values using FORMAT().
The most im portant aspects to understand about this example of using SUMMARIZECOLUMNS() are that … SUMMARIZECOLUMNS():
(a) Creates a table of unique Supporter[CustomerKey] values.
(b) Adds a new column named Date Type.
(c) Adds a new column named DateKey.
(d) Invokes a Filter Context rather than a Row Context when adding the new columns. This is the most important point to understand. I will explain this further below.
In the case of the calculation for Date Type, it does not matter that the calculation is being evaluated in a Filter Context because the corresponding measure always returns a single value that is not dependent on the Filter Context. These values are ‘Created On’, ‘First Donation Date’ or ‘Last Donation Date’.
However, in the case of the calculation for ‘DateKey’ the Filter Context does matter! The calculation of the DateKey is dependent on the Filter Context because they refer to columns values in the data model that are retrieved using the Filter Context.
When using SUMMARISECOLUMNS to add new columns, the Filter Context is created from the values of the Group By Columns. In this example, there is only one Group By column, the Supporter[CustomerKey] column. Therefore, retrieving the Supporter[Created On], Supporter[First Donation Date] and Supporter[Last Donation Date] for each iterated Supporter[CustomerKey] will return the matching rows for all Supporters with a matching Supporter[CustomerKey].
In this example, there will only ever be one matching Supporter in the Supporter table for each iterated Supporter[CustomerKey].
The Filter Context returns a table containing one or more rows and columns. This table can be evaluated using functions such as AVERAGE(), COUNT(), DISTINCTCOUNT(), MIN(), MAX() and SUM() and any of their variants such as SUMX() etc. Each of these functions interates each row in the table, performs an evaluation on one or more columns in each iterated row and then returns a single value result.
In this example, the the Filter Context for each iterated Suporter[CustomerKey] returns a single-row, single-column table containing a single unique value for either the Supporter[Created On], Supporter[First Donation Date] or Supporter[Last Donation Date] value. Either VALUES() or DISTINCT() may be used to retrieve a single unique value or multiple unique values from a table. In this example, the Filter Context returns a table containing a single value. However, unlike DISTINCT, VALUES may add a blank value to the list of retrieved values. This causes multiple values to be returned, and the inclusion of the added blank value causes a Circular Dependency. DISTINCT returns the single value. This value is assigned to the DateKey column after first being transformed from a Date value to a DateKey value using FORMAT.
The Data Cards and Column Chart in the image below demonstrate an analysis of the number of Supporters by Date Type and Month for the selected Year. The Column Chart is configured to display Small Multiples using the Date Type.
This analysis uses a measure named ‘Count Supporters (Supporter Dates)’, which counts the number of distinct Supporters in the ‘Supporter Dates’ table.
In conclusion, this article demonstrates two methods for creating a Fact table from columns in a Dimension table. This article also demonstrates some of the important considerations when using SUMMARIZECOLUMNS, especially with respect to Filter Context when SUMMARIZECOLUMNS is used to add additional columns.
The scenario demonstrated in this article is very simple, but SUMMARIZECOLUMNS can also be used to summarise information from other tables, both Dimension tables and Fact tables.
Finally, if you want a deep dive and want to avoid some pitfalls, I suggest you study these two articles and practice using SUMMARIZE and SUMMARIZECOLUMNS.