Safe and predictable MDX queries for PowerPivot data loads

I have spent most of the last month digging deep and developing in PowerQuery and PowerPivot. I must say that I am impressed and there is still a lot more for me to learn. However I am not that much impressed about the documentation available online as there have been some issues I could not find an useful answer to.

Today I would like to share one quick tip related to querying OLAP database from PowerQuery or directly from PowerPivot. Imagine that you would place in the data loader section of a PowerQuery or PowerPivot a really basic MDX query looking like this:

Query 1:

SELECT
[Measures].[Internet Order Count] ON 0,
NONEMPTY([Customer].[Customer].[Customer],[Measures].[Customer Count]) ON 1
FROM
(SELECT
[Sales Territory].[Sales Territory].[Group].&[Europe] ON 0
FROM
(SELECT [Reseller].[Business Type].&[Specialty Bike Shop] ON 0
FROM
[Adventure Works])
)
WHERE [Product].[Category].&[40]

The results are:

Capture2

It could easily happen, that if you choose WHERE condition returning no rows, in this case determined by the non-existing Product Category, (or if your user would not have sufficient security rights under the OLAP role ), that the PowerPivot model could become corrupted and loose a relationship related to one of the (not)returned columns. You can rewrite the mentioned Query 1 in a much better way to ensure, that you always get the needed columns no matter how many rows are retrieved.

Query 2:

WITH
MEMBER [iCustomerName] AS [Customer].[Customer].CURRENTMEMBER.NAME
MEMBER [iCustomerKey] AS [Customer].[Customer].CURRENTMEMBER.Properties( "Key(0)" )

SELECT
{[iCustomerKey],[iCustomerName],[Measures].[Internet Order Count]} ON 0,
NONEMPTY([Customer].[Customer].[Customer],[Measures].[Customer Count]) ON 1
FROM
(SELECT
[Sales Territory].[Sales Territory].[Group].&[Europe] ON 0
FROM
(SELECT [Reseller].[Business Type].&[Specialty Bike Shop] ON 0
FROM
[Adventure Works])
)
WHERE [Product].[Category].&[40]
The results are:
Capture1
As you can see from the next picture, for Query 1 we have no Customer Name column in the model, for Query 2 we are just fine. I must admit that Customer Name is not a great example since we have the Customer Key returned anyway, but I guess you can see the point where I’m heading here. To stay on the safe side, write bullet-proof MDX queries!
 tabular1
Advertisements

Some thoughts on running PowerQuery in 32bit Excel 2013/2016 while connecting to XLS files

I really do like PowerQuery for the ease of use, its powerfulness and the self-explanatory ETL code in M. However I must say, that although Microsoft claims its a Self Service BI tool, things can get quite complex, especially if your clients have 32 bit Excel installed on their machines. So quite often the issues come up, when the clients try to connect via PowerQuery to huge local Excel data source files and at some point hit the out of memory error message. Now in 32bit Excel, you need to watch the virtual allocated memory in Process Explorer application. This useful application can be downloaded from here. You need to catch the Excel process and in the properties, you can easily find this value. No sense watching Excel memory claims from the Windows built-in Task Manager app. Quite often you can see, that Excel committed memory charge is around 500 MB, however you are receiving the out of memory message, because virtual memory allocated at this point can be around 1.95 GB, which I consider the point, where you can be certain that Excel will crash ( The point of no return ). You might be lucky and the virtual memory allocation might rise up to around 2,1 GB, but you are really surfing the threshold here. Very likely you will not be able to save the file at this point as well. So the recommendations based upon my personal experience are:

  • Whenever possible, try to load data from .csv instead of .xls files in case you need to load local data
  • Load only the columns you are sure you will need in your data model, no sense loading for example DWH system columns like GUIDs, PKs, System Dates etc..
  • Set the PowerQuery current workbook settings in a lightweight manner so that it does not automatically detect column data types, does not automatically create relationships between tables, does not update relationships when refreshing queries loaded to the data model, when possible, ignore the privacy levels
  • Track the virtual allocated memory in the Process Explorer application while adding each query. You might find a query loading just a few rows that causes high virtual memory allocation for no obvious reason, and that’s the perfect time to start tuning the query steps one after another. ( Unfortunately I have not found so far a way, how to do some more accurate query performance tracking when the queries are loading data from local Excel file source, however when loading data from different data sources, you have a pretty decent option setting the PowerQuery tracking ON and then you can load the log text file in another PowerQuery and set some basic transformations on the file to get the specific query duration etc. ).
  • When not needed, disable loading query results to a Worksheet
  • Try avoiding chaotic and memory consuming steps like adding columns, changing their datatypes and then removing them at the end of the query flow
  • Disable COM Excel add-ins because of their memory consumption
  • When nothing else seems to work, you can try downloading and installing Large Address Aware capability change for Excel from here , this KB released in May 2016 raises the 32 bit Excel memory limit from 2 to 3 GB

And that’s all I can think of at the moment. Don’t forget , that you should reserve at least 500 MB for PowerPivot if you plan to load the data into the data model. If I come across any more recommendations, I will share them here, however combining these steps served me pretty well.

sp_testlinkedserver in a try…catch block

Sometimes you definitely need to go with quick workarounds. I am pretty much sure I am not the only BI developer working from time to time with legacy and somehow wacky old code used for production purposes. This time I came across a legacy scheduled stored procedure filling a dataset for SSRS reporting purposes calling openrowset to run MDX query against an OLAP cube but the linked server was failing from time to time because of the weak connections. Whenever the linked server call would fail, there would be simply no reporting as the MDX results were later used in an INNER JOIN 🙂 . I kind of wonder what did the people writing this code thought back then. Anyway I needed this SP to stop failing and to have results even if the linked server connection would fail and to be informed that the linked server call failed so I could react and persist the results from the last successful run.

The solution is quite easy and so far seems bullet proof. Lets use this sample MDX code enveloped in an openrowset for example:

SELECT a.*
FROM OpenRowset( 'MSOLAP','DATASOURCE=myOlapServer; Initial Catalog=FoodMart;',
'SELECT Measures.members ON ROWS,
[Product Category].members ON COLUMNS
FROM [Sales]')
as a
GO

So the trick is to add this chunk of code after sp_testlinkedserver which tests if we are able to connect to the specified linked server and we need to run this together in the try block. Also we might want to set the variable @err to know that an error happened. The code could look something like this:


DECLARE @err BIT = 0;

BEGIN TRY
EXEC sp_testlinkedserver N'MSOLAP';

SELECT a.*
INTO #results
FROM OpenRowset( 'MSOLAP','DATASOURCE=myOlapServer; Initial Catalog=FoodMart;',
'SELECT Measures.members ON ROWS,
[Product Category].members ON COLUMNS
FROM [Sales]' )
as a;

END TRY

BEGIN CATCH
SET @err = 1;
END CATCH

IF @err = 1
BEGIN
--FOR EXAMPLE USE AND PERSIST RESULTS FROM THE LAST TIME THE SP WAS SUCCESSFULLY EXECUTED..
END

Disclaimer: This is just a quick fix tutorial, I definitely agree this is not the best example of using
the try…catch block. Details on sp_testlinkedserver can be easily found here.

Build your own Data quality framework part 3/3

And here we are , at the final part of this tutorial. This third part is going to be mainly about the Data quality issue fixing and possible reporting layers. I have been working with this framework in an environment, where we’ve had SSRS reports embedded in a .NET web application. The links in the SSRS subreports for each Data quality rule would navigate you onto the specific business objects in the application and there you would fix the errors. That’s the part where I find this solution really straightforward and powerful. If you’re in a completely different setup environment, you can possibly write your own simple web page for the error fixing module or you can come up with any other possible UI you can think of. The same with the reporting module. I have used a simple SSAS Tabular model with Excel sheets pushed onto the internal Sharepoint site, but you can really go in any possible direction you can think of.

So once you have the DQ schema tables filled with rules and their results, you can easily create an SSRS report dashboard  ( dataset joining dql.ValidationResult and dql.ValidationRun tables ) looking something like this for example:

Untitled Diagram (4)

The datasets for this report are a piece of cake to come up with when you think about the data model we have from the second part of this tutorial. You can use for example:


DECLARE @CurrValidationRun_ID INT;
SELECT @CurrValidationRun_ID = ISNULL(MAX(ValidationRun_ID),0) FROM dql.ValidationRun WHERE ValidationRunEnd IS NOT NULL;
DECLARE @PrevValidationRun_ID INT = @CurrValidationRun_ID - 1;

SELECT
VR.ValidationRule_ID,
VR.ValidationRuleName,
(SELECT VRP.ValidationErrorCount FROM dql.ValidationResult VRP WHERE VRP.ValidationRun_ID = @PrevValidationRun_ID AND VRP.ValidationRule_ID = VR.ValidationRule_ID) AS PrevErrorCount,
(SELECT VRP.ValidationErrorCount FROM dql.ValidationResult VRC WHERE VRC.ValidationRun_ID = @CurrValidationRun_ID AND VRC.ValidationRule_ID = VR.ValidationRule_ID) AS CurrentErrorCount
FROM dql.ValidationRule VR;

or you can use CROSS APPLY or what ever technique you are the most comfortable with. The Data quality rule column in the table should be a link to a subreport containing the specific rule errors with links to the CRM application business objects.

When it comes to the drill down into the subreports, you need to query the dql.ValidationResultItem table but you are facing one issue here. As you’ve seen, I have used 6 ResultCol columns for the results and in the Contact with multiple invoicing addresses example you have 6 columns filled with data. But what if you had used less than 6 columns in some rules? So how can you prepare the report column layout in the specific rule results subreport? The same problem is with the SSRS matrix / table column headers. I leave this part up to your imagination.

You can go with a simplistic approach, show all 6 columns in the table report layout and have the column visibility and column header value based on some Data quality rule – specific configuration and solve this issue with an lookup expression. You could also choose a more technical approach and build dynamic SQL statements using the Data quality query metadata and prepare a dataset for a matrix style report with some pivoting / unpivoting. I would not go into more details here as everyone is working in an different environment and this is really the fun part where you can spend some time evaluating the best available options. I leave the reporting for Management up to you as well. It is really the fun part where sky is the limit and the solution is totally based on your preferences.

Conclusion:

This lightweight Data quality framework can be considered pretty easy to build yet very powerful. You can further expand the functionality in many ways. You can even make this framework call available Datasets used for example for addresses cleaning and have some auto corrections being done. You should not forget about row-level security , so the users cannot touch each others owned business entities. But the main thought behind this is to provide the users an easy to use interface through which they can easily fix the Data quality issues that they are responsible for. Just remember to keep it simple and well performing!

Build your own Data quality framework part 2/3

The database back-end part of this simple framework will be made of these parts:

schema dql – containing all the Data quality objects mentioned below

table dql.ValidationRule – containing a list of Data quality rules with their meta information

table dql.ValidationRun – containing information about the validation runs

table dql.ValidationResult – containing aggregated information with the results needed for your reporting

table dql.ValidationRuleItem – containing the lowest level info related to the business object specific errors, ie. a list of the Contacts having more than 1 invoicing address filled in SQL_VARIANT datatype columns

stored procedure dql.ValidationRunner – responsible for the Data quality rules evaluation, maintenance of data and aggregations for reporting

stored procedure dql.ValidationRuleAPI – responsible for the Data quality rules management

SQL Agent job DataQuality – responsible for scheduled execution of the stored procedure dql.ValidationRunner

Here is a relational schema to give you a better idea:

Untitled Diagram (1) (1)

And how do we bring the Data Quality evaluation to life?  First we need to fill the Data quality rules and their meta information into the dql.ValidationRule table.

INSERT INTO dql.ValidationRule
( ValidationRuleName
, ValidationRuleDescription
, ValidationRuleQuery
, ValidationRuleColumns
, IsActive )

SELECT
'Contacts with multiple invoicing Address',
'List of all contacts with their owners having multiple invoicing addresses in the CRM',
'SELECT Contact_ID, ContactName, ContactOwner_ID, ContactOwnerName, AddressType, COUNT(*) AS [InvoicingAddressCount]
FROM dbo.Contact WITH (NOLOCK)
WHERE AddressType = ''Invoicing''
GROUP BY Contact_ID, ContactName, ContactOwner_ID, ContactOwnerName, AddressType
HAVING COUNT(*) > 1',
'Contact_ID,ContactName,ContactOwner_ID,ContactOwnerName,AddressType,InvoicingAddressCount',
1 UNION ...

At this phase you can build your simple stored procedure API. Just prepare a stored procedure called dql.ValidationRuleAPI that can either insert, update or delete a Data quality rule based on the provided parameters.

Next you need to create the stored procedure dql.ValidationRunner which will take care of the main part. It has to create a row in the dql.ValidationRun table to mark the validation run has started. Then it needs to iterate through the table dql.ValidationRule, use the value from the ValidationRuleQuery column, build a dynamic SQL statement and fill the result table dql.ValidationResultItem with each Data quality query results. ( No need for cursors here, you can refer to this link to use much more lightweight solution using a temp table )


DECLARE @statement NVARCHAR(MAX);
SET @statement = N'YOUR DATA QUALITY QUERY SELECT.. GOES HERE';

INSERT INTO dql.ValidationResultItem(
ValidationRun_ID ,
ValidationRule_ID ,
ResultCol1 ,
ResultCol2 ,
ResultCol3 ,
ResultCol4 ,
ResultCol5 ,
ResultCol6 )

EXEC sp_executesql @statement;

**Note: In case you’ll have different result column count for each query, you need to generate the INSERT INTO… statement also in a dynamic SQL manner. You can use some simple logic based on the column count metadata for the specific rule. The result would look like EXEC sp_executesql @statement where @statement is set something like : ‘INSERT INTO dql.ValidationResultItem (…) SELECT DATA QUALITY QUERY SELECT GOES HERE’…

After each rule being finished, this SP needs also to write the number of found results for the rule into the dql.ValidationResult table. Once all the rules are finished and all the results are inserted into the result table, this SP needs to mark the end of the validation run back into the dql.ValidationRun table and do some maintenance so the result table does not grow out of control. This table could easily become really huge, being considered the job is scheduled once in a hour, evaluating in average 50 rules with about approx. 100 results per rule. At any cost avoid using DELETE statement here, use TRUNCATE. I have spilled the results for the last 2 runs into a temp db table, truncated the result item table and inserted the spilled results back. I am pretty much sure you can come up with a more elegant solution.

Here is a simple UML diagram explaining the dql.ValidationRunner stored procedure workflow.

Untitled Diagram3.png

One important note. DO make sure, all the Data quality queries have NOLOCK hints. You don’t want to stop your production CRM application with your queries. Also I strongly advise to have the framework run on a different server then your CRM database production.

Build your own Data quality framework part 1/3

This time I would like to talk about building your own simple Data quality framework using T-SQL. This framework will be implemented as a SQL Agent Job evaluating the Data quality queries against your clients CRM Database and persisting the results from the last 2 runs in a separate DB schema with some front end for Data quality issues fixing and reporting. This solution will also be generic ( not dependent on column names ) so you can combine all kinds of Data quality rules the business can think of.

This tutorial will have 3 parts. In the first part, I would like to give an high perspective overview of what we are building and how this Data quality tool can work. The second part will focus on a high level overview of DB back-end steps needed and the third part will focus on the Data quality fixing and reporting part.

Let me start with saying, that Data quality is absolutely critical when it comes to a group of people entering data into a system. I have a lot of experience from an environment, where there was an internal CRM / ERP system being developed. Issues in Data quality would really be business-stoppers back there ( and probably anywhere else ). Imagine that you would like to invoice your client, but someone from the back-office team had inserted incomplete address or you have multiple addresses assigned to the client and you don’t know which one should be used for invoicing. And trust me, managers who cannot invoice their clients are not the right people to mess with. I am pretty sure that you can come up with a plenty of issues like this.

With the solution I am about to present, you don’t have to buy an pricey 3rd party application to solve these kinds of issues. So lets go through the possible scenarios.

1.Business Analyst knows exactly what is needed to ensure the required data quality at the clients side and manages these rules written in SQL language in an API he is provided with. This API could be implemented as a simple stored procedure used for insert / delete / update based on the parameters provided.

2.CRM Manager comes up with rules needed to have the required data quality in the client application. Let’s suppose this person does not know how to write SQL code, so these rules are handed over to the Business Analyst as verbally described definitions ( i.e. 1 contact can have multiple addresses but only one address can be used for invoicing ) and the Business Analyst translates these rules into the SQL language for example like this:


SELECT Contact_ID, ContactName, ContactOwner_ID, ContactOwnerName, AddressType, COUNT(*) AS [InvoicingAddressCount]
FROM dbo.Contact WITH (NOLOCK)
WHERE AddressType = 'Invoicing'
GROUP BY Contact_ID, ContactName, ContactOwner_ID, ContactOwnerName, AddressType
HAVING COUNT(*) > 1

and passes them into the API.

3.You could also have the DB developer come up with these rules and implementing them. ( definitely not my preferred scenario )

Untitled Diagram

4.Once the rules are in place, the DB job is scheduled and started and the rules are already being evaluated against the CRM Database, there will be a need for 2 front ends on the other side. One front end will be used for the Data quality issues fixing by the CRM users and the second front end will be used for reporting to Management. Seeing how well do your employees manage their business objects Data quality is an interesting KPI.

Next time, I will talk more about the DB project implementation.

Excel VBA for pivot table report safe delivery

As mentioned in the previous Excel VBA post, I would like to share my thoughts on safe Excel pivot tables reports delivery. The best I can recommend right now is an Excel file with an action, that on file open event hides all the rows in the pivot tables , so the user cannot see the data being pulled and saved into this sheet by someone else last time. The user has to refresh data, and thats where the provided credentials take place, right? Next thing I always do is, that I put password on the VBA code in the file. You have a tutorial for this step here. That way you ensure the user cannot see the macro VBA code and cannot do anything else then execute.


Private Sub Auto_Open()

Dim pf As PivotField
Dim PT As PivotTable
Dim WS As Worksheet
Dim aLock As Integer

On Error Resume Next

For Each WS In ActiveWorkbook.Worksheets
For Each PT In WS.PivotTables
With PT
.ManualUpdate = True
.EnableDrilldown = False
.EnableFieldList = False
.EnableFieldDialog = False
.EnableWriteback = False

For Each pf In .PivotFields
With pf
.DragToPage = False
.DragToRow = False
.DragToColumn = False
.DragToData = False
.DragToHide = False

End With
Next pf

End With

Next PT
Next WS

End Sub

 

Excel VBA macro for report sheet injection

Today I would like to talk a bit about Excel VBA macros. Recently I needed to come up with a macro for an Excel 2016 report sheet, that would inject this report sheet into every Excel file in the directory. This report sheet obviously contains data connections which are also being distributed to the end users. Regarding distribution purposes, this macro replaces the report sheet injected last time with the current version. ( Using this, you  can easily develop and distribute new versions of the report sheet to your users ). Sounds like some pretty useful stuff doesnt it?

I need to mention, that I am not an dedicated VBA developer, so you might find the attached code a little bit harsh, but it works perfectly. The code was also checked by the Office Code Compatibility Checker tool ( available here ). This is a pretty cool tool from Microsoft ensuring you have your VBA code compatible between MS Office versions and that classical 32/64 bit issue. You just launch this tool and it automatically rewrites and comments your code.

Now lets get to the data connections as there are definitly more options. I have been elaborating with these two: You can build .odc connection in the workbook from the .odc connection on the target machine or you can build the connection string on your own from the VBA code. I chose the second option in the final version, but I leave this up to your considerations and what better suits your environment. You can also see that there is Auto_Open VBA macro code being executed for every file injection. I will focus on this next time, you can just skip this step for now.

I think that I will do more posts on VBA macros in Excel in the future. I would also like to share my experience with the way you can deliver and secure data delivered to your Excel report users using VBA.

Sub Distribute_Sheet()
Application.DisplayAlerts = False
Dim MyFolder As String
Dim SrcWks As Worksheet
Dim Wkb As Workbook
Dim file As Variant
Dim delfile As Object
Set SrcWks = ThisWorkbook.Worksheets("_yourFileName_")
MyFolder = ThisWorkbook.Path & "\"

  file = Dir(MyFolder)
  While (file <> "")
   If InStr(file, "_yourFileName_") = 0 And InStr(file, ".xlsx") > 0 Then

      Set Wkb = Workbooks.Open(Filename:=ThisWorkbook.Path & "\" & file, ReadOnly:=False)

      Application.Run (Auto_Open)

      MsgBox ("Injecting _yourWorksheetName_ sheet with ext.data connection into: " & MyFolder & file)
        For Each delfile In Wkb.Sheets
        If delfile.Name = "_yourWorksheetName_" Then delfile.Delete
        Next delfile

        Dim strConnection As String
        Dim WBC As WorkbookConnection
        For Each WBC In ActiveWorkbook.Connections
          strConnection = WBC.Name
          If InStr(WBC.OLEDBConnection.Connection, "_yourConnOLAPName_") = 0 Then
          With ActiveWorkbook.Connections(strConnection). _
            OLEDBConnection
            .CommandText = .CommandText
            .CommandType = .CommandType
            .Connection = .Connection & ";CustomData=""_yourConnOLAPName_"""
            .RefreshOnFileOpen = True
            .SavePassword = False
            .SourceConnectionFile = ""
            .MaxDrillthroughRecords = 1
            .ServerCredentialsMethod = .ServerCredentialsMethod
            .AlwaysUseConnectionFile = False
            .RetrieveInOfficeUILang = False
          End With
          With ActiveWorkbook.Connections(strConnection)
            .Name = .Name
            .Description = ""
          End With
          ActiveWorkbook.Connections(strConnection).Refresh
          End If

         Next WBC

      SrcWks.Copy Before:=Wkb.Sheets(1)

      Wkb.ShowPivotTableFieldList = False

      Wkb.Close SaveChanges:=True

   End If
   file = Dir
 Wend

Application.DisplayAlerts = True

End Sub

SSRS reports personalisation

I would like to talk here about SSRS reports personalisation. The point is, that there are situations, when you feel you could do a little more to ease the use in some specific reports.

For example you have 20 additional columns and you can show or hide these columns based on what the user chose to see last time. Obviously this solution must also contain a parameter listing all the possible columns called ShowFields. Now we know, that SSRS 2008 lets you receive parameterized reports based on subscription to reports that run on demand. When you subscribe to a report that runs on a snapshot basis, your subscription must use the parameters values defined in the snapshot. But what if you cannot use subscriptions ?

I came across 2 scenarios when report subscription was not possible. One was, that you had an browser based CRM / ERP type application which used embedded SSRS reports running from an Iframe in the app and the second was that you might have had an custom build container-browser for SSRS reports project. Personally I dont feel these options are that rare, as both mentioned were used at my last employer, and the second mentioned, the custom build container for SSRS reports was an front-end to a pretty robust financial Data Warehouse solution.

So what will be the steps in this process?

1.If the user runs the report for the first time, the ShowFields multi-valued report parameter shows all column names for this report, user might change the values chosen in the ShowFields parameter and they are being sent with SSRS global variables Windows login of the user and the report name to the stored procedure ReportColumnParameterCache_DeleteInsert residing at the end of the stored procedure displaying this report dataset

2.This SP deletes the cache from last time and inserts the current values

3.Next time the user runs the report, the ShowFields multi-value parameter default values are being filled from an UDF retrieving cached results for the current user and the current report

4.The process is repeated except that now the parameter contains fields chosen by the user last time

Let me guide you through the steps needed to accomplish this report personalization. Lets start with some DB back-end stuff. I remember when I was developing this solution, I prepared an report -level configuration table rpt.Report holding Report_ID INT set as identity column , ReportName, ReportDescription, IsActive flag and IsVisible flag ( for report listing in the application purposes – this way you could really quickly state that the report is under construction, is obsolete etc. ). The sky is the limit here. The second configuration table rpt.ReportColumn would be used for report column configuration holding ReportColumn_ID INT as Identity, Report_ID from the rpt.Report table, ReportColumnName, IsActive flag,  IsVisible flag for country-specific purposes ( when a column might be never needed for users from some country in a multi-country enviroment ) and IsAdditional flag determining whether this column is an additional or a non-additional column. The next step might be a little bit time consuming as you must fill these tables with values , but you can come up with some clever string concatenation generating insert statements in Excel.

Now you need to create a table holding the report parameter cache. It must be a quick and a narrow one for some solid performance. I called mine rpt.ReportColumnParameterCache and it had just ReportColumnParameterCache_ID as an Identity column, ReportColumn_ID, and the Windows user login UserID named ReportUser NVARCHAR(100) column. MSDN describes UserID as The ID of the user running the report. If you are using Windows Authentication, this value is the domain account of the current user. The value is determined by the Reporting Services security extension, which can use Windows Authentication or custom authentication. 

image001

And what are the next steps? We have to create a stored procedure ( lets name it rpt.ReportColumnParameterCache_DeleteInsert ) which has input parameters from SSRS report parameters UserID, ReportName (global variables) and ColumnNames ( ColumnNames would look like Column1,Column2,Column3 coming from the ShowFields report multi-valued parameter explained in the next paragraph) . Firstly this SP would delete your login-related cache for this report from the last time. I remember doing some basic measurements using time statistics and the scenario with deleting and inserting the cache rows was a little bit faster in my enviroment then any other option available. In the next step, this SP would transform the ReportName to the Report_ID from rpt.Report and then the ColumnNames into ReportColumn_IDs using a table value UDF and an inner join to the rpt.ReportColumn based on ReportColumnName and ReportName. Here are some examples how to implement a custom split-string function or in case you are lucky and already using MSSQL 2016, you can go with the internal SPLIT_STRING function. In the last step, this SP would insert these values into the rpt.ReportColumnParameterCache table. This SP has to be located at the end of the stored procedure used for retrieving the data into the report main dataset.

On the SSRS side, we need to create a report parameter holding multiple string values called ShowFields. This parameter should have available values fulfilled from a query showing available columns for this Report. The default values would be the values you have stored in your ReportColumnParameterCache table for your Windows login and this report, so in this case, it would be the result of a function containing a select, where the ReportUser value equals your Windows login (globals!UserID) and the ReportName (globals!ReportName) equals the value in the rpt.Report table.

The last step is to set the affected columns visibility in the SSRS report table / matrix based on a formula saying something like:

=iif(instr(Join(Parameters!ShowFields.Value,”, “) ,”Column1″) <> 0 and instr(Join(Parameters!ShowFields.Value,”, “) ,”None”) = 0,false,true)

Rebuild / reorganize those indexes

Lately I brought back to life a script that I would like to briefly talk about. It is made of dynamic TSQL code which I find pretty neat and really useful if it preceeds the OLAP processing tasks ( ideally together in a SSIS package with a consecutive statistics update task using sp_updatestats ) .

I am not going to share the whole thing, but I will give some hints how to build this code for yourself.

In the declaration part, you set the variables with the fragmentation thresholds which determine if the index is to be rebuild or just reorganized and the minimal index page count. Just a hint here, we have this currently setup as 14 percent threshold to reorganize, 35 to rebuild and 500 as the minimal page count.

Next step is to create a temp table which will hold the index names and the dynamically generated rebuild / reorganize statements. After this table is filled, you iterate through the table and execute the statement per each row.

The main magic however happens using this query:

SELECT
b.name as key_name
,a.avg_fragmentation_in_percent
,a.page_count, c.name as Tab_name
,d.name as Schema_name
,a.partition_number as Partition_Number
,'ALTER INDEX [' + b.name + '] ON [MyDB].[' + d.name + '].[' + c.name + '] '
+ CASE
WHEN a.avg_fragmentation_in_percent BETWEEN @reorg_pct AND @rebuild_pct
THEN 'REORGANIZE'
WHEN a.avg_fragmentation_in_percent > @rebuild_pct
THEN 'REBUILD'
END
+ CASE
WHEN a.partition_number > 1
THEN ' PARTITION = ' + CAST(a.partition_number AS NVARCHAR)
ELSE ''
END
AS index_maintenance
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) a
JOIN sys.indexes b ON b.object_id = a.object_id AND b.index_id = a.index_id
JOIN sys.objects c ON c.object_id = b.object_id
JOIN sys.schemas d ON d.schema_id = c.schema_id
WHERE a.index_id > 0 AND c.name NOT LIKE 'sys%' AND a.page_count > @pagecnt
AND a.avg_fragmentation_in_percent > @reorg_pct