Pre-Register for 2020

Summit Blog

Official Bloggers Shabnam Watson Another Look at PASS Summit 2019 Sessions Using Power BI Desktop and “Get Data From Web”

Another Look at PASS Summit 2019 Sessions Using Power BI Desktop and “Get Data From Web”

28 Oct 2019 | Posted by Shabnam Watson

Last week I published a post on how I used Power BI Desktop to import PASS Summit 2019 session schedule and build a few reports for myself to help me sort my sessions of interest by day/time/room and decide which ones to attend.

In that post, I had initially tried to import the schedule with “Get Data From Web” within Power BI Desktop but was not able to do so because I ran into the following error and did not know how to get around it:

Unable to Connect: We encountered an error while trying to connect. Details:”The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF”

It looks like something on this page is not following the expected HTML pattern so “Get Data From Web” is not able to pull its data directly.

Thankfully, Reza Rad showed me an alternative way to get data from a web site at SQL Saturday Atlanta. I was able to use this method that I will explain later in this post to pull the sessions schedule directly from PASS’s website and bypass the manual export to PDF/Excel and import into Power B Desktop steps. This makes it a lot easier to keep the report up to date with changes in the schedule as they can sometimes happen at the last minute.

You can find the new pbix file here: “PASS Summit 2019 Sessions From Web Live.pbix“.

Sessions Summary

In this Power BI Desktop report, in addition to the session categories I had used previously, I added some Word Cloud and Q&A visuals.

The Q&A feature is pretty cool. You can ask questions such as “sessions on Power BI” or any other topic and get a list of the sessions. This works out of the box without any more configurations. You can also ask questions such as which speaker has the most session count? (Hint: The answer is Hamish Watson as seen in the Speaker Word Cloud visual. Check out all the Speaker Idol sessions).

The following shows the “Power BI” report section which is one of the session categories I am interested in.

One of my favorite reports is the “Room Schedule” report. It looks like if you don’t like walking and want to stay at one room/building, any of the TCC rooms will have plenty of Power BI/Azure/AI sessions.

Get Data From Web and PASS 2019 schedule

If you try “Get Data From Web” to pull data from PASS Summit 2019 schedule page, you get an error. This method works for a lot of pages but for some reason it does not work for this page.

You get this error:

Unable to Connect: We encountered an error while trying to connect. Details:”The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF”

One way to get around this, is to start with “Edit Queries” from scratch and use the M language function Web.Page(Web.Contents()) to pull the data. Instead of doing this, you can connect to a different web page (Rotten Tomatoes Top Movies) that Power BI is able to read from and edit the web address later in the Query Editor. That’s what I did.

Start with Get Data From Web:

Type in the web address of a page that has any kind of a list in it, for example: https://www.rottentomatoes.com/top/

Once the page contents show in the Navigator window, choose any Table from the page and click on Transform Data.

When the Query Editor opens, under Applied Steps, remove “Changed Type” and “Navigation” steps. Click on the “Source” step:

This step shows the M function Web.Page(Web.Contents()) that is used to pull the page data:

= Web.Page(Web.Contents(“https://www.rottentomatoes.com/top/”))

Change the web address to https://www.pass.org/summit/2019/Learn/Schedule.aspx

Once the contents load, you can see that each day of conference has its own Table. Click on each Table to add it to the model.

You can download the pbix file shown above from here.

Once you have this file, it is easy to do some clean up and end up at the pbix file that I used to look at the sessions.

Hint: You can get one of the days cleaned up to the form you want, duplicate it and change the navigation step to a different day by changing the number in “Source{xx}[Data]” as highlighted in the picture above,

The final pbix file is available here:

PASS Summit 2019 Sessions From Web Live.pbix

See original blog post here.

Meet more Summit bloggers

See All

cage-aids
cage-aids
cage-aids
cage-aids