top of page

Web scraping GitHub with Power BI




Have you tried the GitHub connector in Power BI? You’ll find information on changes and statistics of the repo, but not much more. What if you need raw code or binary data for a report? Read on.


Goal: show Azure Icons and metadata based on a GitHub repo.





Ben Coleman has a superb write-up and repo of Azure Icons (past and present). He also has some tools and artifacts for web scraping.


Ben also has a well written web gallery for viewing, searching and downloading Azure Icons:

In this post I’ll walk you through creating a Power BI report based on a GitHub repo of images and Power BI web data “by providing example” feature:



  1. locate the repo

  2. in Power BI desktop: connect to the Web page

  3. Click on “Add Table using Examples”

  4. scrape the URLs of the images

  5. create a calculated column to link to GitHub’s “raw” URL

  6. use table, image grid or other Power BI visuals to render image

  7. publish the report, schedule refreshes if desired.


Here's the M code. See AzureIcons2021.pbix on Github:


let
    Source = Web.BrowserContents("https://github.com/benc-uk/icon-collection/tree/master/azure-icons"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Icon Name", ".Link\-\-primary.js-navigation-open"}, {"URL", "[data-pjax=""\#repo-content-pjax-container""]", each [Attributes][href]?}}, [RowSelector=".Box-row + *"]),
    #"Inserted Replaced Text" = Table.AddColumn(#"Extracted Table From Html", "Icon URL", each Text.Replace([URL], "/benc-uk/icon-collection/blob", "https://raw.githubusercontent.com/benc-uk/icon-collection"), type text),
    #"Added Prefix" = Table.TransformColumns(#"Inserted Replaced Text", {{"URL", each "https://github.com" & _, type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Added Prefix",{{"Icon Name", "File Name"}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Renamed Columns", "File Name", "File Name - Copy"),
    #"Renamed Columns1" = Table.RenameColumns(#"Duplicated Column",{{"File Name - Copy", "Icon"}}),
    #"Replaced Value" = Table.ReplaceValue(#"Renamed Columns1","-"," ",Replacer.ReplaceText,{"Icon"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",".svg","",Replacer.ReplaceText,{"Icon"})
in
    #"Replaced Value1"





 
 
 

23 Comments


Bobby Dixon
Bobby Dixon
5 days ago

What a fantastic breakdown of using Power BI's "Add Table by Example" feature to pull raw data straight from GitHub — that calculated column trick for converting blob URLs to raw GitHub URLs is genuinely brilliant and saves so much back-and-forth. It reminded me of how even the best strategies need a solid data foundation; just like Krispy Kreme's marketing strategy relies on precise audience insights and trend tracking to drive those viral moments, data professionals need clean, structured pipelines to make smart decisions. I actually came across a similar challenge while completing a business analytics assignment, and the team at New Assignment Help UK helped me structure my data sourcing methodology effectively. This post honestly fills a gap that…

Like

This is such a practical walkthrough — using Power BI's "Add Table by Example" feature to scrape GitHub repos is one of those tricks that feels like cheating in the best way possible. The step of converting blob URLs to raw GitHub URLs using a calculated column is particularly clever and saves so much manual effort. I actually ran into a similar challenge while working on a data visualization project for university, and I wish I'd had this guide then — I ended up spending hours trying to figure out the URL transformation logic alone. It reminded me of when I used New Assignment Help UK for structuring my data analysis reports; having a clear, step-by-step framework makes all the…

Like

I wanted more clarity about Fairplay Pro, and this blog covered the important points nicely. The content is well-written and easy to understand.

Like

I was looking for guidance on Fairplay Sign Up, and this post made it very straightforward. The instructions are clear and beginner-friendly.

Like

I was exploring information about the Fairplay App, and this content helped a lot. The layout is neat, the points are explained smoothly, and it makes understanding the platform much easier. Great work putting this together


Like
Post: Blog2_Post

Subscribe Form

Thanks for submitting!

  • Twitter
  • LinkedIn
  • Facebook

©2021 by Snoozy Data. Proudly created with Wix.com

bottom of page