Microsoft Sentinel – Incident Enrichment with urlscan.io

Helping a SOC Analyst get the data they need during an investigation is critical in helping drive down security incident response time. Microsoft Sentinel provides a fantastic place to do incident investigation and response, but there are additional 3rd party services that can be woven into the response lifecycle which benefit the analyst by providing contextual enrichment data. Many of the solutions on the Content Hub or available via the community have the ability to pull in data from external sources to enrich alerts and incidents, but there are additionally very powerful 3rd party services that can be tapped for powerful insights as well.

This blog is going to walkthrough:

  • Signing up for urlscan.io
  • Working with Postman to understand the urlscan.io API
  • Utilizing the Postman Collection to create a custom Logic App Connector
  • Building out a logical flow for utilizing the API for incident enrichment
  • Building a Logic App to enrich a Sentinel Incident

icons

What is urlscan.io?

Per the company’s website (located here: About – urlscan.io)

urlscan.io is a free service to scan and analyse websites. When a URL is submitted to urlscan.io, an automated process will browse to the URL like a regular user and record the activity that this page navigation creates. This includes the domains and IPs contacted, the resources (JavaScript, CSS, etc) requested from those domains, as well as additional information about the page itself. urlscan.io will take a screenshot of the page, record the DOM content, JavaScript global variables, cookies created by the page, and a myriad of other observations. If the site is targeting the users one of the more than 400 brands tracked by urlscan.io, it will be highlighted as potentially malicious in the scan results.

urlscan.io provides 3 main APIs for free. There are commercial options that can be explored on the company’s website for higher quotas and additional functionality.

APIs:

    1. Search: searches urlscan.io to see if the submitted website has already been scanned and then returns those results
      1. Input: URL query string
        1. method: GET
        2. q=domain:whateverdomain.com (domain for which to search)
        3. size=number (number of results to return)
        4. search_after=number (batch retrieval – not used in this post)
      2. Response
        1. JSON response with an array of results (up to ‘size’) sorted most recent to oldest. Response includes URL to a site screenshot as well as a URL to the full JSON result for each result.
    2. Submission: submits a URL to be scanned by urlscan.io. Returns a UUID that can be used to retrieve those specific scan results
      1. Input: POST body
        1. url: url to scan
        2. visiblity: defaults to value in account settings if not specified
        3. tags: array of tags that can be passed (optional)
      2. Response
        1. JSON response with success/fail messages, information about the API, and most importantly the uuid field which can be used with the result API to retrieve the specific results of this scan
    3. Result: accepts the UUID from a previous submission and returns that scan’s specific results
      1. Input:
        1. use the uuid value from the Submission API within the URL path
      2. Response:
        1. The full result set including URLs, certificates, hosting, etc. Of highest interest, URLs to:
          1. Screenshot of the page
          2. Full result set
          3. Initial DOM code

Straight forward – now to sign up!

urlscan.io Account

Signing up is easy and free. Again, there are commercial options so make sure and explore what makes sense for you and your organization. For demo purposes, I am sticking with the free tier.

urlscan-signup

Username & Password…done. Once signed up, you can look at your account and see your quotas:

urlscan-account-info

(Recommend enabling 2FA). A noteworthy item here is the visibility of your scans. You can set the default for your scans and Public (everyone can see), Unlisted (only vetted security researchers w/ Pro licenses can see), or Private (visible only to you). Set this to your preference. More information here: API Documentation – urlscan.io

Now that you have an account, the next step is to generate and save off your API key:

urlscan-api-key

Done! We’re now ready to play with the API.

Postman

When I start playing with an API, my tool of choice is Postman. Postman is available here: Download Postman | Get Started for Free

There are many benefits to using Postman, but a huge one for this scenario is that the Postman collection can be directly imported into Azure Logic Apps Custom Connectors to do a bunch of the heavy lifting for utilizing the API. I created a very simple collection with the 3 main API calls for urlscan.io.

Download here: sentinel/urlscan_io at main · scomurr/sentinel (github.com)

Import the collection into Postman and then set your API key. Set it at the top level so that it is inherited for each of the calls:

postman-set-api-key

Now that the API-Key header is set, let’s play with the calls.

Search:

urlscan-search-noresults

Specifying a domain to the Search API that doesn’t exist, notice no results are returned. The same will be true if the site has never been scanned.

Here, I’m specifying a domain that does exist and has been scanned:

postman-search-microsoft

Excellent! The API key is working and I have results. By specifying size=1, I am returning only the most recent scan. Logic can be wrapped around the timestamp to ensure the scan is fresh.

Moving onto the submission API, I am going to submit my test domain (jhgfdsa.com) to the API to get a fresh scan.

postman-urlscan-submission

Note the uuid. A key thing to note here is that the scan will not be immediately available. It can take a few minutes depending on the complexity of the scan, how busy the urlscan.io services are, how busy the scanned site is, etc. This will have to be factored in when it comes to the automation logic.

After waiting a few minutes, I can take the uuid value over to the Results API and get my scan results:

postman-urlscan-result

Results! We now have a fully functional Postman collection that allows for easy access to the urlscan.io APIs.

Keep Postman open (if you’re following along) – we need the results from each of the API calls for configuring the response options in the next step.

Logic App Custom Connector

Next step is to use the collection (either export, use the one I provided, or create your own) into Azure and create a Logic Apps Custom Connector. This will allow for each of the 3 API calls to be used within a Logic App in response to a Sentinel incident.

Log into Azure and navigate to “Logic Apps Custom Connector” and then click ‘Create’.

 create-logicapps-customconnector

NOTE: the custom connector needs to be in the same region as your logic app. If it is not, the connector will not be available for use within your Logic App.

Now, review and create the custom connector. Once it is created, we can configure by navigating to the resource and then hitting edit.

customconnector-edit

Now, hit the Import button, browse to the Postman collection JSON file, and then hit ‘Update connector’.

customconnector-import-postman

NOTE: the name in the dialog box may or may not switch to match the name of the JSON file. This can be a little misleading as you may think that the browse to the file was not successful.

Once the collection is imported, navigate to the bottom of the screen and then hit ‘Security’ at the bottom to move to the next page. For Security options, the urlscan.io API calls require that the API-Key header is included along with the API key itself. Configure the Security options as such:

logicapps-custom-connector-security

The parameter name needs to be exactly API-Key to match the APIs requirements.

Now, at the bottom move on to ‘Definition’. If the import was successful, you will see the 3 API calls on the left:

customconnector-definitions

At this point, we need to configure the response options for each of our API calls. For each of the calls, move down the screen to the Response section and hit the default response:

customconnector-default-response

Now, hit ‘Import from sample’. This will cause the option to import to fly out from the right.

customconnector-response-flyout

Navigate back to Postman, and for the Submit API copy out the response from the previous call and then paste it into the Body section of the flyout in the screen above. After pasting, hit import.

postman-response-copy

After Import, you should see the elements from the body as payload responses:

customconnector-response-options

Hit ‘Update connector’ up top. Once the update completes, do the same steps for the Search (possible error here – check the next note) and ScanResults Actions on the left hand side.

NOTE: for the result set for Search, the “sort” element in the response JSON payload looks like this:

“sort”: [
     1664215094152,
     “c4d660e5-e55e-456d-9d85-cbc3ea140767”
],

This causes an error because the Azure Portal UI sees the first element in the sort array as an integer and the second element as a string which causes the mismatch error. In order to move past this, simply wrap the integer in double (“1664215094152”) quotes to make both elements be treated as strings.

One more tweak to make the Postman collection work – the Result API needs a tweak. Go to definition and then open the Swagger Editor. Scroll down to ‘/api/v1/result’ around line 120. Two changes, first, set the path to the URI to ‘/api/v1/result/{uuid}:’. Second, add a parameter (line 125/126 or so) so that the swagger file looks like this:

swagger-update

The parameter is:

– {name: uuid, in: path, type: string, required: true}

This parameterizes the late element of the path for the Result API. Hit ‘Update connector’ and the custom connector is complete! Now, it’s a matter of mapping out the logic for responding to an Azure Sentinel incident with URL entities.

Download the custom connector here: sentinel/urlscan_io at main · scomurr/sentinel (github.com)

Response Logic

Looking at the API best practices documented on urlscan.io’s website, the goal is to avoid burning my quota and only call the Submit/Scan and Scan Results APIs when no result is returned or the result is stale. For demo purposes, I am only looking to bring in the latest result as long as the result is < 24 hours old. Here’s a high level diagram of the logic I want to attempt in the Logic App:

logicapp-flow

The goal will be to manually trigger the alert from Sentinel, however, once the logic is baked and a comfort level is established, this could be flagged to automatically run and enrich incidents.

Logic App

In order to create the Logic App for incident response, I am going to navigate to Sentinel –> Automation

sentinel-create-logicapp

This launched the ‘Create playbook’ screen:

create-playbook

Move through to Review and Create. We now have a logic app with the Microsoft Sentinel incident trigger ready to go!

I am not going to walk through the creation and logic of the app, however, the gist of it matches the flow diagram above. Once manually triggered,

  1. the search API will be called
    1. If there is a result, the age of the result will be checked
    2. If the age passes, the result of the search API will be used
  2. otherwise,
    1. Scan will be called
    2. A loop w/ a 1 minute delay between iterations will be used
    3. Each iteration of the loop will call the Results API
    4. Once results are returned, those will be sent to the incident

Note, inside the Logic App Designer, I am able to use the Actions from the Custom Connector:

customconnector-in-logicapp

Completed playbook:

completed-playbook

There’s a lot of additional detail in the playbook that’s hard to capture in the screenshot. Download the sample playbook arm template here: sentinel/urlscan_io at main · scomurr/sentinel (github.com)

NOTE: ARM template will have to be updated to match the subscription and resource group identifiers for target environment.

NOTE: Timestamp logic was not included in this iteration of the logic app. I opted to omit it for the purposes of simplicity and demoing the capabilities of the Custom Connector and the urlscan.io APIs.

Logic App Permission on Log Analytics Workspace

One last step – granting the Logic App the ability to actually update comments within a Sentinel incident. The easiest way to grant the sufficient permissions to the app is within the IAM space for the Log Analytics workspace.

Error: Attempting to execute prior to granting appropriate permissions

“StatusCode”: “Forbidden“,

“ReasonPhrase”: “Forbidden”,

“Content”: “{\”error\”:{\”code\”:\”AuthorizationFailed\”,\”message\”:\”The client ’45f08354-bdb4-4dd7-8547-478cedb154b8′ with object id ’45f08354-bdb4-4dd7-8547-478cedb154b8′ does not have authorization to perform action ‘Microsoft.Securit ` yInsights/incidents/comments/write’ over scope ‘/subscriptions/<subid>/resourceGroups/sentinel-rg/providers/Microsoft.OperationalInsights/workspaces/sentinel-laws/providers/Microsoft.SecurityInsights/incidents/78e6a70b-4884-4f81-bd4e-32bd02496db0/comments/c36bd762-9874-4609-b31c-3e2e17873c7c’ or the scope is invalid. If access was recently granted, please ` refresh your credentials.\”}}”,

To fix this, navigate to Log Analytics Workspaces, select the sentinel workspace in question, navigate to Access control (IAM), and then hit Add

update-sentinel-perms

Hit ‘Add role assignment’ and then select “Contributor” and hit Next

contributor-perms

Managed identity –> Select members –> Logic app –> select the Logic App that requires the permissions:

adding-sentinel-perms

Select and then Review + assign. Done!

Now, to test the enrichment via an incident within Sentinel.

Incident Enrichment

Within Sentinel, I have an incident that came over from Defender for Endpoint. Within the incident, there are several entities, but the one I am concerned with for the purposes of this blog post is the URL:

url-entity

Since I have a URL, I can call my enrichment playbook!

run-playbook

And then select the new playbook to run:

playbook-hit-run

If all works out, the playbook with make calls to urlscan.io and update the incident with comments with the screenshot and report URLs for the scan.

playbook-triggered

Checking the incident after a few seconds and:

search-results

With functional URLs in the comment! Very exciting. This validates the playbook works with a result that can be retrieved via the Search API. Now, need to test with a URL that has not had a scan. At the time of this writing, https://scomurr.com had not been scanned.

I generated an alert in Defender for Endpoint by looking for scomurr.com traffic, and the alert populated into Sentinel via the MDE integration. Launching the playbook…

scomurr-launch-playbook

Waiting just a bit, I checked the execution in Logic Apps:

delay

Which is perfect – the playbook is designed to wait 60 seconds after submitting the URL before calling the Result API to ensure the results are there. The playbook will run the loop four times before timing/failing out. After another 20 seconds, the comments with the results from the API were posted to Sentinel:

submit-result

Done – I now have a Logic App Custom Connector and a Logic App for reaching out to urlscan.io and pulling in results from the API. The results can be posted directly to a comment within a Sentinel incident or alert in order to enrich the data available to an analyst. There is a myriad of great information available via these API calls and I am just using a few tiny elements.

Automation is power. Utilizing Logic Apps to enrich alerts and incidents within Sentinel can help an analyst respond even faster. Happy automating!

Leave a Reply