Scripting URL normalization and resolution

From time to time most network and / or security engineers will need to normalize the output of a set of URLs to either IP literals or a formatted list of DNS names. This can be particularly useful for feeding intelligence feeds or creating block/allow lists.

There are probably 10,000 other scripts to do this, but this one is mine.

Potential use cases:

Benefits / Features

Via the github repo:


Simple DNS Resolver Script for URLs

This script is marginally useful for creating files suitable for importing into tools that require one address or URL per line such as pihole

Description

This Python script reads a list of URLs from an input file, extracts the domain names, and resolves their A (IPv4) and AAAA (IPv6) records. The results are printed to the console and saved to an output file.

Prerequisites

Installation

For simplicity, and because I am a terrible developer, no additional libraries are required as the script uses built-in Python capabilities / modules.

Usage

Run the script with the following options:

python urlresolver.py [-f INPUT_FILE] [-o OUTPUT_FILE]

or

chmod +x urlresolver.py

and then

./urlresolver.py [-f INPUT_FILE] [-o OUTPUT_FILE]

Options:

Examples

python urlresolver.py -f someurls.txt -o someresults.txt

This will read domain names from someurls.txt, resolve their A and AAAA records, and save the results in someresults.txt.

urlresolver.py -r -j -f someurls.txt -o junos.txt -z someurls

This will read domain names from someurls.txt, resolve their A and AAAA records, and save the results in a junos prefix-list formatted list named junos.txt where the prefix-list name is someurls.

Input File Format

The input file should contain one URL per line, e.g.,

https://example.com
http://sub.domain.com/

Output File Format

The output file will contain resolved IP addresses, with the original URL commented above each entry, e.g.,

# https://example.com
192.0.2.1
3ffe:db8::1
# ----------------------------------------

Or, depending on the -n flag, it will return a list of commented, normalized URLs:

# https://7e14d.v.fwmrm.net
7e14d.v.fwmrm.net
# ----------------------------------------
# 
# ----------------------------------------
# https://a-fds.youborafds01.com
a-fds.youborafds01.com
# ----------------------------------------
# 

Notes

Caveats

The script can’t handle stripping really, really long URLs, so anything egregiously long will need to be cleaned up manually.