How to use the Mozscape API with Python for guest blogging

In the last step of the guest blogging guide (Step 3: Clean up the list of potential publications by evaluating their domain metrics), we created a Google sheet where we listed the URLs of the potential guest blogging publications.

Unfortunately, this leaves us with a lot of data to fill in for an initial evaluation.

Here’s how to automate this step using Moz’s API and some Python

In this example, we will pull the domain metrics for 100 entries in a .csv sheet (exported from Google sheets) containing website URLs using the Mozscape API, Moz’s free API that provides intelligent metrics such as the Domain Authority.

To get started, export the spreadsheet in CSV format and rename it to sheet1 for the example. Then head over to GitHub to grab a copy of my code or write your own using the walkthrough below.

I will break the Python script to its more important parts:

First, we need to import the necessary libraries and create a “client” object with our API credentials.

from time import sleep
from mozscape import Mozscape
import csv
client = Mozscape('*******', '*******')
# Now I'm ready to make API calls!

Next, we will create a function called “get_MozscapeData” that takes a list of URLs and returns the number of external links, page authority, and domain authority for each one of them. We will also pause for 11 seconds between API calls to make sure we don’t get throttled.

You can read more about what Moz data we can access in the free tier on the documentation page and see more Python examples on their Github.

def get_MozscapeData(url_list):
    authorities = client.urlMetrics(
        Mozscape.UMCols.pageAuthority | Mozscape.UMCols.equityExternalLinks | Mozscape.UMCols.domainAuthority
    return (authorities)

The next part of the code takes care of reading the URLs from our CSV file and writing the results in a new file called “Mozscape.csv.”

myFile = open('sheet1.csv','r')
reader = csv.reader(myFile)
row_count = sum(1 for row in myFile)
row_count = int(row_count)
print ("row_count in file: ",row_count)
myFile = open('sheet1.csv','r')
reader = csv.reader(myFile)
outputFile = open('Mozscape.csv', 'w', newline='')
outputWriter = csv.writer(outputFile)
outputWriter.writerow(["URL","External Links","Page Authority","Domain Authority"])

In the last part of the code, we make sure that we pass valid URLs to the “get_MozscapeData” function above and form batch requests. Bypassing 10 URLs at a time to the API in the form of the “batch_urls” list, we will finish with our list of 100 websites in about 2 minutes, given that we wait a few seconds between requests.

Because this is a large code section, you will find more comments inline the code (marked with #).

# initializing the URL list for the batch requests
batch_urls = []
# cleaning up the URLs and creating batches of 10 URLs at a time
which_row = 0
for row in reader:
    which_row = which_row +1
    url = row[0] #The URLs are in the 1st column
    if url == "URL":
            if "http://" in url:
                url = url.replace("http://","")
            if "https://" in url:
                url = url.replace("https://", "")
            #making sure that all requests go without a www subdomain
            if "www." in url:
                url = url.replace("www.", "")
            # making sure that all requests go with www
            if "www." in url:
                url = "www." + url
            if (len(batch_urls) == 10) or (which_row == row_count):
                print ("New batch...")
                print (batch_urls)
                print ("calling Moz...")
                MozDatas = get_MozscapeData(batch_urls)
                print ("Moz Datas...")
                print (MozDatas)
                count = 0
                for data in MozDatas:
                    url = batch_urls[count]
                    external_links = data["ueid"]
                    page_authority = data["upa"]
                    domain_authority = data["pda"]                    print(url,external_links,page_authority,domain_authority)                  outputWriter.writerow([url,external_links,page_authority,domain_authority])
                    count += 1
                #initialize new batch
                print ("initializing new batch...")
                batch_urls = []
            outputWriter.writerow([url, "Error", "Error"])
            print('ERROR WITH URL: ', url)

At this point, you will have a CSV file that looks like this:

When filtering websites by domain authority, a good rule is to try and guest blog to those that have higher domain authority than your own website. Based on this, we can shorten the list by Domain Authority and filter out low authority websites.

If done well, this can help us focus our final manual review on around 20 websites in the sweet spot between 30 and 60 DA. Well-established websites tend to be tough to get a reply from and often have rigorous copyright and linking guidelines so that I would avoid them; growing websites are more eager for good quality content, and it feels more like building a relationship with the team behind them rather than doing you a favor.

Last but not least, I would still conduct a manual review of a. their publishing guidelines, b. overall quality of the publication, c. link profile & spam score, and d. review some of the existing guest content to confirm that followed links are allowed.