This blog entry is a follow up to my previous post on Hackernoon, outlining the process I use for identifying quality publications that accept contributors.

In the last step of the guest blogging guide (Step 3: Clean up the list of potential publications by evaluating their domain metrics), we created a Google sheet where we listed the URLs of the potential guest blogging publications.

Unfortunately, this leaves us with a lot of data to fill in for an initial evaluation.  

Here's How to Automate This Step Using Moz's Api and Some Python

In this example, we will pull the domain metrics for 100 entries in a .csv sheet (exported from Google sheets) that contains website URLs using the Mozscape API, Moz's free API that provides intelligent metrics such as the Domain Authority.

To get started, export the spreadsheet in csv format and rename it to sheet1 for the sake of the example. Then head over to GitHub to grab a copy of my code, or write your own using the walkthrough below.

I will break the Python script to its more important parts:

First, we need to import the necessary libraries, and create a "client" object with our API credentials.

from time import sleep
from mozscape import Mozscape
import csv
client = Mozscape('*******', '*******')
# Now I'm ready to make API calls!

Next we will create a function called "get_MozscapeData" that takes a list of URLs and returns the number of external links, page authority and domain authority for each one of them. We will also make sure to pause for 11 seconds between API calls, to make sure we don't get throttled.

You can read more about what Moz data we can access in the free tier on the documentation page and see more Python examples on their Github.

def get_MozscapeData(url_list):
    authorities = client.urlMetrics(
        #['www.example.com',],
        url_list,
        Mozscape.UMCols.pageAuthority | Mozscape.UMCols.equityExternalLinks | Mozscape.UMCols.domainAuthority
        )
    sleep(11)
    return (authorities)

The next part of the code takes care of reading the URLS from our CSV file and writing the results in a new file called "Mozscape.csv".

myFile = open('sheet1.csv','r')
reader = csv.reader(myFile)
row_count = sum(1 for row in myFile)
row_count = int(row_count)
print ("row_count in file: ",row_count)
myFile.close()
myFile = open('sheet1.csv','r')
reader = csv.reader(myFile)
outputFile = open('Mozscape.csv', 'w', newline='')
outputWriter = csv.writer(outputFile)
outputWriter.writerow(["URL","External Links","Page Authority","Domain Authority"])

In the last part of the code, we make sure that we pass valid URLs to the "get_MozscapeData" function above, and also form batch requests. By passing 10 URLs at a time to the API in the form of  the "batch_urls" list, we will finish with our list of 100 websites in about 2 minutes time, given that we wait a few seconds between requests.

Because this is a large code section, you will find more comments inline the code (marked with #).

# initializing the URL list for the batch requests
batch_urls = []
# cleaning up the URLs and creating batches of 10 URLs at a time
which_row = 0
for row in reader:
    which_row = which_row +1
    url = row[0] #The URLs are in the 1st column
    if url == "URL":
        pass
    else:
        try:
            if "http://" in url:
                url = url.replace("http://","")
            if "https://" in url:
                url = url.replace("https://", "")
            #making sure that all requests go without a www subdomain
            if "www." in url:
                url = url.replace("www.", "")
            else:
                pass
            """
            # making sure that all requests go with www
            if "www." in url:
                pass
            else:
                url = "www." + url
            """
            batch_urls.append(url)
            if (len(batch_urls) == 10) or (which_row == row_count):
                print ("New batch...")
                print (batch_urls)
                print ("calling Moz...")
                MozDatas = get_MozscapeData(batch_urls)
                print ("Moz Datas...")
                print (MozDatas)
                count = 0
                for data in MozDatas:
                    url = batch_urls[count]
                    external_links = data["ueid"]
                    page_authority = data["upa"]
                    domain_authority = data["pda"]                    print(url,external_links,page_authority,domain_authority)                  outputWriter.writerow([url,external_links,page_authority,domain_authority])
                    count += 1
                #initialize new batch
                print ("initializing new batch...")
                batch_urls = []
                sleep(1)
        except:
            outputWriter.writerow([url, "Error", "Error"])
            print('ERROR WITH URL: ', url)
myFile.close()
outputFile.close()

At this point you will have a CSV file that looks like this:

image of csv file with the domain metrics

A good rule when filtering websites by domain authority, is to try and guest blog to those that have higher domain authority than your own website. Based on this, we can short the list by Domain Authority and filter out low authority websites.

If done well this can help us focus our final manual review on around 20 websites in the sweet-spot between 30 and 60 DA. Well-established websites tend to be tough to get a reply from, and often have very strict copyright and linking guidelines so I would avoid them; growing websites are more eager for good quality content and it feels more like building a relationship with the team behind them rather than doing you a favor.

Last but not least, I would still conduct a manual review of a. their publishing guidelines, b. overall quality of the publication, c. link profile & spam score, and d. review some of the existing guest content to confirm that followed links are allowed.