Status Checker Workflow with n8n and Apify: A Step-by-Step

Antonio Blago
Antonio Blago

Status-Checker Workflow with n8n and Apify: A Step-by-Step Guide   Minutes reading time remaining By Antonio Blago February 9, 2025

Overview

The Website Status Code Crawler from Apify helps to check the status code of websites. It crawls all specified URLs and collects the HTTP status codes returned by the servers. The tool is especially useful for detecting faulty pages such as 404 (not found) or 500 (server error).

Features

Crawls websites and returns the HTTP status code.

Supports custom URLs: Define your start URLs and the maximum crawl depth.

Determines status codes for each URL: The collected data includes the status code of each visited URL.

Detects error pages: You can identify error pages like 404, 500, etc. with minimal effort.

Using the Status Code Crawler

To use the Website Status Code Crawler, follow these simple steps:

1. Set up Apify Actor

Visit Apify and log in.

Go to the Apify Store and find the HTTP Status Code Crawler.

Click on "Try for free" to run the Actor.

2. Define input parameters

You can control the crawler with the following inputs:

{
"url": "https://example.com",
"max_urls": 10,
"follow_links": true,
"mode": "auto"
}

start_urls: A list of URLs the crawler should visit.

max_depth: This parameter is currently just a placeholder. All URLs of a domain will be crawled.

Crawler respects robots.txt and has a crawl delay of 1s per URL.

3. Run Apify Actor

Click the "Run" button in Apify to start the Actor.

The output will show you the HTTP status code for each crawled URL.

4. Export data

After execution is complete, you can download the extracted data (URLs and status codes) as JSON or CSV. This gives you an overview of all pages and their status.

Example Output

Details

The output of the crawler looks like this:

{
"details": [
{ "url": "https://example.com", "status": 200 },
{ "url": "https://example.com/missing-page", "status": 404 }
]
}

url: The visited URL.

status: The HTTP status code of the URL. A value of 200 means the page loaded successfully, while values like 404 and 500 indicate errors.

Status Code Summary

There is also an aggregated summary of the current status codes:

{
"overview": [
{ "Status Code": 200, "Count": 10 },
{ "Status Code": 404, "Count": 2 }
]
}

5. Detect error pages

An important use case for this crawler is detecting faulty web pages. Pages with a status code greater than 200 (e.g., 404 or 500) signal errors and can be further processed for analysis.

6. Further use cases

Integration with n8n: You can integrate the Apify Actor into n8n to create an automated workflow that regularly checks for faulty pages and sends a notification to Slack or Telegram in case of errors.

Download the JSON import file here to import it into n8n

Website monitoring: The crawler is especially useful for monitoring websites that need to be checked regularly to ensure that no faulty pages are returned.

Conclusion

The Website Status Code Crawler from Apify is an easy-to-use tool that helps you monitor the status of your websites. Whether you want to identify error pages or perform a complete crawl, this Actor offers a fast and efficient solution.

You can further customize and extend the tool to meet your requirements, or integrate it into workflows like n8n to automate regular checks.

Finally Accessible: Efficiently Fill ALT Texts with AI

5 (1) Practical example in Shopify In my job as an SEO freelancer

Read More

Automation, AI, SEO

Finally Accessible: Efficiently Fill ALT Texts with AI

5 (1) Practical example in Shopify In my job as an SEO freelancer [...]

Automation

Create Apify Account

Tutorial: Create & Start Apify Account Step 1: Go to apify.com Open [...]

Analysis, AI, SEO, SEO Tools

AI Prompt Keyword Mapper

0 (0) How to automatically analyze prompts Nowadays, prompts – that is, [...]

AI, AI Tools, SEO Tools

ChatGPT German: Use Chat GPT for free without registration

[borlabs-cookie id="aichatbot" type="content-blocker"][/borlabs-cookie]ChatGPT, the advanced language model from OpenAI, is revolutionizing the way [...]

SEO, SEO Tools, Technical SEO

Indexing Checker for Google Search Console

5 (5) Save more than 2 hours per week Guide: [...]

Automation, SEO, SEO Tools

Cheaper Alternative to Keyword Planner: Retrieve 130k Keywords with Python

Use my SEO roadmap to get to page 1 on Google!

Sign up for my newsletter and get access to free guides, checklists, and tools.

 
Cookie-Settings