Alternative to the Keyword Planner: Retrieve 130k Keywords

Antonio Blago
Antonio Blago

Cheaper Alternative to the Keyword Planner: Retrieve 130k Keywords with Python   Minutes of reading time remaining By Antonio Blago March 23, 2025

SEO, Tutorial

The Google Keyword Planner is handy – but with large keyword lists, automation, or API usage, you quickly hit its limits. If you want to analyze 10,000 or even 130,000 keywords, you need a scalable, affordable, and flexible solution.

This is exactly where DataForSEO comes into play – combined with Python, you can build your own keyword tool. Check out the video here:

In this tutorial, I'll show you step by step how to do this – including batching, cost control, and Excel export.

Why an Alternative to the Keyword Planner?

The Google Keyword Planner is solid, but:

  • heavily limited in terms of keyword volume
  • not very flexible for automated workflows
  • API access only with an active Google Ads account

So if you want to process large keyword lists (e.g., from BigQuery, SEO tools, or your own data sources), you need a scalable solution – like DataForSEO + Python.

Prerequisites

Install Python Packages

pip install pandas requests openpyxl

DataForSEO Access

Register at DataForSEO and get your credentials (login & password). This allows you to use their powerful API, which is specifically designed for large queries – the perfect Keyword Planner alternative for developers and SEOs.

API Access

Register at DataForSEO and get:

  • Login
  • Password

You need these for authentication via Basic Auth.

Setup Logic

  • Load keywords from Excel file
  • Send to the API in batches of 1,000
  • Retrieve results including CPC, search volume, and monthly trends
  • Save progress, API costs, and results (Pickle + Excel export)
  • Seamlessly resume after interruption

Important Features in the Code

Batching:

Keywords are sent to the API in groups of 1,000 to comply with limits.

Pickle & Progress Tracking:

If the script crashes, you can seamlessly continue – including costs, progress, and results.

API Cost Overview:

After each batch, costs are logged – and at the end exported as a table to an Excel file.

Keyword Filtering:

  • Max. 80 characters
  • Max. 10 words
  • Cleansing of characters like ™, ©, ®, etc.

projekt/ │ ├── keywords.xlsx ← Your keyword list ├── get_sv_data4seo.py ← Main script ├── backup_results.pkl ← Automatic interim storage ├── final_results.xlsx ← Exported result (3 sheets) ├── invalid_keywords.txt ← Filtered / unusable keywords └── progress.txt (optional) ← Optional progress tracking

Advantages Over the Keyword Planner

Google Keyword Planner DataForSEO + Python (this tutorial)
Limit for bulk queries Any number of batches possible
Inflexible, web interface Fully automated via code
Hardly any clustering or export Custom analyses & filters possible
API only with Google Ads API directly & transparently usable

👉 So you get a real alternative to the Keyword Planner if you need deeper, faster, or more extensive keyword data.

Script:

There are functions in the script (such as the advanced cleaning method for keywords) that I cannot publish; if you want access or need help with setup, book a support appointment here.

import os
import time
import json
import base64
import pandas as pd
import requests
import numpy as np
from dotenv import load_dotenv
load_dotenv()
## import word_frequency as wf advanced cleaning
import re
import unicodedata

# --- Konfiguration ---
PICKLE_BACKUP = "backup_results.pkl"
PROGRESS_FILE = "progress.txt"
BATCH_SIZE = 1000
LOCATION_CODE = 2276
LANGUAGE_CODE = "de"
MAX_RETRIES = 3
RETRY_DELAY = 3  # Sekunden

USERNAME = "login"
PASSWORD = "password"
AUTH_HEADER = os.getenv("d4seo") #base64.b64encode(f"{USERNAME}:{PASSWORD}".encode()).decode()

API_URL = "https://api.dataforseo.com/v3/keywords_data/google_ads/search_volume/live"

def return_path():
EXCEL_FILE = "keywords.xslx"  # muss eine Spalte "keyword" enthalten
return EXCEL_FILE

def clean_keywords(keywords):
cleaned = keywords ## wf.basic_cleaning(keywords)

return cleaned

FORBIDDEN_CHARS = set("™©®✓•→←≠∞¿¡§¶…")

def is_valid_keyword(kw: str) -> bool:
try:
kw = kw.strip()
if not kw:
return False

# Entferne verbotene Zeichen
for char in FORBIDDEN_CHARS:
kw = kw.replace(char, "")

kw_norm = unicodedata.normalize("NFKC", kw)

# Entferne Steuerzeichen
kw_cleaned = ''.join(c for c in kw_norm if unicodedata.category(c)[0] != 'C')

# Prüfe verdächtige Zeichen mit Unicode-Name
for char in kw_cleaned:
try:
name = unicodedata.name(char)
if not any(part in name for part in ["LATIN", "DIGIT", "SPACE", "DASH", "HYPHEN"]):
return False
except ValueError:
return False

# Filter kaputter Kodierungen
if re.search(r"[ãåÓ€]", kw_cleaned):
return False

# Zeichenverhältnis
valid_chars = re.findall(r"[a-zA-Z0-9äöüßéèêáàâíìîóòôúùûčšžăîâăëç\- ]", kw_cleaned.lower())
ratio = len(valid_chars) / max(len(kw_cleaned), 1)
if ratio < 0.7:
return False

# Mindestens 3 Buchstaben
if len(re.findall(r"[a-zA-Z]", kw_cleaned)) < 3:
return False

return True

except Exception:
return False

# ----------------- Hilfsfunktionen -----------------
def load_keywords_from_excel(filepath: str) -> list:
df = pd.read_excel(filepath)
return df["keyword"].dropna().astype(str).tolist()

def batch_keywords(keywords: list, size: int) -> list:
return [keywords[i:i + size] for i in range(0, len(keywords), size)]
def send_batch_request(keywords_batch: list) -> dict:
payload = [{
"keywords": keywords_batch,
"location_code": LOCATION_CODE,
"language_code": LANGUAGE_CODE,
"sort_by": "relevance"
}]
headers = {
'Authorization': f'Basic {AUTH_HEADER}',
'Content-Type': 'application/json'
}
response = requests.post(API_URL, headers=headers, data=json.dumps(payload))
if response.status_code == 200:
return response.json()
else:
raise Exception(f"HTTP {response.status_code}: {response.text}")

def parse_response(response_json: dict) -> tuple[pd.DataFrame, pd.DataFrame]:
results = []
monthly_results = []
error_keywords = []

tasks = response_json.get("tasks", [])
for task in tasks:
items = task.get("result", [])
for item in items:
try:
keyword = item.get("keyword")
results.append({
"keyword": keyword,
"search_volume": item.get("search_volume"),
"competition": item.get("competition"),
"cpc": item.get("cpc"),
"currency": item.get("currency")
})

monthly_searches = item.get("monthly_searches")
if isinstance(monthly_searches, list):
for ms in monthly_searches:
monthly_results.append({
"keyword": keyword,
"year": ms.get("year"),
"month": ms.get("month"),
"search_volume": ms.get("search_volume")
})

except Exception as e:
error_keywords.append((item.get("keyword"), str(e)))
print(f"Fehler beim Verarbeiten von Keyword '{item.get('keyword')}': {e}")

return pd.DataFrame(results), pd.DataFrame(monthly_results)

def save_progress(index: int):
with open(PROGRESS_FILE, "w") as f:
f.write(str(index))

def load_progress() -> int:
if os.path.exists(PROGRESS_FILE):
with open(PROGRESS_FILE, "r") as f:
return int(f.read())
return 0

def load_existing_results() -> tuple[pd.DataFrame, pd.DataFrame]:
if os.path.exists(PICKLE_BACKUP):
data = pd.read_pickle(PICKLE_BACKUP)
return (
data.get("results", pd.DataFrame()),
data.get("monthly", pd.DataFrame()),
data.get("api_costs", pd.DataFrame()),
data.get("status", {}).get("progress_index", 0),
data.get("status", {}).get("total_cost", 0.0)
)
else:
return pd.DataFrame(), pd.DataFrame(), pd.DataFrame(), 0, 0.0

def save_all_to_pickle(results_df, monthly_df, api_costs_df, progress_index, total_api_cost):
data = {
"results": results_df,
"monthly": monthly_df,
"api_costs": api_costs_df,
"status": {
"progress_index": progress_index,
"total_cost": total_api_cost
}
}
pd.to_pickle(data, PICKLE_BACKUP)

# ----------------- Filter Keywords -----------------'''
#     dataforseo: max. 80 characters
#     max. 10 words
#     Conversion to lowercase
#     :param keywords:
#     :param log_file:
#     :return:
#     '''
def filter_keywords(keywords: list[str], log_file: str = "invalid_keywords.txt") -> list[str]:
filtered = []
invalid = []

for kw in keywords:
kw_clean = kw.strip().lower()
if (
len(kw_clean) <= 80
and len(kw_clean.split()) <= 10
and is_valid_keyword(kw_clean)
):
filtered.append(kw_clean)
else:
invalid.append(kw.strip())

if invalid:
with open(log_file, "w", encoding="utf-8") as f:
f.write("Invalid keywords (too long, too many words or invalid characters):\n\n")
for kw in invalid:
f.write(f"{kw}\n")
print(f"{len(invalid)} invalid keywords were ignored and saved in '{log_file}'.")

return filtered

def clean_keyword_simple(kw: str) -> str:
"""
Specifically removes protected characters such as ™, ©, ® etc. from the keyword.
Returns the cleaned keyword.
"""
FORBIDDEN_CHARS = set("™©®✓•→←≠∞¿¡§¶…")

kw_cleaned = kw.strip()
for char in FORBIDDEN_CHARS:
kw_cleaned = kw_cleaned.replace(char, "")

return kw_cleaned

def main():
print("Loading keywords from Excel file...")
excel_file = return_path()
all_keywords = load_keywords_from_excel(excel_file)
all_keywords = [clean_keyword_simple(kw) for kw in all_keywords] ## clean keywords from trademark and copyright
all_keywords = filter_keywords(all_keywords) ## filter too long keywords
keyword_batches = batch_keywords(all_keywords, BATCH_SIZE)
start_index = load_progress()

results_df, monthly_df, api_costs_df, start_index, total_api_cost = load_existing_results()
#monthly_df = pd.DataFrame()
total_api_cost = 0.0
api_costs_list = []

print(f"{len(all_keywords)} keywords loaded. Starting from batch {start_index + 1} of {len(keyword_batches)}.")

for idx in range(start_index, len(keyword_batches)):
batch = keyword_batches[idx]
print(f"\nProcessing batch {idx + 1}/{len(keyword_batches)} ({len(batch)} keywords)...")

success = False
for attempt in range(1, MAX_RETRIES + 1):
try:
## Clean keywords from apostrophe
cleaned_batch = clean_keywords(batch)

response_json = send_batch_request(cleaned_batch)
time.sleep(1)

# Extract API costs
batch_cost = response_json.get("cost", 0.0)
total_api_cost += batch_cost
api_costs_list.append({
"batch_index": idx + 1,
"keyword_count": len(batch),
"cost_usd": batch_cost
})
print(f"API cost for this batch: {batch_cost:.3f} USD")

df_keywords, df_monthly = parse_response(response_json)
results_df = pd.concat([results_df, df_keywords], ignore_index=True)
monthly_df = pd.concat([monthly_df, df_monthly], ignore_index=True)

save_all_to_pickle(results_df, monthly_df, api_costs_df, idx + 1, total_api_cost)

print(f"Progress saved – batch {idx + 1}, total cost: {total_api_cost:.3f} USD")

save_progress(idx + 1)
success = True
break
except Exception as e:
print(f"Error in batch {idx + 1} (attempt {attempt}): {e}")
if attempt < MAX_RETRIES:
print(f"Waiting {RETRY_DELAY} seconds before retrying...")
time.sleep(RETRY_DELAY)
else:
print("Maximum number of attempts reached. Script will be aborted.")
return

if success:
time.sleep(1.5)  # API rate limiting

print(f"\nTotal API cost: {total_api_cost:.3f} USD")

# API cost table
api_costs_df = pd.DataFrame(api_costs_list)
api_costs_df.loc[len(api_costs_df.index)] = {
"batch_index": "Total",
"keyword_count": api_costs_df["keyword_count"].sum(),
"cost_usd": total_api_cost
}

with pd.ExcelWriter("final_results.xlsx") as writer: results_df.to_excel(writer, sheet_name="Keyword-Daten", index=False) monthly_df.to_excel(writer, sheet_name="Monatliche Suche", index=False) api_costs_df.to_excel(writer, sheet_name="API-Kosten", index=False)

print("\nExport completed: final_results.xlsx")

if name == "main": main()

Exported Result

At the end of this tutorial, you will have an Excel file with:

Keyword overview: search volume, competition, CPC

Time series: monthly volume development

API cost overview per batch

You can use this, for example, to create clusters, analyze seasonal trends, or build your own keyword tool.

Conclusion

This setup is ideal for:

data-driven SEO analyses

performance marketing with large keyword sets

reporting, forecasting & clustering

If you are looking for an alternative to the Keyword Planner that you can fully control and scale – then Python + DataForSEO is exactly what you need.

Tutorial + Template: Backlink Analysis with DataForSEO API Key

Set up Google Search Console

Data Science for SEO

E-commerce SEO Case Study for PURELEI

E-commerce SEO Case Study: PURELEI.com +100% visibility in 8 months

Read More

Case study, SEO, strategies

E-commerce SEO Case Study for PURELEI

E-commerce SEO Case Study: PURELEI.com +100% visibility in 8 months [...]

Case study, SEO

Case Study: How we generated over €1.2 million in revenue with targeted blog content

Brief overview A fast-growing brand was able to generate over €1.2 million in revenue within 12 months through targeted content strategies and [...]

Automation, AI, SEO

Finally Accessible: Efficiently Fill ALT Texts with AI

5 (1) Practical example in Shopify In my job as an SEO freelancer [...]

Automation

Create Apify Account

Tutorial: Create & start Apify account Step 1: Go to apify.com Open [...]

Analysis, AI, SEO, SEO Tools

AI Prompt Keyword Mapper

0 (0) How to automatically analyze prompts Nowadays, prompts – that is, [...]

AI, AI Tools, SEO Tools

ChatGPT German: Use Chat GPT for free without registration

[borlabs-cookie id="aichatbot" type="content-blocker"][/borlabs-cookie] ChatGPT, the advanced language model from OpenAI, is revolutionizing the way [...]

Use my SEO roadmap to get to page 1 on Google!

Sign up for my newsletter and get access to free guides, checklists, and tools.

 
Cookie-Settings