Automatisierter Immobilien-Extraktor

Fortgeschritten

Dies ist ein Market Research-Bereich Automatisierungsworkflow mit 7 Nodes. Hauptsächlich werden Code, GoogleSheets, ScheduleTrigger, Scrapeless und andere Nodes verwendet. Automatisierung der Immobilienobjekterfassung mit Scrapeless und Google Tabellen

Voraussetzungen
  • Google Sheets API-Anmeldedaten
Workflow-Vorschau
Visualisierung der Node-Verbindungen, mit Zoom und Pan
Workflow exportieren
Kopieren Sie die folgende JSON-Konfiguration und importieren Sie sie in n8n
{
  "id": "EgeVsV76EKfXbkcW",
  "meta": {
    "instanceId": "7d291de9dc3bbf0106d65e069919a3de2507e3365a7b25788a79a3562af9bfc5"
  },
  "name": "Automated Real Estate Listing Extractor",
  "tags": [],
  "nodes": [
    {
      "id": "337aabda-3017-4057-8383-6855837d5e9a",
      "name": "Wöchentlicher Markt-Trigger",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        60,
        780
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "weeks",
              "triggerAtDay": [
                1
              ],
              "triggerAtHour": 9
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "2be97af8-6121-4cbc-9239-1901d947d8e2",
      "name": "Notiz3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        0,
        0
      ],
      "parameters": {
        "color": 6,
        "width": 620,
        "height": 1160,
        "content": "## 🔹 **SECTION 1: 🔁 Schedule Trigger — Automate Workflow**\n\n### 🧩 1. 📅 Schedule Trigger\n\n**Node Name:** `Schedule Trigger`  \n**What it does:**  \nAutomatically triggers the workflow every 6 hours, no manual intervention needed. Keeps your data fresh and updated regularly.\n\n🧠 **Beginner Benefit:**  \n\n> Set it once and forget it — your workflow runs automatically on schedule without any extra effort.\n\n---\n\n## 🔹 **SECTION 2: 🌐 Scrapeless Crawler — Fetch Webpage Data**\n\n### 🧩 2. 🕷️ Scrapeless Crawler\n\n**Node Name:** `Scrapeless Crawler`  \n**What it does:**  \nSends a request to Scrapeless API to crawl the target real estate webpage. Returns the page content in Markdown format for easy parsing later.\n\n**Example URL:**  \nhttps://www.loopnet.com/search/commercial-real-estate/los-angeles-ca/for-lease/\n\n🧠 **Beginner Benefit:**  \n\n> Leverage powerful scraping as a service — no need to write complicated crawler code yourself.\n\n---\n"
      },
      "typeVersion": 1
    },
    {
      "id": "ce4de51e-920e-4e72-9aee-13f2180952fc",
      "name": "Notiz",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        660,
        -380
      ],
      "parameters": {
        "color": 5,
        "width": 700,
        "height": 1540,
        "content": "\n\n## 🔹 **SECTION 4: 🕵️ Parse Listings — Extract Property Data**\n\n### 🧩 3. 🔍 Parse Listings (Code Node)\n\n\n**Node Name:** `Parse Listings`\n**What it does:**\nHandles the entire extraction and cleaning process in a single code node to simplify the workflow and improve performance.\n\n\n### ✅ **Step 1: Extract Markdown Text**\n\n* Extracts the core Markdown-formatted text from the complex HTML response returned by Scrapeless.\n* Automatically removes unwanted HTML tags, scripts, and ads, keeping only the meaningful page content.\n\n---\n\n### ✅ **Step 2: Parse Key Information**\n\n* Uses regex and string manipulation to extract critical fields from the Markdown text, including:\n\n  * 🏢 **Property Title**\n  * 🔗 **Link**\n  * 📐 **Size**\n  * 🏗️ **Year Built**\n\n* Outputs clean, structured **JSON objects** that are easy to pass to downstream nodes.\n\n---\n\n### ✅ **Step 3: Clean & Format Data**\n\n* Filters out unnecessary fields, keeping only the relevant ones:\n\n  * `title`\n  * `link`\n  * `size`\n  * `yearBuilt`\n\n* Formats the output to be clean and ready for export to Google Sheets, Notion, Slack, databases, or other platforms.\n\n---\n\n### 🧠 **Beginner Benefit:**\n\n> Extracts text, parses listings, and cleans data in one step, saving time and reducing node complexity. Produces structured, ready-to-use data for your business needs.\n\n"
      },
      "typeVersion": 1
    },
    {
      "id": "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2",
      "name": "Crawl",
      "type": "n8n-nodes-scrapeless.scrapeless",
      "position": [
        360,
        780
      ],
      "parameters": {
        "url": "https://www.loopnet.com/search/commercial-real-estate/los-angeles-ca/for-lease/",
        "resource": "crawler",
        "operation": "crawl",
        "limitCrawlPages": 2
      },
      "credentials": {
        "scrapelessApi": {
          "id": "B73pdQXNjpqNbIhs",
          "name": "Scrapeless account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "64cbc99c-e071-4c0d-8758-a0e5167f4c88",
      "name": "Zeile in Tabelle anhängen oder aktualisieren",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1580,
        780
      ],
      "parameters": {
        "columns": {
          "value": {
            "Link": "={{ $json.link }}",
            "Size": "={{ $json.size }}",
            "Image": "={{ $json.image }}",
            "Title": "={{ $json.title }}",
            "YearBuilt": "={{ $json.yearBuilt }}"
          },
          "schema": [
            {
              "id": "Title",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Title",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Link",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Link",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Size",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Size",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "YearBuilt",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "YearBuilt",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Image",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Image",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "Title"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "appendOrUpdate",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1of_9PIseDnbwGYiJ5SLx3bSU5m8TTXpBN6haS2f7EBY/edit?usp=drivesdk",
          "cachedResultName": "Real Estate Market Report"
        }
      },
      "typeVersion": 4.6
    },
    {
      "id": "0458cbbb-e60b-461d-aed0-562d5067946e",
      "name": "Notiz1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1420,
        -60
      ],
      "parameters": {
        "width": 580,
        "height": 1220,
        "content": "\n\n## 🔹 **SECTION 6: 📊 Append to Google Sheets — Save Data**\n\n### 🧩 6. 📈 Append to Google Sheets\n\n**Node Name:** `Google Sheets Append`  \n**What it does:**  \nAppends the parsed and cleaned property data into a Google Sheets spreadsheet for easy review and analysis.\n\n🧠 **Beginner Benefit:**  \n\n> Automatically keeps your spreadsheet up-to-date with fresh listings — no copy/paste required.\n\n"
      },
      "typeVersion": 1
    },
    {
      "id": "f0d425e4-af8d-4c6f-bced-625ba3b094f0",
      "name": "Parse Listings",
      "type": "n8n-nodes-base.code",
      "position": [
        860,
        780
      ],
      "parameters": {
        "jsCode": "const markdownData = [];\n$input.all().forEach((item) => {\n\titem.json.forEach((c) => {\n\t\tmarkdownData.push(c.markdown);\n\t});\n});\n\nconst results = [];\n\nfunction dataExtact(md) {\n\tconst re = /\\[More details for ([^\\]]+)\\]\\((https:\\/\\/www\\.loopnet\\.com\\/Listing\\/[^\\)]+)\\)/g;\n\n\tlet match;\n\n\twhile ((match = re.exec(md))) {\n\t\tconst title = match[1].trim();\n\t\tconst link = match[2].trim()?.split(' ')[0];\n\n\t\t// Extract a snippet of context around the match\n\t\tconst context = md.slice(match.index, match.index + 500);\n\n\t\t// Extract size range, e.g. \"10,000 - 20,000 SF\"\n\t\tconst sizeMatch = context.match(/([\\d,]+)\\s*-\\s*([\\d,]+)\\s*SF/);\n\t\tconst sizeRange = sizeMatch ? `${sizeMatch[1]} - ${sizeMatch[2]} SF` : null;\n\n\t\t// Extract year built, e.g. \"Built in 1988\"\n\t\tconst yearMatch = context.match(/Built in\\s*(\\d{4})/i);\n\t\tconst yearBuilt = yearMatch ? yearMatch[1] : null;\n\n\t\t// Extract image URL\n\t\tconst imageMatch = context.match(/!\\[[^\\]]*\\]\\((https:\\/\\/images1\\.loopnet\\.com[^\\)]+)\\)/);\n\t\tconst image = imageMatch ? imageMatch[1] : null;\n\n\t\tresults.push({\n\t\t\tjson: {\n\t\t\t\ttitle,\n\t\t\t\tlink,\n\t\t\t\tsize: sizeRange,\n\t\t\t\tyearBuilt,\n\t\t\t\timage,\n\t\t\t},\n\t\t});\n\t}\n\n\t// Return original markdown if no matches found (for debugging)\n\tif (results.length === 0) {\n\t\treturn [\n\t\t\t{\n\t\t\t\tjson: {\n\t\t\t\t\terror: 'No listings matched',\n\t\t\t\t\traw: md,\n\t\t\t\t},\n\t\t\t},\n\t\t];\n\t}\n}\n\nmarkdownData.forEach((item) => {\n\tdataExtact(item);\n});\n\nreturn results;\n"
      },
      "typeVersion": 2
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "3bbe4fe1-455d-4486-af39-d0980957100e",
  "connections": {
    "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2": {
      "main": [
        [
          {
            "node": "f0d425e4-af8d-4c6f-bced-625ba3b094f0",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "f0d425e4-af8d-4c6f-bced-625ba3b094f0": {
      "main": [
        [
          {
            "node": "64cbc99c-e071-4c0d-8758-a0e5167f4c88",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "337aabda-3017-4057-8383-6855837d5e9a": {
      "main": [
        [
          {
            "node": "3cb24f03-3bc2-4ca9-9234-9dc2ab9c36a2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Häufig gestellte Fragen

Wie verwende ich diesen Workflow?

Kopieren Sie den obigen JSON-Code, erstellen Sie einen neuen Workflow in Ihrer n8n-Instanz und wählen Sie "Aus JSON importieren". Fügen Sie die Konfiguration ein und passen Sie die Anmeldedaten nach Bedarf an.

Für welche Szenarien ist dieser Workflow geeignet?

Fortgeschritten - Marktforschung

Ist es kostenpflichtig?

Dieser Workflow ist völlig kostenlos. Beachten Sie jedoch, dass Drittanbieterdienste (wie OpenAI API), die im Workflow verwendet werden, möglicherweise kostenpflichtig sind.

Workflow-Informationen
Schwierigkeitsgrad
Fortgeschritten
Anzahl der Nodes7
Kategorie1
Node-Typen5
Schwierigkeitsbeschreibung

Für erfahrene Benutzer, mittelkomplexe Workflows mit 6-15 Nodes

Externe Links
Auf n8n.io ansehen

Diesen Workflow teilen

Kategorien

Kategorien: 34