FirecrawlとGoogle Sheetsを使ってサイトマップと可視化ツリーを生成する

Name: FirecrawlとGoogle Sheetsを使ってサイトマップと可視化ツリーを生成する
Rating: 4.5 (10 reviews)
Author: Growth AI

中級

これはMarket Research, Multimodal AI分野の自動化ワークフローで、8個のノードを含みます。主にIf, Code, GoogleDrive, GoogleSheets, RespondToWebhookなどのノードを使用。 Firecrawl と Google スプレッドシートを使用してサイトマップとビジュアルツリーマップを生成する

前提条件

•Google Drive API認証情報
•Google Sheets API認証情報
•HTTP Webhookエンドポイント（n8nが自動生成）

使用ノード (8)

カテゴリー

市場調査

マルチモーダルAI

ワークフロープレビュー

ノード接続関係を可視化、ズームとパンをサポート

チャットメッセージ受信時

Firecrawl OK

Copy template

Data mapping

ソートing URL into table

Final answer

Bad URL

Map a website and get urls

React Flow

ワークフローをエクスポート

以下のJSON設定をn8nにインポートして、このワークフローを使用できます

{
  "meta": {
    "instanceId": "393ca9e36a1f81b0f643c72792946a5fe5e49eb4864181ba4032e5a408278263",
    "templateCredsSetupCompleted": true
  },
  "nodes": [
    {
      "id": "f6f18549-b9e2-4ea5-b0ad-9a4a4df3bff1",
      "name": "チャットメッセージ受信時",
      "type": "@n8n/n8n-nodes-langchain.chatTrigger",
      "position": [
        0,
        0
      ],
      "webhookId": "9a4aeebc-9dd5-4248-8349-ebaf7e9bd7ce",
      "parameters": {
        "mode": "webhook",
        "public": true,
        "options": {
          "responseMode": "responseNode"
        }
      },
      "typeVersion": 1.1
    },
    {
      "id": "3ab02f4d-4593-4d32-8007-f657e7706f84",
      "name": "Firecrawl OK",
      "type": "n8n-nodes-base.if",
      "position": [
        480,
        0
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "d1e1025f-704e-4392-bf2b-5be624a9c3a2",
              "operator": {
                "type": "boolean",
                "operation": "true",
                "singleValue": true
              },
              "leftValue": "={{ $json.success }}",
              "rightValue": "true"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "1f44edba-d802-4e48-a193-9aa073971724",
      "name": "Copy template",
      "type": "n8n-nodes-base.googleDrive",
      "position": [
        768,
        0
      ],
      "parameters": {
        "name": "={{ $('When chat message received').item.json.chatInput }} - n8n - Arborescence",
        "fileId": {
          "__rl": true,
          "mode": "id",
          "value": "12lV4HwgudgzPPGXKNesIEExbFg09Tuu9gyC_jSS1HjI"
        },
        "options": {},
        "operation": "copy"
      },
      "credentials": {
        "googleDriveOAuth2Api": {
          "id": "3TalAPza9NdMx3yx",
          "name": "Google Drive account"
        }
      },
      "executeOnce": true,
      "typeVersion": 3
    },
    {
      "id": "9e188a8a-4faa-488d-ba3e-10e25fb94c05",
      "name": "Data mapping",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1408,
        0
      ],
      "parameters": {
        "columns": {
          "value": {},
          "schema": [
            {
              "id": "Niv 0",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Niv 0",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Niv 1",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Niv 1",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Niv 2",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Niv 2",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Niv 3",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Niv 3",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Niv 4",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Niv 4",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Niv 5",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Niv 5",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "error",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "error",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "message",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "message",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "autoMapInputData",
          "matchingColumns": [],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "FR"
        },
        "documentId": {
          "__rl": true,
          "mode": "id",
          "value": "={{ $('Copy template').item.json.id }}"
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "wBRLUCktxqXE6DVJ",
          "name": "Google Sheets account"
        }
      },
      "typeVersion": 4.5
    },
    {
      "id": "90c7df17-d3ad-434b-b6ec-23c7e64888de",
      "name": "ソートing URL into table",
      "type": "n8n-nodes-base.code",
      "position": [
        1120,
        0
      ],
      "parameters": {
        "jsCode": "/**\n * Fonction pour traiter les URLs collectées par Firecrawl et générer une arborescence de site\n * en traitant séparément les différents domaines et sous-domaines\n * \n * @param {Object} inputData - Les données brutes de l'appel Firecrawl\n * @returns {Array} - Tableau d'objets avec les colonnes pour Google Sheets\n */\nfunction createSiteHierarchy(inputData) {\n  // Vérifier que les données d'entrée sont valides\n  if (!inputData || !inputData.success || !Array.isArray(inputData.links) || inputData.links.length === 0) {\n    throw new Error(\"Données d'entrée invalides ou vides\");\n  }\n\n  // Normaliser toutes les URLs (convertir http en https)\n  const urls = inputData.links.map(url => {\n    if (url.startsWith('http://')) {\n      return 'https://' + url.substring(7);\n    }\n    return url;\n  });\n\n  // Extraire les différents domaines/sous-domaines présents dans les URLs\n  const domainPattern = /^https?:\\/\\/([^\\/]+)/;\n  const domains = {};\n  \n  // Regrouper les URLs par domaine/sous-domaine\n  for (const url of urls) {\n    const match = url.match(domainPattern);\n    if (!match) continue;\n    \n    const fullDomain = match[1]; // ex: www.zest.fr, wiki.zest.fr\n    \n    // Extraire le sous-domaine et le domaine de base\n    const domainParts = fullDomain.split('.');\n    const isSubdomain = domainParts.length > 2;\n    \n    // Déterminer le domaine principal\n    let mainDomain;\n    if (isSubdomain) {\n      // Pour les sous-domaines comme wiki.zest.fr\n      mainDomain = domainParts.slice(domainParts.length - 2).join('.');\n    } else {\n      // Pour les domaines principaux comme zest.fr\n      mainDomain = fullDomain;\n    }\n    \n    // Enregistrer cette URL dans son groupe de domaine\n    if (!domains[fullDomain]) {\n      domains[fullDomain] = {\n        mainDomain: mainDomain,\n        fullDomain: fullDomain,\n        baseUrl: `https://${fullDomain}`,\n        urls: []\n      };\n    }\n    \n    domains[fullDomain].urls.push(url);\n  }\n  \n  // Traiter chaque domaine/sous-domaine séparément\n  const results = [];\n  \n  // Fonction pour formater le texte d'affichage d'une URL\n  function formatDisplayText(segment) {\n    if (!segment) return \"HOME PAGE\";\n    // Décodage des caractères URL (comme %20, %C3%A9, etc.)\n    try {\n      const decoded = decodeURIComponent(segment);\n      return decoded.toUpperCase().replace(/-/g, ' ');\n    } catch (e) {\n      // En cas d'erreur de décodage, utiliser le segment tel quel\n      return segment.toUpperCase().replace(/-/g, ' ');\n    }\n  }\n  \n  // Fonction pour extraire le chemin relatif d'une URL\n  function getPathFromUrl(url, baseUrl) {\n    // Supprimer le domaine\n    let path = url.replace(baseUrl, '');\n    \n    // Supprimer les slashes au début et à la fin\n    if (path.startsWith('/')) path = path.substring(1);\n    if (path.endsWith('/')) path = path.substring(0, path.length - 1);\n    \n    return path;\n  }\n  \n  // Fonction pour créer l'arborescence d'un domaine spécifique\n  function processUrlsForDomain(domainInfo) {\n    // Créer une structure arborescente pour ce domaine\n    const tree = {};\n    \n    // Ajouter la page d'accueil (niveau 0)\n    tree[domainInfo.baseUrl] = {\n      url: domainInfo.baseUrl,\n      level: 0,\n      segments: [],\n      displayText: domainInfo.fullDomain.toUpperCase(),\n      children: {}\n    };\n    \n    // Trier les URLs par longueur de chemin (du plus court au plus long)\n    domainInfo.urls.sort((a, b) => {\n      const pathA = getPathFromUrl(a, domainInfo.baseUrl);\n      const pathB = getPathFromUrl(b, domainInfo.baseUrl);\n      \n      const segmentsA = pathA ? pathA.split('/') : [];\n      const segmentsB = pathB ? pathB.split('/') : [];\n      \n      // D'abord comparer le nombre de segments\n      if (segmentsA.length !== segmentsB.length) {\n        return segmentsA.length - segmentsB.length;\n      }\n      \n      // Si même nombre de segments, comparer alphabétiquement\n      return pathA.localeCompare(pathB);\n    });\n    \n    // Construire l'arborescence\n    for (const url of domainInfo.urls) {\n      // Ignorer l'URL racine déjà ajoutée\n      if (url === domainInfo.baseUrl || url === domainInfo.baseUrl + '/') continue;\n      \n      const path = getPathFromUrl(url, domainInfo.baseUrl);\n      const segments = path ? path.split('/') : [];\n      \n      // Déterminer le niveau (limité à 5)\n      const level = Math.min(segments.length, 5);\n      \n      if (level === 0) continue; // Ignorer les duplications de l'URL racine\n      \n      // Construire le chemin complet segment par segment\n      let currentNode = tree[domainInfo.baseUrl];\n      let parentPath = domainInfo.baseUrl;\n      \n      for (let i = 0; i < level; i++) {\n        const segment = segments[i];\n        const currentPath = parentPath + '/' + segment;\n        \n        // Créer le nœud s'il n'existe pas\n        if (!currentNode.children[segment]) {\n          currentNode.children[segment] = {\n            url: currentPath,\n            level: i + 1,\n            segments: segments.slice(0, i + 1),\n            displayText: formatDisplayText(segment),\n            children: {}\n          };\n        }\n        \n        // Avancer au nœud enfant\n        currentNode = currentNode.children[segment];\n        parentPath = currentPath;\n      }\n    }\n    \n    // Convertir l'arborescence en lignes\n    const domainRows = [];\n    \n    // Fonction récursive pour parcourir l'arborescence\n    function traverseTree(node) {\n      // Créer une nouvelle ligne\n      const row = {\n        \"Niv 0\": \"\",\n        \"Niv 1\": \"\",\n        \"Niv 2\": \"\",\n        \"Niv 3\": \"\",\n        \"Niv 4\": \"\",\n        \"Niv 5\": \"\",\n        \"URL\": node.url // Ajout de la colonne URL avec l'URL en texte brut\n      };\n      \n      // Définir la valeur au niveau approprié\n      if (node.level <= 5) {\n        row[`Niv ${node.level}`] = `=HYPERLINK(\"${node.url}\";\"${node.displayText}\")`;\n      }\n      \n      // Ajouter la ligne au résultat\n      domainRows.push(row);\n      \n      // Traiter les enfants dans l'ordre alphabétique\n      const children = Object.values(node.children);\n      children.sort((a, b) => a.displayText.localeCompare(b.displayText));\n      \n      for (const child of children) {\n        traverseTree(child);\n      }\n    }\n    \n    // Commencer le parcours avec le nœud racine\n    traverseTree(tree[domainInfo.baseUrl]);\n    \n    return domainRows;\n  }\n  \n  // Trier les domaines: d'abord le domaine principal (sans sous-domaine), puis les sous-domaines\n  const sortedDomains = Object.values(domains).sort((a, b) => {\n    // Si un domaine est exactement le domaine principal, il vient en premier\n    const aParts = a.fullDomain.split('.');\n    const bParts = b.fullDomain.split('.');\n    \n    // Cas spécial pour www: le traiter comme domaine principal\n    const aIsWWW = aParts.length > 2 && aParts[0] === 'www';\n    const bIsWWW = bParts.length > 2 && bParts[0] === 'www';\n    \n    if (aIsWWW && !bIsWWW) return -1;\n    if (!aIsWWW && bIsWWW) return 1;\n    \n    // Ensuite comparer par nombre de parties\n    if (aParts.length !== bParts.length) {\n      return aParts.length - bParts.length;\n    }\n    \n    // Enfin, comparer alphabétiquement\n    return a.fullDomain.localeCompare(b.fullDomain);\n  });\n  \n  // Traiter chaque domaine et ajouter les résultats\n  for (const domainInfo of sortedDomains) {\n    const domainRows = processUrlsForDomain(domainInfo);\n    results.push(...domainRows);\n  }\n  \n  return results;\n}\n\n/**\n * Fonction principale pour traiter l'entrée de n8n\n */\nfunction processInput() {\n  try {\n    // Récupérer les données de la node \"Map a website and get urls\" en utilisant la méthode $()\n    // Cette méthode a été confirmée fonctionnelle par nos tests\n    const firecrawlData = $('Map a website and get urls').item.json;\n    \n    // Vérifier la structure des données\n    if (!firecrawlData || !firecrawlData.success || !Array.isArray(firecrawlData.links)) {\n      throw new Error(\"Données d'entrée non valides ou structure incorrecte\");\n    }\n    \n    // Traiter les URLs pour créer l'arborescence\n    const siteHierarchy = createSiteHierarchy(firecrawlData);\n    \n    // Créer un nouvel item pour chaque ligne de l'arborescence\n    // C'est le format attendu par Google Sheets dans n8n\n    return siteHierarchy.map(row => {\n      return {\n        json: row\n      };\n    });\n    \n  } catch (error) {\n    console.error(\"Erreur lors du traitement:\", error.message);\n    // Retourner un message d'erreur formaté pour n8n\n    return [{\n      json: {\n        error: true,\n        message: error.message,\n        details: error.stack\n      }\n    }];\n  }\n}\n\n// Exécuter le traitement\nreturn processInput();"
      },
      "typeVersion": 2
    },
    {
      "id": "85d435ab-89fb-4f30-a5ff-66239dd02bfc",
      "name": "Final answer",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        1648,
        0
      ],
      "parameters": {
        "options": {},
        "respondWith": "json",
        "responseBody": "={\n  \"text\": \"Cliquez [ici](https://docs.google.com/spreadsheets/d/{{ $('Copy template').item.json.id }}) afin d'accéder à votre arborescence\"\n}"
      },
      "typeVersion": 1.1
    },
    {
      "id": "9d187a39-57fd-43f3-9426-6ba0f13b4a6b",
      "name": "Bad URL",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        672,
        208
      ],
      "parameters": {
        "options": {},
        "respondWith": "json",
        "responseBody": "={\n  \"text\": \"L'url {{ $('Chat input').item.json.chatInput }} n'est pas une url correcte ou elle n'est pas prise en compte par ce service\"\n}"
      },
      "typeVersion": 1.1
    },
    {
      "id": "b58250bb-3f3e-4a29-a8c1-215f23503a79",
      "name": "Map a website and get urls",
      "type": "@mendable/n8n-nodes-firecrawl.firecrawl",
      "position": [
        272,
        0
      ],
      "parameters": {
        "url": "={{ $json.chatInput }}",
        "operation": "map",
        "sitemapOnly": true,
        "ignoreSitemap": false,
        "requestOptions": {}
      },
      "credentials": {
        "firecrawlApi": {
          "id": "E34WDB80ik5VHjiI",
          "name": "Firecrawl account"
        }
      },
      "typeVersion": 1
    }
  ],
  "pinData": {},
  "connections": {
    "9e188a8a-4faa-488d-ba3e-10e25fb94c05": {
      "main": [
        [
          {
            "node": "85d435ab-89fb-4f30-a5ff-66239dd02bfc",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3ab02f4d-4593-4d32-8007-f657e7706f84": {
      "main": [
        [
          {
            "node": "1f44edba-d802-4e48-a193-9aa073971724",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "9d187a39-57fd-43f3-9426-6ba0f13b4a6b",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "1f44edba-d802-4e48-a193-9aa073971724": {
      "main": [
        [
          {
            "node": "Sorting URL into table",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Sorting URL into table": {
      "main": [
        [
          {
            "node": "9e188a8a-4faa-488d-ba3e-10e25fb94c05",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "b58250bb-3f3e-4a29-a8c1-215f23503a79": {
      "main": [
        [
          {
            "node": "3ab02f4d-4593-4d32-8007-f657e7706f84",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When chat message received": {
      "main": [
        [
          {
            "node": "b58250bb-3f3e-4a29-a8c1-215f23503a79",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

よくある質問

このワークフローの使い方は？

上記のJSON設定コードをコピーし、n8nインスタンスで新しいワークフローを作成して「JSONからインポート」を選択、設定を貼り付けて認証情報を必要に応じて変更してください。

このワークフローはどんな場面に適していますか？

中級 - 市場調査, マルチモーダルAI

有料ですか？

このワークフローは完全無料です。ただし、ワークフローで使用するサードパーティサービス（OpenAI APIなど）は別途料金が発生する場合があります。

FirecrawlとGoogle Sheetsを使ってサイトマップと可視化ツリーを生成する

使用ノード (8)

カテゴリー

このワークフローの使い方は？

このワークフローはどんな場面に適していますか？

有料ですか？

関連ワークフロー

カテゴリー