Rastreador web automatizado: Monitoreo de细分 Empleos/Productos con alertas de Telegram

Name: Rastreador web automatizado: Monitoreo de细分 Empleos/Productos con alertas de Telegram
Rating: 4.5 (10 reviews)
Author: Piotr Sobolewski

Intermedio

Este es unMarket Research, AI Summarizationflujo de automatización del dominio deautomatización que contiene 6 nodos.Utiliza principalmente nodos como If, Cron, Function, Telegram, HtmlExtract. Rastreador web automatizado: monitoreo segmentado de empleos/productos y alertas de Telegram

Requisitos previos

•Bot Token de Telegram
•Pueden requerirse credenciales de autenticación para la API de destino

Nodos utilizados (6)

Categoría

Investigación de mercado

Resumen de IA

Vista previa del flujo de trabajo

Visualización de las conexiones entre nodos, con soporte para zoom y panorámica

Activador de Monitoreo por Hora

Obtener Contenido de la Página Web

Extraer Información de Puestos/Productos

Si se Encuentran Elementos

Formatear Mensaje de Notificación

Enviar Alerta Telegram

React Flow

Exportar flujo de trabajo

Copie la siguiente configuración JSON en n8n para importar y usar este flujo de trabajo

{
  "nodes": [
    {
      "name": "Activador de Monitoreo por Hora",
      "type": "n8n-nodes-base.cron",
      "notes": {
        "text": "### 1. Hourly Monitor Trigger\n\nThis `Cron` node will trigger the workflow automatically every **hour**.\n\n**To change the schedule:** Adjust the 'Mode' or set specific 'Hour' and 'Minute' values to match how often you want to check the website (e.g., every 4 hours, daily).",
        "position": "right"
      },
      "position": [
        240,
        300
      ],
      "parameters": {
        "mode": "everyHour",
        "options": {}
      },
      "typeVersion": 1,
      "id": "Activador-de-Monitoreo-por-Hora-0"
    },
    {
      "name": "Obtener Contenido de la Página Web",
      "type": "n8n-nodes-base.httpRequest",
      "notes": {
        "text": "### 2. Fetch Webpage Content\n\nThis `HTTP Request` node downloads the entire HTML content of the target webpage.\n\n**Setup:**\n1.  **URL:** **IMPORTANT:** Change `https://www.n8n.io/blog/` to the exact URL of the job board, product page, or any webpage you want to monitor.\n2.  **Response Format:** Ensure this is set to `string` (for HTML content).\n\n**Considerations:**\n* If the website requires login, you might need to add authentication headers or cookies (more advanced).\n* If the content loads dynamically with JavaScript after the initial page load, this method might not capture it. You'd need more advanced tools (like Puppeteer/Playwright in a `Code` node).",
        "position": "right"
      },
      "position": [
        460,
        300
      ],
      "parameters": {
        "url": "https://www.n8n.io/blog/",
        "options": {},
        "responseFormat": "string"
      },
      "typeVersion": 3,
      "id": "Obtener-Contenido-de-la-P-gina-Web-1"
    },
    {
      "name": "Extraer Información de Puestos/Productos",
      "type": "n8n-nodes-base.htmlExtract",
      "notes": {
        "text": "### 3. Extract Specific Data (`HTML Extract` - Key Node!)\n\nThis `HTML Extract` node is the core of the web scraping. It parses the HTML and pulls out specific data points based on CSS Selectors.\n\n**Setup (CRITICAL!):**\n1.  **HTML:** This field is already set to `{{ $node[\"Fetch Webpage Content\"].json.data }}`, taking the HTML from the previous node.\n2.  **Extract Operations:**\n    * **Change or Add Operations:** You'll need to define exactly *what* to extract.\n    * **Selector:** This is the most important part. You need to find the correct CSS selector for the data you want. \n        * **How to find:** Open the target webpage in your browser (Chrome/Firefox). Right-click on the specific text/element (e.g., a job title, a product price) and choose 'Inspect' or 'Inspect Element'. In the developer tools panel, right-click on the highlighted HTML code, then select 'Copy' -> **'Copy selector'** or 'Copy XPath'. Paste this into the 'Selector' field.\n    * **Attribute:** Usually `textContent` for visible text, or `href` for links, `src` for image URLs, etc.\n    * **Property Name:** Give it a meaningful name (e.g., `JobTitle`, `JobLink`, `ProductName`, `StockStatus`).\n\n**Example (from n8n blog):**\n* `h3.BlogItem_title__d78Xb` for blog post titles (`textContent`)\n* `a.BlogItem_blogItem__a_H6E` for blog post links (`href`)\n\n**Test this node carefully!** Run the workflow up to this point and inspect its output to ensure it extracts what you expect.",
        "position": "right"
      },
      "position": [
        700,
        300
      ],
      "parameters": {
        "html": "={{ $node[\"Fetch Webpage Content\"].json.data }}",
        "extractOperations": [
          {
            "options": {},
            "selector": "h3.BlogItem_title__d78Xb",
            "attribute": "textContent",
            "operation": "extract",
            "propertyName": "JobTitle"
          },
          {
            "options": {},
            "selector": "a.BlogItem_blogItem__a_H6E",
            "attribute": "href",
            "operation": "extract",
            "propertyName": "JobLink"
          }
        ]
      },
      "typeVersion": 1,
      "id": "Extraer-Informaci-n-de-Puestos-Productos-2"
    },
    {
      "name": "Si se Encuentran Elementos",
      "type": "n8n-nodes-base.if",
      "notes": {
        "text": "### 4. If Items Found (Conditional Check)\n\nThis `If` node checks if the 'Extract Job/Product Info' node actually found any items. If it did, the workflow continues down the 'True' path to send a notification.\n\n**No configuration needed**; it checks if the array of extracted items is not empty.",
        "position": "right"
      },
      "position": [
        940,
        300
      ],
      "parameters": {
        "conditions": [
          {
            "value1": "={{ $json.length }}",
            "value2": "0",
            "operation": "notEqual"
          }
        ]
      },
      "typeVersion": 1,
      "id": "Si-se-Encuentran-Elementos-3"
    },
    {
      "name": "Formatear Mensaje de Notificación",
      "type": "n8n-nodes-base.function",
      "notes": {
        "text": "### 5. Format Notification Message\n\nThis `Function` node takes the extracted data and formats it into a human-readable message for your Telegram alert.\n\n**Customization:**\n* **Adjust `item.json.JobTitle`, `item.json.JobLink`, etc.:** Make sure these match the 'Property Name' you defined in the 'Extract Job/Product Info' node.\n* You can add more details or change the formatting here.\n\n**No configuration needed if your property names match the example.**",
        "position": "right"
      },
      "position": [
        1180,
        220
      ],
      "parameters": {
        "options": {},
        "function": "let summary = \"\";\n\nif (items.length > 0) {\n  summary = `**Found ${items.length} new/updated items!**\\n\\n`;\n  for (const item of items) {\n    // Assuming you extracted 'JobTitle' and 'JobLink' from HTML Extract\n    const title = item.json.JobTitle || item.json.ProductName || 'N/A';\n    const link = item.json.JobLink || 'No link';\n    const otherInfo = item.json.StockStatus ? ` (Status: ${item.json.StockStatus})` : '';\n    summary += `* **${title}**${otherInfo}\\n  Link: ${link}\\n\\n`;\n  }\n} else {\n  summary = \"No new items found during this check.\";\n}\n\nreturn [{ json: { notificationMessage: summary } }];"
      },
      "typeVersion": 1,
      "id": "Formatear-Mensaje-de-Notificaci-n-4"
    },
    {
      "name": "Enviar Alerta Telegram",
      "type": "n8n-nodes-base.telegram",
      "notes": {
        "text": "### 6. Send Telegram Alert\n\nThis `Telegram` node sends the formatted notification message to your Telegram chat.\n\n**Setup:**\n1.  **Telegram Credential:** Click 'Credentials' and select 'New Credential'. Choose 'Telegram API'.\n    * You'll need a **Bot Token** from BotFather on Telegram (search for '@BotFather' in Telegram, type `/newbot`, follow instructions).\n2.  **Chat ID:** **IMPORTANT: You need your specific Telegram Chat ID.**\n    * **How to get it:** Send a message to your new bot. Then, open this URL in your browser: `https://api.telegram.org/bot<YOUR_BOT_TOKEN>/getUpdates` (replace `<YOUR_BOT_TOKEN>` with your bot's token). Look for the `\"chat\": {\"id\": ...}` field; that's your Chat ID.\n    * Paste this ID into the 'Chat ID' field.\n3.  **Text:** This pulls the message from the 'Format Notification Message' node.\n4.  **Parse Mode:** Set to `Markdown` for bolding (`**`) and links.\n\n**Test this node by running the workflow (from the 'Hourly Monitor Trigger') and checking your Telegram!**",
        "position": "right"
      },
      "position": [
        1420,
        220
      ],
      "parameters": {
        "text": "={{ $json.notificationMessage }}",
        "chatId": "YOUR_TELEGRAM_CHAT_ID",
        "options": {},
        "parseMode": "Markdown"
      },
      "credentials": {
        "telegramApi": {
          "id": "YOUR_TELEGRAM_CREDENTIAL_ID",
          "resolve": false
        }
      },
      "typeVersion": 1,
      "id": "Enviar-Alerta-Telegram-5"
    }
  ],
  "pinData": {},
  "version": 1,
  "connections": {
    "Si-se-Encuentran-Elementos-3": {
      "main": [
        [
          {
            "node": "Formatear-Mensaje-de-Notificaci-n-4",
            "type": "main"
          }
        ],
        []
      ]
    },
    "Obtener-Contenido-de-la-P-gina-Web-1": {
      "main": [
        [
          {
            "node": "Extraer-Informaci-n-de-Puestos-Productos-2",
            "type": "main"
          }
        ]
      ]
    },
    "Activador-de-Monitoreo-por-Hora-0": {
      "main": [
        [
          {
            "node": "Obtener-Contenido-de-la-P-gina-Web-1",
            "type": "main"
          }
        ]
      ]
    },
    "Extraer-Informaci-n-de-Puestos-Productos-2": {
      "main": [
        [
          {
            "node": "Si-se-Encuentran-Elementos-3",
            "type": "main"
          }
        ]
      ]
    },
    "Formatear-Mensaje-de-Notificaci-n-4": {
      "main": [
        [
          {
            "node": "Enviar-Alerta-Telegram-5",
            "type": "main"
          }
        ]
      ]
    }
  }
}

Preguntas frecuentes

¿Cómo usar este flujo de trabajo?

Copie el código de configuración JSON de arriba, cree un nuevo flujo de trabajo en su instancia de n8n y seleccione "Importar desde JSON", pegue la configuración y luego modifique la configuración de credenciales según sea necesario.

¿En qué escenarios es adecuado este flujo de trabajo?

Intermedio - Investigación de mercado, Resumen de IA

¿Es de pago?

Este flujo de trabajo es completamente gratuito, puede importarlo y usarlo directamente. Sin embargo, tenga en cuenta que los servicios de terceros utilizados en el flujo de trabajo (como la API de OpenAI) pueden requerir un pago por su cuenta.