Scraping en tiempo real de startups de Y Combinator con Apify y Google Sheets

Intermedio

Este es unLead Generation, Multimodal AIflujo de automatización del dominio deautomatización que contiene 9 nodos.Utiliza principalmente nodos como GoogleSheets, Apify, ManualTrigger. Automatización del scraping de startups de Y Combinator con Apify y Google Sheets

Requisitos previos
  • Credenciales de API de Google Sheets
Vista previa del flujo de trabajo
Visualización de las conexiones entre nodos, con soporte para zoom y panorámica
Exportar flujo de trabajo
Copie la siguiente configuración JSON en n8n para importar y usar este flujo de trabajo
{
  "id": "f0l6j5GkLScFOfqK",
  "meta": {
    "instanceId": "1a54c41d9050a8f1fa6f74ca858828ad9fb97b9fafa3e9760e576171c531a787",
    "templateCredsSetupCompleted": true
  },
  "name": "Live-Automate Scraping Y Combinator Startups with Apify & Google Sheets",
  "tags": [],
  "nodes": [
    {
      "id": "4d88b9f9-6909-47c8-91a5-c27ebc97de49",
      "name": "Ejecutar un Actor",
      "type": "@apify/n8n-nodes-apify.apify",
      "position": [
        1632,
        1632
      ],
      "parameters": {
        "actorId": {
          "__rl": true,
          "mode": "list",
          "value": "XXsXDaNQLjoF4lgmU",
          "cachedResultUrl": "https://console.apify.com/actors/XXsXDaNQLjoF4lgmU/input",
          "cachedResultName": "Y Combinator Directory Scraper | Fast & Reliable | $4.5 / 1K (fatihtahta/y-combinator-directory-scraper)"
        },
        "customBody": "{\n  \"maxCompanies\": 5,\n  \"startUrls\": \"{https://www.ycombinator.com/companies?industry=Fintech&regions=America%20%2F%20Canada&team_size=%5B%221%22%2C%2225%22%5D}\",\n  \"proxyConfiguration\": {\n    \"useApifyProxy\": true\n  }\n}"
      },
      "credentials": {
        "apifyApi": {
          "id": "8decwrzbYTySCGCT",
          "name": "Apify account 4"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "e524c759-a193-42b6-9553-683656413431",
      "name": "Obtener elementos del dataset",
      "type": "@apify/n8n-nodes-apify.apify",
      "position": [
        2432,
        1968
      ],
      "parameters": {
        "resource": "Datasets",
        "datasetId": "={{ $json.defaultDatasetId }}"
      },
      "credentials": {
        "apifyApi": {
          "id": "8decwrzbYTySCGCT",
          "name": "Apify account 4"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "4eea9bab-911c-4480-9073-831b8ac46571",
      "name": "Nota adhesiva",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        608,
        1744
      ],
      "parameters": {
        "width": 528,
        "height": 336,
        "content": "### **Step 1 – Manual Trigger**\n\n- The workflow begins with a **Manual Trigger node**, allowing you to start the process on demand.  \n- This approach ensures full control over when company data from **Y Combinator** is scraped and logged.  \n"
      },
      "typeVersion": 1
    },
    {
      "id": "b5814a97-7dd1-4488-8af3-6bf0af555d51",
      "name": "Iniciar flujo de trabajo",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        816,
        1936
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "3eacc0a3-ca74-4405-ad0e-a25b9b4b964e",
      "name": "Nota adhesiva1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1392,
        1424
      ],
      "parameters": {
        "color": 3,
        "width": 592,
        "height": 368,
        "content": "### **Step 2 – Apify Actor (Scrape Company Data)**\n\n- This step uses an **Apify Actor node** to scrape details of companies listed on **Y Combinator**.  \n- You need to provide the **URL of the Y Combinator search page** with your desired filters applied (e.g., industry, location, funding stage).  \n- The actor then extracts structured company data, including names, descriptions, websites, and other available details, preparing it for downstream logging and processing.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "d67e5ff1-ff84-4196-9a76-cc59215e4061",
      "name": "Nota adhesiva2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2176,
        1760
      ],
      "parameters": {
        "color": 4,
        "width": 592,
        "height": 368,
        "content": "### **Step 3 – Apify Get Dataset Items**\n\n- This step uses the **Apify Get Dataset Items node** to fetch the actual company data generated by the Apify Actor in the previous step.  \n- The node requires the **Dataset ID** returned by the Apify Actor to retrieve structured results.  \n- The output includes detailed company information (e.g., name, description, website, location, sector), which is then prepared for logging into Google Sheets.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "04149226-1821-419d-b7c6-f2288de0f4cc",
      "name": "Nota adhesiva3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        3040,
        1104
      ],
      "parameters": {
        "color": 5,
        "width": 640,
        "height": 720,
        "content": "### **Step 4 – Add or Update Row in Google Sheet**\n\n- This step uses the **Google Sheets (Add or Update Row) node** to log the company data into a connected Google Sheet.  \n- You must **select the target Google Document and specific Sheet** where the data will be stored.  \n- Ensure the following columns are already created in the sheet (**case-sensitive**):  \n  - Company  \n  - Location  \n  - Website  \n  - LinkedIn  \n  - Founded  \n  - Description  \n  - Industry Tags  \n  - Founder 1 Name  \n  - Founder 1 LinkedIn  \n  - Founder 2 Name  \n  - Founder 2 LinkedIn  \n\n- The node will automatically add new rows or update existing entries, keeping the sheet clean and up to date with the latest scraped company details.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "e0cff6ae-ea8b-47c6-8cc1-884459e8224e",
      "name": "Agregar datos a la hoja Google",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        3312,
        1616
      ],
      "parameters": {
        "columns": {
          "value": {
            "Company": "={{ $json.company_name }}",
            "Founded": "={{ $json.year_founded }}",
            "Website": "={{ $json.website }}",
            "LinkedIn": "={{ $json.company_linkedin }}",
            "Location": "={{ $json.company_location }}",
            "Description": "={{ $json.long_description }}",
            "Industry Tags": "={{ $json['tags/0'] }} {{ $json['tags/1'] }} {{ $json['tags/2'] }} {{ $json['tags/3'] }}",
            "Founder 1 Name": "={{ $json['founders/0/name'] }}",
            "Founder 2 Name": "={{ $json['founders/1/name'] }}",
            "Founder 1 LinkedIn": "={{ $json['founders/0/linkedin'] }}",
            "Founder 2 LinkedIn": "={{ $json['founders/1/linkedin'] }}"
          },
          "schema": [
            {
              "id": "Company",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "Company",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Location",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Location",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Website",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Website",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "LinkedIn",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "LinkedIn",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founded",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founded",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Description",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Description",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Industry Tags",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Industry Tags",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 1 Name",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 1 Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 1 LinkedIn",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 1 LinkedIn",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 2 Name",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 2 Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Founder 2 LinkedIn",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Founder 2 LinkedIn",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "Company"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "appendOrUpdate",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1AEOYMIRNgxYN3gihT1bIrGswnkCzuWbFljX2ac4XjUU/edit?usp=drivesdk",
          "cachedResultName": "YCom Apify Scrapped "
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "dZG6jp43p2oX45HG",
          "name": "Google Sheets account 4-Smit"
        }
      },
      "typeVersion": 4.7
    },
    {
      "id": "c8f614e2-2aa5-4f4a-8be9-090fb24bf616",
      "name": "Nota adhesiva4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        368,
        944
      ],
      "parameters": {
        "color": 3,
        "width": 768,
        "height": 672,
        "content": "### **Step 0 – Prerequisites**\n\nBefore running the workflow, ensure the following configurations are complete:\n\n- **Apify Setup:**\n  - Connect your Apify account in n8n.  \n  - Select the **Y Combinator Directory Scraper** actor.  \n  - Paste the Y Combinator search URL (with filters applied) into the `searchUrls` parameter.  \n  - Adjust the `maxCompanies` parameter to control the number of companies scraped per run.  \n\n- **Google Sheets Setup:**\n  - Connect your Google account using **OAuth2 credentials** with both **Google Sheets** and **Google Drive** features enabled.  \n  - Ensure the target Google Sheet is created in advance with the following column headers (**case-sensitive**):  \n    - Company  \n    - Location  \n    - Website  \n    - LinkedIn  \n    - Founded  \n    - Description  \n    - Industry Tags  \n    - Founder 1 Name  \n    - Founder 1 LinkedIn  \n    - Founder 2 Name  \n    - Founder 2 LinkedIn  \n\n- **n8n Configuration:**\n  - Confirm that both Apify and Google integrations are properly authenticated and available in your workflow.\n"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "36ae4ec1-b59a-49a4-b4e6-0f80bd2111f3",
  "connections": {
    "4d88b9f9-6909-47c8-91a5-c27ebc97de49": {
      "main": [
        [
          {
            "node": "e524c759-a193-42b6-9553-683656413431",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "b5814a97-7dd1-4488-8af3-6bf0af555d51": {
      "main": [
        [
          {
            "node": "4d88b9f9-6909-47c8-91a5-c27ebc97de49",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "e524c759-a193-42b6-9553-683656413431": {
      "main": [
        [
          {
            "node": "e0cff6ae-ea8b-47c6-8cc1-884459e8224e",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Preguntas frecuentes

¿Cómo usar este flujo de trabajo?

Copie el código de configuración JSON de arriba, cree un nuevo flujo de trabajo en su instancia de n8n y seleccione "Importar desde JSON", pegue la configuración y luego modifique la configuración de credenciales según sea necesario.

¿En qué escenarios es adecuado este flujo de trabajo?

Intermedio - Generación de leads, IA Multimodal

¿Es de pago?

Este flujo de trabajo es completamente gratuito, puede importarlo y usarlo directamente. Sin embargo, tenga en cuenta que los servicios de terceros utilizados en el flujo de trabajo (como la API de OpenAI) pueden requerir un pago por su cuenta.

Flujos de trabajo relacionados recomendados

Generación automática de propuestas de Upwork en tiempo real usando Apify, Google Gemini y Sheets
Usar Apify, Google Gemini y Sheets para automatizar la generación de propuestas de Upwork de IA
If
Set
Gmail
+
If
Set
Gmail
25 NodosIntuz
Generación de leads
Empresas financiadas por CB e investigación de información
Automatización de generación y contacto por correo a prospectos: Apify, Apollo.io, GPT-4 y Google Sheets
If
Code
Merge
+
If
Code
Merge
32 NodosIntuz
Generación de leads
Generación automatizada de prospectos impulsados por IA para empleos de LinkedIn con Apify, Apollo.io y Google Gemini
Automatización de generación de prospectos de empleos de LinkedIn: Apify, Apollo.io y Google Gemini
If
Code
Limit
+
If
Code
Limit
47 NodosIntuz
Generación de leads
Investigación automática de perfiles de LinkedIn en tiempo real y contacto externo con IA (usando Apify, Gemini y Sheets)
Automatización de investigación de perfiles y outreach por correo electrónico de LinkedIn con Apify, Gemini y Sheets
If
Limit
Google Sheets
+
If
Limit
Google Sheets
20 NodosIntuz
Generación de leads
Campañas de correo electrónico hiperpersonalizadas con IA, Gmail y Google Sheets
Campañas de correo hiperpersonalizado usando IA, Gmail y Google Sheets
Gmail
Http Request
Google Sheets
+
Gmail
Http Request
Google Sheets
12 NodosIntuz
Nutrición de leads
Automatización de desarrollo de ventas utilizando señales de empleos de LinkedIn, Apify, Apollo.io y Google Gemini
Basado en señales de empleos de LinkedIn, usar Apify y Google Gemini para generar outreach de ventas personalizados
If
Code
Limit
+
If
Code
Limit
47 NodosIntuz
Información del flujo de trabajo
Nivel de dificultad
Intermedio
Número de nodos9
Categoría2
Tipos de nodos4
Descripción de la dificultad

Adecuado para usuarios con experiencia intermedia, flujos de trabajo de complejidad media con 6-15 nodos

Autor
Intuz

Intuz

@intuz

Workflow automation can help automate your routine activities and help saves $$$, as well as hours of time. As a boutique tech consulting company, Intuz help businesses with custom AI/ML, AI Workflow Automations, and software development. Automate your business workflow for: Sales Marketing Accounting Finance Operations E-Commerce Customer Support Admin & Backoffice Logistics & Supply Chain

Enlaces externos
Ver en n8n.io

Compartir este flujo de trabajo

Categorías

Categorías: 34