Extraction automatique d'articles de presse avec ScrapegraphAI et stockage dans Google Sheets

Name: Extraction automatique d'articles de presse avec ScrapegraphAI et stockage dans Google Sheets
Rating: 4.5 (10 reviews)
Author: vinci-king-01

Intermédiaire

Ceci est unMarket Research, AI Summarizationworkflow d'automatisation du domainecontenant 8 nœuds.Utilise principalement des nœuds comme Code, GoogleSheets, ScheduleTrigger, ScrapegraphAi. Utiliser ScrapegraphAI pour extraire automatiquement des articles d'actualité et les stocker dans Google Sheets

Prérequis

•Informations d'identification Google Sheets API

Nœuds utilisés (8)

Catégorie

Étude de marché

Résumé IA

Aperçu du workflow

Visualisation des connexions entre les nœuds, avec support du zoom et du déplacement

Déclencheur de Collecte Automatisée d'Actualités

Récupérateur d'Articles d'Actualité par IA

Stockage d'Actualités dans Google Sheets

Mise en Forme et Traitement des Données d'Actualité

React Flow

Exporter le workflow

Copiez la configuration JSON suivante dans n8n pour importer et utiliser ce workflow

{
  "id": "MIllJmbqayQrZM1F",
  "meta": {
    "instanceId": "521567c5f495f323b77849c4cfd0c9f4f2396c986e324e0e66c8425b6f124744",
    "templateCredsSetupCompleted": true
  },
  "name": "Automate News Article Scraping with ScrapegraphAI and Store in Google Sheets",
  "tags": [],
  "nodes": [
    {
      "id": "37df323b-5c75-495f-ba19-b8642c02d96f",
      "name": "Déclencheur de Collecte Automatisée d'Actualités",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        700,
        820
      ],
      "parameters": {
        "rule": {
          "interval": [
            {}
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "efd61ca5-e248-4027-b705-6d9c5dabe820",
      "name": "Récupérateur d'Articles d'Actualité par IA",
      "type": "n8n-nodes-scrapegraphai.scrapegraphAi",
      "position": [
        1380,
        820
      ],
      "parameters": {
        "userPrompt": "Extract all the articles from this site. Use the following schema for response {   \"request_id\": \"5a9de102-8a43-4e89-8aae-397c9ca80a9b\",   \"status\": \"completed\",   \"website_url\": \"https://www.bbc.com/\",   \"user_prompt\": \"Extract all the articles from this site.\",   \"title\": \"'My friend died right in front of me' - Student describes moment air force jet crashed into school\",   \"url\": \"https://www.bbc.com/news/articles/cglzw8y5wy5o\",   \"category\": \"Asia\" }",
        "websiteUrl": "https://www.bbc.com/"
      },
      "credentials": {
        "scrapegraphAIApi": {
          "id": "",
          "name": ""
        }
      },
      "typeVersion": 1
    },
    {
      "id": "976d9123-7585-4700-9972-5b2838571a44",
      "name": "Stockage d'Actualités dans Google Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        2980,
        820
      ],
      "parameters": {
        "columns": {
          "value": {},
          "schema": [
            {
              "id": "title",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "title",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "url",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "url",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "category",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "category",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "autoMapInputData",
          "matchingColumns": []
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "url",
          "value": ""
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "",
          "name": ""
        }
      },
      "typeVersion": 4.5
    },
    {
      "id": "6d11ae64-e2f8-47ed-854a-c749881ce72c",
      "name": "Mise en Forme et Traitement des Données d'Actualité",
      "type": "n8n-nodes-base.code",
      "notes": "Hey this is where \nyou \nformat results ",
      "position": [
        2140,
        820
      ],
      "parameters": {
        "jsCode": "// Get the input data\nconst inputData = $input.all()[0].json;\n\n// Extract articles array\nconst articles = inputData.result.articles;\n\n// Map each article and return only title, url, category\nreturn articles.map(article => ({\n  json: {\n    title: article.title,\n    url: article.url,\n    category: article.category\n  }\n}));"
      },
      "notesInFlow": true,
      "typeVersion": 2
    },
    {
      "id": "ca78baaf-0480-490d-aa9a-3663ca93f5d0",
      "name": "Note adhésive1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1180,
        460
      ],
      "parameters": {
        "color": 5,
        "width": 574.9363634768473,
        "height": 530.4701664623029,
        "content": "# Step 2: AI-Powered News Article Scraper 🤖\n\nThis is the core node which uses ScrapeGraphAI to intelligently extract news articles from any website.\n\n## How to Use\n- Configure the target news website URL\n- Use natural language to describe what data to extract\n- The AI will automatically parse and structure the results\n\n## Configuration\n- **Website URL**: Target news website (e.g., BBC, CNN, Reuters)\n- **User Prompt**: Natural language instructions for data extraction\n- **API Credentials**: ScrapeGraphAI API key required\n\n## Example\n- **Website**: BBC News homepage\n- **Instruction**: \"Extract all article titles, URLs, and categories\"\n\n⚠️ **Note**: This is a community node requiring self-hosting"
      },
      "typeVersion": 1
    },
    {
      "id": "51a1337b-6a50-43a5-8d6f-8345bc771c7b",
      "name": "Note adhésive2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1920,
        460
      ],
      "parameters": {
        "color": 5,
        "width": 574.9363634768473,
        "height": 530.4701664623029,
        "content": "# Step 3: News Data Formatting and Processing 🧱\n\nThis node transforms and structures the scraped news data for optimal Google Sheets compatibility.\n\n## What it does\n- Extracts articles array from ScrapeGraphAI response\n- Maps each article to standardized format\n- Ensures data consistency and structure\n- Prepares clean data for spreadsheet storage\n\n## Data Structure\n- **title**: Article headline and title\n- **url**: Direct link to the article\n- **category**: Article category or section\n\n## Customization\n- Modify the JavaScript code to extract additional fields\n- Add data validation and cleaning logic\n- Implement error handling for malformed data"
      },
      "typeVersion": 1
    },
    {
      "id": "2e8cde8e-f534-4f37-a1f9-bcf0fe0b09f9",
      "name": "Note adhésive3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        460,
        460
      ],
      "parameters": {
        "color": 5,
        "width": 574.9363634768473,
        "height": 530.4701664623029,
        "content": "# Step 1: Automated News Collection Trigger ⏱️\n\nThis trigger automatically invokes the workflow at specified intervals to collect fresh news content.\n\n## Configuration Options\n- **Frequency**: Daily, hourly, or custom intervals\n- **Time Zone**: Configure for your business hours\n- **Execution Time**: Choose optimal times for news collection\n\n## Best Practices\n- Set appropriate intervals to respect rate limits\n- Consider news website update frequencies\n- Monitor execution logs for any issues\n- Adjust frequency based on your monitoring needs"
      },
      "typeVersion": 1
    },
    {
      "id": "5606537c-a531-490a-b4ff-6d0dc5e642b4",
      "name": "Note adhésive",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2680,
        460
      ],
      "parameters": {
        "color": 5,
        "width": 574.9363634768473,
        "height": 530.4701664623029,
        "content": "# Step 4: Google Sheets News Storage 📊\n\nThis node securely stores the processed news article data in your Google Sheets for analysis and tracking.\n\n## What it does\n- Connects to your Google Sheets account via OAuth2\n- Appends new article data as rows\n- Maintains historical data for trend analysis\n- Provides structured data for business intelligence\n\n## Configuration\n- **Spreadsheet**: Select or create target Google Sheets document\n- **Sheet Name**: Configure worksheet (default: Sheet1)\n- **Operation**: Append mode for continuous data collection\n- **Column Mapping**: Automatic mapping of title, url, category fields\n\n## Data Management\n- Each execution adds new article entries\n- Historical data preserved for analysis\n- Easy export and sharing capabilities\n- Built-in Google Sheets analytics and filtering"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "c2fee060-f99e-48aa-a280-ac5492715fd9",
  "connections": {
    "efd61ca5-e248-4027-b705-6d9c5dabe820": {
      "main": [
        [
          {
            "node": "6d11ae64-e2f8-47ed-854a-c749881ce72c",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "37df323b-5c75-495f-ba19-b8642c02d96f": {
      "main": [
        [
          {
            "node": "efd61ca5-e248-4027-b705-6d9c5dabe820",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "6d11ae64-e2f8-47ed-854a-c749881ce72c": {
      "main": [
        [
          {
            "node": "976d9123-7585-4700-9972-5b2838571a44",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Foire aux questions

Comment utiliser ce workflow ?

Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.

Dans quelles scénarios ce workflow est-il adapté ?

Intermédiaire - Étude de marché, Résumé IA

Est-ce payant ?

Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.