8
n8n 한국어amn8n.com

매일 RAG 연구 논문 센터와 arXiv, Gemini AI, Notion

고급

이것은Content Creation, Multimodal AI분야의자동화 워크플로우로, 22개의 노드를 포함합니다.주로 If, Code, Gmail, Notion, Switch 등의 노드를 사용하며. 매일 RAG 연구 논문 센터와 arXiv, Gemini AI, Notion

사전 요구사항
  • Google 계정 및 Gmail API 인증 정보
  • Notion API Key
  • 대상 API의 인증 정보가 필요할 수 있음
  • Google Gemini API Key
워크플로우 미리보기
노드 연결 관계를 시각적으로 표시하며, 확대/축소 및 이동을 지원합니다
워크플로우 내보내기
다음 JSON 구성을 복사하여 n8n에 가져오면 이 워크플로우를 사용할 수 있습니다
{
  "meta": {
    "instanceId": "a6011e4876c6b1225fa48dae1dbfa92e1932a633b3186bbb7bfd5c9e6ad2d878"
  },
  "nodes": [
    {
      "id": "7e9f18f1-edfe-4af6-835b-12fe16a99034",
      "name": "Basic LLM Chain",
      "type": "@n8n/n8n-nodes-langchain.chainLlm",
      "position": [
        272,
        0
      ],
      "parameters": {
        "text": "={{ $json.data }}",
        "batching": {},
        "messages": {
          "messageValues": [
            {
              "message": "You are a paper content analysis assistant. You can analyze and inspect JSON data, accurately identify the content in the `summary` field, make judgments, and enrich the data items. The main tasks are as follows:\n\n1. RAG Relevance and Labeling:\n   - Analyze the `summary` field to determine whether the content is related to RAG (Retrieval-Augmented Generation) and assign labels.\n   - For each data item, add three new fields:\n     - `RAG_TF`: \"T\" if related, \"F\" if not\n     - `RAG_REASON`: if not related, provide the reason in English; otherwise, leave empty\n     - `RAG_Category`: if related, assign a category label based on the `summary` content (e.g., Framework / Application / …); otherwise, leave empty\n\n2. RAG Method Extraction:\n   - Analyze the `summary` and extract the RAG method proposed in the paper.\n   - Store it in the new field `RAG_NAME`.\n\n3. External Link Extraction:\n   - Analyze the `summary` content for `github` or `huggingface` links.\n   - If present, extract the URLs and populate the existing `github` and `huggingface` fields.\n   - If not present, leave them unchanged.\n\nOutput Format: standard JSON\n\nExample:\n\nGiven a data item with the following `summary`:\n\n\"summary\":\"Processing long contexts presents a significant challenge for large language models (LLMs). While recent advancements allow LLMs to handle much longer contexts than before (e.g., 32K or 128K tokens), it is computationally expensive and can still be insufficient for many applications. Retrieval-Augmented Generation (RAG) is considered a promising strategy to address this problem. However, conventional RAG methods face inherent limitations because of two underlying requirements: 1) explicitly stated queries, and 2) well-structured knowledge. These conditions, however, do not hold in general long-context processing tasks. In this work, we propose MemoRAG, a novel RAG framework empowered by global memory-augmented retrieval. MemoRAG features a dual-system architecture. First, it employs a light but long-range system to create a global memory of the long context. Once a task is presented, it generates draft answer\n"
            }
          ]
        },
        "promptType": "define"
      },
      "typeVersion": 1.7
    },
    {
      "id": "92d37dc1-aaaf-47ec-987a-e6d23c93e055",
      "name": "Google Gemini Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini",
      "position": [
        272,
        144
      ],
      "parameters": {
        "options": {},
        "modelName": "=models/gemini-2.5-flash"
      },
      "credentials": {
        "googlePalmApi": {
          "id": "ra9slZSGvLJTHQw1",
          "name": "Google Gemini(PaLM) Api account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "aaa67776-c308-443e-98f6-e1fe7035cbb5",
      "name": "제출일:T-1",
      "type": "n8n-nodes-base.code",
      "position": [
        -1664,
        320
      ],
      "parameters": {
        "jsCode": "// Function 节点代码\nconst now = new Date();\nconst yesterday = new Date(now);\nyesterday.setDate(now.getDate() - 2);\n\nconst y = yesterday.getFullYear();\nconst m = String(yesterday.getMonth() + 1).padStart(2, '0');\nconst d = String(yesterday.getDate()).padStart(2, '0');\n\nreturn [\n  {\n    json: {\n      from: `${y}${m}${d}0000`,\n      to: `${y}${m}${d}2359`\n    }\n  }\n];\n"
      },
      "typeVersion": 2
    },
    {
      "id": "c3685631-8bbd-409a-978a-fbb3e9847115",
      "name": "If",
      "type": "n8n-nodes-base.if",
      "position": [
        -160,
        16
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "de0a5a7e-67dd-4dd0-8ccc-3406e17bd09c",
              "operator": {
                "type": "number",
                "operation": "notEquals"
              },
              "leftValue": "={{ $json.paperCount }}",
              "rightValue": 0
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "4dd24343-1872-472d-8d7d-4cd28a9dbabe",
      "name": "Schedule Trigger",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        -1856,
        320
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "triggerAtHour": 6
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "a38b1b58-a6f6-4c6b-ba6e-f153980a220d",
      "name": "FEISHU",
      "type": "n8n-nodes-base.switch",
      "position": [
        576,
        720
      ],
      "parameters": {
        "rules": {
          "values": [
            {
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "7b804f5e-6702-4d4a-99b9-3f06f8eb20d4",
                    "operator": {
                      "type": "string",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.type }}",
                    "rightValue": "feishu"
                  }
                ]
              }
            }
          ]
        },
        "options": {}
      },
      "typeVersion": 3.2
    },
    {
      "id": "ac6b1c0d-b18e-4b42-b49e-8cb4daf0d384",
      "name": "FEISHU POST",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        800,
        720
      ],
      "parameters": {
        "url": "=",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "bodyParameters": {
          "parameters": [
            {
              "name": "msg_type",
              "value": "={{ $json.msg_type }}"
            },
            {
              "name": "content",
              "value": "={{ $json.content }}"
            }
          ]
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "9151ab18-379f-4d3b-8ca2-cf65c547e78d",
      "name": "gmail",
      "type": "n8n-nodes-base.switch",
      "position": [
        576,
        544
      ],
      "parameters": {
        "rules": {
          "values": [
            {
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "3222832c-bbf2-46a2-abd8-2bb14095b7bf",
                    "operator": {
                      "type": "string",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.type }}",
                    "rightValue": "gmail"
                  }
                ]
              }
            }
          ]
        },
        "options": {}
      },
      "typeVersion": 3.2
    },
    {
      "id": "869f80ec-c14c-4d1e-ae11-bb6eb4c99e5d",
      "name": "메시지 전송",
      "type": "n8n-nodes-base.gmail",
      "position": [
        800,
        544
      ],
      "webhookId": "cb0a1f30-59e0-4505-af24-db689d9c1f23",
      "parameters": {
        "sendTo": "xing.adam@gmail.com",
        "message": "={{ $json.message }}",
        "options": {},
        "subject": "={{ $json.subject }}"
      },
      "credentials": {
        "gmailOAuth2": {
          "id": "WoyY5hj4D93bD2Fp",
          "name": "Gmail account"
        }
      },
      "typeVersion": 2.1
    },
    {
      "id": "3df82b76-e9c8-4b0b-a552-428f2fc12c97",
      "name": "모델에 메시지 전송",
      "type": "@n8n/n8n-nodes-langchain.googleGemini",
      "position": [
        -1040,
        320
      ],
      "parameters": {
        "modelId": {
          "__rl": true,
          "mode": "list",
          "value": "models/gemini-2.5-flash-lite",
          "cachedResultName": "models/gemini-2.5-flash-lite"
        },
        "options": {},
        "messages": {
          "values": [
            {
              "role": "model",
              "content": "You are a daily paper content summarization assistant capable of analyzing XML data. Your main tasks are as follows:\n\n1. Set the daily title field `Title`: {yyyy-mm-dd} paper summary\n2. Set the daily date field `Date`: yyyy-mm-dd\n3. Identify the `<opensearch:totalResults>` tag in the XML and set its numeric value to the field `Number of papers`.\n4. Provide a brief summary of all papers for the day, covering all topics. Set the Chinese summary as `SUMMARY_CN` and the English summary as `SUMMARY_EN`. Ensure that both summaries reflect the comprehensive summary of all papers for the day.\n5. Output format: standard JSON. If there are no papers for the day, set `Number of papers` to 0, but still include the `SUMMARY_CN` and `SUMMARY_EN` fields with empty content.\n\nExample: If there are papers:\n{\n  \"Number of papers\":\"2025-09-13 paper summary\",\n  \"Date\":2025-09-13,\n  \"Number of papers\": 2,\n  \"SUMMARY_CN\": \"Today's papers cover the Knowledge Graph (KG) for climate knowledge and the Approximate Graph Propagation (AGP) framework. The first paper introduces a KG based on climate publications to improve access and utilization of climate science literature. The second paper focuses on the AGP framework, proposing a new algorithm AGP-Static++ and enhancing dynamic graph support for better query and update efficiency.\",\n  \"SUMMARY_EN\": \"Today's papers cover the Knowledge Graph (KG) for climate knowledge and the Approximate Graph Propagation (AGP) framework. The first paper introduces a domain-specific KG built from climate publications aimed at improving access and use of climate science literature. The second paper focuses on the AGP framework, proposing a new algorithm, AGP-Static++, and improving dynamic graph support, enhancing query and update efficiency.\"\n}\n\nIf the number of papers is 0, maintain the JSON structure:\n{\n  \"Number of papers\":\"2025-09-13 paper summary\",\n  \"Date\":2025-09-13,\n  \"Number of papers\": 0,\n  \"SUMMARY_CN\": \"\",\n  \"SUMMARY_EN\": \"\"\n}"
            },
            {
              "content": "={{ $json.data }}"
            }
          ]
        },
        "simplify": false
      },
      "credentials": {
        "googlePalmApi": {
          "id": "ra9slZSGvLJTHQw1",
          "name": "Google Gemini(PaLM) Api account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "024c6399-857e-45a3-a15d-8b733e16da67",
      "name": "RAG 일일 논문 요약",
      "type": "n8n-nodes-base.notion",
      "position": [
        800,
        320
      ],
      "parameters": {
        "title": "={{ $json.title }}",
        "simple": false,
        "options": {},
        "resource": "databasePage",
        "databaseId": {
          "__rl": true,
          "mode": "list",
          "value": "26fa136d-cee4-8092-8b85-cf9e9cbc424f",
          "cachedResultUrl": "https://www.notion.so/26fa136dcee480928b85cf9e9cbc424f",
          "cachedResultName": "RAG Daily Paper Summary"
        },
        "propertiesUi": {
          "propertyValues": [
            {
              "key": "DATE|date",
              "date": "={{ $json.date }}"
            },
            {
              "key": "Number of papers|number",
              "numberValue": "={{ $json.paperCount }}"
            },
            {
              "key": "SUMMARY_EN|rich_text",
              "textContent": "={{ $json.summaryEN }}"
            },
            {
              "key": "SUMMARY_CN|rich_text",
              "textContent": "={{ $json.summaryCN }}"
            }
          ]
        }
      },
      "credentials": {
        "notionApi": {
          "id": "BNsFk38kgqvRDJpX",
          "name": "Notion account"
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "3282f989-a9a4-4d4f-aaf0-097fc0d72e0d",
      "name": "JSON FORMAT",
      "type": "n8n-nodes-base.code",
      "position": [
        -688,
        320
      ],
      "parameters": {
        "jsCode": "const items = $input.all();\nconst response = items[0].json;\n\ntry {\n  // Extract text content from Gemini API response\n  // Note: response is directly an object, not an array\n  const text = response.candidates[0].content.parts[0].text;\n  \n  // Extract JSON content\n  const jsonMatch = text.match(/```json\\n([\\s\\S]*?)\\n```/);\n  const jsonStr = jsonMatch[1];\n  \n  // Parse JSON\n  const data = JSON.parse(jsonStr);\n  \n  // Manually handle duplicate keys - extract from original string\n  const titleMatch = jsonStr.match(/\"Number of papers\":\\s*\"([^\"]+)\"/);\n  const countMatch = jsonStr.match(/\"Number of papers\":\\s*(\\d+)/);\n  \n  // Construct result\n  items[0].json = {\n    title: titleMatch ? titleMatch[1] : '',\n    date: data.Date || '',\n    paperCount: countMatch ? parseInt(countMatch[1]) : 0,\n    summaryCN: data.SUMMARY_CN || '',\n    summaryEN: data.SUMMARY_EN || ''\n  };\n  \n} catch (error) {\n  items[0].json = {\n    error: error.message,\n    originalData: response\n  };\n}\n\nreturn items;\n"
      },
      "typeVersion": 2
    },
    {
      "id": "f1a331fa-d830-4656-b108-7e18e7430b04",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1984,
        544
      ],
      "parameters": {
        "width": 736,
        "height": 768,
        "content": "## 1. Data Retrieval\n### arXiv API\n\nThe arXiv provides a public API that allows users to query research papers by topic or by predefined categories.\n\n[arXiv API User Manual](https://info.arxiv.org/help/api/user-manual.html#arxiv-api-users-manual)\n\n**Key Notes:**\n\n1. **Response Format**: The API returns data as a typical *Atom Response*.\n2. **Timezone & Update Frequency**:  \n   - The arXiv submission process operates on a 24-hour cycle.  \n   - Newly submitted articles become available in the API only at midnight *after* they have been processed.  \n   - Feeds are updated daily at midnight Eastern Standard Time (EST).  \n   - Therefore, a single request per day is sufficient.  \n3. **Request Limits**:  \n   - The maximum number of results per call (`max_results`) is **30,000**,  \n   - Results must be retrieved in slices of at most **2,000** at a time, using the `max_results` and `start` query parameters.  \n4. **Time Format**:  \n   - The expected format is `[YYYYMMDDTTTT+TO+YYYYMMDDTTTT]`,  \n   - `TTTT` is provided in 24-hour time to the minute, in GMT.\n\n### Scheduled Task\n\n- **Execution Frequency**: Daily  \n- **Execution Time**: 6:00 AM  \n- **Time Parameter Handling (JS)**:  \n  According to arXiv’s update rules, the scheduled task should query the **previous day’s (T-1)** `submittedDate` data.\n\n"
      },
      "typeVersion": 1
    },
    {
      "id": "ae855e91-2363-4b97-8933-761934b269fe",
      "name": "arXiv API",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -1440,
        320
      ],
      "parameters": {
        "url": "=https://export.arxiv.org/api/query?search_query=all:RAG+AND+submittedDate:[{{$json[\"from\"]}}+TO+{{$json[\"to\"]}}]",
        "options": {},
        "sendQuery": true,
        "queryParameters": {
          "parameters": [
            {
              "name": "={{ $json.from }}"
            },
            {
              "name": "={{ $json.to }}"
            }
          ]
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "6f3df3be-a376-42e9-b0be-32c4fba5a8e2",
      "name": "메시지 구성",
      "type": "n8n-nodes-base.code",
      "position": [
        -128,
        528
      ],
      "parameters": {
        "jsCode": "// Get current date\nconst now = new Date();\nconst year = now.getFullYear();\nconst month = String(now.getMonth() + 1).padStart(2, '0');\nconst day = String(now.getDate()).padStart(2, '0');\nconst date = `${year}-${month}-${day}`;\n\n// Get input data\nconst inputData = $input.first().json;\n\n// Generate message content\nconst messageContent = inputData.SUMMARY_CN;\n\n// Gmail message body\nconst gmailMessage = {\n    subject: inputData.title || `Daily Paper Summary - ${date}`,\n    message: `<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\n<html xmlns=\"http://www.w3.org/1999/xhtml\" lang=\"en\">\n<head>\n    <meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\" />\n    <title> RAG Daily Paper Summary - ${date}</title>\n    <style type=\"text/css\">\n        /* Gmail safe styles */\n        body {\n            font-family: Arial, sans-serif;\n            line-height: 1.4;\n            margin: 0;\n            padding: 0;\n            background-color: #f9f9f9;\n            color: #333333;\n        }\n        \n        table {\n            border-collapse: collapse;\n            mso-table-lspace: 0pt;\n            mso-table-rspace: 0pt;\n        }\n        \n        .email-wrapper {\n            width: 100%;\n            background-color: #f9f9f9;\n            padding: 40px 20px;\n        }\n        \n        .email-container {\n            width: 100%;\n            max-width: 600px;\n            margin: 0 auto;\n            background-color: #ffffff;\n            border-radius: 8px;\n            box-shadow: 0 2px 12px rgba(0, 0, 0, 0.1);\n        }\n        \n        .header {\n            background-color: #2563eb;\n            padding: 24px;\n            text-align: center;\n            border-radius: 8px 8px 0 0;\n        }\n        \n        .header h1 {\n            margin: 0 0 8px 0;\n            font-size: 24px;\n            font-weight: 600;\n            color: #ffffff;\n        }\n        \n        .date {\n            font-size: 14px;\n            color: #ffffff;\n            opacity: 0.9;\n        }\n        \n        .stats {\n            background-color: #f1f5f9;\n            padding: 16px 24px;\n            font-size: 14px;\n            color: #64748b;\n        }\n        \n        .content {\n            padding: 32px 24px 40px 24px;\n        }\n        \n        .section {\n            margin-bottom: 24px;\n        }\n        \n        .section-title {\n            font-size: 16px;\n            font-weight: 600;\n            color: #1e293b;\n            margin-bottom: 12px;\n            padding-bottom: 8px;\n            border-bottom: 1px solid #e2e8f0;\n        }\n        \n        .flag {\n            display: inline-block;\n            width: 20px;\n            height: 14px;\n            margin-right: 8px;\n            border-radius: 2px;\n            vertical-align: middle;\n        }\n        \n        .flag-cn {\n            background-color: #de2910;\n        }\n        \n        .flag-en {\n            background-color: #012169;\n        }\n        \n        .summary {\n            font-size: 14px;\n            line-height: 1.6;\n            color: #475569;\n            padding: 16px;\n            background-color: #f8fafc;\n            border-radius: 6px;\n            border-left: 3px solid #2563eb;\n        }\n        \n        .divider {\n            height: 1px;\n            background-color: #e2e8f0;\n            margin: 20px 0;\n            border: none;\n        }\n        \n        /* Mobile responsive */\n        @media screen and (max-width: 600px) {\n            .email-wrapper {\n                padding: 20px 10px !important;\n            }\n            \n            .header, .stats {\n                padding: 20px 16px !important;\n            }\n            \n            .content {\n                padding: 24px 16px 32px 16px !important;\n            }\n            \n            .email-container {\n                border-radius: 0;\n            }\n        }\n        \n        /* Gmail specific fixes */\n        .gmail-fix {\n            display: none;\n        }\n        \n        /* Outlook specific fixes */\n        .ExternalClass {\n            width: 100%;\n        }\n        \n        .ExternalClass,\n        .ExternalClass p,\n        .ExternalClass span,\n        .ExternalClass font,\n        .ExternalClass td,\n        .ExternalClass div {\n            line-height: 100%;\n        }\n    </style>\n    <!--[if mso]>\n    <style type=\"text/css\">\n        .email-container {\n            width: 600px !important;\n        }\n    </style>\n    <![endif]-->\n</head>\n<body>\n    <table role=\"presentation\" class=\"email-wrapper\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\">\n        <tr>\n            <td align=\"center\">\n                <table role=\"presentation\" class=\"email-container\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\">\n                    <!-- Header -->\n                    <tr>\n                        <td class=\"header\">\n                            <h1>RAG Daily Papers</h1>\n                            <div class=\"date\">${inputData.Date || date}</div>\n                        </td>\n                    </tr>\n                    \n                    <!-- Stats -->\n                    <tr>\n                        <td class=\"stats\">\n                            <strong>${inputData[\"Number of papers\"] || inputData.paperCount || 0} papers</strong> reviewed today\n                        </td>\n                    </tr>\n                    \n                    <!-- Content -->\n                    <tr>\n                        <td class=\"content\">\n                            <!-- Chinese Section -->\n                            <div class=\"section\">\n                                <h2 class=\"section-title\">\n                                  🇨🇳 Chinese\n                                </h2>\n                                <div class=\"summary\">\n                                    ${inputData.SUMMARY_CN || inputData.summaryCN || 'No Chinese summary available'}\n                                </div>\n                            </div>\n                            \n                            <!-- Divider -->\n                            <hr class=\"divider\">\n                            \n                            <!-- English Section -->\n                            <div class=\"section\">\n                                <h2 class=\"section-title\">\n                                    🇺🇸 English\n                                </h2>\n                                <div class=\"summary\">\n                                    ${inputData.SUMMARY_EN || inputData.summaryEN || 'No English summary available'}\n                                </div>\n                            </div>\n                        </td>\n                    </tr>\n                </table>\n            </td>\n        </tr>\n    </table>\n</body>\n</html>`\n};\n\n// Feishu message body\nconst feishuMessage = {\n    msg_type: \"text\",\n    content: {\n        text: `Today ${$input.first().json.date} ${$input.first().json.paperCount}  papers. ${$input.first().json.summaryEN} ${$input.first().json.summaryCN}`\n    }\n};\n\n// n8n output format\nreturn [\n    { json: { type: \"gmail\", ...gmailMessage } },\n    { json: { type: \"feishu\", ...feishuMessage } }\n];\n"
      },
      "typeVersion": 2
    },
    {
      "id": "2582c7df-9b15-4473-bc47-91cf6f7304e0",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -176,
        896
      ],
      "parameters": {
        "width": 1152,
        "height": 576,
        "content": "## 5. Message Push\n\nSet up two channels for message delivery: **EMAIL** and **IM**, and define the message format and content.\n\n### Email: Gmail\n\n**GMAIL OAuth 2.0 – Official Documentation**  \n[Configure your OAuth consent screen](https://docs.n8n.io/integrations/builtin/credentials/google/oauth-single-service/?utm_source=n8n_app&utm_medium=credential_settings&utm_campaign=create_new_credentials_modal#configure-your-oauth-consent-screen)\n\n**Steps:**\n- Enable Gmail API  \n- Create OAuth consent screen  \n- Create OAuth client credentials  \n- Audience: Add **Test users** under Testing status  \n\n**Message format**: HTML  \n(Model: OpenAI GPT — used to design an HTML email template)\n\n### IM: Feishu (LARK)\n\n**Bots in groups**  \n[Use bots in groups](https://www.larksuite.com/hc/en-US/articles/360048487736-use-bots-in-groups)\n"
      },
      "typeVersion": 1
    },
    {
      "id": "f7ba78f8-19cb-492c-840c-3570d2865fb1",
      "name": "RAG 일일 논문",
      "type": "n8n-nodes-base.notion",
      "position": [
        800,
        0
      ],
      "parameters": {
        "title": "={{ $json.title }}",
        "simple": false,
        "blockUi": {
          "blockValues": [
            {
              "textContent": "={{ $json.summary }}"
            }
          ]
        },
        "options": {},
        "resource": "databasePage",
        "databaseId": {
          "__rl": true,
          "mode": "list",
          "value": "26ba136d-cee4-8029-ad3d-e0e8ac64993f",
          "cachedResultUrl": "https://www.notion.so/26ba136dcee48029ad3de0e8ac64993f",
          "cachedResultName": "RAG DAILY"
        },
        "propertiesUi": {
          "propertyValues": [
            {
              "key": "published|date",
              "date": "={{ $json.published }}"
            },
            {
              "key": "summary|rich_text",
              "textContent": "={{ $json.summary }}"
            },
            {
              "key": "id|rich_text",
              "textContent": "={{ $json.id }}"
            },
            {
              "key": "html_url|url",
              "urlValue": "={{ $json.html_url }}"
            },
            {
              "key": "pdf_url|url",
              "urlValue": "={{ $json.pdf_url }}"
            },
            {
              "key": "primary_category|rich_text",
              "textContent": "={{ $json.primary_category }}"
            },
            {
              "key": "github|url",
              "urlValue": "={{ $json.github }}",
              "ignoreIfEmpty": true
            },
            {
              "key": "huggingface|url",
              "urlValue": "={{ $json.huggingface }}",
              "ignoreIfEmpty": true
            },
            {
              "key": "RAG_TF|rich_text",
              "textContent": "={{ $json.RAG_TF }}"
            },
            {
              "key": "RAG_REASON|rich_text",
              "textContent": "={{ $json.RAG_REASON }}"
            },
            {
              "key": "RAG_Category|rich_text",
              "textContent": "={{ $json.RAG_Category }}"
            },
            {
              "key": "RAG_NAME|rich_text",
              "textContent": "={{ $json.RAG_NAME }}"
            },
            {
              "key": "updated|date",
              "date": "={{ $json.updated }}"
            },
            {
              "key": "author|multi_select",
              "multiSelectValue": "={{ $json.authors }}"
            },
            {
              "key": "category|multi_select",
              "multiSelectValue": "={{ $json.categories }}"
            }
          ]
        }
      },
      "credentials": {
        "notionApi": {
          "id": "BNsFk38kgqvRDJpX",
          "name": "Notion account"
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "5d897d4d-968b-4336-bbee-d1d3b4dcae06",
      "name": "데이터 추출",
      "type": "n8n-nodes-base.code",
      "position": [
        112,
        0
      ],
      "parameters": {
        "jsCode": "// Get input data\nconst xmlData = $('arXiv API').first().json.data\n\nif (!xmlData) {\n    return [{\n        json: {\n            error: \"XML data not found. Please ensure the input contains XML content\",\n            message: \"Check the field names in the input data\",\n            success: false\n        }\n    }];\n}\n\n// Function to format date-time\nfunction formatDateTime(isoString) {\n    if (!isoString) return '';\n    \n    try {\n        const date = new Date(isoString);\n        if (isNaN(date.getTime())) return '';\n        \n        const year = date.getFullYear();\n        const month = String(date.getMonth() + 1).padStart(2, '0');\n        const day = String(date.getDate()).padStart(2, '0');\n        const hours = String(date.getUTCHours()).padStart(2, '0');\n        const minutes = String(date.getUTCMinutes()).padStart(2, '0');\n        const seconds = String(date.getUTCSeconds()).padStart(2, '0');\n        \n        return `${year}-${month}-${day} ${hours}:${minutes}:${seconds}`;\n    } catch (error) {\n        return '';\n    }\n}\n\n// General function to extract tag content\nfunction extractTagContent(xml, tagName) {\n    const regex = new RegExp(`<${tagName}[^>]*>([\\\\s\\\\S]*?)<\\\\/${tagName}>`, 'i');\n    const match = xml.match(regex);\n    return match ? match[1].trim().replace(/\\s+/g, ' ') : '';\n}\n\n// Extract links\nfunction extractLink(entryXml, linkType) {\n    // Fixed link extraction to fit actual XML format\n    // Format: <link href=\"...\" rel=\"...\" type=\"...\"/>\n    const patterns = [\n        new RegExp(`<link[^>]*href=\"([^\"]*)\"[^>]*type=\"${linkType}\"`, 'i'),\n        new RegExp(`<link[^>]*type=\"${linkType}\"[^>]*href=\"([^\"]*)\"`, 'i')\n    ];\n    \n    for (const pattern of patterns) {\n        const match = entryXml.match(pattern);\n        if (match && match[1]) {\n            return match[1];\n        }\n    }\n    return '';\n}\n\n// Fixed author extraction function - returns array\nfunction extractAuthors(entryXml) {\n    const authorBlocks = entryXml.match(/<author[^>]*>([\\s\\S]*?)<\\/author>/gi) || [];\n    const authors = [];\n    \n    for (const block of authorBlocks) {\n        const nameMatch = block.match(/<name[^>]*>(.*?)<\\/name>/i);\n        if (nameMatch && nameMatch[1].trim()) {\n            authors.push(nameMatch[1].trim());\n        }\n    }\n    \n    return authors; // Return array instead of string\n}\n\n// Extract categories\nfunction extractCategories(entryXml) {\n    const categories = [];\n    const regex = /<category[^>]*term=\"([^\"]*)\"/gi;\n    let match;\n    \n    while ((match = regex.exec(entryXml)) !== null) {\n        if (match[1]) {\n            categories.push(match[1]);\n        }\n    }\n    \n    return categories;\n}\n\n// Extract primary category\nfunction extractPrimaryCategory(entryXml) {\n    // Handle namespace-prefixed primary category extraction\n    const patterns = [\n        /primary_category[^>]*term=\"([^\"]*)\"/i,\n        /arxiv:primary_category[^>]*term=\"([^\"]*)\"/i\n    ];\n    \n    for (const pattern of patterns) {\n        const match = entryXml.match(pattern);\n        if (match && match[1]) {\n            return match[1];\n        }\n    }\n    return '';\n}\n\n// New: extract arxiv comment\nfunction extractArxivComment(entryXml) {\n    const commentMatch = entryXml.match(/<arxiv:comment[^>]*>(.*?)<\\/arxiv:comment>/i);\n    return commentMatch ? commentMatch[1].trim() : '';\n}\n\ntry {\n    // Extract all entry blocks\n    const entryRegex = /<entry[^>]*>([\\s\\S]*?)<\\/entry>/gi;\n    const entries = [];\n    let match;\n    \n    while ((match = entryRegex.exec(xmlData)) !== null) {\n        entries.push(match[1]);\n    }\n    \n    if (entries.length === 0) {\n        return [{\n            json: {\n                error: \"No <entry> elements found\",\n                message: \"Please check if the XML data format is correct\",\n                success: false\n            }\n        }];\n    }\n\n    // Process each entry\n    const processedData = [];\n    let processedCount = 0;\n\n    for (let i = 0; i < entries.length; i++) {\n        const entryXml = entries[i];\n        \n        try {\n            const item = {\n                id: extractTagContent(entryXml, 'id'),\n                updated: formatDateTime(extractTagContent(entryXml, 'updated')),\n                published: formatDateTime(extractTagContent(entryXml, 'published')),\n                title: extractTagContent(entryXml, 'title'),\n                summary: extractTagContent(entryXml, 'summary'),\n                authors: extractAuthors(entryXml), // field name changed to authors, returns array\n                html_url: extractLink(entryXml, 'text/html'),\n                pdf_url: extractLink(entryXml, 'application/pdf'),\n                primary_category: extractPrimaryCategory(entryXml),\n                categories: extractCategories(entryXml), // field name changed to categories\n                arxiv_comment: extractArxivComment(entryXml), // new arxiv comment\n                github: '',\n                huggingface: ''\n            };\n\n            // Validate required fields\n            if (item.id && item.title) {\n                processedData.push(item);\n                processedCount++;\n            }\n            \n        } catch (error) {\n            console.log(`Error processing entry ${i+1}: ${error.message}`);\n            // Continue processing next entry\n        }\n    }\n\n    // Return processed results\n    return [{\n        json: {\n            success: true,\n            message: `Successfully processed ${processedCount} entries`,\n            data: processedData,\n            processing_time: new Date().toISOString()\n        }\n    }];\n\n} catch (error) {\n    // Error handling\n    return [{\n        json: {\n            error: \"An error occurred during processing\",\n            message: error.message,\n            success: false\n        }\n    }];\n}\n"
      },
      "typeVersion": 2
    },
    {
      "id": "ae2d8994-7a52-4f7b-81fd-61c0538ba380",
      "name": "JSON Format",
      "type": "n8n-nodes-base.code",
      "position": [
        592,
        0
      ],
      "parameters": {
        "jsCode": "// Get input data\nconst xmlData = $('arXiv API').first().json.data\n\nif (!xmlData) {\n    return [{\n        json: {\n            error: \"XML data not found. Please ensure the input contains XML content\",\n            message: \"Check the field names in the input data\",\n            success: false\n        }\n    }];\n}\n\n// Function to format date-time\nfunction formatDateTime(isoString) {\n    if (!isoString) return '';\n    \n    try {\n        const date = new Date(isoString);\n        if (isNaN(date.getTime())) return '';\n        \n        const year = date.getFullYear();\n        const month = String(date.getMonth() + 1).padStart(2, '0');\n        const day = String(date.getDate()).padStart(2, '0');\n        const hours = String(date.getUTCHours()).padStart(2, '0');\n        const minutes = String(date.getUTCMinutes()).padStart(2, '0');\n        const seconds = String(date.getUTCSeconds()).padStart(2, '0');\n        \n        return `${year}-${month}-${day} ${hours}:${minutes}:${seconds}`;\n    } catch (error) {\n        return '';\n    }\n}\n\n// General function to extract tag content\nfunction extractTagContent(xml, tagName) {\n    const regex = new RegExp(`<${tagName}[^>]*>([\\\\s\\\\S]*?)<\\\\/${tagName}>`, 'i');\n    const match = xml.match(regex);\n    return match ? match[1].trim().replace(/\\s+/g, ' ') : '';\n}\n\n// Extract links\nfunction extractLink(entryXml, linkType) {\n    // Fixed link extraction to fit actual XML format\n    // Format: <link href=\"...\" rel=\"...\" type=\"...\"/>\n    const patterns = [\n        new RegExp(`<link[^>]*href=\"([^\"]*)\"[^>]*type=\"${linkType}\"`, 'i'),\n        new RegExp(`<link[^>]*type=\"${linkType}\"[^>]*href=\"([^\"]*)\"`, 'i')\n    ];\n    \n    for (const pattern of patterns) {\n        const match = entryXml.match(pattern);\n        if (match && match[1]) {\n            return match[1];\n        }\n    }\n    return '';\n}\n\n// Fixed author extraction function - returns array\nfunction extractAuthors(entryXml) {\n    const authorBlocks = entryXml.match(/<author[^>]*>([\\s\\S]*?)<\\/author>/gi) || [];\n    const authors = [];\n    \n    for (const block of authorBlocks) {\n        const nameMatch = block.match(/<name[^>]*>(.*?)<\\/name>/i);\n        if (nameMatch && nameMatch[1].trim()) {\n            authors.push(nameMatch[1].trim());\n        }\n    }\n    \n    return authors; // Return array instead of string\n}\n\n// Extract categories\nfunction extractCategories(entryXml) {\n    const categories = [];\n    const regex = /<category[^>]*term=\"([^\"]*)\"/gi;\n    let match;\n    \n    while ((match = regex.exec(entryXml)) !== null) {\n        if (match[1]) {\n            categories.push(match[1]);\n        }\n    }\n    \n    return categories;\n}\n\n// Extract primary category\nfunction extractPrimaryCategory(entryXml) {\n    // Handle namespace-prefixed primary category extraction\n    const patterns = [\n        /primary_category[^>]*term=\"([^\"]*)\"/i,\n        /arxiv:primary_category[^>]*term=\"([^\"]*)\"/i\n    ];\n    \n    for (const pattern of patterns) {\n        const match = entryXml.match(pattern);\n        if (match && match[1]) {\n            return match[1];\n        }\n    }\n    return '';\n}\n\n// New: extract arxiv comment\nfunction extractArxivComment(entryXml) {\n    const commentMatch = entryXml.match(/<arxiv:comment[^>]*>(.*?)<\\/arxiv:comment>/i);\n    return commentMatch ? commentMatch[1].trim() : '';\n}\n\ntry {\n    // Extract all entry blocks\n    const entryRegex = /<entry[^>]*>([\\s\\S]*?)<\\/entry>/gi;\n    const entries = [];\n    let match;\n    \n    while ((match = entryRegex.exec(xmlData)) !== null) {\n        entries.push(match[1]);\n    }\n    \n    if (entries.length === 0) {\n        return [{\n            json: {\n                error: \"No <entry> elements found\",\n                message: \"Please check if the XML data format is correct\",\n                success: false\n            }\n        }];\n    }\n\n    // Process each entry\n    const processedData = [];\n    let processedCount = 0;\n\n    for (let i = 0; i < entries.length; i++) {\n        const entryXml = entries[i];\n        \n        try {\n            const item = {\n                id: extractTagContent(entryXml, 'id'),\n                updated: formatDateTime(extractTagContent(entryXml, 'updated')),\n                published: formatDateTime(extractTagContent(entryXml, 'published')),\n                title: extractTagContent(entryXml, 'title'),\n                summary: extractTagContent(entryXml, 'summary'),\n                authors: extractAuthors(entryXml), // field name changed to authors, returns array\n                html_url: extractLink(entryXml, 'text/html'),\n                pdf_url: extractLink(entryXml, 'application/pdf'),\n                primary_category: extractPrimaryCategory(entryXml),\n                categories: extractCategories(entryXml), // field name changed to categories\n                arxiv_comment: extractArxivComment(entryXml), // new arxiv comment\n                github: '',\n                huggingface: ''\n            };\n\n            // Validate required fields\n            if (item.id && item.title) {\n                processedData.push(item);\n                processedCount++;\n            }\n            \n        } catch (error) {\n            console.log(`Error processing entry ${i+1}: ${error.message}`);\n            // Continue processing next entry\n        }\n    }\n\n    // Return processed results\n    return [{\n        json: {\n            success: true,\n            message: `Successfully processed ${processedCount} entries`,\n            data: processedData,\n            processing_time: new Date().toISOString()\n        }\n    }];\n\n} catch (error) {\n    // Error handling\n    return [{\n        json: {\n            error: \"An error occurred during processing\",\n            message: error.message,\n            success: false\n        }\n    }];\n}\n"
      },
      "typeVersion": 2
    },
    {
      "id": "8fbefc67-e9f7-4597-b935-d5f5895cf93c",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -160,
        -224
      ],
      "parameters": {
        "width": 656,
        "height": 192,
        "content": "## 3. Data Processing\n\nAnalyze and summarize paper data using AI, then standardize output as JSON.\n\n### Single Paper Basic Information Analysis and Enhancement  \n### Daily Paper Summary and Multilingual Translation"
      },
      "typeVersion": 1
    },
    {
      "id": "884f2c40-4628-4376-a040-709e2db34c48",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1024,
        16
      ],
      "parameters": {
        "width": 624,
        "height": 368,
        "content": "## 4. Data Storage: Notion Database\n\n- Create a corresponding database in Notion with the same predefined field names.  \n- In Notion, create an integration under **Integrations** and grant access to the database. Obtain the corresponding **Secret Key**.  \n- Use the Notion **\"Create a database page\"** node to configure the field mapping and store the data.  \n\n**Notes**  \n- **\"Create a database page\"** only adds new entries; data will not be updated.  \n- The `updated` and `published` timestamps of arXiv papers are in **UTC**.  \n- Notion **single-select** and **multi-select** fields only accept arrays. They do not automatically parse comma-separated strings. You need to format them as proper arrays.  \n- Notion does not accept `null` values, which causes a **400 error**.  \n"
      },
      "typeVersion": 1
    },
    {
      "id": "4991129d-9406-4c52-bd8f-87e2721c4a6f",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1088,
        544
      ],
      "parameters": {
        "width": 624,
        "height": 912,
        "content": "## 2. **Data Extraction**\n\n### Data Cleaning Rules (Convert to Standard JSON)\n\n1. **Remove Header**  \n   - Keep only the `<entry></entry>` blocks representing paper items.\n\n2. **Single Item**  \n   - Each `<entry></entry>` represents a single item.\n\n3. **Field Processing Rules**  \n   - `<id></id>` ➡️ `id`  \n     Extract content.  \n     Example: `<id>http://arxiv.org/abs/2409.06062v1</id>` → `http://arxiv.org/abs/2409.06062v1`  \n   - `<updated></updated>` ➡️ `updated`  \n     Convert timestamp to `yyyy-mm-dd hh:mm:ss`  \n   - `<published></published>` ➡️ `published`  \n     Convert timestamp to `yyyy-mm-dd hh:mm:ss`  \n   - `<title></title>` ➡️ `title`  \n     Extract text content  \n   - `<summary></summary>` ➡️ `summary`  \n     Keep text, remove line breaks  \n   - `<author></author>` ➡️ `author`  \n     Combine all authors into an array  \n     Example: `[ \"Ernest Pusateri\", \"Anmol Walia\" ]` (for Notion multi-select field)  \n   - `<arxiv:comment></arxiv:comment>` ➡️ Ignore / discard  \n   - `<link type=\"text/html\">` ➡️ `html_url`  \n     Extract URL  \n   - `<link type=\"application/pdf\">` ➡️ `pdf_url`  \n     Extract URL  \n   - `<arxiv:primary_category term=\"cs.CL\">` ➡️ `primary_category`  \n     Extract `term` value  \n   - `<category>` ➡️ `category`  \n     Merge all `<category>` values into an array  \n     Example: `[ \"eess.AS\", \"cs.SD\" ]` (for Notion multi-select field)  \n\n4. **Add Empty Fields**  \n   - `github`  \n   - `huggingface`\n"
      },
      "typeVersion": 1
    }
  ],
  "pinData": {},
  "connections": {
    "c3685631-8bbd-409a-978a-fbb3e9847115": {
      "main": [
        [
          {
            "node": "5d897d4d-968b-4336-bbee-d1d3b4dcae06",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "9151ab18-379f-4d3b-8ca2-cf65c547e78d": {
      "main": [
        [
          {
            "node": "869f80ec-c14c-4d1e-ae11-bb6eb4c99e5d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "a38b1b58-a6f6-4c6b-ba6e-f153980a220d": {
      "main": [
        [
          {
            "node": "ac6b1c0d-b18e-4b42-b49e-8cb4daf0d384",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "ae855e91-2363-4b97-8933-761934b269fe": {
      "main": [
        [
          {
            "node": "3df82b76-e9c8-4b0b-a552-428f2fc12c97",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3282f989-a9a4-4d4f-aaf0-097fc0d72e0d": {
      "main": [
        [
          {
            "node": "024c6399-857e-45a3-a15d-8b733e16da67",
            "type": "main",
            "index": 0
          },
          {
            "node": "c3685631-8bbd-409a-978a-fbb3e9847115",
            "type": "main",
            "index": 0
          },
          {
            "node": "6f3df3be-a376-42e9-b0be-32c4fba5a8e2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "ae2d8994-7a52-4f7b-81fd-61c0538ba380": {
      "main": [
        [
          {
            "node": "f7ba78f8-19cb-492c-840c-3570d2865fb1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "7e9f18f1-edfe-4af6-835b-12fe16a99034": {
      "main": [
        [
          {
            "node": "ae2d8994-7a52-4f7b-81fd-61c0538ba380",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "5d897d4d-968b-4336-bbee-d1d3b4dcae06": {
      "main": [
        [
          {
            "node": "7e9f18f1-edfe-4af6-835b-12fe16a99034",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3df82b76-e9c8-4b0b-a552-428f2fc12c97": {
      "main": [
        [
          {
            "node": "3282f989-a9a4-4d4f-aaf0-097fc0d72e0d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "4dd24343-1872-472d-8d7d-4cd28a9dbabe": {
      "main": [
        [
          {
            "node": "aaa67776-c308-443e-98f6-e1fe7035cbb5",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "aaa67776-c308-443e-98f6-e1fe7035cbb5": {
      "main": [
        [
          {
            "node": "ae855e91-2363-4b97-8933-761934b269fe",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "6f3df3be-a376-42e9-b0be-32c4fba5a8e2": {
      "main": [
        [
          {
            "node": "9151ab18-379f-4d3b-8ca2-cf65c547e78d",
            "type": "main",
            "index": 0
          },
          {
            "node": "a38b1b58-a6f6-4c6b-ba6e-f153980a220d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "92d37dc1-aaaf-47ec-987a-e6d23c93e055": {
      "ai_languageModel": [
        [
          {
            "node": "7e9f18f1-edfe-4af6-835b-12fe16a99034",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    }
  }
}
자주 묻는 질문

이 워크플로우를 어떻게 사용하나요?

위의 JSON 구성 코드를 복사하여 n8n 인스턴스에서 새 워크플로우를 생성하고 "JSON에서 가져오기"를 선택한 후, 구성을 붙여넣고 필요에 따라 인증 설정을 수정하세요.

이 워크플로우는 어떤 시나리오에 적합한가요?

고급 - 콘텐츠 제작, 멀티모달 AI

유료인가요?

이 워크플로우는 완전히 무료이며 직접 가져와 사용할 수 있습니다. 다만, 워크플로우에서 사용하는 타사 서비스(예: OpenAI API)는 사용자 직접 비용을 지불해야 할 수 있습니다.

워크플로우 정보
난이도
고급
노드 수22
카테고리2
노드 유형11
난이도 설명

고급 사용자를 위한 16+개 노드의 복잡한 워크플로우

외부 링크
n8n.io에서 보기

이 워크플로우 공유

카테고리

카테고리: 34