RAG avec embeddings OpenAI dans BigQuery

Avancé

Ceci est unMiscellaneous, AI RAG, Multimodal AIworkflow d'automatisation du domainecontenant 24 nœuds.Utilise principalement des nœuds comme Set, HttpRequest, Agent, ChatTrigger, LmChatOpenAi. Utiliser BigQuery RAG et OpenAI pour répondre aux questions relatives aux documents

Prérequis
  • Peut nécessiter les informations d'identification d'authentification de l'API cible
  • Clé API OpenAI
Aperçu du workflow
Visualisation des connexions entre les nœuds, avec support du zoom et du déplacement
Exporter le workflow
Copiez la configuration JSON suivante dans n8n pour importer et utiliser ce workflow
{
  "id": "",
  "meta": {
    "instanceId": ""
  },
  "name": "BigQuery RAG With OpenAI Embedding",
  "tags": [],
  "nodes": [
    {
      "id": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
      "name": "Agent IA",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -960,
        64
      ],
      "parameters": {
        "options": {
          "systemMessage": "You are an assistant specialized in answering questions about **n8n**.  \nUse the connected vector store to retrieve relevant documentation and generate clear, structured, and helpful answers.\n\n## Instructions\n\n1. **Document Retrieval**  \n   - Query the connected vector store to gather information from the n8n documentation.  \n   - Base every answer on the retrieved content.  \n\n2. **Answer Formatting**  \n   - Write all answers in **Markdown**.  \n   - Structure responses with clear sections, bullet points, or lists when useful.  \n   - Provide direct excerpts or explanations from the retrieved documents.  \n\n3. **Images**  \n   - Include screenshots from the documentation when they provide value to the user.  \n   - Use Markdown syntax for images.  \n   - Focus on images that illustrate functionality or clarify instructions.  \n\n4. **Code**  \n   - Present code snippets as **Markdown code blocks**.  \n   - Preserve the original content of the snippet.  \n   - Add language hints when available (e.g., ` ```javascript `).  \n\n5. **Completeness**  \n   - Provide accurate, context-aware answers.  \n   - Indicate clearly when the retrieved documents do not contain enough information. "
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "e198291b-f8b0-455e-8c40-bf5a9cadef70",
      "name": "À réception d'un message chat",
      "type": "@n8n/n8n-nodes-langchain.chatTrigger",
      "position": [
        -1408,
        64
      ],
      "webhookId": "dc7d2421-f489-4eea-b3fb-6b69ede9beed",
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.3
    },
    {
      "id": "53c039bc-4788-49f3-afc8-354498da9e44",
      "name": "Modèle de Chat OpenAI",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -1152,
        400
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4.1-mini",
          "cachedResultName": "gpt-4.1-mini"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "id": "",
          "name": "OpenAi Account"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "e9a99d6d-e191-4ecf-a271-1a9ebb07e02a",
      "name": "Mémoire Simple",
      "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow",
      "position": [
        -848,
        400
      ],
      "parameters": {},
      "typeVersion": 1.3
    },
    {
      "id": "8acb8e42-c6e3-4254-a53d-077bbf9f4065",
      "name": "Lorsqu'exécuté par un autre workflow",
      "type": "n8n-nodes-base.executeWorkflowTrigger",
      "position": [
        -1488,
        1264
      ],
      "parameters": {
        "workflowInputs": {
          "values": [
            {
              "name": "vector_search_question"
            }
          ]
        }
      },
      "typeVersion": 1.1
    },
    {
      "id": "c97a255c-3328-4837-bf45-7a3f5cc7c5b7",
      "name": "BigQuery RAG OpenAI",
      "type": "@n8n/n8n-nodes-langchain.toolWorkflow",
      "position": [
        -512,
        384
      ],
      "parameters": {
        "workflowId": {
          "__rl": true,
          "mode": "list",
          "value": "7tqwzCyxnnu8Svq3",
          "cachedResultName": "BigQuery RAG With OpenAI Embedding"
        },
        "description": "Call this tool to get documents from the vector databas",
        "workflowInputs": {
          "value": {
            "vector_search_question": "={{ /*n8n-auto-generated-fromAI-override*/ $fromAI('vector_search_question', `A natural language question used to query the vector database containing the documentation.\n`, 'string') }}"
          },
          "schema": [
            {
              "id": "vector_search_question",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "vector_search_question",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "vector_search_question"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "dde0a97a-4495-42c2-837d-f5596f2bfbfe",
      "name": "Note adhésive",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1536,
        -128
      ],
      "parameters": {
        "color": 3,
        "width": 384,
        "height": 432,
        "content": "## When Chat Message Received\n\nWhen a chat message is received, it triggers the workflow.  \nThe message is then forwarded to the **AI Agent**.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "62bfaf5a-5b7f-430d-9bf6-c1d745e6a2c3",
      "name": "Note adhésive 1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1120,
        -128
      ],
      "parameters": {
        "color": 7,
        "width": 544,
        "height": 432,
        "content": "## AI Agent\n\nThe AI Agent receives the user input.  \nIt is configured with a system prompt that instructs it to use the connected tool (**BigQuery RAG OpenAI**) to query the vector database.  \nUsing the retrieved documents, it then generates an answer and formats the response in **Markdown**.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "dc49cd0c-3b52-475c-a423-7d8fd3914b4c",
      "name": "Note adhésive 2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1248,
        368
      ],
      "parameters": {
        "color": 7,
        "width": 288,
        "height": 480,
        "content": "\n\n\n\n\n\n\n\n\n\n\n\n## OpenAI Chat Model\n\nThe default model is **gpt-4.1-mini**, chosen for its cost efficiency.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "5a6c7c06-2021-4ad3-9e16-482e230b298d",
      "name": "Note adhésive 3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -928,
        368
      ],
      "parameters": {
        "color": 7,
        "width": 288,
        "height": 480,
        "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n## Simple Memory\n\nProvides the AI Agent with context from the ongoing conversation.  \n\nFor production use, this node can be replaced with a more robust option, such as [Postgres Chat Memory](https://docs.n8n.io/integrations/builtin/cluster-nodes/sub-nodes/n8n-nodes-langchain.memorypostgreschat/).\n"
      },
      "typeVersion": 1
    },
    {
      "id": "7257c13a-e3f1-4602-807a-2ade1b14bcfc",
      "name": "Note adhésive 4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -608,
        368
      ],
      "parameters": {
        "color": 7,
        "width": 288,
        "height": 480,
        "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n## BigQuery RAG OpenAI\n\nThis tool, used by the AI Agent, triggers a subworkflow that retrieves documents from the BigQuery vector store.  \n\nIt takes **`vector_search_question`** as input — the natural language question formulated by the AI Agent.  \nThis input is queried against the BigQuery vector store to fetch relevant documents, which are then used to generate the final answer.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "3d718a18-8cb5-4a8b-b968-fb58502852bb",
      "name": "Note adhésive 5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1568,
        912
      ],
      "parameters": {
        "color": 3,
        "width": 288,
        "height": 512,
        "content": "## When Executed by Another Workflow\n\nThis subworkflow is triggered when the AI Agent calls the **BigQuery RAG OpenAI** tool.  \nIt takes **`vector_search_question`** as input — the natural language query sent to the vector database to retrieve documents and answer the user’s question.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "a61ba929-9efc-404e-8457-39c273aeaeb6",
      "name": "Note adhésive 6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1248,
        912
      ],
      "parameters": {
        "color": 7,
        "width": 288,
        "height": 512,
        "content": "## Set Field – Question\n\nThis node assigns the value of **`vector_search_question`** to the field **`question`**.  \n\nIt is not strictly necessary in this version of the workflow, but it can be extended to set additional fields in future versions.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "14298572-7b25-4d0a-8360-8f36c5d3ee0e",
      "name": "Définir champ - question",
      "type": "n8n-nodes-base.set",
      "position": [
        -1152,
        1264
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "5635b058-5ab3-49fc-be4c-51bfc07630c7",
              "name": "question",
              "type": "string",
              "value": "={{ $json.vector_search_question }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "7166cd84-5c7b-4e10-aa40-d919c9f6387d",
      "name": "OpenAI - Créer un embedding",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -832,
        1264
      ],
      "parameters": {
        "url": "https://api.openai.com/v1/embeddings",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "authentication": "predefinedCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "input",
              "value": "={{ $json.question }}"
            },
            {
              "name": "model",
              "value": "text-embedding-3-large"
            }
          ]
        },
        "nodeCredentialType": "openAiApi"
      },
      "credentials": {
        "openAiApi": {
          "id": "",
          "name": "OpenAi Account"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "00247057-0cc0-4216-b299-115d9637c8b7",
      "name": "Note adhésive 7",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -928,
        912
      ],
      "parameters": {
        "color": 7,
        "width": 288,
        "height": 512,
        "content": "## OpenAI – Create Embedding\n\nThis node calls the OpenAI API to generate an **embedding**, which is a vector representation of the input question.  \nThe embedding is then used to query the BigQuery vector store and retrieve relevant documents.  \n\nIn this template, the embedding model used is **`text-embedding-3-large`**."
      },
      "typeVersion": 1
    },
    {
      "id": "c6517110-de28-475c-82eb-d109ed99501c",
      "name": "Note adhésive 8",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -608,
        912
      ],
      "parameters": {
        "color": 7,
        "width": 288,
        "height": 512,
        "content": "## Set Field – Embedding\n\nThis node assigns the **`embedding`** field using the data returned by the previous API call (**OpenAI – Create Embedding**).  \nThe embedding is then passed to the next API call (**BigQuery**) to perform the vector search."
      },
      "typeVersion": 1
    },
    {
      "id": "79a780d1-8646-4927-93be-54168d920f85",
      "name": "BigQuery - Récupérateur vectoriel - docs n8n",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        0,
        1264
      ],
      "parameters": {
        "url": "https://bigquery.googleapis.com/bigquery/v2/projects/<YOUR-PROJECT-ID>/queries",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "authentication": "predefinedCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "query",
              "value": "=WITH query AS (\n  SELECT ARRAY(\n    SELECT CAST(x AS FLOAT64)\n    FROM UNNEST([{{ $json.embedding.join(',') }}]) AS x\n  ) AS embedding\n)\nSELECT\n  t.base.text,\n  t.base.metadata,\n  t.distance   AS cosine_distance\nFROM VECTOR_SEARCH(\n  TABLE `n8n-docs-rag.n8n_docs.n8n_docs_embeddings`,\n  'embedding',\n  TABLE query,\n  top_k => 10,\n  distance_type => 'COSINE',\n  options => '{\"use_brute_force\": true}'\n) AS t\nORDER BY t.distance;"
            },
            {
              "name": "useLegacySql",
              "value": "false"
            }
          ]
        },
        "nodeCredentialType": "googleBigQueryOAuth2Api"
      },
      "credentials": {
        "googleBigQueryOAuth2Api": {
          "id": "",
          "name": "Google BigQuery account"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "32f2544c-8e36-41a3-97c4-f6264d418be6",
      "name": "Note adhésive 9",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -288,
        912
      ],
      "parameters": {
        "color": 7,
        "width": 736,
        "height": 512,
        "content": "## BigQuery – Vector Retriever – n8n docs\n\nThis node queries the BigQuery vector table that contains part of the n8n documentation:  \n`n8n-docs-rag.n8n_docs.n8n_docs_embeddings`.\n\nIn the **URL field**, replace **`<YOUR-PROJECT-ID>`** with your own project ID.  \n\nThis is a small table (~40 MB), but keep in mind that BigQuery uses the *requester pays* model.  \nWhen you test the workflow, the query cost is billed to your project.  \nRunning 3–4 queries should be inexpensive, as BigQuery provides **1 TB of free processing per month**, unless the project has already consumed its free quota.  \nMore info here: [BigQuery Pricing](https://cloud.google.com/bigquery/pricing?hl=en)\n"
      },
      "typeVersion": 1
    },
    {
      "id": "bb368d56-96a4-4bf0-b3bc-703354258d1d",
      "name": "Documents récupérés",
      "type": "n8n-nodes-base.set",
      "position": [
        576,
        1264
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "3a887578-7e34-4406-b44a-17695e5b5ab4",
              "name": "documents",
              "type": "string",
              "value": "={{ $json.rows.toJsonString() }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "1523c1b9-e81e-4c11-87dc-f67a687965e5",
      "name": "Note adhésive 10",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        480,
        912
      ],
      "parameters": {
        "color": 7,
        "width": 288,
        "height": 512,
        "content": "## Documents Retrieved\n\nThis node stores the **documents retrieved from BigQuery** (10 by default).  \nThe results are then passed to the **AI Agent** to generate the final answer.  \n\nThe number of documents retrieved can be adjusted by changing the value of **`top_k`** in the SQL query.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "3f7c43ba-b4f1-4284-a896-c320769b9a2a",
      "name": "Note adhésive 11",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -288,
        1440
      ],
      "parameters": {
        "color": 5,
        "width": 736,
        "height": 752,
        "content": "## BigQuery – Vector Retriever – n8n docs  \n*Technical details*\n\nThis node runs a SQL query on the BigQuery vector database that stores part of the n8n documentation (nodes and triggers).  \nReplace the sample table with the one you created for your own use case.\n\n```sql\nWITH query AS (\n  SELECT ARRAY(\n    SELECT CAST(x AS FLOAT64)\n    FROM UNNEST([{{ $json.embedding.join(',') }}]) AS x\n  ) AS embedding\n)\nSELECT\n  t.base.text,\n  t.base.metadata,\n  t.distance AS cosine_distance\nFROM VECTOR_SEARCH(\n  TABLE `n8n-docs-rag.n8n_docs.n8n_docs_embeddings`,\n  'embedding',\n  TABLE query,\n  top_k => 10,\n  distance_type => 'COSINE',\n  options => '{\"use_brute_force\": true}'\n) AS t\nORDER BY t.distance;\n```\n* Since this table is small (fewer than ~5,000 rows and under ~10 MB), BigQuery does not use a vector index and automatically falls back to brute‐force search.  \nIncluding `options => '{\"use_brute_force\": true}'` explicitly enforces this mode, although BigQuery would default to it in this scenario.\n\n* The field **`distance`** represents the cosine similarity score — lower values mean the document is more relevant to the query.\n\n* `top_k => 10` retrieves the 10 documents most relevant to the user’s question. These documents are then passed to the AI Agent to generate the final answer."
      },
      "typeVersion": 1
    },
    {
      "id": "5da46a82-d97a-4c8b-8f3a-51286c677a1a",
      "name": "Note adhésive 12",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2928,
        -128
      ],
      "parameters": {
        "color": 7,
        "width": 1344,
        "height": 1200,
        "content": "# BigQuery RAG with OpenAI Embeddings\n\nThis workflow demonstrates how to use **Retrieval-Augmented Generation (RAG)** with **BigQuery** and **OpenAI**.  \nBy default, you cannot directly use OpenAI Cloud Models within BigQuery.\n\n## Why this workflow?\n\nMany organizations already use BigQuery to store enterprise data, and OpenAI for LLM use cases.  \nWhen it comes to RAG, the common approach is to rely on dedicated vector databases such as **Qdrant**, **Pinecone**, **Weaviate**, or PostgreSQL with **pgvector**.  \nThose are good choices, but in cases where an organization already uses and is familiar with BigQuery, it can be more efficient to leverage its built-in vector capabilities for RAG.\n\nThen comes the question of the LLM. If OpenAI is the chosen provider, teams are often frustrated that it is not directly compatible with BigQuery.  \nThis workflow solves that limitation.\n\n## Prerequisites\n\nTo use this workflow, you will need:\n* A good understanding of BigQuery and its vector capabilities  \n* A BigQuery table containing documents and an embeddings column  \n  * The embeddings column must be of type **FLOAT** and mode **REPEATED** (to store arrays)  \n* A data pipeline that **generates embeddings with the OpenAI API** and stores them in BigQuery\n\nThis template comes with a public table that stores part of the **n8n documentation** (about nodes and triggers), so you can try it out:  \n`n8n-docs-rag.n8n_docs.n8n_docs_embeddings`\n\n## How it works\n\nThe system consists of two workflows:\n* **Main workflow** → Hosts the AI Agent, which connects to a subworkflow for RAG  \n* **Subworkflow** → Queries the BigQuery vector table. The retrieved documents are then used by the AI Agent to generate an answer for the user.\n\n## Try it\n\nYou can test this workflow using the public table [`n8n-docs-rag.n8n_docs.n8n_docs_embeddings`](https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1sn8n-docs-rag!2sn8n_docs!3sn8n_docs_embeddings).  \n\n⚠️ **Important:** BigQuery uses the *requester pays* model. This table is small (~40 MB), and BigQuery includes 1 TB of free processing each month. Running 3–4 queries for testing should not incur any noticeable cost.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "92d2b05f-e8a3-4c64-ab65-4c64a2bc0297",
      "name": "Définir champ - Embedding",
      "type": "n8n-nodes-base.set",
      "position": [
        -512,
        1264
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "8abb7285-f4e4-46d7-a660-cd75104d8bbd",
              "name": "embedding",
              "type": "array",
              "value": "={{ $json.data[0].embedding }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    }
  ],
  "active": false,
  "pinData": {
    "When Executed by Another Workflow": [
      {
        "json": {
          "vector_search_question": "How does the n8n Webhook Trigger work?"
        }
      }
    ]
  },
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "",
  "connections": {
    "e9a99d6d-e191-4ecf-a271-1a9ebb07e02a": {
      "ai_memory": [
        [
          {
            "node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
            "type": "ai_memory",
            "index": 0
          }
        ]
      ]
    },
    "53c039bc-4788-49f3-afc8-354498da9e44": {
      "ai_languageModel": [
        [
          {
            "node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "c97a255c-3328-4837-bf45-7a3f5cc7c5b7": {
      "ai_tool": [
        [
          {
            "node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "14298572-7b25-4d0a-8360-8f36c5d3ee0e": {
      "main": [
        [
          {
            "node": "7166cd84-5c7b-4e10-aa40-d919c9f6387d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "92d2b05f-e8a3-4c64-ab65-4c64a2bc0297": {
      "main": [
        [
          {
            "node": "79a780d1-8646-4927-93be-54168d920f85",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "7166cd84-5c7b-4e10-aa40-d919c9f6387d": {
      "main": [
        [
          {
            "node": "92d2b05f-e8a3-4c64-ab65-4c64a2bc0297",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "e198291b-f8b0-455e-8c40-bf5a9cadef70": {
      "main": [
        [
          {
            "node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "8acb8e42-c6e3-4254-a53d-077bbf9f4065": {
      "main": [
        [
          {
            "node": "14298572-7b25-4d0a-8360-8f36c5d3ee0e",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "79a780d1-8646-4927-93be-54168d920f85": {
      "main": [
        [
          {
            "node": "bb368d56-96a4-4bf0-b3bc-703354258d1d",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Foire aux questions

Comment utiliser ce workflow ?

Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.

Dans quelles scénarios ce workflow est-il adapté ?

Avancé - Divers, RAG IA, IA Multimodale

Est-ce payant ?

Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.

Informations sur le workflow
Niveau de difficulté
Avancé
Nombre de nœuds24
Catégorie3
Types de nœuds9
Description de la difficulté

Adapté aux utilisateurs avancés, avec des workflows complexes contenant 16+ nœuds

Auteur
Dataki

Dataki

@dataki

I am passionate about transforming complex processes into seamless automations with n8n. My expertise spans across creating ETL pipelines, sales automations, and data & AI-driven workflows. As an avid problem solver, I thrive on optimizing workflows to drive efficiency and innovation.

Liens externes
Voir sur n8n.io

Partager ce workflow

Catégories

Catégories: 34