RAG mit OpenAI-Embedding in BigQuery
Dies ist ein Miscellaneous, AI RAG, Multimodal AI-Bereich Automatisierungsworkflow mit 24 Nodes. Hauptsächlich werden Set, HttpRequest, Agent, ChatTrigger, LmChatOpenAi und andere Nodes verwendet. Dokumente mit BigQuery RAG und OpenAI beantworten
- •Möglicherweise sind Ziel-API-Anmeldedaten erforderlich
- •OpenAI API Key
Verwendete Nodes (24)
Kategorie
{
"id": "",
"meta": {
"instanceId": ""
},
"name": "BigQuery RAG With OpenAI Embedding",
"tags": [],
"nodes": [
{
"id": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
"name": "KI-Agent",
"type": "@n8n/n8n-nodes-langchain.agent",
"position": [
-960,
64
],
"parameters": {
"options": {
"systemMessage": "You are an assistant specialized in answering questions about **n8n**. \nUse the connected vector store to retrieve relevant documentation and generate clear, structured, and helpful answers.\n\n## Instructions\n\n1. **Document Retrieval** \n - Query the connected vector store to gather information from the n8n documentation. \n - Base every answer on the retrieved content. \n\n2. **Answer Formatting** \n - Write all answers in **Markdown**. \n - Structure responses with clear sections, bullet points, or lists when useful. \n - Provide direct excerpts or explanations from the retrieved documents. \n\n3. **Images** \n - Include screenshots from the documentation when they provide value to the user. \n - Use Markdown syntax for images. \n - Focus on images that illustrate functionality or clarify instructions. \n\n4. **Code** \n - Present code snippets as **Markdown code blocks**. \n - Preserve the original content of the snippet. \n - Add language hints when available (e.g., ` ```javascript `). \n\n5. **Completeness** \n - Provide accurate, context-aware answers. \n - Indicate clearly when the retrieved documents do not contain enough information. "
}
},
"typeVersion": 2.2
},
{
"id": "e198291b-f8b0-455e-8c40-bf5a9cadef70",
"name": "Bei Chatnachrichtenempfang",
"type": "@n8n/n8n-nodes-langchain.chatTrigger",
"position": [
-1408,
64
],
"webhookId": "dc7d2421-f489-4eea-b3fb-6b69ede9beed",
"parameters": {
"options": {}
},
"typeVersion": 1.3
},
{
"id": "53c039bc-4788-49f3-afc8-354498da9e44",
"name": "OpenAI Chat-Modell",
"type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
"position": [
-1152,
400
],
"parameters": {
"model": {
"__rl": true,
"mode": "list",
"value": "gpt-4.1-mini",
"cachedResultName": "gpt-4.1-mini"
},
"options": {}
},
"credentials": {
"openAiApi": {
"id": "",
"name": "OpenAi Account"
}
},
"typeVersion": 1.2
},
{
"id": "e9a99d6d-e191-4ecf-a271-1a9ebb07e02a",
"name": "Einfacher Speicher",
"type": "@n8n/n8n-nodes-langchain.memoryBufferWindow",
"position": [
-848,
400
],
"parameters": {},
"typeVersion": 1.3
},
{
"id": "8acb8e42-c6e3-4254-a53d-077bbf9f4065",
"name": "Bei Ausführung durch anderen Workflow",
"type": "n8n-nodes-base.executeWorkflowTrigger",
"position": [
-1488,
1264
],
"parameters": {
"workflowInputs": {
"values": [
{
"name": "vector_search_question"
}
]
}
},
"typeVersion": 1.1
},
{
"id": "c97a255c-3328-4837-bf45-7a3f5cc7c5b7",
"name": "BigQuery RAG OpenAI",
"type": "@n8n/n8n-nodes-langchain.toolWorkflow",
"position": [
-512,
384
],
"parameters": {
"workflowId": {
"__rl": true,
"mode": "list",
"value": "7tqwzCyxnnu8Svq3",
"cachedResultName": "BigQuery RAG With OpenAI Embedding"
},
"description": "Call this tool to get documents from the vector databas",
"workflowInputs": {
"value": {
"vector_search_question": "={{ /*n8n-auto-generated-fromAI-override*/ $fromAI('vector_search_question', `A natural language question used to query the vector database containing the documentation.\n`, 'string') }}"
},
"schema": [
{
"id": "vector_search_question",
"type": "string",
"display": true,
"required": false,
"displayName": "vector_search_question",
"defaultMatch": false,
"canBeUsedToMatch": true
}
],
"mappingMode": "defineBelow",
"matchingColumns": [
"vector_search_question"
],
"attemptToConvertTypes": false,
"convertFieldsToString": false
}
},
"typeVersion": 2.2
},
{
"id": "dde0a97a-4495-42c2-837d-f5596f2bfbfe",
"name": "Notizzettel",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1536,
-128
],
"parameters": {
"color": 3,
"width": 384,
"height": 432,
"content": "## When Chat Message Received\n\nWhen a chat message is received, it triggers the workflow. \nThe message is then forwarded to the **AI Agent**.\n"
},
"typeVersion": 1
},
{
"id": "62bfaf5a-5b7f-430d-9bf6-c1d745e6a2c3",
"name": "Notizzettel1",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1120,
-128
],
"parameters": {
"color": 7,
"width": 544,
"height": 432,
"content": "## AI Agent\n\nThe AI Agent receives the user input. \nIt is configured with a system prompt that instructs it to use the connected tool (**BigQuery RAG OpenAI**) to query the vector database. \nUsing the retrieved documents, it then generates an answer and formats the response in **Markdown**.\n"
},
"typeVersion": 1
},
{
"id": "dc49cd0c-3b52-475c-a423-7d8fd3914b4c",
"name": "Notizzettel2",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1248,
368
],
"parameters": {
"color": 7,
"width": 288,
"height": 480,
"content": "\n\n\n\n\n\n\n\n\n\n\n\n## OpenAI Chat Model\n\nThe default model is **gpt-4.1-mini**, chosen for its cost efficiency.\n"
},
"typeVersion": 1
},
{
"id": "5a6c7c06-2021-4ad3-9e16-482e230b298d",
"name": "Notizzettel3",
"type": "n8n-nodes-base.stickyNote",
"position": [
-928,
368
],
"parameters": {
"color": 7,
"width": 288,
"height": 480,
"content": "\n\n\n\n\n\n\n\n\n\n\n\n\n## Simple Memory\n\nProvides the AI Agent with context from the ongoing conversation. \n\nFor production use, this node can be replaced with a more robust option, such as [Postgres Chat Memory](https://docs.n8n.io/integrations/builtin/cluster-nodes/sub-nodes/n8n-nodes-langchain.memorypostgreschat/).\n"
},
"typeVersion": 1
},
{
"id": "7257c13a-e3f1-4602-807a-2ade1b14bcfc",
"name": "Notizzettel4",
"type": "n8n-nodes-base.stickyNote",
"position": [
-608,
368
],
"parameters": {
"color": 7,
"width": 288,
"height": 480,
"content": "\n\n\n\n\n\n\n\n\n\n\n\n\n## BigQuery RAG OpenAI\n\nThis tool, used by the AI Agent, triggers a subworkflow that retrieves documents from the BigQuery vector store. \n\nIt takes **`vector_search_question`** as input — the natural language question formulated by the AI Agent. \nThis input is queried against the BigQuery vector store to fetch relevant documents, which are then used to generate the final answer.\n"
},
"typeVersion": 1
},
{
"id": "3d718a18-8cb5-4a8b-b968-fb58502852bb",
"name": "Notizzettel5",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1568,
912
],
"parameters": {
"color": 3,
"width": 288,
"height": 512,
"content": "## When Executed by Another Workflow\n\nThis subworkflow is triggered when the AI Agent calls the **BigQuery RAG OpenAI** tool. \nIt takes **`vector_search_question`** as input — the natural language query sent to the vector database to retrieve documents and answer the user’s question.\n"
},
"typeVersion": 1
},
{
"id": "a61ba929-9efc-404e-8457-39c273aeaeb6",
"name": "Notizzettel6",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1248,
912
],
"parameters": {
"color": 7,
"width": 288,
"height": 512,
"content": "## Set Field – Question\n\nThis node assigns the value of **`vector_search_question`** to the field **`question`**. \n\nIt is not strictly necessary in this version of the workflow, but it can be extended to set additional fields in future versions.\n"
},
"typeVersion": 1
},
{
"id": "14298572-7b25-4d0a-8360-8f36c5d3ee0e",
"name": "Feld setzen - Frage",
"type": "n8n-nodes-base.set",
"position": [
-1152,
1264
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "5635b058-5ab3-49fc-be4c-51bfc07630c7",
"name": "question",
"type": "string",
"value": "={{ $json.vector_search_question }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "7166cd84-5c7b-4e10-aa40-d919c9f6387d",
"name": "OpenAI - Embedding erstellen",
"type": "n8n-nodes-base.httpRequest",
"position": [
-832,
1264
],
"parameters": {
"url": "https://api.openai.com/v1/embeddings",
"method": "POST",
"options": {},
"sendBody": true,
"authentication": "predefinedCredentialType",
"bodyParameters": {
"parameters": [
{
"name": "input",
"value": "={{ $json.question }}"
},
{
"name": "model",
"value": "text-embedding-3-large"
}
]
},
"nodeCredentialType": "openAiApi"
},
"credentials": {
"openAiApi": {
"id": "",
"name": "OpenAi Account"
}
},
"typeVersion": 4.2
},
{
"id": "00247057-0cc0-4216-b299-115d9637c8b7",
"name": "Notizzettel7",
"type": "n8n-nodes-base.stickyNote",
"position": [
-928,
912
],
"parameters": {
"color": 7,
"width": 288,
"height": 512,
"content": "## OpenAI – Create Embedding\n\nThis node calls the OpenAI API to generate an **embedding**, which is a vector representation of the input question. \nThe embedding is then used to query the BigQuery vector store and retrieve relevant documents. \n\nIn this template, the embedding model used is **`text-embedding-3-large`**."
},
"typeVersion": 1
},
{
"id": "c6517110-de28-475c-82eb-d109ed99501c",
"name": "Notizzettel8",
"type": "n8n-nodes-base.stickyNote",
"position": [
-608,
912
],
"parameters": {
"color": 7,
"width": 288,
"height": 512,
"content": "## Set Field – Embedding\n\nThis node assigns the **`embedding`** field using the data returned by the previous API call (**OpenAI – Create Embedding**). \nThe embedding is then passed to the next API call (**BigQuery**) to perform the vector search."
},
"typeVersion": 1
},
{
"id": "79a780d1-8646-4927-93be-54168d920f85",
"name": "BigQuery - Vector Retriever - n8n docs",
"type": "n8n-nodes-base.httpRequest",
"position": [
0,
1264
],
"parameters": {
"url": "https://bigquery.googleapis.com/bigquery/v2/projects/<YOUR-PROJECT-ID>/queries",
"method": "POST",
"options": {},
"sendBody": true,
"authentication": "predefinedCredentialType",
"bodyParameters": {
"parameters": [
{
"name": "query",
"value": "=WITH query AS (\n SELECT ARRAY(\n SELECT CAST(x AS FLOAT64)\n FROM UNNEST([{{ $json.embedding.join(',') }}]) AS x\n ) AS embedding\n)\nSELECT\n t.base.text,\n t.base.metadata,\n t.distance AS cosine_distance\nFROM VECTOR_SEARCH(\n TABLE `n8n-docs-rag.n8n_docs.n8n_docs_embeddings`,\n 'embedding',\n TABLE query,\n top_k => 10,\n distance_type => 'COSINE',\n options => '{\"use_brute_force\": true}'\n) AS t\nORDER BY t.distance;"
},
{
"name": "useLegacySql",
"value": "false"
}
]
},
"nodeCredentialType": "googleBigQueryOAuth2Api"
},
"credentials": {
"googleBigQueryOAuth2Api": {
"id": "",
"name": "Google BigQuery account"
}
},
"typeVersion": 4.2
},
{
"id": "32f2544c-8e36-41a3-97c4-f6264d418be6",
"name": "Notizzettel9",
"type": "n8n-nodes-base.stickyNote",
"position": [
-288,
912
],
"parameters": {
"color": 7,
"width": 736,
"height": 512,
"content": "## BigQuery – Vector Retriever – n8n docs\n\nThis node queries the BigQuery vector table that contains part of the n8n documentation: \n`n8n-docs-rag.n8n_docs.n8n_docs_embeddings`.\n\nIn the **URL field**, replace **`<YOUR-PROJECT-ID>`** with your own project ID. \n\nThis is a small table (~40 MB), but keep in mind that BigQuery uses the *requester pays* model. \nWhen you test the workflow, the query cost is billed to your project. \nRunning 3–4 queries should be inexpensive, as BigQuery provides **1 TB of free processing per month**, unless the project has already consumed its free quota. \nMore info here: [BigQuery Pricing](https://cloud.google.com/bigquery/pricing?hl=en)\n"
},
"typeVersion": 1
},
{
"id": "bb368d56-96a4-4bf0-b3bc-703354258d1d",
"name": "Abgerufene Dokumente",
"type": "n8n-nodes-base.set",
"position": [
576,
1264
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "3a887578-7e34-4406-b44a-17695e5b5ab4",
"name": "documents",
"type": "string",
"value": "={{ $json.rows.toJsonString() }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "1523c1b9-e81e-4c11-87dc-f67a687965e5",
"name": "Notizzettel10",
"type": "n8n-nodes-base.stickyNote",
"position": [
480,
912
],
"parameters": {
"color": 7,
"width": 288,
"height": 512,
"content": "## Documents Retrieved\n\nThis node stores the **documents retrieved from BigQuery** (10 by default). \nThe results are then passed to the **AI Agent** to generate the final answer. \n\nThe number of documents retrieved can be adjusted by changing the value of **`top_k`** in the SQL query.\n"
},
"typeVersion": 1
},
{
"id": "3f7c43ba-b4f1-4284-a896-c320769b9a2a",
"name": "Notizzettel11",
"type": "n8n-nodes-base.stickyNote",
"position": [
-288,
1440
],
"parameters": {
"color": 5,
"width": 736,
"height": 752,
"content": "## BigQuery – Vector Retriever – n8n docs \n*Technical details*\n\nThis node runs a SQL query on the BigQuery vector database that stores part of the n8n documentation (nodes and triggers). \nReplace the sample table with the one you created for your own use case.\n\n```sql\nWITH query AS (\n SELECT ARRAY(\n SELECT CAST(x AS FLOAT64)\n FROM UNNEST([{{ $json.embedding.join(',') }}]) AS x\n ) AS embedding\n)\nSELECT\n t.base.text,\n t.base.metadata,\n t.distance AS cosine_distance\nFROM VECTOR_SEARCH(\n TABLE `n8n-docs-rag.n8n_docs.n8n_docs_embeddings`,\n 'embedding',\n TABLE query,\n top_k => 10,\n distance_type => 'COSINE',\n options => '{\"use_brute_force\": true}'\n) AS t\nORDER BY t.distance;\n```\n* Since this table is small (fewer than ~5,000 rows and under ~10 MB), BigQuery does not use a vector index and automatically falls back to brute‐force search. \nIncluding `options => '{\"use_brute_force\": true}'` explicitly enforces this mode, although BigQuery would default to it in this scenario.\n\n* The field **`distance`** represents the cosine similarity score — lower values mean the document is more relevant to the query.\n\n* `top_k => 10` retrieves the 10 documents most relevant to the user’s question. These documents are then passed to the AI Agent to generate the final answer."
},
"typeVersion": 1
},
{
"id": "5da46a82-d97a-4c8b-8f3a-51286c677a1a",
"name": "Notizzettel12",
"type": "n8n-nodes-base.stickyNote",
"position": [
-2928,
-128
],
"parameters": {
"color": 7,
"width": 1344,
"height": 1200,
"content": "# BigQuery RAG with OpenAI Embeddings\n\nThis workflow demonstrates how to use **Retrieval-Augmented Generation (RAG)** with **BigQuery** and **OpenAI**. \nBy default, you cannot directly use OpenAI Cloud Models within BigQuery.\n\n## Why this workflow?\n\nMany organizations already use BigQuery to store enterprise data, and OpenAI for LLM use cases. \nWhen it comes to RAG, the common approach is to rely on dedicated vector databases such as **Qdrant**, **Pinecone**, **Weaviate**, or PostgreSQL with **pgvector**. \nThose are good choices, but in cases where an organization already uses and is familiar with BigQuery, it can be more efficient to leverage its built-in vector capabilities for RAG.\n\nThen comes the question of the LLM. If OpenAI is the chosen provider, teams are often frustrated that it is not directly compatible with BigQuery. \nThis workflow solves that limitation.\n\n## Prerequisites\n\nTo use this workflow, you will need:\n* A good understanding of BigQuery and its vector capabilities \n* A BigQuery table containing documents and an embeddings column \n * The embeddings column must be of type **FLOAT** and mode **REPEATED** (to store arrays) \n* A data pipeline that **generates embeddings with the OpenAI API** and stores them in BigQuery\n\nThis template comes with a public table that stores part of the **n8n documentation** (about nodes and triggers), so you can try it out: \n`n8n-docs-rag.n8n_docs.n8n_docs_embeddings`\n\n## How it works\n\nThe system consists of two workflows:\n* **Main workflow** → Hosts the AI Agent, which connects to a subworkflow for RAG \n* **Subworkflow** → Queries the BigQuery vector table. The retrieved documents are then used by the AI Agent to generate an answer for the user.\n\n## Try it\n\nYou can test this workflow using the public table [`n8n-docs-rag.n8n_docs.n8n_docs_embeddings`](https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1sn8n-docs-rag!2sn8n_docs!3sn8n_docs_embeddings). \n\n⚠️ **Important:** BigQuery uses the *requester pays* model. This table is small (~40 MB), and BigQuery includes 1 TB of free processing each month. Running 3–4 queries for testing should not incur any noticeable cost.\n"
},
"typeVersion": 1
},
{
"id": "92d2b05f-e8a3-4c64-ab65-4c64a2bc0297",
"name": "Feld setzen - Embedding",
"type": "n8n-nodes-base.set",
"position": [
-512,
1264
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "8abb7285-f4e4-46d7-a660-cd75104d8bbd",
"name": "embedding",
"type": "array",
"value": "={{ $json.data[0].embedding }}"
}
]
}
},
"typeVersion": 3.4
}
],
"active": false,
"pinData": {
"When Executed by Another Workflow": [
{
"json": {
"vector_search_question": "How does the n8n Webhook Trigger work?"
}
}
]
},
"settings": {
"executionOrder": "v1"
},
"versionId": "",
"connections": {
"e9a99d6d-e191-4ecf-a271-1a9ebb07e02a": {
"ai_memory": [
[
{
"node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
"type": "ai_memory",
"index": 0
}
]
]
},
"53c039bc-4788-49f3-afc8-354498da9e44": {
"ai_languageModel": [
[
{
"node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"c97a255c-3328-4837-bf45-7a3f5cc7c5b7": {
"ai_tool": [
[
{
"node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
"type": "ai_tool",
"index": 0
}
]
]
},
"14298572-7b25-4d0a-8360-8f36c5d3ee0e": {
"main": [
[
{
"node": "7166cd84-5c7b-4e10-aa40-d919c9f6387d",
"type": "main",
"index": 0
}
]
]
},
"92d2b05f-e8a3-4c64-ab65-4c64a2bc0297": {
"main": [
[
{
"node": "79a780d1-8646-4927-93be-54168d920f85",
"type": "main",
"index": 0
}
]
]
},
"7166cd84-5c7b-4e10-aa40-d919c9f6387d": {
"main": [
[
{
"node": "92d2b05f-e8a3-4c64-ab65-4c64a2bc0297",
"type": "main",
"index": 0
}
]
]
},
"e198291b-f8b0-455e-8c40-bf5a9cadef70": {
"main": [
[
{
"node": "69f19613-1e74-43dc-9f2e-c95c2db385e3",
"type": "main",
"index": 0
}
]
]
},
"8acb8e42-c6e3-4254-a53d-077bbf9f4065": {
"main": [
[
{
"node": "14298572-7b25-4d0a-8360-8f36c5d3ee0e",
"type": "main",
"index": 0
}
]
]
},
"79a780d1-8646-4927-93be-54168d920f85": {
"main": [
[
{
"node": "bb368d56-96a4-4bf0-b3bc-703354258d1d",
"type": "main",
"index": 0
}
]
]
}
}
}Wie verwende ich diesen Workflow?
Kopieren Sie den obigen JSON-Code, erstellen Sie einen neuen Workflow in Ihrer n8n-Instanz und wählen Sie "Aus JSON importieren". Fügen Sie die Konfiguration ein und passen Sie die Anmeldedaten nach Bedarf an.
Für welche Szenarien ist dieser Workflow geeignet?
Experte - Verschiedenes, KI RAG, Multimodales KI
Ist es kostenpflichtig?
Dieser Workflow ist völlig kostenlos. Beachten Sie jedoch, dass Drittanbieterdienste (wie OpenAI API), die im Workflow verwendet werden, möglicherweise kostenpflichtig sind.
Verwandte Workflows
Dataki
@datakiI am passionate about transforming complex processes into seamless automations with n8n. My expertise spans across creating ETL pipelines, sales automations, and data & AI-driven workflows. As an avid problem solver, I thrive on optimizing workflows to drive efficiency and innovation.
Diesen Workflow teilen