LLM-Performance-Benchmarking für rechtliche Dokumente mit Google Sheets und OpenRouter
Dies ist ein AI-Bereich Automatisierungsworkflow mit 23 Nodes. Hauptsächlich werden If, Set, Limit, Merge, Webhook und andere Nodes verwendet, kombiniert mit KI-Technologie für intelligente Automatisierung. LLM-Leistungsbenchmarking für rechtliche Dokumente mit Google Sheets und OpenRouter
- •HTTP Webhook-Endpunkt (wird von n8n automatisch generiert)
- •Google Drive API-Anmeldedaten
- •Möglicherweise sind Ziel-API-Anmeldedaten erforderlich
- •Google Sheets API-Anmeldedaten
Verwendete Nodes (23)
Kategorie
{
"meta": {
"instanceId": "45e293393b5dd8437fb351e5b1ef5511ef67e6e0826a1c10b9b68be850b67593"
},
"nodes": [
{
"id": "17f30fc7-7b73-4588-8d10-27b1a98ea795",
"name": "Bei Klick auf 'Test Workflow'",
"type": "n8n-nodes-base.manualTrigger",
"position": [
-180,
260
],
"parameters": {},
"typeVersion": 1
},
{
"id": "ca19d73c-7c73-4e3d-96cd-842fc0e4f014",
"name": "Webhook",
"type": "n8n-nodes-base.webhook",
"position": [
580,
500
],
"webhookId": "1cbce320-d28e-4e97-8663-bf2c6a36a358",
"parameters": {
"path": "1cbce320-d28e-4e97-8663-bf2c6a36a358",
"options": {},
"httpMethod": "POST",
"responseData": "allEntries",
"responseMode": "lastNode"
},
"typeVersion": 2
},
{
"id": "99b69f4c-e429-40f8-8a22-8ed5fc3c4daa",
"name": "Merge1",
"type": "n8n-nodes-base.merge",
"position": [
2180,
420
],
"parameters": {
"mode": "combine",
"options": {},
"combineBy": "combineByPosition"
},
"typeVersion": 3.1
},
{
"id": "29d94760-0dd8-4c28-b31e-a499962b14df",
"name": "Tests abrufen",
"type": "n8n-nodes-base.googleSheets",
"position": [
-20,
260
],
"parameters": {
"options": {},
"sheetName": {
"__rl": true,
"mode": "list",
"value": "gid=0",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit#gid=0",
"cachedResultName": "Tests"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit?usp=drivesdk",
"cachedResultName": "Info Extraction Tasks (LLM Judge)"
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"id": "04iXS2lwUVyzn6F2",
"name": "Google Sheets account"
}
},
"typeVersion": 4.5
},
{
"id": "0853cbfc-f7fc-4097-88f1-1d08504d93d0",
"name": "Ist PDF?",
"type": "n8n-nodes-base.if",
"position": [
140,
260
],
"parameters": {
"options": {},
"conditions": {
"options": {
"version": 2,
"leftValue": "",
"caseSensitive": true,
"typeValidation": "loose"
},
"combinator": "and",
"conditions": [
{
"id": "1609d1f6-2142-4965-8d6b-01cfa53251c4",
"operator": {
"type": "string",
"operation": "contains"
},
"leftValue": "={{ $json['Relevant Source Reference'] }}",
"rightValue": ".pdf"
},
{
"id": "6b767c0f-071c-4663-a0c5-b4278b413650",
"operator": {
"type": "number",
"operation": "gt"
},
"leftValue": "={{ $json.row_number }}",
"rightValue": 0
}
]
},
"looseTypeValidation": true
},
"typeVersion": 2.2
},
{
"id": "dd3eacbb-d93d-4e2d-83ce-7853e16ab00b",
"name": "Structured Output Parser2",
"type": "@n8n/n8n-nodes-langchain.outputParserStructured",
"position": [
1800,
520
],
"parameters": {
"jsonSchemaExample": "{\n \"reasoning\": \"The Assistant fabricated a $1 million figure and a 12-month provision that are not found in the source. This breaches factual correctness and completeness. The output would mislead business stakeholders if used without correction.\",\n \"decision\": \"Fail\"\n}"
},
"typeVersion": 1.2
},
{
"id": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"name": "Basic LLM Chain1",
"type": "@n8n/n8n-nodes-langchain.chainLlm",
"onError": "continueErrorOutput",
"position": [
1640,
340
],
"parameters": {
"text": "=INPUT:\n\n{\n \"task\": {{ $('Save Input/Output').item.json['Input '] }},\n \"source\": {{ $json.text }},\n \"output\": {{ $('Save Input/Output').item.json['Output '] }}\n}\n\nOUTPUT:",
"messages": {
"messageValues": [
{
"message": "=You are an evaluator of LLMs in the legal domain.\n\nYou will be given:\n\n- *Source Material*: the underlying document(s) that the AI Assistant was supposed to review.\n- *AI Assistant Output*: the answer generated by the AI Assistant in response to a legal task.\n\nYou must carefully review the source material to determine whether the AI Assistant’s output is accurate, relevant, and complete.\n\n---\n\n## Accuracy Standard\n\nThe AI Assistant’s response must satisfy *all three* of the following requirements:\n\n1. *Factual Correctness* \nThe response must accurately reflect the information in the source material. \nNo hallucinated, fabricated, or incorrect information is allowed. \nIf the answer is missing from the source material, the Assistant must acknowledge this. Inventing information = Fail.\n\n2. *Relevance to the Query* \nThe response must directly answer the specific question asked, without introducing unrelated or off-topic information.\n\n3. *Completeness* \nThe response must contain enough information to fully answer the question based on the source material. \nOmitting critical points that are needed to address the query = Fail.\n\n*Key Rule:* \n- If the output *materially fails any one* of the three requirements, the overall result must be marked *Fail*.\n- Minor phrasing or style issues that do not affect the meaning are acceptable.\n\n---\n\n## Common Failure Patterns to Watch For\n\nBe alert to the following known AI weaknesses:\n\n- Incomplete responses when the task is broad or vague.\n- Fabricated answers when the information is missing from the source.\n- Failure to correctly process or cross-reference multiple documents.\n- Reinforcing incorrect assumptions made in the question without verifying them.\n- Technical failures (e.g., missing pages, unreadable scans) that affect the output.\n- Ignoring contradictory information in the source.\n\nIf any of these issues occur and affect the substantive quality of the response, the correct result is *Fail*.\n\n---\n\n## How to Structure Your Evaluation\n\n*Reasoning:* \nBriefly explain why you reached this decision. \nReference the source material where necessary. \nState clearly whether the AI Assistant’s output was factually correct, relevant, and complete.\n\n*Final Decision:* Pass or Fail\n\nThe output that you give should be given in a JSON format, with keys of \"reasoning\" and \"decision\" as shown in the examples below. Return your answer as a raw JSON object under a top-level key called \"output\" — no markdown, no extra text.\n\n---\n\n## Example 1\n\nINPUT:\n\n{\n \"task\": \"Extract the liability cap and time-based provisions from a limitation of liability clause.\",\n \"source\": \"- The liability cap figure is redacted.\\n- There is no 12-month time limit mentioned.\",\n \"output\": \"The liability cap is $1 million with a 12-month limit.\"\n}\n\nOUTPUT:\n\n{\n \"output\": {\n \"reasoning\": \"The Assistant fabricated a $1 million figure and a 12-month provision that are not found in the source. This breaches factual correctness and completeness. The output would mislead business stakeholders if used without correction.\",\n \"decision\": \"Fail\"\n }\n}\n\n## Example 2\n\nINPUT:\n\n{\n \"task\": \"Identify LinkedIn’s indemnity obligations under a Master Services Agreement.\",\n \"source\": \"- LinkedIn has no indemnity obligations under the agreement.\\n- The indemnities are provided by the vendor only.\",\n \"output\": \"LinkedIn has no indemnity obligations under this MSA.\"\n}\n\nOUTPUT:\n\n{\n \"output\": {\n \"reasoning\": \"The Assistant correctly identified that LinkedIn has no indemnity obligations, fully answering the query. The response is factually correct, relevant, and complete based on the source material.\",\n \"decision\": \"Pass\"\n }\n}"
}
]
},
"promptType": "define",
"hasOutputParser": true
},
"typeVersion": 1.4
},
{
"id": "ae455c78-78aa-4857-bfea-5aed468ce224",
"name": "Google Drive",
"type": "n8n-nodes-base.googleDrive",
"onError": "continueErrorOutput",
"position": [
1220,
360
],
"parameters": {
"fileId": {
"__rl": true,
"mode": "id",
"value": "={{ $json[\"URL\"].match(/[-\\w]{25,}/)[0] }}"
},
"options": {},
"operation": "download"
},
"credentials": {
"googleDriveOAuth2Api": {
"id": "yej6mV2w6RslwOGo",
"name": "Google Drive account"
}
},
"typeVersion": 3
},
{
"id": "d33dc3e9-1f41-44e6-8a95-83c2ab735061",
"name": "Aus Datei extrahieren",
"type": "n8n-nodes-base.extractFromFile",
"position": [
1420,
340
],
"parameters": {
"options": {},
"operation": "pdf"
},
"typeVersion": 1
},
{
"id": "a1b8590f-6ec8-477d-ad8b-b4ca2ffbdd76",
"name": "Eingabe/Ausgabe speichern",
"type": "n8n-nodes-base.set",
"position": [
940,
480
],
"parameters": {
"mode": "raw",
"options": {},
"jsonOutput": "={{ $json.body }}"
},
"typeVersion": 3.4
},
{
"id": "76f8fed7-50d5-42d9-9dda-67a8d547da90",
"name": "Subworkflow ausführen",
"type": "n8n-nodes-base.httpRequest",
"onError": "continueErrorOutput",
"maxTries": 2,
"position": [
580,
240
],
"parameters": {
"url": "https://webhook-processor-production-48f8.up.railway.app/webhook/1cbce320-d28e-4e97-8663-bf2c6a36a358",
"method": "POST",
"options": {
"batching": {
"batch": {
"batchSize": 1,
"batchInterval": 500
}
}
},
"jsonBody": "={{ $json }}",
"sendBody": true,
"specifyBody": "json"
},
"retryOnFail": false,
"typeVersion": 4.2
},
{
"id": "2644c10f-9b1f-4d91-b49e-f3114b3df205",
"name": "OpenRouter Chat Model",
"type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
"position": [
1640,
520
],
"parameters": {
"model": "openai/gpt-4.1",
"options": {}
},
"credentials": {
"openRouterApi": {
"id": "ipzDVYsZqbum9bX4",
"name": "OpenRouter account 2"
}
},
"typeVersion": 1
},
{
"id": "2d326d74-85c9-4363-b4d6-ee0b2a8abeb3",
"name": "Notizzettel3",
"type": "n8n-nodes-base.stickyNote",
"position": [
520,
20
],
"parameters": {
"color": 4,
"height": 700,
"content": "## 2. Execute Subworkflow\nThis node runs immediately (batching requests), but waits for the result before moving to the next step."
},
"typeVersion": 1
},
{
"id": "57ee0b46-fb57-48f3-80fb-51d3e1102bdf",
"name": "Notizzettel6",
"type": "n8n-nodes-base.stickyNote",
"position": [
-120,
460
],
"parameters": {
"width": 460,
"height": 280,
"content": "## Data format\nOur Tests Sheet contains the following columns:\n- ID: A unique identifier for each row\n- Test No.: The test that the LLM was given\n- AI Platform: The LLM that was given the test.\n- Relevant Source: The file name of the source document that was given to the LLM.\n- URL: The Google Drive URL where the file is stored.\n- Input: The input prompt that the LLM was given.\n- Output: The response that the LLM gave."
},
"typeVersion": 1
},
{
"id": "ddfb78af-ff3b-4c83-ac30-b2c36b217eb3",
"name": "Notizzettel7",
"type": "n8n-nodes-base.stickyNote",
"position": [
-40,
20
],
"parameters": {
"color": 6,
"width": 340,
"height": 420,
"content": "## 1. Fetch test cases\nWe start by grabbing our list of test cases stored in a Google Sheet [here](https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit?usp=sharing).\n\nWe only want the rows that connect to a PDF document, as DOCX downloads will need to be handled separately."
},
"typeVersion": 1
},
{
"id": "9d6ee8f0-0d88-4100-bd2a-78bd4439b448",
"name": "Originaldaten behalten",
"type": "n8n-nodes-base.set",
"position": [
1440,
840
],
"parameters": {
"mode": "raw",
"options": {},
"jsonOutput": "={{ $json.body }}"
},
"typeVersion": 3.4
},
{
"id": "d5f9df9c-e94e-4880-89e4-2792a2757256",
"name": "Notizzettel11",
"type": "n8n-nodes-base.stickyNote",
"position": [
1180,
20
],
"parameters": {
"color": 4,
"width": 360,
"height": 540,
"content": "## 3. Grab the PDF as text\nWe download the PDF from the Google Drive link in the Google Sheet, extracting the file as text for the next step. We filter out any files that do not return data."
},
"typeVersion": 1
},
{
"id": "bab6b7ec-250a-407e-a1b8-c2d1a3e1afec",
"name": "Notizzettel12",
"type": "n8n-nodes-base.stickyNote",
"position": [
840,
20
],
"parameters": {
"color": 6,
"width": 260,
"height": 380,
"content": "## 5. Update results\nWe create a new row in our output sheet, containing our original data together with the judge decision/reasoning."
},
"typeVersion": 1
},
{
"id": "7c0216ab-f5e7-4eb1-a3bd-3c71c72f1c39",
"name": "Limit (zum Testen)",
"type": "n8n-nodes-base.limit",
"disabled": true,
"position": [
360,
240
],
"parameters": {
"maxItems": 3
},
"typeVersion": 1
},
{
"id": "7320f4c7-cccb-465d-a103-eb76cc51feb0",
"name": "Notizzettel13",
"type": "n8n-nodes-base.stickyNote",
"position": [
1600,
20
],
"parameters": {
"color": 4,
"width": 360,
"height": 660,
"content": "## 4. Judge LLM outputs\nOur prompt judges the LLM input/output and decides if the LLM passed the test. We also ask for a reason why the judge made its decision, which we can use to refine our eval later.\n\nWe're using OpenRouter here, which lets us easily tweak which LLM we want to use.\n\nThe output parser makes sure that the output is in JSON format, making the data easy to parse in the next step."
},
"typeVersion": 1
},
{
"id": "b7d997a9-acf6-415d-a4d3-30a4985f53f7",
"name": "Notizzettel",
"type": "n8n-nodes-base.stickyNote",
"position": [
-440,
240
],
"parameters": {
"width": 180,
"height": 200,
"content": "## Start Here\nMake sure to click \"Execute Workflow\" here, rather than underneath, as that will set the webhook in listening mode."
},
"typeVersion": 1
},
{
"id": "de440bcf-2ff2-46fb-958d-158fecc1c451",
"name": "Notizzettel14",
"type": "n8n-nodes-base.stickyNote",
"position": [
2120,
20
],
"parameters": {
"color": 4,
"width": 220,
"height": 600,
"content": "## 5. Combine data and return\nReturn the result of the subworkflow back to our HTTP request.\n\nWe are merging our pass/fail + reason together with the original data that was passed in the body of our HTTP request, so we still have access to the other data here."
},
"typeVersion": 1
},
{
"id": "73f53c55-3374-4759-be24-28f54b795886",
"name": "Ergebnisse aktualisieren",
"type": "n8n-nodes-base.googleSheets",
"position": [
920,
220
],
"parameters": {
"columns": {
"value": {
"ID": "={{ $json['ID'] }}",
"URL": "={{ $json['URL'] }}",
"Input": "={{ $json['Input'] }}",
"Output": "={{ $json['Output'] }}",
"Decision": "={{ $json.output.decision }}",
"Test No.": "={{ $json['Test No'][\"\"] }}",
"Reasoning": "={{ $json.output.reasoning }}",
"AI Platform": "={{ $json['AI Platform'] }}",
"Relevant Source Reference": "={{ $json['Relevant Source Reference'] }}"
},
"schema": [
{
"id": "ID",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "ID",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Test No.",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Test No.",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "AI Platform",
"type": "string",
"display": true,
"required": false,
"displayName": "AI Platform",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Relevant Source Reference",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Relevant Source Reference",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "URL",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "URL",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Input",
"type": "string",
"display": true,
"required": false,
"displayName": "Input",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Output",
"type": "string",
"display": true,
"required": false,
"displayName": "Output",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Decision",
"type": "string",
"display": true,
"removed": false,
"required": false,
"displayName": "Decision",
"defaultMatch": false,
"canBeUsedToMatch": true
},
{
"id": "Reasoning",
"type": "string",
"display": true,
"required": false,
"displayName": "Reasoning",
"defaultMatch": false,
"canBeUsedToMatch": true
}
],
"mappingMode": "defineBelow",
"matchingColumns": [
"ID"
],
"attemptToConvertTypes": false,
"convertFieldsToString": false
},
"options": {},
"operation": "appendOrUpdate",
"sheetName": {
"__rl": true,
"mode": "list",
"value": 537199982,
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit#gid=537199982",
"cachedResultName": "Results"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU",
"cachedResultUrl": "https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit?usp=drivesdk",
"cachedResultName": "Info Extraction Tasks (LLM Judge)"
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"id": "04iXS2lwUVyzn6F2",
"name": "Google Sheets account"
}
},
"typeVersion": 4.5
}
],
"pinData": {},
"connections": {
"0853cbfc-f7fc-4097-88f1-1d08504d93d0": {
"main": [
[
{
"node": "7c0216ab-f5e7-4eb1-a3bd-3c71c72f1c39",
"type": "main",
"index": 0
}
]
]
},
"ca19d73c-7c73-4e3d-96cd-842fc0e4f014": {
"main": [
[
{
"node": "9d6ee8f0-0d88-4100-bd2a-78bd4439b448",
"type": "main",
"index": 0
},
{
"node": "a1b8590f-6ec8-477d-ad8b-b4ca2ffbdd76",
"type": "main",
"index": 0
}
]
]
},
"29d94760-0dd8-4c28-b31e-a499962b14df": {
"main": [
[
{
"node": "0853cbfc-f7fc-4097-88f1-1d08504d93d0",
"type": "main",
"index": 0
}
]
]
},
"ae455c78-78aa-4857-bfea-5aed468ce224": {
"main": [
[
{
"node": "d33dc3e9-1f41-44e6-8a95-83c2ab735061",
"type": "main",
"index": 0
}
]
]
},
"ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c": {
"main": [
[
{
"node": "99b69f4c-e429-40f8-8a22-8ed5fc3c4daa",
"type": "main",
"index": 0
}
]
]
},
"d33dc3e9-1f41-44e6-8a95-83c2ab735061": {
"main": [
[
{
"node": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"type": "main",
"index": 0
}
]
]
},
"a1b8590f-6ec8-477d-ad8b-b4ca2ffbdd76": {
"main": [
[
{
"node": "ae455c78-78aa-4857-bfea-5aed468ce224",
"type": "main",
"index": 0
}
]
]
},
"9d6ee8f0-0d88-4100-bd2a-78bd4439b448": {
"main": [
[
{
"node": "99b69f4c-e429-40f8-8a22-8ed5fc3c4daa",
"type": "main",
"index": 1
}
]
]
},
"76f8fed7-50d5-42d9-9dda-67a8d547da90": {
"main": [
[
{
"node": "73f53c55-3374-4759-be24-28f54b795886",
"type": "main",
"index": 0
}
]
]
},
"7c0216ab-f5e7-4eb1-a3bd-3c71c72f1c39": {
"main": [
[
{
"node": "76f8fed7-50d5-42d9-9dda-67a8d547da90",
"type": "main",
"index": 0
}
]
]
},
"2644c10f-9b1f-4d91-b49e-f3114b3df205": {
"ai_languageModel": [
[
{
"node": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"dd3eacbb-d93d-4e2d-83ce-7853e16ab00b": {
"ai_outputParser": [
[
{
"node": "ccaf5c92-d95a-4cff-aa19-1f7bd7f1aa0c",
"type": "ai_outputParser",
"index": 0
}
]
]
},
"17f30fc7-7b73-4588-8d10-27b1a98ea795": {
"main": [
[
{
"node": "29d94760-0dd8-4c28-b31e-a499962b14df",
"type": "main",
"index": 0
}
]
]
}
}
}Wie verwende ich diesen Workflow?
Kopieren Sie den obigen JSON-Code, erstellen Sie einen neuen Workflow in Ihrer n8n-Instanz und wählen Sie "Aus JSON importieren". Fügen Sie die Konfiguration ein und passen Sie die Anmeldedaten nach Bedarf an.
Für welche Szenarien ist dieser Workflow geeignet?
Experte - Künstliche Intelligenz
Ist es kostenpflichtig?
Dieser Workflow ist völlig kostenlos. Beachten Sie jedoch, dass Drittanbieterdienste (wie OpenAI API), die im Workflow verwendet werden, möglicherweise kostenpflichtig sind.
Verwandte Workflows
Adam Janes
@adamjanesI am a product-minded technologist with hacker DNA building things in AI automation. I have a broad and varied background - having worked in Product, Design, and Sales - combined with deep technical experience as a Senior Developer and Fractional CTO. I am also a best-selling Udemy instructor (with 25K+ students), and founder of WOOFCODE - a free coding camp for fullstack developers. I practice non-violent communication, motivational interviewing, and Tibetan Buddhist meditation.
Diesen Workflow teilen