Extracción de datos de facturas de PDF a JSON

Intermedio

Este es unMiscellaneous, AI Summarization, Multimodal AIflujo de automatización del dominio deautomatización que contiene 10 nodos.Utiliza principalmente nodos como Set, Xml, FormTrigger, ExtractFromFile, GoogleGemini. Extraer datos de facturas de PDF a JSON usando Gemini AI y conversión XML

Requisitos previos
  • No hay requisitos previos especiales, puede importar y usarlo directamente
Vista previa del flujo de trabajo
Visualización de las conexiones entre nodos, con soporte para zoom y panorámica
Exportar flujo de trabajo
Copie la siguiente configuración JSON en n8n para importar y usar este flujo de trabajo
{
  "meta": {
    "instanceId": "d1451097bf16b4787e3f6ede2b364ece110261879ec2f0efaeba689056c0a1ab"
  },
  "nodes": [
    {
      "id": "3a0d9a6f-6e6e-44a3-9eb0-1755b01fed0c",
      "name": "Al enviar el formulario",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        672,
        -480
      ],
      "webhookId": "0387941a-9e42-44ab-96ac-dde230418ac3",
      "parameters": {
        "options": {},
        "formTitle": "Test",
        "formFields": {
          "values": [
            {
              "fieldType": "file",
              "fieldLabel": "data"
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "d510fda8-ceaa-4d57-8946-39a97b23f3e1",
      "name": "Extraer de archivo",
      "type": "n8n-nodes-base.extractFromFile",
      "position": [
        832,
        -480
      ],
      "parameters": {
        "options": {},
        "operation": "pdf"
      },
      "typeVersion": 1
    },
    {
      "id": "e070def8-b13a-49fa-ae4a-e366d1f474da",
      "name": "Enviar mensaje a modelo",
      "type": "@n8n/n8n-nodes-langchain.googleGemini",
      "position": [
        704,
        -240
      ],
      "parameters": {
        "modelId": {
          "__rl": true,
          "mode": "list",
          "value": "models/gemma-3n-e4b-it",
          "cachedResultName": "models/gemma-3n-e4b-it"
        },
        "options": {},
        "messages": {
          "values": [
            {
              "content": "=Considera la transcripcion del invoice adjunta, reescribela como un XML siguiendo este esquema:\n\n{{ $json.estructuraXML }}\n\nInvoice:\n\n{{ $json.text_limpio }}"
            }
          ]
        }
      },
      "credentials": {
        "googlePalmApi": {
          "id": "d4exk6UjdeHXH93h",
          "name": "Google Gemini(PaLM) Api account 2"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "4e435b5b-95da-4b6a-a888-c2f74cd96cd1",
      "name": "Limpio data",
      "type": "n8n-nodes-base.set",
      "position": [
        1104,
        -480
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "ad0e7b3d-4011-4bfb-851e-c049883dc00a",
              "name": "text_limpio",
              "type": "string",
              "value": "={{ $json.text.replace(/\\n/g, ' ') }}"
            },
            {
              "id": "e0b6ea3e-17d6-4c18-a5f5-1b2cf98b4ddb",
              "name": "estructuraXML",
              "type": "string",
              "value": "<invoice>\n    <invoice_number>[invoice_number]</invoice_number>\n    <date_of_issue>[date_of_issue]</date_of_issue>\n    <due_date>[due_date]</due_date>\n\n    <billed_to>\n        <company_name>[billed_to.company_name]</company_name>\n        <contact_name>[billed_to.contact_name]</contact_name>\n        <address>[billed_to.address]</address>\n        <postal_code>[billed_to.postal_code]</postal_code>\n        <city>[billed_to.city]</city>\n        <state>[billed_to.state]</state>\n        <country>[billed_to.country]</country>\n        <rfc>[billed_to.rfc]</rfc>\n    </billed_to>\n\n    <from>\n        <company_name>[from.company_name]</company_name>\n        <address>[from.address]</address>\n        <postal_code>[from.postal_code]</postal_code>\n        <city>[from.city]</city>\n        <state>[from.state]</state>\n        <country>[from.country]</country>\n        <rfc>[from.rfc]</rfc>\n    </from>\n\n    <purchase_order>[purchase_order]</purchase_order>\n\n    <items>\n        <item>\n            <description>[item.description]</description>\n            <unit_cost>[item.unit_cost]</unit_cost>\n            <quantity>[item.quantity]</quantity>\n            <amount>[item.amount]</amount>\n        </item>\n        </items>\n\n    <bank_account_details>\n        <account_holder_name>[bank_account_details.account_holder_name]</account_holder_name>\n        <account_number>[bank_account_details.account_number]</account_number>\n        <routing_number>[bank_account_details.routing_number]</routing_number>\n        <swift_code>[bank_account_details.swift_code]</swift_code>\n        <bank_name>[bank_account_details.bank_name]</bank_name>\n        <currency>[bank_account_details.currency]</currency>\n    </bank_account_details>\n\n    <financials>\n        <subtotal>[subtotal]</subtotal>\n        <tax_rate>[tax_rate]</tax_rate>\n        <tax_amount>[tax_amount]</tax_amount>\n        <shipping_cost>[shipping_cost]</shipping_cost>\n        <invoice_total>[invoice_total]</invoice_total>\n    </financials>\n</invoice>"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "93fd56a6-33f9-4ac2-88b2-72157beb871f",
      "name": "Limpio XML",
      "type": "n8n-nodes-base.set",
      "position": [
        1040,
        -240
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "ddaad091-c54e-44d9-bf05-604e3bf43caa",
              "name": "factura_limpia",
              "type": "string",
              "value": "={{ $json.content.parts[0].text.replace('```xml', '').replace('```', '').replace(/(\\n|\\s{2,})/g, '').replace(/(\\s<)/g, '<').replace(/(>\\s)/g, '>') }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "9d96dd97-9048-4a6f-b11c-52c30a6d3fa3",
      "name": "XML a JSON",
      "type": "n8n-nodes-base.xml",
      "position": [
        1200,
        -240
      ],
      "parameters": {
        "options": {
          "trim": false,
          "normalize": false,
          "normalizeTags": false
        },
        "dataPropertyName": "factura_limpia"
      },
      "typeVersion": 1
    },
    {
      "id": "ee4365f4-08b5-42de-afb7-6a187272fabb",
      "name": "Nota adhesiva",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        624,
        -544
      ],
      "parameters": {
        "color": 4,
        "width": 352,
        "height": 240,
        "content": "## PDF to text"
      },
      "typeVersion": 1
    },
    {
      "id": "e6bdaed7-1cee-4412-86c8-c7409ac1231e",
      "name": "Nota adhesiva1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        976,
        -544
      ],
      "parameters": {
        "color": 2,
        "width": 368,
        "height": 240,
        "content": "## Clean data and XML structure definition"
      },
      "typeVersion": 1
    },
    {
      "id": "26faacbb-3464-46fe-8e1f-cd105942d179",
      "name": "Nota adhesiva2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        624,
        -304
      ],
      "parameters": {
        "color": 3,
        "width": 352,
        "height": 256,
        "content": "## Generate XML string"
      },
      "typeVersion": 1
    },
    {
      "id": "33493f4d-a615-4a80-8727-7ebba208f215",
      "name": "Nota adhesiva3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        976,
        -304
      ],
      "parameters": {
        "color": 5,
        "width": 368,
        "height": 256,
        "content": "## String to XML to Json"
      },
      "typeVersion": 1
    }
  ],
  "pinData": {},
  "connections": {
    "93fd56a6-33f9-4ac2-88b2-72157beb871f": {
      "main": [
        [
          {
            "node": "9d96dd97-9048-4a6f-b11c-52c30a6d3fa3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "4e435b5b-95da-4b6a-a888-c2f74cd96cd1": {
      "main": [
        [
          {
            "node": "e070def8-b13a-49fa-ae4a-e366d1f474da",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "e070def8-b13a-49fa-ae4a-e366d1f474da": {
      "main": [
        [
          {
            "node": "93fd56a6-33f9-4ac2-88b2-72157beb871f",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "d510fda8-ceaa-4d57-8946-39a97b23f3e1": {
      "main": [
        [
          {
            "node": "4e435b5b-95da-4b6a-a888-c2f74cd96cd1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3a0d9a6f-6e6e-44a3-9eb0-1755b01fed0c": {
      "main": [
        [
          {
            "node": "d510fda8-ceaa-4d57-8946-39a97b23f3e1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Preguntas frecuentes

¿Cómo usar este flujo de trabajo?

Copie el código de configuración JSON de arriba, cree un nuevo flujo de trabajo en su instancia de n8n y seleccione "Importar desde JSON", pegue la configuración y luego modifique la configuración de credenciales según sea necesario.

¿En qué escenarios es adecuado este flujo de trabajo?

Intermedio - Varios, Resumen de IA, IA Multimodal

¿Es de pago?

Este flujo de trabajo es completamente gratuito, puede importarlo y usarlo directamente. Sin embargo, tenga en cuenta que los servicios de terceros utilizados en el flujo de trabajo (como la API de OpenAI) pueden requerir un pago por su cuenta.

Información del flujo de trabajo
Nivel de dificultad
Intermedio
Número de nodos10
Categoría3
Tipos de nodos6
Descripción de la dificultad

Adecuado para usuarios con experiencia intermedia, flujos de trabajo de complejidad media con 6-15 nodos

Autor
Mauricio Perera

Mauricio Perera

@rckflr

Automation consultant with over 10 years of experience specializing in AI, no-code, and workflow optimization. I’ve delivered tailored AI and NLP solutions across real estate, healthcare, and more, enhancing efficiency and customer experiences. Proficient in tools like Make, Airtable, and Zapier, I also integrate GPT models to create scalable, innovative automations. Contact me to discuss custom n8n workflows or advanced automations to streamline your processes.

Enlaces externos
Ver en n8n.io

Compartir este flujo de trabajo

Categorías

Categorías: 34