Transcription des paroles_ Modèle

Avancé

Ceci est uncontenant 18 nœuds.Utilise principalement des nœuds comme If, Code, Wait, FormTrigger, HttpRequest. Créer des fichiers de sous-titres .SRT et de paroles .LRC à partir d'audio avec Whisper AI et GPT-5-nano

Prérequis
  • Peut nécessiter les informations d'identification d'authentification de l'API cible
  • Clé API OpenAI

Catégorie

-
Aperçu du workflow
Visualisation des connexions entre les nœuds, avec support du zoom et du déplacement
Exporter le workflow
Copiez la configuration JSON suivante dans n8n pour importer et utiliser ce workflow
{
  "id": "ym5RZpXRcp7ZnW8X",
  "meta": {
    "instanceId": "b1699e1d8ef82aaaaf2eed0ed67f215d7574a625e2d012a1bcd013054b0defdf",
    "templateCredsSetupCompleted": true
  },
  "name": "LyricsTranscribe_TEMPLATE",
  "tags": [
    {
      "id": "5WzUYUnG7iVDJG7q",
      "name": "TEMPLATE",
      "createdAt": "2025-10-13T19:43:42.665Z",
      "updatedAt": "2025-10-13T19:43:42.665Z"
    }
  ],
  "nodes": [
    {
      "id": "d3f1c98a-1ff5-47ae-a68c-432355827779",
      "name": "Modèle de chat OpenAI",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -544,
        176
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-5-nano",
          "cachedResultName": "gpt-5-nano"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "id": "SejrVHsogrtvT4yC",
          "name": "TEMPLATE"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "805bc1ef-55d7-4f4c-b82f-8921d7645f4e",
      "name": "WhisperTranscribe",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -704,
        0
      ],
      "parameters": {
        "url": "https://api.openai.com/v1/audio/transcriptions",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "contentType": "multipart-form-data",
        "sendHeaders": true,
        "bodyParameters": {
          "parameters": [
            {
              "name": "file",
              "parameterType": "formBinaryData",
              "inputDataFieldName": "Audio_File"
            },
            {
              "name": "model",
              "value": "whisper-1"
            },
            {
              "name": "response_format",
              "value": "verbose_json"
            },
            {
              "name": "timestamp_granularities[]",
              "value": "word"
            }
          ]
        },
        "headerParameters": {
          "parameters": [
            {
              "name": "Authorization",
              "value": "YOUR API KEY"
            },
            {
              "name": "Content-Type",
              "value": "multipart/form-data"
            }
          ]
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "a4760f60-c22e-4325-abf1-4e34665b4154",
      "name": "AudioInput",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        -864,
        0
      ],
      "webhookId": "9ad0442a-b661-487c-8d8a-09a54400de62",
      "parameters": {
        "options": {},
        "formTitle": "Upload audio file (max 25mb)",
        "formFields": {
          "values": [
            {
              "fieldType": "file",
              "fieldLabel": "Audio File",
              "multipleFiles": false,
              "requiredField": true,
              "acceptFileTypes": ".mp3"
            },
            {
              "fieldType": "radio",
              "fieldLabel": "QualityCheck",
              "fieldOptions": {
                "values": [
                  {
                    "option": "YES"
                  },
                  {
                    "option": "NO"
                  }
                ]
              },
              "requiredField": true
            }
          ]
        },
        "responseMode": "lastNode",
        "formDescription": "Here you can upload your audio file and you get subtitles file back."
      },
      "typeVersion": 2.3
    },
    {
      "id": "3b4b5409-5cbf-41ec-9be0-bdf4eee690d0",
      "name": "TimestampMatching",
      "type": "n8n-nodes-base.code",
      "position": [
        -64,
        176
      ],
      "parameters": {
        "jsCode": "const WhisperTranscribe = $('WhisperTranscribe').first().json;\nconst words = WhisperTranscribe[\"words\"];\nconst lyrics = $json[\"text\"];\n\nconst minWordMatchRatio = 0.7; // tolerance (0-1) for fuzzy matching per line\nconst segments = lyrics.split(/\\r?\\n/).filter(l => l.trim().length > 0);\nconst normalize = s => s.toLowerCase().replace(/[^a-z0-9']/g, \" \").trim();\nconst whisperWords = words.map(w => ({\n  ...w,\n  norm: normalize(w.word)\n}));\n\nconst result = [];\n\nlet currentIndex = 0;\nfor (const line of segments) {\n  const lineWords = normalize(line).split(/\\s+/).filter(Boolean);\n\n  // Try to match this line to consecutive words from Whisper\n  let startIndex = -1;\n  let endIndex = -1;\n  let matchCount = 0;\n\n  for (let i = currentIndex; i < whisperWords.length; i++) {\n    if (lineWords.includes(whisperWords[i].norm)) {\n      if (startIndex === -1) startIndex = i;\n      endIndex = i;\n      matchCount++;\n      if (matchCount / lineWords.length >= minWordMatchRatio) break;\n    }\n  }\n\n  if (startIndex !== -1 && endIndex !== -1) {\n    const start = whisperWords[startIndex].start;\n    const end = whisperWords[endIndex].end;\n    result.push({ start, end, text: line });\n    currentIndex = endIndex + 1;\n  } else {\n    // fallback: if not found, approximate based on previous\n    const prevEnd = result.length ? result[result.length - 1].end : words[0].start;\n    const approxEnd = prevEnd + 2.5; // arbitrary 2.5s window\n    result.push({ start: prevEnd, end: approxEnd, text: line });\n  }\n}\n\nreturn [{ json: { timedSegments: result } }];"
      },
      "typeVersion": 2
    },
    {
      "id": "3bdb855d-0dbb-496c-8d44-9050981a344a",
      "name": "SubtitlesPreparation",
      "type": "n8n-nodes-base.code",
      "position": [
        272,
        176
      ],
      "parameters": {
        "jsCode": "const PostProcessedLyrics = $('RoutingQualityCheck').first().json;\nconst plainText = PostProcessedLyrics[\"text\"];\nconst segments = $json[\"timedSegments\"];\n\nfunction toSrtTime(sec) {\n  const h = Math.floor(sec / 3600);\n  const m = Math.floor((sec % 3600) / 60);\n  const s = Math.floor(sec % 60);\n  const ms = Math.floor((sec * 1000) % 1000);\n  return `${h.toString().padStart(2, \"0\")}:${m\n    .toString()\n    .padStart(2, \"0\")}:${s.toString().padStart(2, \"0\")},${ms\n    .toString()\n    .padStart(3, \"0\")}`;\n}\n\nfunction toLrcTime(sec) {\n  const m = Math.floor(sec / 60);\n  const s = Math.floor(sec % 60);\n  const cs = Math.floor((sec % 1) * 100); // centiseconds\n  return `[${m.toString().padStart(2, \"0\")}:${s\n    .toString()\n    .padStart(2, \"0\")}.${cs.toString().padStart(2, \"0\")}]`;\n}\n\n// --- generate SRT ---\nlet srt = \"\";\nsegments.forEach((seg, i) => {\n  const start = toSrtTime(seg.start);\n  const end = toSrtTime(seg.end);\n  srt += `${i + 1}\\n${start} --> ${end}\\n${seg.text.trim()}\\n\\n`;\n});\n\n// --- generate LRC ---\nlet lrc = \"\";\nsegments.forEach(seg => {\n  const time = toLrcTime(seg.start);\n  lrc += `${time}${seg.text.trim()}\\n`;\n});\n\n// --- return both ---\nreturn [\n  {\n    json: {\n      srtContent: srt.trim(),\n      lrcContent: lrc.trim(),\n      segmentCount: segments.length,\n      plainText\n    }\n  }\n];"
      },
      "typeVersion": 2
    },
    {
      "id": "5c72a572-a67b-423c-92f4-fedbb0c0dbd6",
      "name": "QualityCheck",
      "type": "n8n-nodes-base.wait",
      "position": [
        80,
        -16
      ],
      "webhookId": "779090da-886d-406d-8a0e-daa0d91bc74b",
      "parameters": {
        "resume": "form",
        "options": {},
        "formTitle": "Lyrics Review",
        "formFields": {
          "values": [
            {
              "fieldType": "file",
              "fieldLabel": "Corrected lyrics",
              "multipleFiles": false,
              "acceptFileTypes": ".txt"
            }
          ]
        }
      },
      "typeVersion": 1.1
    },
    {
      "id": "b1ac8f7e-bcb1-4732-b549-b910ec605784",
      "name": "RoutingQualityCheck",
      "type": "n8n-nodes-base.if",
      "position": [
        -240,
        0
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "38864488-adc7-429d-b65f-ed254d2eeacf",
              "operator": {
                "name": "filter.operator.equals",
                "type": "string",
                "operation": "equals"
              },
              "leftValue": "={{ $('AudioInput').item.json.QualityCheck }}",
              "rightValue": "YES"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "404c19e5-9351-4abb-aef3-900dbb7a4587",
      "name": "DiffMatch + SrcPrep",
      "type": "n8n-nodes-base.code",
      "position": [
        272,
        -16
      ],
      "parameters": {
        "jsCode": "// Načtení dat\nconst WhisperTranscribe = $('WhisperTranscribe').first().json;\nconst words = WhisperTranscribe[\"words\"];\n\n// Načtení upravených lyrics\nlet lyricsText = items[0].json.lyricsText;\nif (!lyricsText && items[0].binary && items[0].binary.Corrected_lyrics) {\n  lyricsText = Buffer.from(items[0].binary.Corrected_lyrics.data, \"base64\").toString(\"utf8\");\n}\n\n// CLEANUP: Odstranění garbage na konci textu\nlyricsText = lyricsText.replace(/\\t\\d+\\s*$/, '').trim();\n\n// Funkce pro normalizaci textu\nfunction normalizeWord(word) {\n  return word\n    .toLowerCase()\n    .replace(/[.,!?;:\"\"\"''—-]/g, '')\n    .trim();\n}\n\n// Funkce pro tokenizaci textu na slova\nfunction tokenize(text) {\n  return text\n    .replace(/\\\\n/g, ' ')\n    .replace(/\\n/g, ' ')\n    .split(/\\s+/)\n    .filter(w => w.length > 0);\n}\n\n// Získání původního textu z words\nconst originalText = words.map(w => w.word).join(' ');\nconst originalWords = tokenize(originalText);\nconst correctedWords = tokenize(lyricsText);\n\n// Levenshtein distance\nfunction levenshtein(a, b) {\n  const matrix = [];\n  for (let i = 0; i <= b.length; i++) {\n    matrix[i] = [i];\n  }\n  for (let j = 0; j <= a.length; j++) {\n    matrix[0][j] = j;\n  }\n  for (let i = 1; i <= b.length; i++) {\n    for (let j = 1; j <= a.length; j++) {\n      if (b.charAt(i - 1) === a.charAt(j - 1)) {\n        matrix[i][j] = matrix[i - 1][j - 1];\n      } else {\n        matrix[i][j] = Math.min(\n          matrix[i - 1][j - 1] + 1,\n          matrix[i][j - 1] + 1,\n          matrix[i - 1][j] + 1\n        );\n      }\n    }\n  }\n  return matrix[b.length][a.length];\n}\n\n// Alignment\nfunction alignWords(original, corrected, timestamps) {\n  const m = original.length;\n  const n = corrected.length;\n  \n  const dp = Array(m + 1).fill(null).map(() => Array(n + 1).fill(0));\n  const backtrack = Array(m + 1).fill(null).map(() => Array(n + 1).fill(null));\n  \n  for (let i = 0; i <= m; i++) dp[i][0] = i * -1;\n  for (let j = 0; j <= n; j++) dp[0][j] = j * -1;\n  \n  for (let i = 1; i <= m; i++) {\n    for (let j = 1; j <= n; j++) {\n      const origNorm = normalizeWord(original[i - 1]);\n      const corrNorm = normalizeWord(corrected[j - 1]);\n      \n      let matchScore = 0;\n      if (origNorm === corrNorm) {\n        matchScore = 2;\n      } else {\n        const dist = levenshtein(origNorm, corrNorm);\n        const maxLen = Math.max(origNorm.length, corrNorm.length);\n        if (maxLen > 0 && dist / maxLen < 0.3) {\n          matchScore = 1;\n        } else {\n          matchScore = -1;\n        }\n      }\n      \n      const match = dp[i - 1][j - 1] + matchScore;\n      const del = dp[i - 1][j] - 1;\n      const ins = dp[i][j - 1] - 1;\n      \n      dp[i][j] = Math.max(match, del, ins);\n      \n      if (dp[i][j] === match) backtrack[i][j] = 'match';\n      else if (dp[i][j] === del) backtrack[i][j] = 'delete';\n      else backtrack[i][j] = 'insert';\n    }\n  }\n  \n  const alignment = [];\n  let i = m, j = n;\n  \n  while (i > 0 || j > 0) {\n    if (i === 0) {\n      alignment.unshift({\n        correctedWord: corrected[j - 1],\n        originalIndex: null,\n        timestamp: null,\n        type: 'inserted'\n      });\n      j--;\n    } else if (j === 0) {\n      i--;\n    } else {\n      const action = backtrack[i][j];\n      \n      if (action === 'match') {\n        alignment.unshift({\n          correctedWord: corrected[j - 1],\n          originalIndex: i - 1,\n          timestamp: timestamps[i - 1],\n          type: normalizeWord(original[i - 1]) === normalizeWord(corrected[j - 1]) ? 'match' : 'modified'\n        });\n        i--; j--;\n      } else if (action === 'delete') {\n        i--;\n      } else {\n        let interpolatedTimestamp = null;\n        if (i > 0 && i < m) {\n          const prevTimestamp = timestamps[i - 1];\n          const nextTimestamp = timestamps[i];\n          interpolatedTimestamp = {\n            start: prevTimestamp.end,\n            end: nextTimestamp.start\n          };\n        }\n        \n        alignment.unshift({\n          correctedWord: corrected[j - 1],\n          originalIndex: null,\n          timestamp: interpolatedTimestamp,\n          type: 'inserted'\n        });\n        j--;\n      }\n    }\n  }\n  \n  return alignment;\n}\n\nconst alignment = alignWords(originalWords, correctedWords, words);\n\nconst alignedWords = alignment.map((item, index) => {\n  return {\n    word: item.correctedWord,\n    start: item.timestamp?.start || null,\n    end: item.timestamp?.end || null,\n    type: item.type,\n    originalIndex: item.originalIndex\n  };\n});\n\n// ============================================\n// GENEROVÁNÍ .LRC SOUBORU\n// ============================================\n\nfunction formatLRCTime(seconds) {\n  if (seconds === null || isNaN(seconds)) return '[00:00.00]';\n  const minutes = Math.floor(seconds / 60);\n  const secs = (seconds % 60);\n  const secsStr = secs.toFixed(2).padStart(5, '0');\n  return `[${String(minutes).padStart(2, '0')}:${secsStr}]`;\n}\n\nconst lyricsLines = lyricsText\n  .replace(/\\\\n/g, '\\n')\n  .split('\\n')\n  .map(line => line.trim())\n  .filter(line => line.length > 0);\n\nconst lrcLines = [];\nlet wordIndex = 0;\n\nfor (const line of lyricsLines) {\n  const lineWords = tokenize(line);\n  \n  if (lineWords.length === 0) continue;\n  \n  let lineStart = null;\n  let matchedWords = 0;\n  const startWordIndex = wordIndex;\n  \n  for (let i = wordIndex; i < alignedWords.length && matchedWords < lineWords.length; i++) {\n    const alignedWord = alignedWords[i];\n    const normalizedAligned = normalizeWord(alignedWord.word);\n    const normalizedLine = normalizeWord(lineWords[matchedWords]);\n    \n    if (normalizedAligned === normalizedLine) {\n      if (lineStart === null && alignedWord.start !== null) {\n        lineStart = alignedWord.start;\n      }\n      matchedWords++;\n      wordIndex = i + 1;\n    }\n  }\n  \n  // Fallback pro missing timestamps\n  if (lineStart === null) {\n    if (lrcLines.length > 0) {\n      const lastTime = lrcLines[lrcLines.length - 1].time;\n      lineStart = lastTime + 2;\n    } else {\n      lineStart = 0;\n    }\n  }\n  \n  lrcLines.push({\n    time: lineStart,\n    text: line\n  });\n}\n\n// Deduplikace timestampů v LRC\nconst lrcLinesDeduped = [];\nconst usedTimestamps = new Set();\n\nfor (const line of lrcLines) {\n  let adjustedTime = line.time;\n  let offset = 0;\n  \n  while (usedTimestamps.has(adjustedTime.toFixed(2))) {\n    offset += 0.01;\n    adjustedTime = line.time + offset;\n  }\n  \n  usedTimestamps.add(adjustedTime.toFixed(2));\n  lrcLinesDeduped.push({\n    time: adjustedTime,\n    text: line.text\n  });\n}\n\nlrcLinesDeduped.sort((a, b) => a.time - b.time);\nconst lrcContent = lrcLinesDeduped\n  .map(line => `${formatLRCTime(line.time)}${line.text}`)\n  .join('\\n');\n\n// ============================================\n// GENEROVÁNÍ .SRT SOUBORU (VYLEPŠENO)\n// ============================================\n\nfunction formatSRTTime(seconds) {\n  if (seconds === null || isNaN(seconds)) return '00:00:00,000';\n  const hours = Math.floor(seconds / 3600);\n  const minutes = Math.floor((seconds % 3600) / 60);\n  const secs = Math.floor(seconds % 60);\n  const ms = Math.floor((seconds % 1) * 1000);\n  \n  return `${String(hours).padStart(2, '0')}:${String(minutes).padStart(2, '0')}:${String(secs).padStart(2, '0')},${String(ms).padStart(3, '0')}`;\n}\n\nconst srtEntries = [];\nconst MIN_DURATION = 0.8;  // Minimální délka titulku\nconst MAX_DURATION = 5.0;  // Maximální délka titulku\nconst CHARS_PER_SECOND = 20; // Rychlost čtení\n\nwordIndex = 0;\n\nfor (let lineIdx = 0; lineIdx < lyricsLines.length; lineIdx++) {\n  const line = lyricsLines[lineIdx];\n  const lineWords = tokenize(line);\n  \n  if (lineWords.length === 0) continue;\n  \n  let lineStart = null;\n  let lineEnd = null;\n  let matchedWords = 0;\n  let firstMatchIdx = null;\n  let lastMatchIdx = null;\n  \n  // Najdeme všechna slova pro tento řádek\n  for (let i = wordIndex; i < alignedWords.length && matchedWords < lineWords.length; i++) {\n    const alignedWord = alignedWords[i];\n    const normalizedAligned = normalizeWord(alignedWord.word);\n    const normalizedLine = normalizeWord(lineWords[matchedWords]);\n    \n    if (normalizedAligned === normalizedLine) {\n      if (firstMatchIdx === null) {\n        firstMatchIdx = i;\n      }\n      lastMatchIdx = i;\n      \n      if (lineStart === null && alignedWord.start !== null) {\n        lineStart = alignedWord.start;\n      }\n      if (alignedWord.end !== null) {\n        lineEnd = alignedWord.end;\n      }\n      matchedWords++;\n    }\n  }\n  \n  // Posuneme wordIndex na konec matchů\n  if (lastMatchIdx !== null) {\n    wordIndex = lastMatchIdx + 1;\n  }\n  \n  // Validace a úprava časů\n  if (lineStart !== null && lineEnd !== null) {\n    // Zajistíme že end > start\n    if (lineEnd <= lineStart) {\n      lineEnd = lineStart + MIN_DURATION;\n    }\n    \n    let duration = lineEnd - lineStart;\n    const textLength = line.length;\n    \n    // Výpočet ideální délky podle textu\n    const idealDuration = Math.max(MIN_DURATION, textLength / CHARS_PER_SECOND);\n    \n    // Pokud je duration příliš krátká, prodloužíme\n    if (duration < idealDuration) {\n      lineEnd = lineStart + idealDuration;\n      duration = idealDuration;\n    }\n    \n    // Pokud je příliš dlouhá, zkrátíme\n    if (duration > MAX_DURATION) {\n      lineEnd = lineStart + MAX_DURATION;\n      duration = MAX_DURATION;\n    }\n    \n    // Kontrola překryvu s následujícím titulkem\n    if (lineIdx < lyricsLines.length - 1) {\n      // Najdeme start dalšího řádku\n      let nextStart = null;\n      const nextLine = lyricsLines[lineIdx + 1];\n      const nextLineWords = tokenize(nextLine);\n      let tempMatched = 0;\n      \n      for (let i = wordIndex; i < alignedWords.length && tempMatched < nextLineWords.length; i++) {\n        const alignedWord = alignedWords[i];\n        const normalizedAligned = normalizeWord(alignedWord.word);\n        const normalizedNext = normalizeWord(nextLineWords[tempMatched]);\n        \n        if (normalizedAligned === normalizedNext) {\n          if (nextStart === null && alignedWord.start !== null) {\n            nextStart = alignedWord.start;\n            break;\n          }\n        }\n      }\n      \n      // Pokud by se překrýval s dalším, zkrátíme s malou mezerou\n      if (nextStart !== null && lineEnd > nextStart - 0.1) {\n        lineEnd = Math.max(lineStart + MIN_DURATION, nextStart - 0.1);\n      }\n    }\n    \n    // Finální validace\n    if (lineEnd > lineStart && (lineEnd - lineStart) >= 0.1) {\n      srtEntries.push({\n        start: lineStart,\n        end: lineEnd,\n        text: line\n      });\n    }\n  }\n}\n\n// Vytvoření SRT formátu\nconst srtContent = srtEntries\n  .map((entry, index) => {\n    return `${index + 1}\\n${formatSRTTime(entry.start)} --> ${formatSRTTime(entry.end)}\\n${entry.text}\\n`;\n  })\n  .join('\\n');\n\n// ============================================\n// STATISTIKY\n// ============================================\n\nconst stats = {\n  totalOriginal: originalWords.length,\n  totalCorrected: correctedWords.length,\n  matched: alignment.filter(a => a.type === 'match').length,\n  modified: alignment.filter(a => a.type === 'modified').length,\n  inserted: alignment.filter(a => a.type === 'inserted').length,\n  lrcLinesGenerated: lrcLinesDeduped.length,\n  srtEntriesGenerated: srtEntries.length\n};\n\nreturn [{\n  json: {\n    alignedWords: alignedWords,\n    stats: stats,\n    correctedLyrics: lyricsText,\n    lrcContent: lrcContent,\n    srtContent: srtContent\n  }\n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "fcb24b3f-17c1-43a9-9966-bf0e542d1c4c",
      "name": "SRT",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        464,
        -16
      ],
      "parameters": {
        "options": {},
        "operation": "toText",
        "sourceProperty": "srtContent",
        "binaryPropertyName": "=SrtFile_ {{ $('AudioInput').item.json['Audio File'].filename }}"
      },
      "typeVersion": 1.1
    },
    {
      "id": "aa743fae-91a0-4275-a940-874399341e98",
      "name": "LRC",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        464,
        176
      ],
      "parameters": {
        "options": {},
        "operation": "toText",
        "sourceProperty": "lrcContent",
        "binaryPropertyName": "=LRC_FILE_ {{ $('AudioInput').item.json['Audio File'].filename }}"
      },
      "typeVersion": 1.1
    },
    {
      "id": "6dad3945-e25a-40b9-bac0-fd2d17a1dba7",
      "name": "TranscribedLyrics",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        -64,
        -16
      ],
      "parameters": {
        "options": {},
        "operation": "toText",
        "sourceProperty": "text",
        "binaryPropertyName": "=TRANSCRIBED_{{ $('AudioInput').item.json['Audio File'].filename }}"
      },
      "typeVersion": 1.1
    },
    {
      "id": "201a2931-ae46-43bb-85d2-0abc47ca2c98",
      "name": "PostProcessing",
      "type": "@n8n/n8n-nodes-langchain.chainLlm",
      "position": [
        -544,
        0
      ],
      "parameters": {
        "text": "={{ $json.text }}",
        "batching": {},
        "messages": {
          "messageValues": [
            {
              "message": "You are helping with preparing song lyrics for musicians. Take the following transcription and split it into lyric-like lines. Keep lines short (2–8 words), natural for singing/rap phrasing, and do not change the wording."
            }
          ]
        },
        "promptType": "define"
      },
      "typeVersion": 1.7
    },
    {
      "id": "6a1a5512-f2c3-4b98-b89f-4f328e42422c",
      "name": "Note adhésive",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -240,
        -256
      ],
      "parameters": {
        "color": 5,
        "width": 416,
        "height": 192,
        "content": "## QUALITY CONTROL CHECKPOINT\nChoose your path:\n- Auto: Skip to file generation\n- Manual: Download TXT, make corrections, re-upload for timestamp matching"
      },
      "typeVersion": 1
    },
    {
      "id": "a7725722-9e75-439e-b4c8-bb0bb16520bb",
      "name": "Note adhésive1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -544,
        -256
      ],
      "parameters": {
        "color": 5,
        "width": 224,
        "height": 224,
        "content": "## AI LYRICS SEGMENTATION\nGPT-5-nano formats raw transcription into natural lyric lines (2-8 words per line) while preserving original wording."
      },
      "typeVersion": 1
    },
    {
      "id": "81fb4fa3-41fe-4750-bdaf-b8518e16acdf",
      "name": "Note adhésive2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -848,
        -256
      ],
      "parameters": {
        "color": 5,
        "height": 224,
        "content": "## AUDIO INPUT & TRANSCRIPTION\nUpload your vocal track (MP3) and let Whisper AI transcribe it with precise timestamps. Works best with clean vocal recordings."
      },
      "typeVersion": 1
    },
    {
      "id": "16538472-625d-425a-b374-2ad750a0ac3e",
      "name": "Note adhésive3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        512,
        -272
      ],
      "parameters": {
        "color": 5,
        "width": 288,
        "height": 224,
        "content": "## EXPORT READY FILES\nGenerate professional subtitle files:\n- .SRT for YouTube & video platforms\n- .LRC for Musixmatch & streaming services"
      },
      "typeVersion": 1
    },
    {
      "id": "0169bbce-6ab2-4ff6-9799-9204a3ba0400",
      "name": "Note adhésive4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        208,
        -256
      ],
      "parameters": {
        "color": 5,
        "width": 272,
        "height": 192,
        "content": "## SMART TIMESTAMP ALIGNMENT\nAdvanced diff & matching algorithm aligns your corrections with original Whisper timestamps using Levenshtein distance."
      },
      "typeVersion": 1
    },
    {
      "id": "1bb72b32-f5f0-4a98-801c-ba751f3db0c3",
      "name": "Note adhésive5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1328,
        -368
      ],
      "parameters": {
        "color": 5,
        "width": 400,
        "height": 592,
        "content": "## 🎵 SUBTITLE & LYRICS GENERATOR WITH WHISPER AI\n\nTransform vocal tracks into professional subtitle and lyrics files with AI-powered transcription and intelligent segmentation.\n\nKEY FEATURES:\n- Whisper AI transcription with word-level timestamps\n- GPT-5-nano intelligent lyrics segmentation (2-8 words/line)\n- Optional quality check with manual correction workflow\n- Smart timestamp alignment using Levenshtein distance\n- Dual output: .SRT (YouTube/video) + .LRC (streaming/Musixmatch)\n- No disk storage - download files directly\n- Supports multiple languages via ISO codes\n\nPERFECT FOR:\nMusicians • Record Labels • Content Creators • Video Editors\n\n⚡ Upload MP3 → AI processes → Download professional subtitle files"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "25d9495b-ed66-4b9d-b118-4192b37e8f79",
  "connections": {
    "a4760f60-c22e-4325-abf1-4e34665b4154": {
      "main": [
        [
          {
            "node": "805bc1ef-55d7-4f4c-b82f-8921d7645f4e",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "5c72a572-a67b-423c-92f4-fedbb0c0dbd6": {
      "main": [
        [
          {
            "node": "404c19e5-9351-4abb-aef3-900dbb7a4587",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "201a2931-ae46-43bb-85d2-0abc47ca2c98": {
      "main": [
        [
          {
            "node": "b1ac8f7e-bcb1-4732-b549-b910ec605784",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "201a2931-ae46-43bb-85d2-0abc47ca2c98",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "3b4b5409-5cbf-41ec-9be0-bdf4eee690d0": {
      "main": [
        [
          {
            "node": "3bdb855d-0dbb-496c-8d44-9050981a344a",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "6dad3945-e25a-40b9-bac0-fd2d17a1dba7": {
      "main": [
        [
          {
            "node": "5c72a572-a67b-423c-92f4-fedbb0c0dbd6",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "805bc1ef-55d7-4f4c-b82f-8921d7645f4e": {
      "main": [
        [
          {
            "node": "201a2931-ae46-43bb-85d2-0abc47ca2c98",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "404c19e5-9351-4abb-aef3-900dbb7a4587": {
      "main": [
        [
          {
            "node": "fcb24b3f-17c1-43a9-9966-bf0e542d1c4c",
            "type": "main",
            "index": 0
          },
          {
            "node": "aa743fae-91a0-4275-a940-874399341e98",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "b1ac8f7e-bcb1-4732-b549-b910ec605784": {
      "main": [
        [
          {
            "node": "6dad3945-e25a-40b9-bac0-fd2d17a1dba7",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "3b4b5409-5cbf-41ec-9be0-bdf4eee690d0",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3bdb855d-0dbb-496c-8d44-9050981a344a": {
      "main": [
        [
          {
            "node": "fcb24b3f-17c1-43a9-9966-bf0e542d1c4c",
            "type": "main",
            "index": 0
          },
          {
            "node": "aa743fae-91a0-4275-a940-874399341e98",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Foire aux questions

Comment utiliser ce workflow ?

Copiez le code de configuration JSON ci-dessus, créez un nouveau workflow dans votre instance n8n et sélectionnez "Importer depuis le JSON", collez la configuration et modifiez les paramètres d'authentification selon vos besoins.

Dans quelles scénarios ce workflow est-il adapté ?

Avancé

Est-ce payant ?

Ce workflow est entièrement gratuit et peut être utilisé directement. Veuillez noter que les services tiers utilisés dans le workflow (comme l'API OpenAI) peuvent nécessiter un paiement de votre part.

Informations sur le workflow
Niveau de difficulté
Avancé
Nombre de nœuds18
Catégorie-
Types de nœuds9
Description de la difficulté

Adapté aux utilisateurs avancés, avec des workflows complexes contenant 16+ nœuds

Liens externes
Voir sur n8n.io

Partager ce workflow

Catégories

Catégories: 34