Overlapping SRT cues break playback on YouTube, TikTok, and most media players—causing hidden captions, dual-display glitches, or jarring reading experiences. Before you publish, run a timing sweep so every cue has clear in/out points. The SRT Editor visualizes overlaps directly in the waveform, letting you nudge, merge, or split cues with keyboard shortcuts—no manual millisecond editing required. This guide walks you through diagnosing overlaps, applying fast fixes, and exporting clean, player-ready caption files.
Table of Contents
Category hub: /creator/captions
Quick Start
- Open the SRT Editor and upload your SRT file
- Look for red overlap markers in the timeline
- Select overlapping cue and nudge ±100ms using keyboard shortcuts
- Merge rapid-fire cues when speaker doesn't change
- Validate timing and export clean SRT
Why Overlaps Happen
Overlapping cues typically emerge from four common scenarios. Auto-generated captions from YouTube, Rev, or Descript often produce sloppy timing where one cue's end time bleeds into the next cue's start time by a few milliseconds. Manual edits compound the problem—when you extend a cue's duration to match slower speech, you might inadvertently push the end time past the next cue's start without realizing it.
Merge and split operations introduce timing conflicts when you combine two cues but forget to adjust the boundary, or split a single cue without creating proper gaps. Platform-specific auto-caption quirks add another layer: YouTube's auto-sync can generate tight cue boundaries during fast dialogue, while third-party transcription services may use different gap thresholds that don't align with SRT best practices. Each of these situations creates overlaps that break playback, hide captions, or confuse viewers with simultaneous text displays.
Detecting Overlaps in SRT Editor
The SRT Editor displays red markers in the waveform view wherever cues overlap or gaps fall under 75ms. These visual indicators appear directly on the timeline, making it easy to scan your entire caption file without reading every timecode manually. The timeline visualization shows cue boundaries as vertical bars—when two bars touch or overlap, you've found a timing conflict.

Gap detection uses a 75–100ms threshold because most media players need at least that much time to clear the previous cue and render the next one cleanly. Tight gaps under 50ms can cause rendering issues on slower devices or in web players with limited processing power. A true overlap occurs when the previous cue's end time is greater than or equal to the next cue's start time—even a 1ms overlap can trigger display bugs. The editor distinguishes between overlaps (red) and acceptable gaps (no marker), so you can focus on actual problems rather than chasing minor timing variations that won't affect playback.
Fixing Strategies: Nudge, Merge, Split
Three core strategies handle most overlap scenarios: nudging adjusts timing by small increments, merging combines rapid-fire cues into readable blocks, and splitting separates cues when speakers change or scenes cut. Each strategy preserves caption accuracy while ensuring clean playback.
Nudge (±100ms adjustments)
Nudging shifts a cue's start or end time by 100ms intervals without retyping timecodes. When you spot an overlap, decide whether to nudge the previous cue's end time earlier or the next cue's start time later. In most cases, nudging the end time of the previous cue creates the cleanest fix because it preserves the next cue's sync with the audio. Use nudge controls to adjust timing while watching the waveform—if the cue ends before the speaker finishes the word, nudge it forward; if it overlaps, nudge it back.

Keyboard shortcuts make nudging fast: select a cue, then press the left or right arrow key while holding a modifier to shift timing in 100ms steps. This workflow lets you fix a dozen overlaps in under a minute without switching between tools or manually calculating new timecodes. Always preserve speaker timing—don't nudge so aggressively that the caption appears before the speaker starts talking or disappears mid-word.
Merge Cues
Merging combines two or more cues into a single caption block when the speaker delivers rapid-fire dialogue without pauses. This strategy eliminates overlaps caused by tight timing while improving readability—viewers prefer longer, stable captions over text that flashes on screen for half a second. Merge when the same speaker continues without interruption and the combined text stays under two lines (32–42 characters per line is the target).

Line length limits matter because excessively long captions overflow the screen or force players to shrink the font, reducing legibility on mobile devices. When merging, check that the combined duration gives viewers enough time to read comfortably—aim for 160–180 words per minute (WPM) as covered in the Timing Best Practices guide. The editor automatically re-indexes cues after merging, so you don't need to manually renumber the SRT file.
Split Cues
Splitting divides a single cue into two when speakers change mid-caption or a scene cut requires a timing break. After splitting, add proper gaps—don't let the new boundary create another overlap. The editor places the split at the current playhead position, so scrub to the exact frame where the speaker changes, then trigger the split. This creates two cues with a small gap between them, which you can adjust with nudge controls if needed.
Re-indexing happens automatically when you split cues—the editor increments all subsequent cue numbers to maintain sequential order. This ensures your exported SRT file passes validation and loads correctly in all players. Use splits sparingly; most overlaps resolve faster with nudging or merging.
Keyboard Shortcuts for Fast Fixes
Keyboard shortcuts eliminate the need to click through menus or drag sliders for every timing adjustment. Arrow keys navigate between cues—press up/down to move through the list, left/right to scrub the playhead along the timeline. Shift + Arrow triggers nudging: Shift + Left nudges the selected boundary 100ms earlier, Shift + Right nudges it 100ms later. This two-key combination lets you fix overlaps without taking your hands off the keyboard.
Press M to merge the selected cue with the next one, S to split at the current playhead position, and Space to toggle playback preview. This workflow—detect, select, fix, validate—takes under 30 seconds per cue once you internalize the shortcuts. For a complete shortcut reference, see the Keyboard Shortcuts & Power-User Workflow guide. Keyboard-driven editing scales better than mouse-based workflows when you're processing files with dozens of overlaps.
Validation Workflow
After fixing all visible overlaps, run the overlap detection tool one more time to catch any you missed. The validator scans every cue pair and flags gaps under 75ms or true overlaps where end times exceed start times. If the validator returns zero issues, you're ready to export. If it still shows red markers, jump to those cues and apply nudge or merge fixes until the file is clean.

Preview the file in your browser before exporting—load it into the SRT Editor's preview pane and watch a few minutes of playback to confirm captions appear and disappear smoothly. Export to SRT (or VTT if your platform requires it) and test in the target player—upload to YouTube or your video host and scrub through sections where you made edits. This final check ensures platform-specific rendering doesn't introduce new issues. The editor preserves UTF-8 encoding and original punctuation, so you won't lose special characters or formatting during export.
Common Mistakes & Fixes
- Nudging start time instead of end time → Adjust the end time of the previous cue to create the gap. Nudging start times can desync captions from audio.
- Merging cues with different speakers → Keep speaker changes as separate cues for clarity. Merging across speakers confuses viewers.
- Leaving gaps under 50ms → Aim for 75–100ms minimum to prevent render issues on slower devices and web players.
- Forgetting to validate after bulk edits → Always run overlap detection before exporting. Bulk operations can introduce new timing conflicts.
FAQs
- How much gap should exist between cues?
- Aim for 75–100ms minimum. Some players need time to clear the previous cue before rendering the next. Gaps under 50ms can cause rendering glitches on mobile devices or web players with limited processing power.
- Do I need to re-number cues after fixing overlaps?
- SRT Editor auto-renumbers cues when you merge or split, so indexes stay sequential. You don't need to manually update cue numbers—the editor handles that automatically during export.
- Can I fix overlaps in auto-generated captions?
- Yes. Import the auto-generated file from YouTube, Rev, or any transcription service, run overlap detection, nudge or merge as needed, then export. The editor preserves your original text while fixing timing issues.
- What if two cues overlap by more than 500ms?
- This usually indicates a timing error rather than a minor overlap. Check the transcript and re-time the cues to match the audio. Large overlaps often mean a cue was placed incorrectly or the timing drifted during manual edits.
- Should I merge all overlaps or keep them separate?
- Merge when the speaker doesn't change and the combined text stays under two lines (32–42 characters per line). If merging creates an excessively long caption, nudge instead to preserve readability. Prioritize viewer comfort over technical correctness.
- How do I prevent overlaps when editing manually?
- Always check the next cue's start time before extending the current cue's end time. Use grid snapping if your editor supports it, and validate timing after every bulk edit. The SRT Editor's waveform view makes it easy to see boundaries before you adjust them.
- Can I export to VTT or other formats?
- Yes. SRT Editor supports SRT and VTT export with UTF-8 encoding preserved. VTT is required for HTML5 video players and some streaming platforms, while SRT remains the standard for YouTube and most social platforms.