Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting html video tag inserted in a markdown file to html could generate wrong output #10278

Open
PuckCh opened this issue Oct 9, 2024 · 4 comments

Comments

@PuckCh
Copy link

PuckCh commented Oct 9, 2024

Assuming this sample markdown code:

# Test

Before text

<video width="320" height="240" controls>
  <source src="movie.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>

After text.

Create this wrong html output:

<h1 id="test">Test</h1>
<p>Before text</p>
<video width="320" height="240" controls>
<source src="movie.mp4" type="video/mp4">
<p>Your browser does not support the video tag. </video></p>
<p>After text.</p>

The closing </p> is wrongly placed. It shoud be before the closing </video>.

With this output, the browser never show the "After text" due to the <video> tag not closed.

Tested with:

  • macOS Sequoia 15.0.1, pandoc 3.5, command pandoc file.md --output=file.html
  • https://pandoc.org/try/, from markdown to html4 or html5.

Issue #8629 show the same kind of error (before correcting for <track> element.

@PuckCh PuckCh added the bug label Oct 9, 2024
@bpj
Copy link

bpj commented Oct 9, 2024

Use a fenced raw markup block:

```{=html}
<video>
...
<\video>
```

The {=FORMAT} syntax turns it into a raw block which will be inserted verbatim when generating the named output format but ignored in other output formats, except Pandoc's Markdown obviously, and other "sourcey" formats which support an equivalent syntax, except again raw blocks with {=markdown} as format which will be inserted verbatim in Markdown output. It works for inlines too: Inline `<i>raw<\i>`{=html} markup.

@PuckCh
Copy link
Author

PuckCh commented Oct 9, 2024

Use a fenced raw markup block:

Thanks for pointing it out. It works well for me too.
We will use it as work-around.

@jgm
Copy link
Owner

jgm commented Oct 9, 2024

pandoc parses this as:

[ RawBlock
    (Format "html")
    "<video width=\"320\" height=\"240\" controls>"
, RawBlock
    (Format "html")
    "<source src=\"movie.mp4\" type=\"video/mp4\">"
, Para
    [ Str "Your"
    , Space
    , Str "browser"
    , Space
    , Str "does"
    , Space
    , Str "not"
    , Space
    , Str "support"
    , Space
    , Str "the"
    , Space
    , Str "video"
    , Space
    , Str "tag."
    , SoftBreak
    , RawInline (Format "html") "</video>"
    ]
]

Reason is that video is in the list of tags that can be either block or inline.

Note that -fcommonmark or -f gfm will produce the result you are after.

@jgm
Copy link
Owner

jgm commented Oct 9, 2024

We could conceivably add some logic that would keep track of whether an open video tag has been parsed in the inline context, and if not reject the close tag as inline. That would help with cases like this. But it's best to explicitly mark HTML as suggested above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants