When you try to parse a document with a <!DOCTYPE ...>

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

`<!DOCTYPE ...>` is not preserved during parsing,about rushter/selectolax

Comments (5)

rushter commented on May 29, 2024

Since the parsing engines behind selectolax follow all the standards, they omit the doctype.
Not sure if we can retrieve the original doctype, I need to check. In HTML5, the doctype is the same for all websites.

You can test it in Chrome/Firefox devtools: document.documentElement.outerHTML returns no doctype.

from selectolax.

vidhu commented on May 29, 2024

@rushter Thanks for your response. What do you mean by "follow all the standards"? Unless theres a standards which states that the doctype is not part of the document (very unlikely) I think it should be included. This would also be inline with what other parsers do

A doctype is not part of the documentElement, it's part of the document. You can verify this in devtools document.doctype

Doctype is indeed same for all html5 websites but not for all websites. My use case requires me to make edits to a given html so I need to maintain the doctype.

Feel free to close this issue if you feel its better to file this in the Modest parsing engine project

from selectolax.

rushter commented on May 29, 2024

I will add doctype, but most likely, only next week.

from selectolax.

rushter commented on May 29, 2024

I made a simple fix, please test it.

from selectolax.

vidhu commented on May 29, 2024

I'll test it within 24hr and report back. Thank you!

from selectolax.

Recommend Projects

`<!DOCTYPE ...>` is not preserved during parsing about selectolax HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent