Improve URL previews by not including the content of media tags in the generated description. (#12887)

code_spécifique_watcha
reivilibre 3 years ago committed by GitHub
parent 9385cd0633
commit 317248d42c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 1
      changelog.d/12887.misc
  2. 10
      synapse/rest/media/v1/preview_html.py

@ -0,0 +1 @@
Improve URL previews by not including the content of media tags in the generated description.

@ -246,7 +246,9 @@ def parse_html_description(tree: "etree.Element") -> Optional[str]:
Grabs any text nodes which are inside the <body/> tag, unless they are within
an HTML5 semantic markup tag (<header/>, <nav/>, <aside/>, <footer/>), or
if they are within a <script/> or <style/> tag.
if they are within a <script/>, <svg/> or <style/> tag, or if they are within
a tag whose content is usually only shown to old browsers
(<iframe/>, <video/>, <canvas/>, <picture/>).
This is a very very very coarse approximation to a plain text render of the page.
@ -268,6 +270,12 @@ def parse_html_description(tree: "etree.Element") -> Optional[str]:
"script",
"noscript",
"style",
"svg",
"iframe",
"video",
"canvas",
"img",
"picture",
etree.Comment,
)

Loading…
Cancel
Save