Comments (4)
@appshore i've just published an update with your suggestions. Please check it, and let me know if there are any problems. Thanks.
from cl-editor.
@nenadpnc It is working as expected. I'll do more tests over the next weeks and I'll report if needed.
Thank you for the quick update and this nice svelte editor.
from cl-editor.
@appshore Can you show me example of incorrect stripped html?
As i see here, your solution differs in that it doesn't replace Mso
css classes (Microsoft Word css classes), and removes match(/<!--StartFragment-->(.*?)<!--EndFragment-->/)
.
Default behaviour is to not transfer styling from Word document (or any document) on pasting, but to keep text form. Can you check that your solution doesn't break that?
from cl-editor.
The issue seems to come first from the match. Mso classes (all classes in fact) will be removed with my modifications:
.replace(/class="[^"]*"/gi, '')
.replace(/class='[^']*'/gi, '')
Here a complete command line index.ts file test case:
export const cleanHtml = (input: string) => {
// remove line brakers and find relevant html
const html = input.replace(/\r?\n|\r/g, ' ').match(/<!--StartFragment-->(.*?)<!--EndFragment-->/);
let output = html && html[1] || '';
console.log('cleanHtml origin html', html)
output = output
// 1. removeMso classes
.replace(/(class=(")?Mso[a-zA-Z]+(")?)/g, ' ')
// 2. strip Word generated HTML comments
.replace(/<!--(.*?)-->/g, '')
// 3. remove tags leave content if any
.replace(new RegExp('<(/)*(meta|link|span|\\?xml:|st1:|o:|font|w:sdt)(.*?)>','gi'), '')
.replace(/<!\[if !supportLists\]>(.*?)<!\[endif\]>/gi, '')
.replace(/style="[^"]*"/gi, '')
.replace(/style='[^']*'/gi, '')
.replace(/ /gi, ' ')
.replace(/>(\s+)</g, '><');
// 4. Remove everything in between and including tags '<style(.)style(.)>'
output = removeBadTags(output);
return output;
}
export const cleanHtml2 = (input: string) => {
let output = input
.replace(/\r?\n|\r/g, ' ')
.replace(/<!--(.*?)-->/g, '')
.replace(new RegExp('<(/)*(meta|link|span|\\?xml:|st1:|o:|font|w:sdt)(.*?)>', 'gi'), '')
.replace(/<!\[if !supportLists\]>(.*?)<!\[endif\]>/gi, '')
.replace(/style="[^"]*"/gi, '')
.replace(/style='[^']*'/gi, '')
.replace(/ /gi, ' ')
.replace(/>(\s+)</g, '><')
.replace(/class="[^"]*"/gi, '')
.replace(/class='[^']*'/gi, '')
.replace(/<[^/].*?>/g, i => i.split(/[ >]/g)[0] + '>')
.trim()
// 4. Remove everything in between and including tags '<style(.)style(.)>'
output = removeBadTags(output);
return output;
};
export const removeBadTags = (html: string) => {
['style', 'script', 'applet', 'embed', 'noframes', 'noscript'].forEach((badTag: string) => {
html = html.replace(new RegExp(`<${badTag}.*?${badTag}(.*?)>`, 'gi'), '')
});
return html;
}
let input = '<h1><div class="Mso-xyz" style="font-size:2em">Hello World</div></h1>';
console.log('input', input);
console.log('cleanHtml origin', cleanHtml(input));
console.log('cleanHtml modified', cleanHtml2(input));
from cl-editor.
Related Issues (20)
- unable to make it work HOT 7
- svelte 3 work? HOT 5
- Just wanted to say thanks! HOT 2
- How do we make the editor auto expand? HOT 2
- Provide example for usage within svelte HOT 3
- Events don't seem to work HOT 2
- Improve the image actions with some features HOT 1
- binding editor HTML HOT 4
- Certain keys are not working on Svelte HOT 1
- Problem toggling bold/italics tag when there is a new line HOT 5
- Feature request: remove format blockon second click if activated HOT 2
- setHtml while viewHtml action is active doesn't replace contents? HOT 2
- Selecting text with cursor selects everything above it HOT 1
- Incorrect undo on multiple instances
- Extra Configurability HOT 3
- Error when using in SvelteKit Type script project HOT 3
- h1 and h2 tags don't get removed HOT 2
- Issue with headings, lists
- cl-button Font-Size is unset
- how to get markdown text? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cl-editor.