0

I have a webpage that gets most of its content via API calls to a cloud database solution. The HTML page is fairly barebones but gets much more data injected through a number of JS/JQuery commands, etc.

The resulting page represents a "Quote" which I'd like to save back into the cloud database for reference purposes.

I can get the current state of the page and store it in a variable by using the following command:

var AVMI_thisPage = document.getElementsByTagName('html')[0].outerHTML;

I now need to remove any <script> tags from the variable so that any reimport of the HTML back to the cloud database doesn't contain any JS that is likely to mess with the page again when someone opens it for reference.

I should be able to push the string back to the database but I need to get rid of any <script>.

I've tried JQuery but this seems to kill the HTML, HEAD, and BODY tags.

To be honest, I wasn't expecting the code below to work anyway... but tried it.

E.g.

var AVMI_thisPage = document.getElementsByTagName('html')[0].outerHTML;
var AVMI_tree = $("<div>" + AVMI_thisPage + "</div>");
AVMI_tree.find('script').remove();
AVMI_thisPage = AVMI_tree.html();

Any ideas?

UPDATED - FINAL CODE (including BASE64 encoding and upload)

function b64EncodeUnicode(str) {
    return btoa(encodeURIComponent(str).replace(/%([0-9A-F]{2})/g, function(match, p1) {
    return String.fromCharCode('0x' + p1);
    }));
}

var htmlPage = $("html");
$("script", htmlPage).remove();
AVMI_thisPage = htmlPage.html();
AVMI_thisPageB64 = b64EncodeUnicode(AVMI_thisPage);

var req = "";
req += "<qdbapi>";
req += "<rid>" + AVMI_quoteRID + "</rid>";
req += "<field fid='171' filename='Hardcopy of Quote.html'>"+ AVMI_thisPageB64 + "</field>";
req += "</qdbapi>";
$.ajax({
    type: "POST",
    contentType: "text/xml",
    dataType: "xml",
    processData: false,
    url: "https://xxxx.xxxxxxxx.com/db/" + AVMI_Q_DBID + "?act=API_UploadFile",
    data: req
})
.then(function() {
    alert("A copy of this quote has been saved into the 'Hardcopy Attachment' field.");
    window.close();
});
1
  • Would the answer to this other question work? Commented Jan 20, 2017 at 11:21

2 Answers 2

1

You can do:

$("script", AVMI_tree).remove();

But mind that you're getting the OuterHTML of documentElement, that includes Head and BODY, and putting them into a DIV, which is illegal.

You could do:

var htmlPage = $("html");
$("script", htmlPage).remove();
AVMI_thisPage = htmlPage.html();

Mind that it doesn't matter that you're actually removing the SCRIPTS fro the HTML page rather than from a copied DOM, because once a loaded script has been processed and loaded by the JVM, it doesn't matter if you remove it from the DOM: The script will be loaded and active.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, your second block of code helped me to construct the code in the right way. Everything is working now. Many thanks!
0

I am not going to question the reason you do the 'state' saving like this, however here's how you can achieve what you want:

var regex = new RegExp('<script(.|\n)*</script>', 'g');
var noScript = AVMI_thisPage.replace(regex, '');

You can run it in your console in this page and print the noScript to see for yourself.

The regex selects all script tags that contain any character or newline in the whole stringified page and then we replace them with nothing, only doing string operations. I suspect this must be faster than doing DOM operations, let alone doing them with jQuery.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.