Recently, I needed a way to pass dynamic content to and from webpages using PhantomJS as part of writing my screen scraper. I need the scraper to follow dynamic sets of links and scrape the data from each page. Since a webpage's scope is currently sand boxed, I had to find a way to pass data to and from webpages. With the addition of the new filesystem module in PhantomJS 1.3, it is now possible to pass data from the main scope to an individual page's scope. Any data that you want passed to a particular page should be saved as a javascript string to a javascript file. Then you can inject the javascript into the page on page.onLoadFinished so that the data is then accessible within the page's scope. For example:
var page = require('webpage').create(),
fs = require('fs'),
data = "var dataObject = { item: 'value' };",
fullpath;
fullpath = fs.workingDirectory + fs.separator + 'data.js';
// open file for writing
var dataFile = fs.open(fullpath, 'w');
dataFile.write(data);
dataFile.close();
// check that the file was successfully written
if(fs.size(fullpath) > 0) {
console.log('File wrote successfully!');
page.open('http://somesite.org/page.html');
// put page data in a local variable
var output = page.evaluate(function () {
// print the output of the data object
console.log(dataObject.item);
return dataObject.item;
});
// output should be the same value as the page's dataObject.item
console.log(output);
}
else {
console.log('Error in writing the file!');
phantom.exit();
}
page.onLoadFinished = function() {
// inject the javascript data that we created earlier
page.injectJS(fullpath);
}
For more information about PhantomJS' File System module, please visit: http://code.google.com/p/phantomjs/wiki/Interface#Filesystem_Module
While this solution may not be the best long term solution, it does provide a way to get data to and from your pages until official support for passing data to a webpage object becomes available in PhantomJS.