-
Notifications
You must be signed in to change notification settings - Fork 5.7k
API Reference WebPage
This is a living document. As the codebase is updated, we hope to keep this document updated as well. Unless otherwise stated, this document currently applies to the latest PhantomJS release: PhantomJS 1.8.0
Note: This page serves as a reference. To learn step-by-step on how to use PhantomJS, please refer to the Quick Start guide.
A WebPage
object encapsulates a web page. It is usually instantiated using the following pattern:
var page = require('webpage').create();
Note: For backward compatibility with legacy PhantomJS applications, the constructor also remains exposed as a deprecated global WebPage
object:
var page = new WebPage();
clipRect
canGoBack
canGoForward
content
cookies
customHeaders
event
focusedFrameName
frameContent
frameName
framePlainText
frameTitle
frameUrl
framesCount
framesName
libraryPath
navigationLocked
offlineStoragePath
offlineStorageQuota
ownsPages
pages
pagesWindowName
paperSize
plainText
scrollPosition
settings
title
url
viewportSize
windowName
zoomFactor
addCookie()
childFramesCount()
childFramesName()
clearCookies()
close()
currentFrameName()
deleteCookie()
evaluateJavaScript()
evaluate()
evaluateAsync()
getPage()
go()
goBack()
goForward()
includeJs()
injectJs()
open()
openUrl()
release()
reload()
render()
renderBase64()
sendEvent()
setContent()
stop()
switchToFocusedFrame()
switchToFrame()
switchToChildFrame()
switchToMainFrame()
switchToParentFrame()
uploadFile()
onAlert
onCallback
onClosing
onConfirm
onConsoleMessage
onError
onFilePicker
onInitialized
onLoadFinished
onLoadStarted
onNavigationRequested
onPageCreated
onPrompt
onResourceRequested
onResourceReceived
onResourceTimeout
onResourceError
onUrlChanged
Internal methods to trigger callbacks :
closing()
initialized()
javaScriptAlertSent()
javaScriptConsoleMessageSent()
loadFinished()
loadStarted()
navigationRequested()
rawPageCreated()
resourceReceived()
resourceRequested()
urlChanged()
This property defines the rectangular area of the web page to be rasterized when WebPage#render
is invoked. If no clipping rectangle is set, WebPage#render
will process the entire web page.
Example:
page.clipRect = { top: 14, left: 3, width: 400, height: 300 };
This property stores the content of the web page (main frame), enclosed in an HTML/XML element. Setting the property will effectively reload the web page with the new content.
See also plainText
to get the content without any HTML tags.
cookies
[Cookies]
Get or set Cookies visible to the current URL (though, for setting, use of WebPage#addCookie
is preferred). This array will be pre-populated by any existing Cookie data visible to this URL that is stored in the CookieJar, if any. See phantom.cookies
for more information on the CookieJar.
Introduced: PhantomJS 1.5
This property specifies additional HTTP request headers that will be sent to the server for every request issued (for pages and resources). The default value is an empty object {}
. Headers names and values get encoded in US-ASCII before being sent. Please note that the 'User-Agent' should be set using the WebPage#settings
, setting the 'User-Agent' property in this property will overwrite the value set via WebPage#settings
.
Example:
// Send two additional headers 'X-Test' and 'DNT'.
page.customHeaders = {
'X-Test': 'foo',
'DNT': '1'
};
Do you only want these customHeaders
passed to the initial WebPage#open
request? Here's the recommended workaround:
// Send two additional headers 'X-Test' and 'DNT'.
page.customHeaders = {
'X-Test': 'foo',
'DNT': '1'
};
page.onInitialized = function() {
page.customHeaders = {};
};
Introduced: PhantomJS 1.7
This property stores the content of the web page's currently active frame (which may or may not be the main frame), enclosed in an HTML/XML element. Setting the property will effectively reload the web page with the new content.
Introduced: PhantomJS 1.7
Read-only. This property stores the content of the web page's currently active frame (which may or may not be the main frame) as plain text — no element tags!
Introduced: PhantomJS 1.7
Read-only. This property gets the current URL of the web page's currently active frame (which may or may not be the main frame).
This property stores the path which is used by WebPage#injectJs
function to
resolve the script name. Initially it is set to the location of the
script invoked by PhantomJS.
This property defines whether navigation away from the page is permitted or not. If it is set to true
, then the page is locked to the current URL. Defaults to false
.
This property defines the size of the web page when rendered as a PDF.
The given object should be in one of the following two formats:
{ width: '200px', height: '300px', border: '0px' }
{ format: 'A4', orientation: 'portrait', border: '1cm' }
If no paperSize
is defined, the size is defined by the web page. Supported dimension units are: 'mm'
, 'cm'
, 'in'
, 'px'
. No unit means 'px'
. Border is optional and defaults to 0
. A non-uniform border can be specified in the form {left: '2cm', top: '2cm', right: '2cm', bottom: '3cm'}
. Supported formats are: 'A3'
, 'A4'
, 'A5'
, 'Legal'
, 'Letter'
, 'Tabloid'
. Orientation ('portrait'
, 'landscape'
) is optional and defaults to 'portrait'
.
Example:
page.paperSize = { width: '5in', height: '7in', border: '20px' };
A repeating page header
and footer
can also be added via this property, as in this example:
page.paperSize = {
format: 'A4',
// ...
header: {
height: "1cm",
contents: phantom.callback(function(pageNum, numPages) {
if (pageNum == 1) {
return "";
}
return "<h1>Header <span style='float:right'>" + pageNum + " / " + numPages + "</span></h1>";
})
}
}
Read-only. This property stores the content of the web page (main frame) as plain text — no element tags!
See also: content
which returns the content with element tags.
This property defines the scroll position of the web page.
Example:
page.scrollPosition = { top: 100, left: 0 };
This property stores various settings of the web page:
-
javascriptEnabled
defines whether to execute the script in the page or not (defaults totrue
). -
loadImages
defines whether to load the inlined images or not (defaults totrue
). -
localToRemoteUrlAccessEnabled
defines whether local resource (e.g. from file) can access remote URLs or not (defaults tofalse
). -
userAgent
defines the user agent sent to server when the web page requests resources. -
userName
sets the user name used for HTTP authentication. -
password
sets the password used for HTTP authentication. -
XSSAuditingEnabled
defines whether load requests should be monitored for cross-site scripting attempts (defaults tofalse
). -
webSecurityEnabled
defines whether web security should be enabled or not (defaults totrue
). -
resourceTimeout
(in milli-secs) defines the timeout after which any resource requested will stop trying and proceed with other parts of the page.onResourceTimeout
callback will be called on timeout.
Note: The settings
apply only during the initial call to the WebPage#open
function. Subsequent modification of the settings
object will not have any impact.
Introduced: PhantomJS 1.7
Read-only. This property gets the current URL of the web page (main frame).
This property sets the size of the viewport for the layout process. It is useful to set the preferred initial size before loading the page, e.g. to choose between 'landscape'
vs 'portrait'
.
Because PhantomJS is headless (nothing is shown), viewportSize
effectively simulates the size of the window like in a traditional browser.
Example:
page.viewportSize = { width: 480, height: 800 };
This property specifies the scaling factor for the WebPage#render
and WebPage#renderBase64
functions. The default is 1
, i.e. 100% zoom.
Example:
// Create a thumbnail preview with 25% zoom
page.zoomFactor = 0.25;
page.render('capture.png');
Introduced: PhantomJS 1.7
Add a Cookies to the page. If the domains do not match, the Cookie will be ignored/rejected. Returns true
if successfully added, otherwise false
.
Example:
page.addCookie({
'name' : 'Added-Cookie-Name',
'value' : 'Added-Cookie-Value',
'domain': 'Added-Cookie-Domain'
});
Deprecated.
Deprecated.
Introduced: PhantomJS 1.7
Delete all Cookies visible to the current URL.
Introduced: PhantomJS 1.7
Close the page and releases the memory heap associated with it. Do not use the page instance after calling this.
Due to some technical limitations, the web page object might not be completely garbage collected. This is often encountered when the same object is used over and over again. Calling this function may stop the increasing heap allocation.
Deprecated.
Introduced: PhantomJS 1.7
Delete any Cookies visible to the current URL with a 'name' property matching cookieName
. Returns true
if successfully deleted, otherwise false
.
Example:
page.deleteCookie('Added-Cookie-Name');
Evaluates the given function in the context of the web page. The execution is sandboxed, the web page has no access to the phantom
object and it can't probe its own setting.
Example:
var page = require('webpage').create();
page.open('http://m.bing.com', function(status) {
var title = page.evaluate(function() {
return document.title;
});
console.log(title);
phantom.exit();
});
As of PhantomJS 1.6, JSON-serializable arguments can be passed to the function. In the following example, the text value of a DOM element is extracted. The following example achieves the same end goal as the previous example but the element is chosen based on a selector which is passed to the evaluate
call:
var page = require('webpage').create();
page.open('http://m.bing.com', function(status) {
var title = page.evaluate(function(s) {
return document.querySelector(s).innerText;
}, 'title');
console.log(title);
phantom.exit();
});
Note: The arguments and the return value to the evaluate
function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine. Closures,
functions, DOM nodes, etc. will not work!
Evaluates the given function in the context of the web page without blocking the current execution. The function returns immediately and there is no return value. This is useful to run some script asynchronously.
Includes external script from the specified url
(usually a remote location) on the page and executes the callback
upon completion.
Example:
page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js', function() {
/* jQuery is loaded, now manipulate the DOM */
});
Injects external script code from the specified file into the page (like WebPage#includeJs
, except that the file does not need to be accessible from the hosted page). If the file cannot be found in the current directory, libraryPath
is used for additional look up. This function returns true
if injection is successful, otherwise it returns false
.
Opens the url
and loads it to the page. Once the page is loaded, the optional callback
is called using WebPage#onLoadFinished
, and also provides the page status to the function ('success'
or 'fail'
).
Example:
page.open('http://www.google.com/', function(status) {
console.log('Status: ' + status);
// Do other things here...
});
As of PhantomJS 1.2, the open function can be used to request a URL with methods other than GET. This syntax also includes the ability to specify data to be sent with the request. In the following example, we make a request using the POST method, and include some basic data.
Example:
var data = 'user=username&password=password';
page.open('http://www.google.com/', 'POST', data, function(status) {
console.log('Status: ' + status);
// Do other things here...
});
Stability: DEPRECATED - Use WebPage#close
)
Releases memory heap associated with this page. Do not use the page instance after calling this.
Due to some technical limitations, the web page object might not be completely garbage collected. This is often encountered when the same object is used over and over again. Calling this function may stop the increasing heap allocation.
Renders the web page to an image buffer and saves it as the specified filename
. The special files /dev/stdout
and /dev/stderr
can be used here. The options
hash is optional, and may contain these options: format
and quality
.
// render to file named "test.jpg" with JPEG format
page.render("test.jpg");
// render to file named "test.jpg" with PNG format. format option will override format of file extension.
page.render("test.jpg", { format: "png" });
// render to "test.jpg" with JPEG format and 50 quality
page.render("test.jpg", { quality: 50 });
// render to "test.jpg" with JPEG format and 50 quality
page.render("test.jpg", { format: "jpg", quality: 50 });
// render to stdout with PNG format. PNG is default for stdout.
page.render("/dev/stdout");
// render to stdout with JPEG format.
page.render("/dev/stdout", { format: "jpg" });
// render to stdout with JPEG format and 50 quality.
page.render("/dev/stdout", { format: "jpg", quality: 50 });
The output format is determined automatically by the file extension. Supported formats include:
- PNG
- GIF
- JPEG
- And any other formats available in the QImage class.
Renders the web page to an image buffer and returns the result as a Base64-encoded string representation of that image.
Supported formats include:
- PNG
- GIF
- JPEG
sendEvent(mouseEventType[, mouseX, mouseY, button='left'])
or sendEvent(keyboardEventType, keyOrKeys, [null, null, modifier])
Sends an event to the web page. 1.7 implementation source.
The events are not like synthetic DOM events. Each event is sent to the web page as if it comes as part of user interaction.
The first argument is the event type. Supported types are 'mouseup'
, 'mousedown'
, 'mousemove'
, 'doubleclick'
and 'click'
. The next two arguments are optional but represent the mouse position for the event.
The button parameter (defaults to left
) specifies the button to push.
For 'mousemove'
, however, there is no button pressed (i.e. it is not dragging).
The first argument is the event type. The supported types are: keyup
, keypress
and keydown
. The second parameter is a key (from page.event.key), or a string.
You can also indicate a fifth argument, which is an integer indicating the modifier key.
- 0: No modifier key is pressed
- 0x02000000: A Shift key on the keyboard is pressed
- 0x04000000: A Ctrl key on the keyboard is pressed
- 0x08000000: An Alt key on the keyboard is pressed
- 0x10000000: A Meta key on the keyboard is pressed
- 0x20000000: A keypad button is pressed
Third and fourth argument are not taken account for keyboard events. Just give null for them.
Example:
page.sendEvent('keypress', page.event.key.A, null, null, 0x02000000 | 0x08000000 );
It simulate a shift+alt+A keyboard combination.
Introduced: PhantomJS 1.8
Allows to set both WebPage#content
and WebPage#url
properties.
The webpage will be reloaded with the new content and the current location set as the given url, without any actual http request being made.
deprecated
Uploads the specified file (filename
) to the form element associated with the selector
.
This function is used to automate the upload of a file, which is usually handled with a file dialog in a traditional browser. Since there is no dialog in this headless mode, such an upload mechanism is handled via this special function instead.
Example:
page.uploadFile('input[name=image]', '/path/to/some/photo.jpg');
Introduced: PhantomJS 1.0
This callback is invoked when there is a JavaScript alert
on the web page. The only argument passed to the callback is the string for the message. There is no return value expected from the callback handler.
Example:
page.onAlert = function(msg) {
console.log('ALERT: ' + msg);
};
Stability: EXPERIMENTAL
Introduced: PhantomJS 1.6
This callback is invoked when there is a JavaScript window.callPhantom
call made on the web page. The only argument passed to the callback is a data object.
Note: window.callPhantom
is still an experimental API. In the near future, it will be likely replaced with a message-based solution which will still provide the same functionality.
Although there are many possible use cases for this inversion of control, the primary one so far is to prevent the need for a PhantomJS script to be continually polling for some variable on the web page.
Example:
WebPage (client-side)
if (typeof window.callPhantom === 'function') {
window.callPhantom({ hello: 'world' });
}
PhantomJS (server-side)
page.onCallback = function(data) {
console.log('CALLBACK: ' + JSON.stringify(data)); // Prints 'CALLBACK: { "hello": "world" }'
};
Additionally, note that the WebPage#onCallback
handler can return a data object that will be carried back as the result of the originating window.callPhantom
call, too.
Example:
WebPage (client-side)
if (typeof window.callPhantom === 'function') {
var status = window.callPhantom({ secret: 'ghostly' });
alert(status); // Will either print 'Accepted.' or 'DENIED!'
}
PhantomJS (server-side)
page.onCallback = function(data) {
if (data && data.secret && data.secret === 'ghostly') {
return 'Accepted.';
}
return 'DENIED!';
};
Introduced: PhantomJS 1.7
This callback is invoked when the WebPage
object is being closed, either via WebPage#close
in the PhantomJS outer space or via window.close
in the page's client-side. It is not invoked when child/descendant pages are being closed unless you also hook them up individually. It takes one argument, closingPage
, which is a reference to the page that is closing. Once the onClosing
handler has finished executing (returned), the WebPage
object closingPage
will become invalid.
Example:
page.onClosing = function(closingPage) {
console.log('The page is closing! URL: ' + closingPage.url);
};
Introduced: PhantomJS 1.6
This callback is invoked when there is a JavaScript confirm
on the web page. The only argument passed to the callback is the string for the message. The return value of the callback handler can be either true
or false
, which are equivalent to pressing the "OK" or "Cancel" buttons presented in a JavaScript confirm
, respectively.
Example:
page.onConfirm = function(msg) {
console.log('CONFIRM: ' + msg);
return true; // `true` === pressing the "OK" button, `false` === pressing the "Cancel" button
};
Introduced: PhantomJS 1.2
This callback is invoked when there is a JavaScript console
message on the web page. The callback may accept up to three arguments: the string for the message, the line number, and the source identifier.
By default, console
messages from the web page are not displayed. Using this callback is a typical way to redirect it.
Example:
page.onConsoleMessage = function(msg, lineNum, sourceId) {
console.log('CONSOLE: ' + msg + ' (from line #' + lineNum + ' in "' + sourceId + '")');
};
Note: line number and source identifier are not used yet, at least in phantomJS <= 1.8.1. You receive undefined values.
Introduced: PhantomJS 1.5
This callback is invoked when there is a JavaScript execution error. It is a good way to catch problems when evaluating a script in the web page context. The arguments passed to the callback are the error message and the stack trace [as an Array].
Example:
page.onError = function(msg, trace) {
var msgStack = ['ERROR: ' + msg];
if (trace && trace.length) {
msgStack.push('TRACE:');
trace.forEach(function(t) {
msgStack.push(' -> ' + t.file + ': ' + t.line + (t.function ? ' (in function "' + t.function + '")' : ''));
});
}
console.error(msgStack.join('\n'));
};
Introduced: PhantomJS 1.3
This callback is invoked after the web page is created but before a URL is loaded. The callback may be used to change global objects.
Example:
page.onInitialized = function() {
page.evaluate(function() {
document.addEventListener('DOMContentLoaded', function() {
console.log('DOM content has loaded.');
}, false);
});
};
Introduced: PhantomJS 1.2
This callback is invoked when the page finishes the loading. It may accept a single argument indicating the page's status
: 'success'
if no network errors occurred, otherwise 'fail'
.
Also see WebPage#open
for an alternate hook for the onLoadFinished
callback.
Example:
page.onLoadFinished = function(status) {
console.log('Status: ' + status);
// Do other things here...
};
Introduced: PhantomJS 1.2
This callback is invoked when the page starts the loading. There is no argument passed to the callback.
Example:
page.onLoadStarted = function() {
var currentUrl = page.evaluate(function() {
return window.location.href;
});
console.log('Current page ' + currentUrl +' will gone...');
console.log('Now loading a new page...');
};
Introduced: PhantomJS 1.6
By implementing this callback, you will be notified when a navigation event happens and know if it will be blocked (by WebPage#navigationLocked
). Takes the following arguments:
-
url
: The target URL of this navigation event -
type
: Possible values include:'Undefined'
,'LinkClicked'
,'FormSubmitted'
,'BackOrForward'
,'Reload'
,'FormResubmitted'
,'Other'
-
willNavigate
:true
if navigation will happen,false
if it is locked (byWebPage#navigationLocked
) -
main
:true
if this event comes from the main frame,false
if it comes from an iframe of some other sub-frame.
Example:
page.onNavigationRequested = function(url, type, willNavigate, main) {
console.log('Trying to navigate to: ' + url);
console.log('Caused by: ' + type);
console.log('Will actually navigate: ' + willNavigate);
console.log("Sent from the page's main frame: " + main);
}
Introduced: PhantomJS 1.7
This callback is invoked when a new child window (but not deeper descendant windows) is created by the page, e.g. using window.open
. In the PhantomJS outer space, this WebPage
object will not yet have called its own WebPage#open
method yet and thus does not yet know its requested URL (WebPage#url
). Therefore, the most common purpose for utilizing a WebPage#onPageCreated
callback is to decorate the page (e.g. hook up callbacks, etc.).
Example:
page.onPageCreated = function(newPage) {
console.log('A new child page was created! Its requested URL is not yet available, though.');
// Decorate
newPage.onClosing = function(closingPage) {
console.log('A child page is closing: ' + closingPage.url);
};
};
Introduced: PhantomJS 1.6
This callback is invoked when there is a JavaScript prompt
on the web page. The arguments passed to the callback are the string for the message (msg
) and the default value (defaultVal
) for the prompt answer. The return value of the callback handler should be a string.
Example:
page.onPrompt = function(msg, defaultVal) {
if (msg === "What's your name?") {
return 'PhantomJS';
}
return defaultVal;
};
Introduced: PhantomJS 1.2
This callback is invoked when the page requests a resource. The first argument to the callback is the requestData
metadata object. The second argument is the networkRequest
object itself.
Example:
page.onResourceRequested = function(requestData, networkRequest) {
console.log('Request (#' + requestData.id + '): ' + JSON.stringify(requestData));
};
The requestData
metadata object contains these properties:
-
id
: the number of the requested resource -
method
: http method -
url
: the URL of the requested resource -
time
: Date object containing the date of the request -
headers
: list of http headers
The networkRequest
object contains these functions:
-
abort()
: aborts the current network request. Aborting the current network request will invoke (onResourceError
) callback. -
changeUrl(url)
: changes the current URL of the network request. setHeader(key, value)
Introduced: PhantomJS 1.2
This callback is invoked when the a resource requested by the page is received. The only argument to the callback is the response
metadata object.
If the resource is large and sent by the server in multiple chunks, onResourceReceived
will be invoked for every chunk received by PhantomJS.
Example:
page.onResourceReceived = function(response) {
console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};
The response
metadata object contains these properties:
-
id
: the number of the requested resource -
url
: the URL of the requested resource -
time
: Date object containing the date of the response -
headers
: list of http headers -
bodySize
: size of the received content decompressed (entire content or chunk content) -
contentType
: the content type if specified -
redirectURL
: if there is a redirection, the redirected URL -
stage
: "start", "end" (FIXME: other value for intermediate chunk?) -
status
: http status code. ex:200
-
statusText
: http status text. ex:OK
Introduced: PhantomJS 1.2
This callback is invoked when the a resource requested by the page timeout according to settings.resourceTimeout
. The only argument to the callback is the request
metadata object.
Example:
page.onResourceTimeout = function(request) {
console.log('Response (#' + request.id + '): ' + JSON.stringify(request));
};
The request
metadata object contains extra error related properties:
-
id
: the number of the requested resource -
method
: http method -
url
: the URL of the requested resource -
time
: Date object containing the date of the request -
headers
: list of http headers -
errorCode
: the error code of the error -
errorString
: text message of the error
Introduced: PhantomJS 1.6
This callback is invoked when the URL changes, e.g. as it navigates away from the current URL. The only argument to the callback is the new targetUrl
string.
Example:
page.onUrlChanged = function(targetUrl) {
console.log('New URL: ' + targetUrl);
};
To retrieve the old URL, use the onLoadStarted callback.
Introduced: PhantomJS 1.9
This callback is invoked when a web page was unable to load resource. The only argument to the callback is
the resourceError
metadata object.
Example:
page.onResourceError = function(resourceError) {
console.log('Unable to load resource (#' + resourceError.id + 'URL:' + resourceError.url + ')');
console.log('Error code: ' + resourceError.errorCode + '. Description: ' + resourceError.errorString);
};
The resourceError
metadata object contains these properties:
-
id
: the number of the request -
url
: the resource url -
errorCode
: the error code -
errorString
: the error description
These function call callbacks. Used for tests...