How to use Chrome content scripts to automate web page interaction and
nested navigations
I am trying to control chrome from my C# application.
I am looking to establish a fairly simplistic API between my C#
application and Chrome.
- Navigate to url
- Find HTML element based on some criteria (typically to be done via jQuery)
- Click on the selected element
- Repeat as needed
My C# program will manage multiple Chrome instances doing this kind of work.
The solution that I am trying to implement is to use Chrome extension
'content' scripts
Here's my current Manifest.json:
{
"name": "ScraperAPI",
"manifest_version": 2,
"version": "0.0.1",
"content_scripts": [
{
"matches": [ "<all_urls>" ],
"js": [ "AutomationApi.js"],
"run_at" : "document_start",
"all_frames" : false
}
],
"permissions": [ "tabs", "http://*/*", "storage" ]
"web_accessible_resources": [ "jQuery.min.v.2.0.3.map" ]
}
The content script uses a WebSocket to communicate with my C# application
So far, the WebSocket works very well for communicating API requests (such
as 'navigate') and responses (such as 'document ready').
My extension listens to document ready and opens up a WebSocket to my C#
program. - This works
My issues are:
[1] How to get the content script to launch automatically when Chrome
comes up? It appears that until I manually enter a URL at the navigateion
bar, my extension is not loaded
How can the content script recognize that it is running for the first time
in the chrome instance?
[2] How to maintain general state for the content script across page loads.
In particular, the C# program uses the content script to navigate to a
URL, find a particular HTML element, click on it, and after 'document
ready' it then wants to continue browsing and further clicking on the
page.
Unfortunately (for me), each time a page is loaded, Chrome loads a new
instance of the content script.
==> I don't know how to have the script determine whether it is running
for the first time or not.
On the first time through it has to open a WebSocket to the C#. On
subsequent loads in the same tab
I want it to recognize that the WebSocket connection exists and
continue using it.
I tried to create a 'window.myApi' object to save data, but each page
seems to get a new window object.
I was going to try to use 'local storage' but that is shared between
all local instances of all the scripts
and a fresh instance of the content script does not know whether it is
running for first time or not.
By the way, when my content script opens a WebSocket to the C#, the C#
responds with a unique id (GUID) so
all further communications use this unique id in the messages.
It would be great if a content script can tell if it has been assigned
a unique id by the C# program.
Is there an 'uber' window for the entire Chrome instance that I can
latch onto?
Should I be using a different approach (not using content scripts) to
automate Chrome sessions?
Is there an approach to saving state that I missed?
Any help would be greatly appreciated.
-Many thanks in advance Davud
No comments:
Post a Comment