#Chrobot: Browser Automation Library

1 messages Β· Page 1 of 1 (latest)

stone musk
#

I've been working on a library to control a browser instance via the Chrome Devtools Protocol, I just published the package on hex here:

https://hexdocs.pm/chrobot/index.html
Hope it might be useful to someone! There are some examples right in the readme if you're interested.

I used the JSON specification of the protocol to generate gleam code for it, so you can use the protocol in a typed way (somewhat).
There is also a module with high level functions to do simple things, which I'm still hoping to extend considerably, but for now I need to take a break I think.

It's been working well for me on macOS, I ran the tests on Debian as well, I have no idea if it would work on windows.

An enjoyable first gleaming experience for me overall 🌟

jagged kestrel
#

Oh wow this looks extremely useful

#

So many places this would come in handy

tight harness
#

TIL there is a chrome devtools protocol

#

this is awesome!

jagged kestrel
#

Great docs too!!

stone musk
#

tyty jigglypuff

jagged kestrel
#

@vestal lantern one for the newsletter perhaps

spring flame
#

Whoaa this is really cool

#

@latent knot this might end up being useful for Lustre too!

tall sundial
#

Oh wow wow wow, very cool, thank you! I hope we'll have a puppeteerplaywright client library some day as well, for testing in Firefox etc

stone musk
# tall sundial Oh wow wow wow, very cool, thank you! I hope we'll have a ~~puppeteer~~playwrigh...

What I was going for here is something a bit like puppeteer yes, which also uses the chrome devtools protocol.

If you want solid cross browser automation for all browser engines (Firefox, Safari, Chrome, Edge etc.) it would be best to build on the WebDriver spec instead, or just use one of the well engineered & maintained packages that do, like playwright which works really well in my experience, not sure if it makes a lot of sense to rebuild all this just for the sake of having it 'in gleam'.

I decided against building a WebDriver client because I though it would be neat to keep the scope smaller and focus on getting specific things done.
With WebDriver you would also need to run the WebDriver server (like Geckodriver for FF or Chromedriver) in addition to the browser, which I think adds some complexity that I don't need for my use cases, but a client would def be nice to have!

Something interesting though: Firefox specifically does actually support a subset of the Chrome Devtools Protocol:
https://firefox-source-docs.mozilla.org/remote/cdp/
and I think puppeteer allows you to control FF as well through that.
But I can't really see anywhere what specifically is included in this subset, my package uses pipes to communicate with the browser via file descriptors instead of the older method of a websocket connection, no idea if FF supports that for example

latent knot
#

do you think you could come up with a way to automatically download a chromium browser if the user doesnt have one available

#

instead of asking users to use npx/puppeteer

#

not sure if it makes a lot of sense to rebuild all this just for the sake of having it 'in gleam'.
my perspective is i'd like lustre_dev_tools to provide an experience that works out of the box without any additional config or tooling. for users that want more configuration or more power im always very happy to push people towards better tools (vite for dev server/bundling, pupetteer for testing, ...) but it would be so cool if i could include a e2e testing solution in the dev tools that's zero-config

latent knot
stone musk
# latent knot do you think you could come up with a way to automatically download a chromium b...

I thought about this, I would love to do this and kind of planned to initially but focused on other things.
What I noticed though, the puppeteer install script (which uses Google's chrome for testing distribution) isn't even as good as I thought.. It works great on mac but when I tried it on debian, it doesn't even attempt to install any dependencies, it's not zero config at all, there are issues open about it it where people post all sorts of unhinged "paste this into your terminal" hacks to get the right dependencies on linux

https://github.com/GoogleChromeLabs/chrome-for-testing/issues/55
https://github.com/GoogleChromeLabs/chrome-for-testing/issues/121

#

I would also love it to be zero config, which is why I made the default launch command without any parameters, but not sure how to solve this in a cross platform way atm

#

I think I could do as good as the pupeteer install script for mac, but not sure if that's good enough..

latent knot
#

ah yeah thats interesting/unfortunate

stone musk
latent knot
#

i will add it on to my never-ending list of things to investigate

stone musk
#

I could look at how playwright does it, maybe they do a better job

latent knot
#

we were discusssing packaging up a binary of bun or deno plus vitest and a jsdom package (not jsdom specifically) and exposing an api to write tests using that at one point

#

this feels a bit more "heavyweight" but also probably much nicer

stone musk
latent knot
#

it is nice when people test their websites against all engines i agree

#

but i am absolutely not planning on writing a tool that will somehow download platform-appropriate binaries for all major browsers and test them πŸ™‚

#

something is better than nothing, and if i can provide a convenient way to write some e2e tests that is better than the state of the world today where no e2e tests can be written

stone musk
latent knot
#

that was before this existed!

stone musk
#

Aaah neat! I was just thinking if you were to package bun, you might as well just use playwright from it which should be able to handle all the installation bits.
But if you wanna try use the chrobot package that's great, I'm open to any contributions regarding the installation etc. from anyone of course! 🌞 would love to make it more straightforward

latent knot
#

if i make any headway i'll let you know!

tall sundial
#

Oohh I was confused I wrote puppeteer when I meant playwright, sorry

#

There's too many e2e testing automation approaches, old selenium webdriver, w3c webdriver, playwright, webdriver BiDi... What I like about playwright is the company behind it (Microsoft) that ships compatible forks of browsers with some protocol bugs fixed, with webdriver and w3c webdriver I was constatly annoyed at small differences of behavior when testing on chrome vs firefox etc

stone musk
# tall sundial There's too many e2e testing automation approaches, old selenium webdriver, w3c ...

yeah playwright is great, they also have this recording feature for vscode where you can click around in a browser session and it will generate a test from it, so useful!
The fact that it's already such a good tool is also why I decided to not try target the same kind of niche and keep the scope smaller.

Just for context: My own use case for this package is I often find myself wanting to do some slow, long running one-off web scraping job (for 'art'/archiving rather than AI / business / whatever) and I would like better tooling for it and to do it on the BEAM because I think it's fun.
My plan is to use chrobot next time I want to do something like that.
And I actually want to use it from Elixir, I think it can be a neat flow, writing the scrape function in a quick unsafe way without having to worry about types etc. letting the process crash if something doesn't line up and just skip to the next item.

#

It might also be neat for generating PDFs, although currently it is unsatisfyingly slow to get the PDF from the browser. Could just be because it's being piped back as a base64 encoded string inside the JSON payload, but I can't see any way around it, also open for suggestions to improve that..

latent knot
#

And I actually want to use it from Elixir,
ah interesting, did you already look at hound and the other libraries?

stone musk
#

I've used this twice: https://github.com/elixir-crawly/crawly
I think it's alright but not exactly what I want, and the whole time I was thinking oh I would just like to have a browser instance and write some query selectors / javascript against that and also structure it a bit differently, I prefer to use an sqlite db rather than having the output be json lines and the state in an opaque dets store, have multiple jobs that depend on each other, like one to traverse an index, one to scrape detail pages, stuff like that.

For browser automation with Elixir I hadn't heard of hound before, I only found this https://github.com/holsee/chroxy which in turn suggests to use another library to the the communication via CDP https://github.com/cyrus-and/chrome-remote-interface and I though I would prefer to have it kind of all in one, batteries included with some high level functions, so you could just launch the browser open a page and select some elements, this is how I arrived at the chrobot package.

#

And I like the idea of having a "solid" typed library in gleam as the basis for doing some quick and dirty scripting in Elixir.

#

Actually a nice implementation of the CDP in Elixir that I used a bit as a reference is this:
https://github.com/bitcrowd/chromic_pdf
which is only focused on PDF generation though.
But it was really helpful in figuring out how to get the erlang Port to work properly.

stone musk
# stone musk I could look at how playwright does it, maybe they do a better job

@latent knot The playwright approach to installing browsers on Linux is that they check if required libs are on the system and if not, they suggest the packages to install to get them.
https://github.com/microsoft/playwright/tree/main/utils/linux-browser-dependencies#mapping-distribution-libraries-to-package-names
They do that for ubuntu and debian, and they have a script there to generate a map of libs to package names for those and then they put the output in their installation script here πŸ˜΅β€πŸ’«
https://github.com/microsoft/playwright/blob/main/packages/playwright-core/src/server/registry/nativeDeps.ts
I guess I could just copy the list of packages from there and try to keep it up to date.
I don't see a way around installing dependencies manually for linux distros.

tall sundial
#

noo I think a gleam lib should not evem try to install system deps, listing them in readme as requirements seems more appropriate

#

It's cool that Microsoft makes it just work for most people but they have a lot of resources to keep this up to date etc

jagged kestrel
#

Totally

#

I would not want a library to install stuff

stone musk
#

I just published a 2.1.1 which contains rudimentary browser installation functionality and removes the pointers to the puppeteer install script

https://hexdocs.pm/chrobot/browser_install.html

Comes with some caveats though:

  • Linux dependencies are not installed automatically, same as puppeteer, though I added some hints
  • No support for ARM linux, because Google doesn't provide binaries of Chrome for Testing for them unfortunately
  • No idea about Windows (personally will never need it to work there)
nimble girder
#

time to use this to make my own Rabbit R1 competitor

#

i've developed a brand new, game-changing AI architecture that i'm calling the SHAM

#

/j

stone musk
#

my dream is to build an automated AI tool that realistically browses the web like my dad, at a quite relaxed pace, double clicking every link for some reason, downloading all sorts of malware, would put it up on a screen and just see what it gets up to

latent knot
#

This is the funniest teardown of the r1 i’ve ever read

hot crown
#

nice one. Just made my first integration test, was a breeze.

#

anything to provide text input to a page, other than dispatch_key_event ? that seems cumbersome.

#

no problem to wrap it to pass a String to some input field.

#

Looks like using multiple tabs is possible? Nice!

stone musk
stone musk
stone musk
hot crown
#

good to hear!

stone musk
hot crown
#

sounds good, will try soon!

spring flame
stone musk
#

I just published a version 3.0 since I've been made aware I should have put everything under the chrobot namespace, I've restructured the project accordingly now.

I've also recently updated the dependencies and made sure the package works with Gleam 1.4.1 πŸ™‚

There is some new features in the chrobot module too:

  • A convenience function for launching the browser with a visible window, which is useful for debugging (launch_window)
  • A polling function for requests to the protocol (poll)
  • Selectors that run on elements, given their RemoteObjectId (select_from and select_all_from)
jagged kestrel
#

Oooh lovely!!

#

Thank you

lament sigil
#

Glad my input was useful πŸ™‚

raven lake
#

this is awesome, nice work

stone musk
#

Specifically: should there perhaps be a clearer naming distinction between functions that are part of the builder, and functions that take in the builder but issue a command, or is it obvious because of the return types already? πŸ€”

#

Hm maybe keep the locator as just a builder and have a separate module called action with functions that take locators so it would be

import chrobot/locator
import chrobot/action

/// Use a locator to select an input and fill in text 
pub fn wibble() {
  locator.new()
  |> locator.label("First Name")
  |> action.fill("Lucy")
}
lament sigil
#

The most human API I have seen in this area is codeceptjs:

    I.amOnPage("/vatandas");
    I.waitForElement("#LoginForm_username", 10);
    I.fillField("#LoginForm_username", current.login);
    I.fillField("#LoginForm_password", current.password);
    I.click("button[type='submit']");
    I.waitForText("Make Hospital Appointment", 10);
    I.waitForText("You do not have an active appointment", 10);
    I.click({ css: ".hasta-randevu-card" });
    I.click("General Search");
    I.click({
      css: ".ant-form-item:nth-child(3) .ant-select.ant-select-enabled",
    });
    I.type("ant");
    I.click({ css: "span[title='ANTALYA']" });
    I.click({
      css: ".ant-form-item:nth-child(4) .ant-select.ant-select-enabled",
    });
    I.type("murat");
jagged kestrel
#

What do you think of the Ruby one?

  scenario 'valid inputs' do
    visit new_city_path
    fill_in 'Name', with: 'Minneapolis'
    click_on 'Create City'
    visit cities_path
    expect(page).to have_content('Minneapolis')
  end
stone musk
stone musk
lament sigil
#

@stone musk Hi, do you have any plans to migrate chrobot to make it work with latest std_lib?

stone musk
#

While working on this now I'm also kind of questioning if it's even necessary to do all this codegen to produce bindings to the devtools API because probably noone needs that?
I'm thinking about doing a major version bump where I drop all that and just write the bindings I need by hand, it's only a couple that are used, would also make the package much smaller.
Not sure though πŸ€·β€β™‚οΈ

lament sigil
#

Make a major bump and only add something is being requested sounds as a good start

stone musk
#

(not exaggerating when I say a thousand warnings btw πŸ˜… )

lament sigil
tidal shoal
#

hey @stone musk ! I recently came across an article regarding using playwright from elixir (https://ftes.de/articles/2024-11-14-using-playwright-in-elixir), it basically communicates with playwright via a websocket if I understand correctly. I'm struggling quite a bit to find production-grade browser automation software usable with gleam/elixir so maybe this is something that could work

stone musk
#

Interesting! I don't think I fully understand this, this is running playwright via node?
I think playwright is pretty good, but if I'm gonna run node, might as well just write the tests / stuff in js/ts.
If you wanna use playwright specifically from Gleam I'd just compile to js and use playwright (js) directly.

I'm struggling quite a bit to find production-grade browser automation software usable with gleam/elixir
As far as I know, wallaby is the go-to one on the BEAM: https://hexdocs.pm/wallaby/readme.html runs via selenium/webdriver, similar to playwright

tidal shoal
stone musk
#

I think there is some nice opportunity with this package (chrobot) to just be a slightly lighter alternative, no need for a webdriver inbetween, just a single browser, direct connection that cleans itself up – ideally I would rip out all the generated protocol bindings from the public API ( because realistically noone actually needs that ) and make the API much less tied to that and more straightforward.
Don't think I can dedicate the time to do that right now but I would be interested to do it, also maybe once gleam_erlang and the stdlib hit 1.0

stone musk