Introduction

Welcome to the documentation for Nolita! Nolita is a framework for building web-enabled agentic applications and workflows.

What do I need to use Nolita?

Before using Nolita you'll want to install Node.js, preferably version 20, and Google Chrome. If you want to work on the source code, you'll want pnpm too.

You don't need to install Nolita itself at all. You can run it directly with npx or import it into a Node.js project with npm i nolita. More information is available on the following pages on usage.

How does it work?

Nolita drives a Puppeteer installation using a local instance of Chrome and parses the accessiblity tree, preprocessing ARIA nodes to add additional accessibility information when necessary.

At the core of the framework is a state machine between Puppeteer and your model that enforces action steps with Zod.

In sum: "you tell the AI to do something on the internet and it does it."

In practice, Nolita allows for a large degree of granularity and extensibility, deploying agentic behaviour where necessary and stepping down into manual scripts or lower-level directives when you need to.

What do we mean by 'agentic'?

We consider agentic software to be software that can make decisions about its behaviour. For example, while a state machine normally restricts its inputs and outputs according to its state, it cannot reason about future inputs to give itself or outputs it should next derive in order to accomplish a higher-level goal.

In contrast, agentic software possesses the ability to evaluate its environment, consider multiple potential actions, and choose the most appropriate one based on predefined goals or learning algorithms. This decision-making capability allows agentic software to adapt and respond to dynamic conditions in a more flexible and intelligent manner than state machines.

How is Nolita different from [agentic framework]?

That being said, agents are an emerging art form. Agentic software based upon large language models to drive state machines are inherently probabilistic, and therefore not well-suited to production environments.

Nolita was written from the start to integrate with High Dimensional Research's Collective Memory Index. By working in concert, agentic actions are requested only for new and unfamiliar situations; if you imagine a webpage as a graph, then you can probably guess that most pages share structural similarities. We use a combination of accessibility tree preprocessing and graph comparison algorithms to then search for the most appropriate step in the browser state machine to execute, increasing the determinism and speed of the task, and leaving the agent to reason where it needs to.

What can I do with Nolita?

As a few examples, you can use Nolita to

  • gather structured data and chain it into other APIs, like sending an email or building an RSS feed;
  • quickly pipe information from the internet into your shell scripts, even from behind a log-in screen;
  • write and deploy Puppeteer scripts in natural language

Running tasks

If you want to test Nolita out quickly, you can do so on the command line by running

npx nolita

When running, you must provide a startUrl and objective. Before your first run, you will need to authenticate your model details and HDR keys with npx nolita auth.

If you don't include information, we will prompt you for it at runtime.

Flags

  • --startUrl dictates where we start the session.
  • --objective specifies what we want our agent to accomplish for us.
  • --headless specifies whether you want the browser to run in headless mode or not. We default to true, but you can set it to false to see the browser run.
  • --config takes a JSON file with the previous flags, if you want to provide them. You can also specify an inventory of personal data to use for the objective, like usernames and passwords. If you want to set per-task agent credentials, you can do so here and they will take precedence over nolita auth.

Optional features

  • --record will return the ID of the completed task session for later replay of the same actions, which you can use with the Page API.
  • --replay takes in a string of the ID above and currently just confirms that the route can be successfully followed. When using replay, objective and start URL are discarded.

Example configuration

{
  "agentProvider": "openai", // or process.env.HDR_AGENT_PROVIDER
  "agentModel": "gpt-4", // or process.env.HDR_AGENT_MODEL
  "agentApiKey": "sk-*********", // or process.env.HDR_AGENT_API_KEY
  "inventory": [
        {  
            "value": "student", 
            "name": "Username", 
            "type": "string" 
        },
        { 
            "value": "Password123",
            "name": "Password",
            "type": "string" 
        }
    ]
}

Using Nolita as a server

Since Nolita is written in TypeScript, it may not work for every pre-existing product. If you would like to use the same task-runner API provided by npx nolita for your own applications, you can run Nolita as a server.

npx nolita serve

This runs a local API for objective-first agentic navigation of a local Chrome instance.

After starting the server, you can see the /doc folder for the expected JSON payload.

Example payload

curl -X POST http://localhost:3000/browse \
      -H "Content-Type: application/json" \
      -d '{
        "browse_config": {
            "startUrl": "https://google.com",
            "objective": [
            "please tell me how many people edited wikipedia"
            ],
            "maxIterations": 10
        },
        "headless": true
        }
      '

Flags

  • --port will customize the port. By default, the server runs on port 3000.

Creating a new project

npx nolita create

Bootstraps a template application built on Express, React, TypeScript, and the core Nolita framework for making a user-facing, web-enabled, agentic product.

For reference, the template repository resides at hdresearch/create. Upon running the create command, the repository is cloned, its dependencies are installed and example files are instantiated.

Except one...

Please note that you will need to set up a .env file to use the application. You can copy and paste the example environment to modify it from there:

cp .env.example .env

What does it do?

The example application is extremely simple: it finds food from a specific location and outputs typed data, surfacing each step of the navigation in the console.

At the top of the App class in app/src/App.tsx you can start modifying the core objective:

  // You can change the URL to any website for the objective.
  const [url] = React.useState("https://www.google.com");


  const [objective] = React.useState(
    "where to find food in",
  );
  const [location, setLocation] = React.useState(
    "West Village"
  )

Tweaking these default values will then tweak what is requested from the server in app/src/lib/events.ts:

eventSource = new EventSource(
    `http://localhost:3040/api/browse?url=${encodeURIComponent(url)}&objective=${encodeURIComponent(objective)}%20${encodeURIComponent(location)}&maxIterations=10`,
  );

Which itself hits the server at /server/index.ts, running at port 3040 (in the dev environment) and taking in these parameters:

  const answer = await agentBrowser.browse(
    {
      startUrl: req.query.url as string,
      objective: [req.query.objective as string],
      maxIterations: parseInt(req.query.maxIterations as string) || 10,
    },
    CustomSchema,
  );

And returning each step back to the EventSource in as a callback to the browser's Logger:

  const logger = new Logger(["info"], (msg) => {
    return res.write(`data: ${msg}\n\n`);
  });

For more information, continue reading about the folder structure.

Using local models

Nolita can take both local and ollama providers for autonomous tasks. However, at present functionality is quite limited; we mark usage of local models as experimental, but available. We encourage you to experiment with your own workflows and see where delegating to local models makes sense.

Using Ollama

Ollama is supported as a provider and will connect to the default Ollama port when used. When you set ollama as a provider, the model name you provide to one that Ollama recognizes. For more information on available models for Ollama, see its documentation.

This means that we expect Ollama to be running when using Nolita with ollama as provider. You'll want to run Ollama's GUI or just ollama serve in a terminal window.

Using local model files

We use node-llama-cpp under the hood to run a model file. For information on node-llama-cpp and getting a model file for use, see its documentation.

Once a model file is downloaded, Nolita accepts the full path to the file as a model name when using a local provider.

You can set this, as usual, using npx nolita auth.

Usage and limitations

Smaller models (7B and below) are far less capable of holding enough context to navigate the web alone. They can, however, still process data well. In our current usage, we find that page.get() calls function consistently. Other calls, like autonomous browsing and navigation, can stumble on navigating the Nolita state machine. It is not uncommon to see errors about incorrect JSON from our generators.

You may, however, find better results with bigger local models; what models you can experiment with is a limitation of hardware.

Use Nolita as a scraper

If you're trying to just quickly gather data from the web, Nolita enables you to both dictate how much you want a model to autonomously search for data and what shape the data needs to come back in. These two features in tandem enable very powerful automation scripting. In this guide, which elaborates upon our findEmail example file, we'll walk you through what you need to do to get started.

Installation

Assuming you've installed Node.js (preferably the LTS), let's create a new folder for our script and put Nolita in it.

mkdir email-gathering && cd email-gathering
npm i nolita zod

NPM will make a package.json file and a node_modules folder for everything we need. Now let's just make our file and open it in your preferred code editor.

touch index.js

Necessary prerequisites

Nolita will want an applicable model set up before we start.

In your terminal, you may want to run npx nolita auth and input your model provider (and any applicable keys). If you prefer writing the configuration yourself, create .nolitarc in your home folder and fill in the blanks here:

{
    "agentModel": "", // use a model name from your provider
    "agentApiKey":"", // api key for anthropic / openai
    "agentProvider": "", //anthropic / openai
    "hdrApiKey":"" // optional
}

Imports and setup

We're going to need to import the Browser class and the makeAgent utility. The Browser class is the main navigation engine Nolita provides us; makeAgent is just a helper for making chat completion classes for our model without us having to think about it.

const { Browser, makeAgent } = require("nolita");
const { z } = require("zod");

Now we just write our main function, using those imports.

async function main() {
  const agent = makeAgent();
  const browser = await Browser.launch(true, agent);

Browser.launch() has two inputs: whether or not to run the browser headless, and the agent to attach to the Browser instance. If you want to see the browser open and autonomously navigate the page, change true to false.

Page actions

Now let's break down the next section. We'll use the browser constant here to make a Page class, corresponding to a browser tab, that we can manipulate and pilot:

  const page = await browser.newPage();
  await page.goto("https://hdr.is");
  await page.do("click on the company link");
  const answer = await page.get(
    "Find all the email addresses on the page",
    z.object({
      emails: z
        .array(z.string())
        .describe("The email addresses found on the page"),
    })
  );

We use .goto() to start us off at a URL, which instantiates our session. .do() gives a natural language directive to our model to navigate the page and issue a command to the browser. Please note that it's a single action, specific to the current page -- if you want to give a broader objective to the model, you'll want .browse() instead.

.get(), likewise, is specific to the current page. We provide it an objective and a schema. Our model will parse the page content and return the data in the shape we provide in the schema. This shape corresponds to a basic Zod object schema. You'll want to ensure you chain .describe() to let our model know exactly what you want.

In this case, we ask for an array of emails as strings inside a JSON object.

  console.log(answer);
  await browser.close();
}

main();

Adding it all together

Once it has the answer, it will write it to the console (making it pipeable to an arbitrary text file) and clean up the process.

In sum, our script looks like this:

const { Browser, makeAgent } = require("nolita");
const { z } = require("zod");

async function main() {
  const agent = makeAgent();
  const browser = await Browser.launch(true, agent);
  const page = await browser.newPage();
  await page.goto("https://hdr.is");
  await page.do("click on the company link");
  const answer = await page.get(
    "Find all the email addresses on the page",
    z.object({
      emails: z
        .array(z.string())
        .describe("The email addresses found on the page"),
    })
  );
  console.log(answer);
  await browser.close();
}

main();

We can run it by just running node index.js and even write the result with node index.js > emails.txt. If we try running the latter command, we'll see in our .txt file:

{
  emails: [
    'tynan.daly@hdr.is',
    'matilde.park@hdr.is',
    'SUPPORT@HDR.IS'
  ]
}

The folder structure

The default Nolita project uses the following folders to delineate each layer of an agentic application:

  • /agent includes the creation of a chat completion API, which is then passed to the Nolita Agent class for integration with the browser state machine.
  • /app includes all front-end code for the application.
  • /extensions includes the integration of inventories as well as defining custom types for your responses from the agent.
  • /server includes all back-end code for running the browse loop.

Inspiration

While working with other agentic companies, our research found the following separation of roles, which inspired the structure of the Nolita project.

LLM

  • What model is used on the 'bottom layer' of the stack?
  • Do you allow users to swap out underlying models?
  • System prompt is included here.

Agentic logic

  • What is the agent’s prompt on top of the underlying system prompt?
  • What’s the structure of the event loop we place an LLM in?
  • Does it self-iterate and improve? Is it set?
  • Is it a formal state machine, or is it entirely prompt driven?

Toolchain

  • How do we define what actions the LLM can perform?
  • Is the toolchain defined as part of prompt manipulation, or is it formally constrained (as in, proceeding along a graph of possible actions)?
    • By going entirely prompt-driven, one can fall into “phantom actions” (reporting an action is undertaken, as opposed to making it such that saying an action is the same thing as the action itself.)
    • Let me expound a little here: there’s a situation where by simply saying “I’ll invite this person” is itself part of a command to the toolchain, as opposed to a separate report to an observer. When writing entirely prompt-driven applications, one can hallucinate the structure of the action.
  • Do we write an underlying API to formalise all commands without directly hitting external APIs?

Event loop manipulation

  • Does the agent then “double check its work” before proceeding to presentation layer?
  • Do we GOTO 1 so to speak, if it performs incorrectly?

Presentation

  • How transparent is the stack to the user?
  • Is the agent abstracted as a “product” (with an agent’s artificial monologue puppeteering a conventional software stack), or anthropomorphized in its own right?
  • Is the user configuring tasks, agents, events?

/agent

The /agent folder of a Nolita project concerns itself with setting up a chat completion API and passing it to the Agent class.

Supported models

All OpenAI and Anthropic text-mode models are supported.

While we currently predominantly support OpenAI and Anthropic models, you can also write your own chat completion API for any model. For more information, see the Agent reference document.

/app

The /app folder bootstraps a starter front-end application built on TypeScript and React with Vite as its development server and build pipeline, and Tailwind as its CSS framework.

The sample application

  • searches a query on the internet ("where to get food in...");
  • returns each step of its navigation for your reference;
  • and finally returns with typed data, as defined in the /extensions folder.

Removing the application

If you want to bootstrap a Nolita application without a front-end, you can remove the app folder and the remaining calls to its build process.

In package.json, amend the scripts to

"scripts": {
    "server": "npx tsx ./server/index.ts",
    "start": "NODE_ENV=production npm run server"
  },

and in /server/index.ts, remove this remaining call to front-end code:

if (process.env.NODE_ENV === "production") {
   app.use("/", express.static(path.join(__dirname, "../app/dist")));
 }

/extensions

The extensions folder is intended for additions to the core agentic stack:

  • inventories, i.e. personal data outside the prompt context;
  • custom types, for specifying structured responses for the workflow

If you need to derive external information to pipe into either, it is also best done in this folder.

You could, for example, gather user session data to then derive which custom type to pass to the agent browser for the session. You could also, in addition, import user data from a database for use as an inventory while keeping all user data outside the prompt context itself.

For more information on inventories and how they work, see Inventory.

/server

The /server folder holds all back-end logic for the application.

We spin up all applicable classes in the Nolita framework, take the additions from the /extensions folder, and finally user input from the front-end application and pass an objective to the agent to perform on a local, sandboxed Chrome instance.

This section of the application is best-suited for including any additional API calls you need -- whether sending emails based upon the response from the agent, recording user objectives in your own database, or pre-processing input before incorporating the agent browse session.

Specifying types

We use Zod under the hood to enforce typed responses from the agent. You can use this to enforce predictable output for your application.

For example, in the example repository we include the following in extensions/schema.ts:

export const CustomSchema = z.object({
  restaurants: z.array(z.string().describe("The name of a restaurant")),
});

In this example, we inform the agent that we expect the response that reports a completed objective to also include an array of restaurants. When we finally get ObjectiveComplete back from the agent as an answer, we will also see restaurants as part of the response:

// inside the event response from the server`
export type AgentEvent = {
  command?: string[];
  done?: boolean;
  progressAssessment?: string;
  description?: string;
  objectiveComplete?: {
    kind: "ObjectiveComplete";
    restaurants: string[];
    result: string;
  }
  objectiveFailed: {
    kind: "ObjectiveFailed";
    result: string;
  }
};

Extending your application

You can most easily start tweaking your project by modifying the two pieces of state at the top of the App class in /app/src/App.tsx:

  // You can change the URL to any website for the objective.
  const [url] = React.useState("https://www.google.com");


  const [objective] = React.useState(
    "where to find food in",
  );
  const [location, setLocation] = React.useState(
    "West Village"
  )

and its appropriate code in the server, at ./server/index.ts:

// ...
app.get("/api/browse", async (req, res) => {
  res.setHeader("Content-Type", "text/event-stream");
  const browser = await Browser.create(true);
  const logger = new Logger(["info"], (msg) => {
    return res.write(`data: ${msg}\n\n`);
  });
  const agentBrowser = new AgentBrowser({ 
    agent, 
    browser, 
    // ...

You can then pass different objectives and start URLs to the backend, or pre-process incoming objectives with calls to APIs. You can pass steps from the Logger to external classes if necessary. You can also take the answer reported back from agentBrowser.browse and chain more actions onto it before returning to the user.

Extensions and inventories

Remember that you can always pass different pieces of personal data, i.e. inventories per user, whether stored in a database outside the template application or as another component of the application itself. By doing so, you can effectively customize agentic responses to the user as necessary.

Additional infrastructure

In production, you may find that one browser can't handle significant load. In such cases, an external API for spinning up multiple browsers on multiple machines may be necessary. Additional changes for Nolita are incoming for multi-tab browsing within one machine and for attaching white-label headless browsers in a future release.

Class: Agent

Constructors

new Agent()

new Agent(agentArgs: {objectgeneratorOptions: ModelConfig;providerConfig: ProviderConfig;systemPrompt: string; }): Agent

Parameters

ParameterType
agentArgsobject
agentArgs.objectgeneratorOptions?ModelConfig
agentArgs.providerConfigProviderConfig
agentArgs.systemPrompt?string

Returns

Agent

Defined in

src/agent/agent.ts:33

Properties

systemPrompt?

optional systemPrompt: string

Defined in

src/agent/agent.ts:31

Methods

actionCall()

actionCall<T>(prompt: CoreMessage[], commandSchema: T, opts: {autoSlice: true;maxDelay: 10000;numOfAttempts: 5;startingDelay: 1000;timeMultiple: 2; }): Promise<any>

Generate a command response from the model and return the parsed data

Type Parameters

Type Parameter
T extends ZodType<any, ZodTypeDef, any>

Parameters

ParameterTypeDescription
promptCoreMessage[]The prompt to send to the model
commandSchemaTThe schema to validate the response
optsobjectOptions for actionCall function
opts.autoSlicebooleanWhether to automatically slice the response
opts.maxDelaynumberMaximum delay
opts.numOfAttemptsnumberMaximum number of retries
opts.startingDelaynumberInitial delay in milliseconds
opts.timeMultiplenumberMultiplier for the delay

Returns

Promise<any>

The parsed response data as

Command Schema

Defined in

src/agent/agent.ts:229


askCommand()

askCommand<T>(prompt: CoreMessage[], schema: T, opts: {maxDelay: 10000;numOfAttempts: 5;startingDelay: 1000;timeMultiple: 2; }): Promise<any>

Type Parameters

Type Parameter
T extends ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>

Parameters

ParameterType
promptCoreMessage[]
schemaT
optsobject
opts.maxDelaynumber
opts.numOfAttemptsnumber
opts.startingDelaynumber
opts.timeMultiplenumber

Returns

Promise<any>

Defined in

src/agent/agent.ts:182


call()

call<T>(prompt: CoreMessage[], responseSchema: T, opts?: {autoSlice: boolean; }): Promise<T>

Type Parameters

Type Parameter
T extends ZodType<any, ZodTypeDef, any>

Parameters

ParameterType
promptCoreMessage[]
responseSchemaT
opts?object
opts.autoSlice?boolean

Returns

Promise<T>

Defined in

src/agent/agent.ts:202


chat()

chat(prompt: string): Promise<any>

Chat with

Parameters

ParameterTypeDescription
promptstring

Returns

Promise<any>

The response from the model

Defined in

src/agent/agent.ts:281


defaultObjectGeneratorOptions()

defaultObjectGeneratorOptions(model: string): ObjectGeneratorOptions

Parameters

ParameterType
modelstring

Returns

ObjectGeneratorOptions

Defined in

src/agent/agent.ts:45


generateResponseType()

generateResponseType<T>(currentState: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }, memories: {actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }, responseSchema: T): Promise<TypeOf<T>>

Type Parameters

Type Parameter
T extends ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>

Parameters

ParameterType
currentStateobject
currentState.ariaTreestring
currentState.kind"ObjectiveState"
currentState.objectivestring
currentState.progressstring[]
currentState.urlstring
memoriesobject
memories.actionStepobject
memories.actionStep.command?any
memories.actionStep.descriptionstring
memories.actionStep.objectiveComplete?{kind: "ObjectiveComplete";result: string; } | {}
memories.actionStep.progressAssessmentstring
memories.objectiveStateobject
memories.objectiveState.ariaTreestring
memories.objectiveState.kind"ObjectiveState"
memories.objectiveState.objectivestring
memories.objectiveState.progressstring[]
memories.objectiveState.urlstring
responseSchemaT

Returns

Promise<TypeOf<T>>

Defined in

src/agent/agent.ts:94


modifyActions()

modifyActions(currentState: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }, memory: {actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }, config?: {inventory: Inventory;maxAttempts: number;systemPrompt: string; }): Promise<undefined | {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; }>

Parameters

ParameterType
currentStateobject
currentState.ariaTreestring
currentState.kind"ObjectiveState"
currentState.objective?string
currentState.progress?string[]
currentState.url?string
memory?object
memory.actionStep?object
memory.actionStep.command?any
memory.actionStep.description?string
memory.actionStep.objectiveComplete?{kind: "ObjectiveComplete";result: string; } | {}
memory.actionStep.progressAssessment?string
memory.objectiveState?object
memory.objectiveState.ariaTree?string
memory.objectiveState.kind?"ObjectiveState"
memory.objectiveState.objective?string
memory.objectiveState.progress?string[]
memory.objectiveState.url?string
config?object
config.inventory?Inventory
config.maxAttempts?number
config.systemPrompt?string

Returns

Promise<undefined | {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; }>

Defined in

src/agent/agent.ts:121


prompt()

prompt(currentState: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }, memories: {actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }[], config?: {inventory: Inventory;systemPrompt: string; }): CoreMessage[]

Generate a prompt for the user to complete an objective

Parameters

ParameterTypeDescription
currentStateobjectThe current state of the objective
currentState.ariaTreestring-
currentState.kind"ObjectiveState"-
currentState.objective?string-
currentState.progress?string[]-
currentState.url?string-
memories?{actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }[]The memories to use as examples
config?objectConfiguration options for the prompt
config.inventory?InventoryThe inventory to use for the prompt
config.systemPrompt?stringThe system prompt to use for the prompt

Returns

CoreMessage[]

string - The prompt for the user to complete the objective

Defined in

src/agent/agent.ts:64


returnCall()

returnCall<T>(prompt: CoreMessage[], responseSchema: T, opts: {autoSlice: true;maxDelay: 10000;numOfAttempts: 5;startingDelay: 1000;timeMultiple: 2; }): Promise<TypeOf<T>>

Get information from the model and return the parsed data

Type Parameters

Type Parameter
T extends ZodType<any, ZodTypeDef, any>

Parameters

ParameterTypeDescription
promptCoreMessage[]The prompt to send to the model
responseSchemaTThe schema to validate the response
optsobjectOptions for actionCall function
opts.autoSlicebooleanWhether to automatically slice the response
opts.maxDelaynumberMaximum delay
opts.numOfAttemptsnumberMaximum number of retries
opts.startingDelaynumberInitial delay in milliseconds
opts.timeMultiplenumberMultiplier for the delay

Returns

Promise<TypeOf<T>>

The parsed response data as

Response Schema

Defined in

src/agent/agent.ts:261

Class: AgentBrowser

Constructors

new AgentBrowser()

new AgentBrowser(agentBrowserArgs: {agent: Agent;behaviorConfig: {actionDelay: number;goToDelay: number;telemetry: boolean; };browser: Browser;collectiveMemoryConfig: {apiKey: string;endpoint: string; };inventory: Inventory;logger: Logger; }): AgentBrowser

Parameters

ParameterType
agentBrowserArgsobject
agentBrowserArgs.agentAgent
agentBrowserArgs.behaviorConfig?object
agentBrowserArgs.behaviorConfig.actionDelaynumber
agentBrowserArgs.behaviorConfig.goToDelaynumber
agentBrowserArgs.behaviorConfig.telemetryboolean
agentBrowserArgs.browserBrowser
agentBrowserArgs.collectiveMemoryConfig?object
agentBrowserArgs.collectiveMemoryConfig.apiKeystring
agentBrowserArgs.collectiveMemoryConfig.endpointstring
agentBrowserArgs.inventory?Inventory
agentBrowserArgs.logger?Logger

Returns

AgentBrowser

Defined in

src/agentBrowser.ts:42

Properties

agent

agent: Agent

Defined in

src/agentBrowser.ts:29


browser

browser: Browser

Defined in

src/agentBrowser.ts:30


config

config: {actionDelay: number;goToDelay: number;telemetry: boolean; }

actionDelay

actionDelay: number

goToDelay

goToDelay: number

telemetry

telemetry: boolean

Defined in

src/agentBrowser.ts:32


hdrConfig

hdrConfig: {apiKey: string;endpoint: string; }

apiKey

apiKey: string

endpoint

endpoint: string

Defined in

src/agentBrowser.ts:34


inventory?

optional inventory: Inventory

Defined in

src/agentBrowser.ts:33


logger

logger: Logger

Defined in

src/agentBrowser.ts:31


page

page: undefined | Page

Defined in

src/agentBrowser.ts:35

Methods

browse()

browse<TObjectiveComplete>(browserObjective: {maxIterations: number;objective: string[];startUrl: string; }, responseType: ZodObject<{command: ZodOptional<ZodType<any, ZodTypeDef, any>>;description: ZodString;objectiveComplete: ZodOptional<ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>> | ZodOptional<ZodObject<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["unknownKeys"], TObjectiveComplete["_def"]["catchall"], objectOutputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>, objectInputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>>>;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "command" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "command" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>): Promise<undefined | {content: Promise<string>;result: {kind: "ObjectiveFailed";result: failureReason; };url: string; } | {content: string;kind: "ObjectiveComplete";result: stepResponse;url: string; }>

Type Parameters

Type ParameterDefault type
TObjectiveComplete extends AnyZodObjectZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>

Parameters

ParameterType
browserObjectiveobject
browserObjective.maxIterationsnumber
browserObjective.objectivestring[]
browserObjective.startUrlstring
responseTypeZodObject<{command: ZodOptional<ZodType<any, ZodTypeDef, any>>;description: ZodString;objectiveComplete: ZodOptional<ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>> | ZodOptional<ZodObject<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["unknownKeys"], TObjectiveComplete["_def"]["catchall"], objectOutputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>, objectInputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>>>;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "command" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "command" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>

Returns

Promise<undefined | {content: Promise<string>;result: {kind: "ObjectiveFailed";result: failureReason; };url: string; } | {content: string;kind: "ObjectiveComplete";result: stepResponse;url: string; }>

Defined in

src/agentBrowser.ts:221


close()

close(): Promise<void>

Returns

Promise<void>

Defined in

src/agentBrowser.ts:327


followPath()

followPath<TObjectiveComplete>(memorySequenceId: string, page: Page, browserObjective: {maxIterations: number;objective: string[];startUrl: string; }, responseSchema: ZodObject<{description: ZodString;objectiveComplete: ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }> | TObjectiveComplete;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>): Promise<undefined | {content: string;kind: "ObjectiveComplete";result: stepResponse;url: string; }>

Type Parameters

Type ParameterDefault type
TObjectiveComplete extends AnyZodObjectZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>

Parameters

ParameterType
memorySequenceIdstring
pagePage
browserObjectiveobject
browserObjective.maxIterationsnumber
browserObjective.objectivestring[]
browserObjective.startUrlstring
responseSchemaZodObject<{description: ZodString;objectiveComplete: ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }> | TObjectiveComplete;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>

Returns

Promise<undefined | {content: string;kind: "ObjectiveComplete";result: stepResponse;url: string; }>

Defined in

src/agentBrowser.ts:110


followRoute()

followRoute(page: Page, memories: {actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }[]): Promise<void>

Parameters

ParameterType
pagePage
memories{actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }[]

Returns

Promise<void>

Defined in

src/agentBrowser.ts:169


memorize()

memorize(state: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }, action: ModelResponseType): Promise<void>

Parameters

ParameterType
stateobject
state.ariaTreestring
state.kind"ObjectiveState"
state.objectivestring
state.progressstring[]
state.urlstring
actionModelResponseType

Returns

Promise<void>

Defined in

src/agentBrowser.ts:301


performMemory()

performMemory(page: Page, memory: {actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }): Promise<undefined | {content: Promise<string>;result: {kind: "ObjectiveFailed";result: failureReason; };url: string; }>

Parameters

ParameterType
pagePage
memoryobject
memory.actionStepobject
memory.actionStep.command?any
memory.actionStep.descriptionstring
memory.actionStep.objectiveComplete?{kind: "ObjectiveComplete";result: string; } | {}
memory.actionStep.progressAssessmentstring
memory.objectiveStateobject
memory.objectiveState.ariaTreestring
memory.objectiveState.kind"ObjectiveState"
memory.objectiveState.objectivestring
memory.objectiveState.progressstring[]
memory.objectiveState.urlstring

Returns

Promise<undefined | {content: Promise<string>;result: {kind: "ObjectiveFailed";result: failureReason; };url: string; }>

Defined in

src/agentBrowser.ts:71


remember()

remember(state: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }): Promise<{actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }[]>

Parameters

ParameterType
stateobject
state.ariaTreestring
state.kind"ObjectiveState"
state.objectivestring
state.progressstring[]
state.urlstring

Returns

Promise<{actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }[]>

Defined in

src/agentBrowser.ts:186


reset()

reset(): void

Returns

void

Defined in

src/agentBrowser.ts:296


returnErrorState()

returnErrorState(page: Page, failureReason: string): Promise<{content: Promise<string>;result: {kind: "ObjectiveFailed";result: failureReason; };url: string; }>

Parameters

ParameterType
pagePage
failureReasonstring

Returns

Promise<{content: Promise<string>;result: {kind: "ObjectiveFailed";result: failureReason; };url: string; }>

content

content: Promise<string>

result

result: {kind: "ObjectiveFailed";result: failureReason; }

result.kind

kind: string = "ObjectiveFailed"

result.result

result: string = failureReason

url

url: string

Defined in

src/agentBrowser.ts:315


step()

step<TObjectiveComplete>(page: Page, currentObjective: string, responseType: ZodObject<{command: ZodOptional<ZodType<any, ZodTypeDef, any>>;description: ZodString;objectiveComplete: ZodOptional<ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>> | ZodOptional<ZodObject<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["unknownKeys"], TObjectiveComplete["_def"]["catchall"], objectOutputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>, objectInputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>>>;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "command" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "command" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>): Promise<any>

Type Parameters

Type ParameterDefault type
TObjectiveComplete extends AnyZodObjectZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>

Parameters

ParameterType
pagePage
currentObjectivestring
responseTypeZodObject<{command: ZodOptional<ZodType<any, ZodTypeDef, any>>;description: ZodString;objectiveComplete: ZodOptional<ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>> | ZodOptional<ZodObject<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["unknownKeys"], TObjectiveComplete["_def"]["catchall"], objectOutputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>, objectInputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>>>;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "command" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "command" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>

Returns

Promise<any>

Defined in

src/agentBrowser.ts:190

Class: Browser

Represents a browser session using Puppeteer. Manages the creation of new browser pages and handles the browser instance.

Constructors

new Browser()

new Browser(browser: Browser, agent: Agent, logger?: Logger, opts?: {apiKey: string;disableMemory: boolean;endpoint: string;inventory: Inventory;mode: BrowserMode; }): Browser

Initializes a new instance of the Browser class.

Parameters

ParameterTypeDescription
browserBrowserThe Puppeteer browser instance.
agentAgentThe agent instance that interacts with the browser.
logger?LoggerOptional logger for logging browser activities.
opts?objectOptional configuration options for the browser.
opts.apiKey?stringThe API key for the browser.
opts.disableMemory?booleanSpecifies if the browser should disable memory.
opts.endpoint?stringThe HDR collective memory endpoint.
opts.inventory?InventoryThe inventory to use for the browser.
opts.mode?BrowserModeThe mode of the browser (e.g., text).

Returns

Browser

Defined in

src/browser/browser.ts:48

Properties

agent

agent: Agent

The agent instance that interacts with the browser.

Defined in

src/browser/browser.ts:31


browser

browser: Browser

The Puppeteer browser instance.

Defined in

src/browser/browser.ts:29


logger?

optional logger: Logger

Optional logger for logging browser activities.

Defined in

src/browser/browser.ts:32


mode

mode: BrowserMode

The mode of the browser which could be headless or non-headless.

Defined in

src/browser/browser.ts:30


pages

pages: Map<string, Page>

Defined in

src/browser/browser.ts:25

Methods

close()

close(): Promise<void>

Closes all pages and the browser instance.

Returns

Promise<void>

A promise that resolves when all pages and the browser have been closed.

Defined in

src/browser/browser.ts:165


newPage()

newPage(opts?: {agent: Agent;device: Device;disableMemory: boolean;inventory: Inventory;pageId: string; }): Promise<Page>

Asynchronously creates and returns a new Page instance, potentially emulating a specific device.

Parameters

ParameterType
opts?object
opts.agent?Agent
opts.device?Device
opts.disableMemory?boolean
opts.inventory?Inventory
opts.pageId?string

Returns

Promise<Page>

A promise that resolves to the newly created Page instance.

Defined in

src/browser/browser.ts:124


launch()

static launch(headless: boolean, agent: Agent, logger?: Logger, opts?: {apiKey: string;browserLaunchArgs: string[];browserWSEndpoint: string;disableMemory: boolean;endpoint: string;inventory: Inventory;mode: BrowserMode; }): Promise<Browser>

Asynchronously launch a new Browser instance with given configuration.

Parameters

ParameterTypeDescription
headlessbooleanSpecifies if the browser should be launched in headless mode.
agentAgentThe agent that will interact with the browser.
logger?LoggerOptional logger to pass for browser operation logs.
opts?objectOptional configuration options for launching the browser.
opts.apiKey?stringThe API key for the browser.
opts.browserLaunchArgs?string[]Additional arguments for launching the browser.
opts.browserWSEndpoint?stringThe WebSocket endpoint to connect to a browser instance.
opts.disableMemory?booleanSpecifies if the browser should disable memory.
opts.endpoint?stringThe HDR collective memory endpoint.
opts.inventory?InventoryThe inventory to use for the browser.
opts.mode?BrowserModeThe mode of the browser, defaults to text.

Returns

Promise<Browser>

A promise that resolves to an instance of Browser.

Defined in

src/browser/browser.ts:89

Class: Inventory

Constructors

new Inventory()

new Inventory(inventory: InventoryValue[]): Inventory

Parameters

ParameterType
inventoryInventoryValue[]

Returns

Inventory

Defined in

src/inventory/inventory.ts:11

Properties

maskedInventory

maskedInventory: InventoryValue[] = []

Defined in

src/inventory/inventory.ts:9

Methods

censor()

censor(str: string): string

Parameters

ParameterType
strstring

Returns

string

Defined in

src/inventory/inventory.ts:43


replaceMask()

replaceMask(value: string): string

Parameters

ParameterType
valuestring

Returns

string

Defined in

src/inventory/inventory.ts:30


toString()

toString(): string

Returns

string

Defined in

src/inventory/inventory.ts:22

Class: Logger

Constructors

new Logger()

new Logger(logLevels?: string[], callback?: (input: string) => any): Logger

Parameters

ParameterType
logLevels?string[]
callback?(input: string) => any

Returns

Logger

Defined in

src/utils/debug.ts:25

Properties

callback

callback: undefined | (input: string) => any

Defined in

src/utils/debug.ts:23


events

events: EventEmitter<DefaultEventMap>

Defined in

src/utils/debug.ts:22


logLevels

logLevels: string[]

Defined in

src/utils/debug.ts:21


logStream

logStream: string[]

Defined in

src/utils/debug.ts:20

Methods

info()

info(input: string): void

Parameters

ParameterType
inputstring

Returns

void

Defined in

src/utils/debug.ts:43


log()

log(input: string): void

Parameters

ParameterType
inputstring

Returns

void

Defined in

src/utils/debug.ts:36


streamHandler()

streamHandler(): void

Returns

void

Defined in

src/utils/debug.ts:47

Class: Nolita

High level wrapper for the Nolita API.

Constructors

new Nolita()

new Nolita(hdrApiKey: string, providerApiKey: string, opts?: {endpoint: string;model: string;provider: string;systemPrompt: string;temperature: number; }): Nolita

Initializes a new instance of the Nolita class.

Parameters

ParameterTypeDescription
hdrApiKeystringThe HDR api key.
providerApiKeystringThe collective memory endpoint. Defaults to "https://api.hdr.is".
opts?objectOptional configuration options
opts.endpoint?stringThe collective memory endpoint. Defaults to "https://api.hdr.is".
opts.model?stringThe model to use. Defaults to "gpt-4".
opts.provider?stringThe provider to use. Defaults to "openai".
opts.systemPrompt?stringThe system prompt to use.
opts.temperature?numberThe temperature to use. Defaults to 0

Returns

Nolita

Defined in

src/nolita.ts:34

Properties

agent

agent: Agent

The agent instance that interacts with the browser. Defaults to OpenAI's gpt-4

Defined in

src/nolita.ts:21


hdrApiKey

hdrApiKey: string

The HDR api key.

Defined in

src/nolita.ts:18


hdrEndpoint

hdrEndpoint: string = "https://api.hdr.is"

The collective memory endpoint. Defaults to "https://api.hdr.is".

Defined in

src/nolita.ts:19

Methods

task()

task(task: {objective: string;returnSchema: ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>;startUrl: string; }, opts?: {headless: boolean;inventory: Inventory;maxTurns: number; }): Promise<{command: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[];description: string;objectiveComplete: updatedObjectiveComplete;progressAssessment: string; } | {objectiveFailed: {kind: "ObjectiveFailed";result: failureReason;url: string; }; }>

Executes a task using the Nolita API.

Parameters

ParameterTypeDescription
taskobjectstring "Tell me the email addresses on the contact page"
task.objectivestringstring The objective of the task "Tell me the email addresses on the contact page"
task.returnSchema?ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>z.ZodObject The schema to return
task.startUrl?stringstring The URL the taks will begin at "https://hdr.is"
opts?object
opts.headless?booleanboolean Whether to run the browser in headless mode. Defaults to false
opts.inventory?InventoryInventory The inventory to use when doing tasks
opts.maxTurns?numbernumber The maximum number of turns to allow. Defaults to 10

Returns

Promise<{command: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[];description: string;objectiveComplete: updatedObjectiveComplete;progressAssessment: string; } | {objectiveFailed: {kind: "ObjectiveFailed";result: failureReason;url: string; }; }>

Defined in

src/nolita.ts:69

Class: Page

Represents a web page and provides methods to interact with it.

Constructors

new Page()

new Page(page: Page, agent: Agent, opts?: {apiKey: string;disableMemory: boolean;endpoint: string;inventory: Inventory;logger: Logger;pageId: string; }): Page

Creates a new Page instance.

Parameters

ParameterTypeDescription
pagePageThe PuppeteerPage object representing the browser page.
agentAgentThe Agent object representing the user agent interacting with the page.
opts?objectOptional parameters for additional configuration.
opts.apiKey?stringAn optional API key for accessing collective memory.
opts.disableMemory?boolean-
opts.endpoint?stringAn optional endpoint for collective memory.
opts.inventory?InventoryAn optional inventory object for storing and retrieving user data.
opts.logger?LoggerAn optional logger for logging events; if not provided, logging may be absent.
opts.pageId?stringAn optional unique identifier for the page; if not provided, a UUID will be generated.

Returns

Page

Defined in

src/browser/page.ts:68

Properties

agent

agent: Agent

Defined in

src/browser/page.ts:51


error

error: undefined | string

Defined in

src/browser/page.ts:55


logger?

optional logger: Logger

Defined in

src/browser/page.ts:52


page

page: Page

Defined in

src/browser/page.ts:42


pageId

pageId: string

Defined in

src/browser/page.ts:50


progress

progress: string[] = []

Defined in

src/browser/page.ts:53

Methods

browse()

browse(objective: string, opts: {agent: Agent;inventory: Inventory;maxTurns: number;progress: string[];schema: ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>; }): Promise<{command: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[];description: string;objectiveComplete: updatedObjectiveComplete;progressAssessment: string; } | {objectiveFailed: {kind: "ObjectiveFailed";result: failureReason;url: string; }; }>

Browses the page based on the request and return type.

Parameters

ParameterTypeDescription
objectivestring-
optsobjectAdditional options.
opts.agent?AgentThe agent to use (optional).
opts.inventory?InventoryThe inventory object (optional).
opts.maxTurnsnumberThe maximum number of turns to browse.
opts.progress?string[]The progress towards the objective (optional).
opts.schema?ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>The Zod schema for the return type.

Returns

Promise<{command: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[];description: string;objectiveComplete: updatedObjectiveComplete;progressAssessment: string; } | {objectiveFailed: {kind: "ObjectiveFailed";result: failureReason;url: string; }; }>

A promise that resolves to the retrieved data.

Defined in

src/browser/page.ts:634


close()

close(): Promise<void>

Closes the page.

Returns

Promise<void>

Defined in

src/browser/page.ts:154


content()

content(): Promise<string>

Returns the text content of the page.

Returns

Promise<string>

A promise that resolves to the text content of the page.

Defined in

src/browser/page.ts:102


do()

do(request: string, opts?: {agent: Agent;delay: number;inventory: Inventory;progress: string[];schema: ZodType<any, ZodTypeDef, any>; }): Promise<void>

Performs a request on the page.

Parameters

ParameterTypeDescription
requeststringThe request or objective.
opts?objectAdditional options.
opts.agent?AgentThe agent to use (optional).
opts.delay?numberThe delay in milliseconds after performing the action (default: 100).
opts.inventory?InventoryThe inventory object (optional).
opts.progress?string[]The progress towards the objective (optional).
opts.schema?ZodType<any, ZodTypeDef, any>The Zod schema for the return type.

Returns

Promise<void>

Defined in

src/browser/page.ts:511


followRoute()

followRoute(memoryId: string, opts?: {delay: number;inventory: Inventory;maxTurns: number;schema: ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>; }): Promise<any>

Follows a route based on a memory sequence.

Parameters

ParameterTypeDescription
memoryIdstringThe memory sequence ID.
opts?objectAdditional options.
opts.delay?numberThe delay in milliseconds after performing the action (default: 100).
opts.inventory?InventoryThe inventory object (optional).
opts.maxTurns?numberThe maximum number of turns to follow the route. Currently not used.
opts.schema?ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>-

Returns

Promise<any>

A promise that resolves to the retrieved data.

Throws

An error is thrown if no memories are found for the memory sequence ID.

Defined in

src/browser/page.ts:710


generateCommand()

generateCommand<T>(request: string, opts: {agent: Agent;inventory: Inventory;progress: string[];schema: T; }): Promise<TypeOf<T>>

Generates a command based on the request and current state.

Type Parameters

Type Parameter
T extends ZodType<any, ZodTypeDef, any>

Parameters

ParameterTypeDescription
requeststringThe request or objective.
optsobjectAdditional options.
opts.agent?AgentThe agent to use (optional).
opts.inventory?InventoryThe inventory object (optional).
opts.progress?string[]The progress towards the objective (optional).
opts.schemaTThe Zod schema for the return type.

Returns

Promise<TypeOf<T>>

Defined in

src/browser/page.ts:474


get()

get<T>(request: string, outputSchema: ZodType<any, ZodTypeDef, any>, opts?: {agent: Agent;mode: "text" | "html" | "aria" | "markdown" | "image";progress: string[]; }): Promise<TypeOf<T>>

Retrieves data from the page based on the request and return type.

Type Parameters

Type ParameterDescription
T extends ZodType<any, ZodTypeDef, any>The type of the return value.

Parameters

ParameterTypeDefault valueDescription
requeststringundefinedThe request or objective.
outputSchemaZodType<any, ZodTypeDef, any>ObjectiveCompleteThe Zod schema for the return type.
opts?objectundefinedAdditional options.
opts.agent?AgentundefinedThe agent to use (optional).
opts.mode?"text" | "html" | "aria" | "markdown" | "image"undefined-
opts.progress?string[]undefinedThe progress towards the objective (optional).

Returns

Promise<TypeOf<T>>

A promise that resolves to the retrieved data.

Defined in

src/browser/page.ts:546


getState()

getState(): undefined | {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }

Returns the current state of the page.

Returns

undefined | {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }

The objective state of the page.

Defined in

src/browser/page.ts:347


goto()

goto(url: string, opts?: {delay: number; }): Promise<void>

Navigates to a URL.

Parameters

ParameterTypeDescription
urlstringThe URL to navigate to.
opts?objectThe navigation options.
opts.delay?numberThe delay in milliseconds after navigating to the URL.

Returns

Promise<void>

Defined in

src/browser/page.ts:189


html()

html(): Promise<string>

Returns the HTML content of the page.

Returns

Promise<string>

A promise that resolves to the HTML content of the page.

Defined in

src/browser/page.ts:132


injectBoundingBoxes()

injectBoundingBoxes(): Promise<void>

Injects bounding boxes around clickable elements on the page.

Returns

Promise<void>

Defined in

src/browser/page.ts:742


makePrompt()

makePrompt(request: string, opts?: {agent: Agent;inventory: Inventory;progress: string[]; }): Promise<{prompt: CoreMessage[];state: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }; }>

Creates a prompt for the agent based on the request and current state.

Parameters

ParameterTypeDescription
requeststringThe request or objective.
opts?objectThe navigation options.
opts.agent?AgentThe agent to create the prompt for.
opts.inventory?Inventory-
opts.progress?string[]The progress of the objective (optional).

Returns

Promise<{prompt: CoreMessage[];state: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }; }>

A promise that resolves to the created prompt.

prompt

prompt: CoreMessage[]

state

state: {ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }

state.ariaTree

ariaTree: string

state.kind

kind: "ObjectiveState"

state.objective

objective: string

state.progress

progress: string[]

state.url

url: string

Defined in

src/browser/page.ts:442


markdown()

markdown(): Promise<string>

Returns the Markdown representation of the page's HTML content.

Returns

Promise<string>

A promise that resolves to the Markdown content of the page.

Defined in

src/browser/page.ts:139


parseContent()

parseContent(): Promise<string>

Parses the content of the page and returns a simplified accessibility tree.

Returns

Promise<string>

A promise that resolves to the simplified accessibility tree as a JSON string.

Defined in

src/browser/page.ts:310


performAction()

performAction(command: {index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {index: number;kind: "Enter"; } | {kind: "Back"; } | {kind: "Wait"; } | {index: number;kind: "Hover"; } | {direction: "up" | "down";kind: "Scroll"; } | {kind: "GoTo";url: string; } | {kind: "Get";request: string;type: "text" | "html" | "aria" | "markdown" | "image"; }, opts?: {delay: number;inventory: Inventory; }): Promise<void>

Performs a browser action on the page.

Parameters

ParameterTypeDescription
command{index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {index: number;kind: "Enter"; } | {kind: "Back"; } | {kind: "Wait"; } | {index: number;kind: "Hover"; } | {direction: "up" | "down";kind: "Scroll"; } | {kind: "GoTo";url: string; } | {kind: "Get";request: string;type: "text" | "html" | "aria" | "markdown" | "image"; }The browser action to perform.
opts?objectAdditional options.
opts.delay?numberThe delay in milliseconds after performing the action (default: 100).
opts.inventory?InventoryThe inventory object (optional).

Returns

Promise<void>

Defined in

src/browser/page.ts:358


performManyActions()

performManyActions(commands: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {index: number;kind: "Enter"; } | {kind: "Back"; } | {kind: "Wait"; } | {index: number;kind: "Hover"; } | {direction: "up" | "down";kind: "Scroll"; } | {kind: "GoTo";url: string; } | {kind: "Get";request: string;type: "text" | "html" | "aria" | "markdown" | "image"; })[], opts?: {delay: number;inventory: Inventory; }): Promise<void>

Performs multiple browser actions on the page.

Parameters

ParameterTypeDescription
commands({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {index: number;kind: "Enter"; } | {kind: "Back"; } | {kind: "Wait"; } | {index: number;kind: "Hover"; } | {direction: "up" | "down";kind: "Scroll"; } | {kind: "GoTo";url: string; } | {kind: "Get";request: string;type: "text" | "html" | "aria" | "markdown" | "image"; })[]An array of browser actions to perform.
opts?objectAdditional options.
opts.delay?numberThe delay in milliseconds after performing the action (default: 100).
opts.inventory?InventoryThe inventory object (optional).

Returns

Promise<void>

Defined in

src/browser/page.ts:425


performMemory()

performMemory(memory: {actionStep: {command: any;description: string;objectiveComplete: {kind: "ObjectiveComplete";result: string; } | {};progressAssessment: string; };objectiveState: ObjectiveState; }, opts?: {agent: Agent;delay: number;inventory: Inventory;memoryDelay: number;schema: ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>; }): Promise<any>

Performs a memory on a page

Parameters

ParameterTypeDescription
memoryobjectThe memory to perform
memory.actionStepobject-
memory.actionStep.command?any-
memory.actionStep.description?string-
memory.actionStep.objectiveComplete?{kind: "ObjectiveComplete";result: string; } | {}-
memory.actionStep.progressAssessment?string-
memory.objectiveState?object-
memory.objectiveState.ariaTree?string-
memory.objectiveState.kind?"ObjectiveState"-
memory.objectiveState.objective?string-
memory.objectiveState.progress?string[]-
memory.objectiveState.url?string-
opts?objectAdditional options
opts.agent?Agent-
opts.delay?numberThe delay in milliseconds after performing the action (default: 100).
opts.inventory?InventoryThe inventory object (optional).
opts.memoryDelay?number-
opts.schema?ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>The Zod schema for the return type (optional).

Returns

Promise<any>

Oaram

opts.agent The agent to use (optional). Defaults to page agent.

Oaram

opts.memoryDelay The delay in milliseconds after performing the memory (optional).

Defined in

src/browser/page.ts:674


returnErrorState()

returnErrorState(failureReason: string): Promise<{objectiveFailed: {kind: "ObjectiveFailed";result: failureReason;url: string; }; }>

Parameters

ParameterType
failureReasonstring

Returns

Promise<{objectiveFailed: {kind: "ObjectiveFailed";result: failureReason;url: string; }; }>

objectiveFailed

objectiveFailed: {kind: "ObjectiveFailed";result: failureReason;url: string; }

objectiveFailed.kind

kind: string = "ObjectiveFailed"

objectiveFailed.result

result: string = failureReason

objectiveFailed.url

url: string

Defined in

src/browser/page.ts:609


screenshot()

screenshot(): Promise<Buffer>

Takes a screenshot of the page.

Returns

Promise<Buffer>

A promise that resolves to the screenshot buffer

Defined in

src/browser/page.ts:124


setViewport()

setViewport(width: number, height: number, deviceScaleFactor: number): Promise<void>

Sets the viewport size of the page.

Parameters

ParameterTypeDefault valueDescription
widthnumberundefinedThe width of the viewport.
heightnumberundefinedThe height of the viewport.
deviceScaleFactornumber1The device scale factor (default: 1).

Returns

Promise<void>

Defined in

src/browser/page.ts:112


state()

state(objective: string, objectiveProgress?: string[]): Promise<{ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }>

Retrieves the current state of the page based on the objective and progress.

Parameters

ParameterTypeDescription
objectivestringThe objective of the page.
objectiveProgress?string[]The progress of the objective.

Returns

Promise<{ariaTree: string;kind: "ObjectiveState";objective: string;progress: string[];url: string; }>

A promise that resolves to the objective state of the page.

ariaTree

ariaTree: string

kind

kind: "ObjectiveState"

objective

objective: string

progress

progress: string[]

url

url: string

Defined in

src/browser/page.ts:326


step()

step(objective: string, outputSchema?: ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>, opts?: {agent: Agent;delay: number;inventory: Inventory;progress: string[]; }): Promise<{command: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[];description: string;objectiveComplete: updatedObjectiveComplete;progressAssessment: string; }>

Take the next step towards the objective.

Parameters

ParameterTypeDescription
objectivestring-
outputSchema?ZodObject<any, UnknownKeysParam, ZodTypeAny, {}, {}>-
opts?objectAdditional options.
opts.agent?AgentThe agent to use (optional).
opts.delay?number-
opts.inventory?InventoryThe inventory object (optional).
opts.progress?string[]The progress towards the objective (optional).

Returns

Promise<{command: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[];description: string;objectiveComplete: updatedObjectiveComplete;progressAssessment: string; }>

A promise that resolves to the retrieved data.

command?

optional command: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[]

description

description: string

objectiveComplete?

optional objectiveComplete: {} = updatedObjectiveComplete

progressAssessment

progressAssessment: string

Defined in

src/browser/page.ts:582


title()

title(): Promise<string>

Returns the title of the page.

Returns

Promise<string>

The title of the page.

Defined in

src/browser/page.ts:162


url()

url(): string

Returns the URL of the page.

Returns

string

The URL of the page.

Defined in

src/browser/page.ts:94

Enumeration: BrowserMode

Enumeration Members

text

text: "text"

Defined in

src/types/browser/browser.types.ts:5


vision

vision: "vision"

Defined in

src/types/browser/browser.types.ts:4

Function: makeAgent()

makeAgent(prodiverOpts?: {apiKey: string;provider: string; }, modelConfig?: Partial<ModelConfig>, opts?: {systemPrompt: string; }): Agent

Parameters

ParameterType
prodiverOpts?object
prodiverOpts.apiKey?string
prodiverOpts.provider?string
modelConfig?Partial<ModelConfig>
opts?object
opts.systemPrompt?string

Returns

Agent

Defined in

src/agent/agent.ts:297

Function: ModelResponseSchema()

ModelResponseSchema<TObjectiveComplete>(objectiveCompleteExtension?: TObjectiveComplete, commandSchema?: ZodType<any, ZodTypeDef, any>): ZodObject<{command: ZodOptional<ZodType<any, ZodTypeDef, any>>;description: ZodString;objectiveComplete: ZodOptional<ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>> | ZodOptional<ZodObject<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["unknownKeys"], TObjectiveComplete["_def"]["catchall"], objectOutputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>, objectInputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>>>;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "command" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "command" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>

Type Parameters

Type Parameter
TObjectiveComplete extends AnyZodObject

Parameters

ParameterTypeDefault value
objectiveCompleteExtension?TObjectiveCompleteundefined
commandSchema?ZodType<any, ZodTypeDef, any>BrowserActionSchemaArray

Returns

ZodObject<{command: ZodOptional<ZodType<any, ZodTypeDef, any>>;description: ZodString;objectiveComplete: ZodOptional<ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>> | ZodOptional<ZodObject<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["unknownKeys"], TObjectiveComplete["_def"]["catchall"], objectOutputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>, objectInputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>>>;progressAssessment: ZodString; }, "strip", ZodTypeAny, { [k in "description" | "progressAssessment" | "command" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }, { [k_1 in "description" | "progressAssessment" | "command" | "objectiveComplete"]: baseObjectInputType<Object>[k_1] }>

command

command: ZodOptional<ZodType<any, ZodTypeDef, any>>

description

description: ZodString

objectiveComplete

objectiveComplete: ZodOptional<ZodObject<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, "strip", ZodTypeAny, {kind: "ObjectiveComplete";result: string; }, {kind: "ObjectiveComplete";result: string; }>> | ZodOptional<ZodObject<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["unknownKeys"], TObjectiveComplete["_def"]["catchall"], objectOutputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>, objectInputType<extendShape<{kind: ZodLiteral<"ObjectiveComplete">;result: ZodString; }, TObjectiveComplete["shape"]>, TObjectiveComplete["_def"]["catchall"], TObjectiveComplete["_def"]["unknownKeys"]>>>

progressAssessment

progressAssessment: ZodString

Defined in

src/types/browser/actionStep.types.ts:63

Function: setupServer()

setupServer(): OpenAPIHono<Env, {}, "/">

Returns

OpenAPIHono<Env, {}, "/">

Defined in

src/server/index.ts:12

Type Alias: BrowserActionSchemaArray

BrowserActionSchemaArray: ({index: number;kind: "Click"; } | {index: number;kind: "Type";text: string; } | {kind: "Back"; } | {kind: "Wait"; })[]

Defined in

src/types/browser/actionStep.types.ts:29

Type Alias: BrowserArgs

BrowserArgs: {browserWSEndpoint: string;headless: boolean;mode: BrowserMode;userAgent: string; }

Type declaration

browserWSEndpoint?

optional browserWSEndpoint: string

headless

headless: boolean

mode

mode: BrowserMode

userAgent?

optional userAgent: string

Defined in

src/types/browser/browser.types.ts:8

Type Alias: CollectiveMemoryConfig

CollectiveMemoryConfig: {apiKey: string;endpoint: string; }

Type declaration

apiKey

apiKey: string

endpoint

endpoint: string

Defined in

src/types/collectiveMemory/config.types.ts:3

Type Alias: ModelResponseType<TObjectiveComplete>

ModelResponseType<TObjectiveComplete>: { [k in "description" | "progressAssessment" | "command" | "objectiveComplete"]: addQuestionMarks<baseObjectOutputType<Object>, any>[k] }

Type Parameters

Type ParameterDefault type
TObjectiveComplete extends z.AnyZodObjecttypeof ObjectiveComplete

Defined in

src/types/browser/actionStep.types.ts:82

Type Alias: ObjectiveComplete

ObjectiveComplete: {kind: "ObjectiveComplete";result: string; }

Type declaration

kind

kind: "ObjectiveComplete"

result

result: string

Defined in

src/types/browser/actionStep.types.ts:32

Type Alias: ObjectiveFailed

ObjectiveFailed: {kind: "ObjectiveFailed";reason: string; }

Type declaration

kind

kind: "ObjectiveFailed"

reason

reason: string

Defined in

src/types/browser/actionStep.types.ts:39

Variable: BrowserActionSchemaArray

const BrowserActionSchemaArray: ZodArray<BrowserActionSchemaArray>

Defined in

src/types/browser/actionStep.types.ts:29

Variable: BrowserArgs

const BrowserArgs: ZodObject<BrowserArgs>

Defined in

src/types/browser/browser.types.ts:8

Variable: CollectiveMemoryConfig

const CollectiveMemoryConfig: ZodObject<CollectiveMemoryConfig>

Defined in

src/types/collectiveMemory/config.types.ts:3

Variable: ObjectiveComplete

const ObjectiveComplete: ZodObject<ObjectiveComplete>

Defined in

src/types/browser/actionStep.types.ts:32

Variable: ObjectiveFailed

const ObjectiveFailed: ZodObject<ObjectiveFailed>

Defined in

src/types/browser/actionStep.types.ts:39