#šŸ’¬ā”‚general

1 messages Ā· Page 56 of 1

sleek vortex
#

no but i could have a pipeline that runs in the background where it

#
  • fetches and scrapes pages, could even use embedded chrome to be able to fetch js pages and pdfs
#
  • maybe uses llama to consider how good the source is?
#

based on some form of threshold we could store the source snippets into embeddings

#

obviously there’s the problem that i wouldn’t have a pagerank algorithm

#

and my web index would be purely searching based on text similarity

agile jay
#

Yep, I think refining the input using embeddings, and then using a fast and cheap model like llama 3 would be the best option.

onyx echo
#

What decides if my question (in non-English) be translated into Eng for pro search? I notice sometimes my questions are translated, but not always, kind of annoying to be honest, are there some kind of switch?

agile jay
devout cargo
inland bear
zenith atlas
#

Was someone using both kagi and perplexity to compare their ai features? Im thinking of trying kagi but not sure if its worth the time and money

warm cave
halcyon sequoia
#

Quick question. I am new to Perplexity. I just asked it what the latest version of ChatGPT is, and it said the latest version is GPT4 Turbo, which was released in November 2023. Is this out-of-date information because I am using the free version, or is this typical of Perplexity?

sleek vortex
# warm cave

haiku's response to
"plan me a trip to japan somewhere in the month of may 2024"

{
    "steps": [
        {
            "id": 0,
            "worker_type": "web_worker",
            "inputs": {
                "goal": "best places to visit in japan in may 2024"
            },
            "dependencies": []
        },
        {
            "id": 1,
            "worker_type": "web_worker",
            "inputs": {
                "goal": "flight prices from major us airports to japan in may 2024"
            },
            "dependencies": [
                0
            ]
        },
        {
            "id": 2,
            "worker_type": "web_worker",
            "inputs": {
                "goal": "top rated hotels in popular japan destinations for may 2024"
            },
            "dependencies": [
                0
            ]
        },
        {
            "id": 3,
            "worker_type": "web_worker",
            "inputs": {
                "goal": "japan rail pass options and pricing for may 2024"
            },
            "dependencies": [
                0
            ]
        },
        {
            "id": 4,
            "worker_type": "final_output",
            "dependencies": [
                0, 1, 2, 3
            ]
        }
    ]
}
warm cave
# halcyon sequoia Quick question. I am new to Perplexity. I just asked it what the latest version...

Hey, Are you on web search or writing mode, here is what I get when I ask the same question in writing mode: https://www.perplexity.ai/search/What-is-the-naaH6lGFTmipttnJeN1xMw

Perplexity AI

The latest version of ChatGPT is GPT-4o, which was introduced on May 13, 2024. This new model offers several enhancements over previous versions, including improved speed and advanced capabilities in text, voice, and vision processing. GPT-4o is designed to provide GPT-4-level intelligence but is much faster and more efficient. It also includes ...

sleek vortex
#

seems to not pick up on the weather thing

sleek vortex
#

and this is sonnet

#

way better but still idk

#
{
"steps": [
{
"id": 0,
"worker_type": "web_worker",
"inputs": {
"goal": "find popular tourist destinations in Japan"
},
"dependencies": []
},
{
"id": 1,
"worker_type": "web_worker",
"inputs": {
"goal": "find typical weather conditions in Japan during May"
},
"dependencies": []
},
{
"id": 2,
"worker_type": "airport_search",
"inputs": {
"location": "my current location"
},
"dependencies": []
},
{
"id": 3,
"worker_type": "airport_search",
"inputs": {
"location": "major cities in Japan"
},
"dependencies": [0]
},
{
"id": "4*",
"worker_type": "map:flight_price",
"inputs": {
"per:takeoff_location": "ref:2.airport_code",
"per:landing_location": "ref:3.airport_code",
"per:day": "dates in May 2024"
},
"dependencies": [2, 3]
},
{
"id": 5,
"worker_type": "web_worker",
"inputs": {
"goal": "find highly rated hotels and accommodation options in the Japanese cities from step 0"
},
"dependencies": [0]
},
{
"id": 6,
"worker_type": "final_output",
"dependencies": [0, 1, 4, 5]
}
]
}```
#

havent fully figured out the syntax for map steps but

#

this is already promising i think

warm cave
sleek vortex
#

i need to figure out map steps

#

or how to syntax it

#

aka like when the following step is ran on each output from a previous step

#

so if i asked for an ipad lineup comparison

warm cave
#

What model do you plan to use on the final product?

sleek vortex
#

idelaly first step would get a list

sleek vortex
#

im too broke to serve opus atm

#

all the intemedary models would be llama 8b or something?

#

and the starting model im debating between haiku, sonnet or a finetune 8b

#

let me ask this same query to groq l3- 70b actually

#

and 4o

warm cave
#

Yeh let’s see

sleek vortex
#

gpt4o input $5/mtok, output $15/mtok

#

time to sign into my $5 funded api account on gpt platform lol

warm cave
#

I’m glad groq is free rn, make it easy to test things out

sleek vortex
#

ok so heres gpt4o on temp 1

#
{
  "steps": [
    {
      "id": 0,
      "worker_type": "web_worker",
      "inputs": {
        "goal": "best cities to visit in Japan in May 2024"
      },
      "dependencies": []
    },
    {
      "id": "1*",
      "worker_type": "map:airport_search",
      "inputs": {
        "per:location": "<ref>"
      },
      "dependencies": [
        0
      ]
    },
    {
      "id": 2,
      "worker_type": "flight_price",
      "inputs": {
        "takeoff_location": "origin_airport_code",
        "landing_location": "destination_airport_code",
        "day": "May 2024"
      },
      "dependencies": [
        1
      ]
    },
    {
      "id": 3,
      "worker_type": "weather_worker",
      "inputs": {
        "location": "Japan"
      },
      "dependencies": []
    },
    {
      "id": 4,
      "worker_type": "final_output",
      "dependencies": [
        0,
        "1*",
        2,
        3
      ]
    }
  ]
}```
#

ok ideally what do we actually want it to do

#
  • search up best tourist places
  • look at weather in that area for the month
  • find the airport codes for those areas
  • find flight price from user's origin airport
  • build an output itenary
#

well we're definetly very close

#

i think maybe we have to first expose a better schema for the weather function

#

and also make it more clear how to specify inputs from other functions

like "landing_location"

#

right now its a large system prompt + 2 shot of example queries

#

around 2k tokens (given that the json is formatted)

warm cave
#

Sounds good, do what the rabbit team promised, haha

warm cave
sleek vortex
#

the system prompt is

#

1.17k

#

with the "tools":

web_worker, spotify_worker, weather_worker, airport_search, flight_price, final_output

#

now that doesnt help that this might grow if we added more things

warm cave
#

I have wondered if there is a way to dynamically load is context for relaxant tools, but idk

sleek vortex
#

relaxant?

warm cave
#

lol, ā€œrelevantā€

sleek vortex
#

oh right

warm cave
#

On my phone

sleek vortex
#

i mean we could have a super small model look at only the query and decide which tools might be required

#

then use a large model after to decide the specifics and dependencies

#

but then this chain is going to involve like 25+ model requests at this rate

#

also small model might neglect quality?

#

ooh i have an idea

warm cave
#

Yeah, you could have a list of tools and a brief description, and a small models would select the tools

sleek vortex
#

what if we get the model to write sort of psuedo js code or something

warm cave
#

And those would load into the big models context

sleek vortex
#

the small model given something like japan query

#

might not pick up on needing to figure out the weather

#

or something

#

i mean even haiku didnt pick up on flight price like 60% of the time

warm cave
#

Hmmmm

sleek vortex
#

let me try this

#

instead of huge json schemas

#

we could maybe make it write something like this

warm cave
#

You could have broad groups of tools, like many travel tools would be a category

agile jay
#

Why not just write a simple interface for the tools?

sleek vortex
#

this is an example i wrote

#

make a table comparing the whole current 2024 ipad lineup

#

->

#
{
  "steps": [
    {
      "id": 0,
      "worker_type": "web_worker",
      "inputs": {
        "goal": "find out the names of the ipads in the 2024 lineup",
      },
      "dependencies": []
    },
    {
      "id": "1*",
      "worker_type": "map:web_worker",
      "inputs": {
        "per:goal": "find out the price and specs for the device <ref>",
      },
      "dependencies": []
    },
    {
        "id":2,
        "worker_type": "final_output",
        "dependencies": [0,"1*"]
    }
  ]
}```
#

as part of a shot for the model

#

so maybe instead we could represent it like

#
/**
 * Website explorer - if you have more than one goal, use separate queries
 * @constructor
 * @param {string} goal - Specifically what to find out from the web.
 */
type SearchWorker<T> = (goal: string) => T;

/**
 * Plays a song on the user's Spotify account
 */
type SpotifyWorker = (song_name: string, artist: string) => void;


type WeatherForecast = {
    temperature: number;
    weather: string;
    date: string;
};

/**
 * Get the seven day forecast for a location
 * @param {string} location - The location to get the forecast for
 */
type SevenDayForecastWorker = (location: string) => Promise<WeatherForecast[]>;


/**
 * Get the current 3 character airport code for a location
 * @param {string} rough_name_or_location - The location to get the airport code for
 * @returns {string} - The airport code (e.g. "LAX" or "LHR")
 */
type AirportSearchWorker = (rough_name_or_location: string) => Promise<string>;


type Flight = {
    origin: string;
    destination: string;
    date: Date;
    price: number;
    airline: string;
};

/**
 * Get the price of a flight
 * @param {string} origin_code - The 3 character airport code of the origin airport
 * @param {string} destination_code - The 3 character airport code of the destination airport
 * @param {Date} dateStart - The start date of the flight
 * @param {Date} dateEnd - The end date of the flight
 * @returns {Flight[]} - An array of possible flights
 */
type FlightPriceWorker = (origin_code: string, destination_code: string, dateStart: Date, dateEnd: Date) => Promise<Flight[]>;

/**
 * This returns the final output to the user in a message
 * You must call this function at the end of your code
 * @param {any[]} inputs - Any inputs from previous functions
 * @param {string} [extra_instructions] - Any extra instructions or information for the worker (optional)
 */
type FinalOutput = (inputs: any[], extra_instructions?: string) => void;
#

thinking

agile jay
#

Wouldn't it make more sense to reorganize the response?

{
    query: "Find out the names of the iPad in the 2024 lineup",
    steps: [
        { //step 1 },
        { //step 2 }
    ]
}

And have them be a tree of nodes?

sleek vortex
#

this is 464 tokens
big json was 780 ish

sleek vortex
#

The model takes in the query and has to return an organised list of steps

agile jay
#

Instead of just a tree.

sleek vortex
#

Yeah but we want steps to be able to run in parallel too

#

otherwise some queries are gonna take like 5minutes

agile jay
#

Yep, they do run in parallel.

#

You just loop through steps, and steps can have steps etc.

sleek vortex
#

Yeah but say a query like

#

make a table comparing the whole current 2024 ipad lineup in price and specs

#

how would you approach it (asking)

agile jay
#

Step 1 produces a task and adds it to the message queue. In this case it could be to make a search and you use the sources in your output.
Then when the task is done, the next step is triggered. In this case, it would pass the data into a table maker, to make the table you wanted.

#

But it's a simple producer consumer model, where the tools consume the tasks as they come in, for maximum load efficiency.

sleek vortex
#

But how would you deal with fetching each ipads own data

#

We would have to first teach the decision model in prompting or shotting or finetuning to break down something like that into multiple queries

#

where it first finds out the ipads then researches each individually

#

you also wouldnt search for all the ipads' specs at once right?

agile jay
sleek vortex
#

I mean ideally youd want it to go to apple.com for each

agile jay
#

If you start splitting your tasks up too much, it gonna take longer and cost more too.

sleek vortex
#

hmm, true

#
let search_worker: SearchWorker<string[]>;
const ipad_models = await search_worker("list of 2024 ipad models");

const specs_worker: SearchWorker<{[key: string]: any}[]> = async (model: string) => {
  const specs = await search_worker(`specs of ${model} ipad`);
  const price = await search_worker(`price of ${model} ipad`);
  return [{model, ...specs}, {model, price}];
}

const ipad_specs = await Promise.all(ipad_models.map(specs_worker));
const table = ipad_specs.reduce((acc, curr) => acc.concat(curr), []);

final_output([table], "The table contains the model name as keys, with specs and price as values");```
#

sonnet turned that query into this

#

which works but yeah i see your point, this would use like 10 google queries...

#

which might be a bit stupid when perplexity can already do it from existing sources
https://www.perplexity.ai/search/make-a-table-sMQdUCkEThWdpBftzut4Cw#0

Perplexity AI

Here is a comprehensive table comparing the 2024 iPad lineup in terms of price and specifications:

Model Display Size Display Type Resolution Processor Storage Options RAM Front Camera Rear Camera Connectivity Battery Life Price (Wi-Fi) Price (Wi-Fi + Cellular)

...

#

though i had this query

"price comparison - claude 3 models, gpt 4 turbo, gpt4o, gemini 1.5 pro and flash - use unit million tokens and give it in a table"

#

idk pplx like 90% gets it

#

hmm

warm cave
#

Here is the table I put together

#

It per 1M

agile jay
agile jay
sleek vortex
#

that we have to fallback to searching individually

agile jay
#

Make the model ask if it needs to...

#

After doing the first search.

sleek vortex
#

bruh what

#

now im feeling 25% demotivated lmao

agile jay
#

Is this enough data to solve the task? If not, create the queries needed to get the information you need.

sleek vortex
#

need a good example of a complex query that pplx cant do

agile jay
#

You probably also want to add a context timeout, to maximise the time it has to solve any task. Otherwise it could carry on for ever...

sleek vortex
#

then the worker on task "search thing a" can also check if there's anything missing, and go search that independently if needed

#

because yeah my approach would i think go out the roof in model costs, search costs and query time, since i might get to the same endgoal with 100% information but its too slow

agile jay
#

Yep, make it simple and reduce the number of steps, if possible.

#

Also save the output and embed it, so you can use it as a source for people who ask the same question...

sleek vortex
#

you mentioned subtasks right

agile jay
#

Yep, basically a node tree of tasks.

sleek vortex
#

im thinking

maybe a task, like say "find out price of iphone 15 lineup" could be the task goal

#

and then thats a container for maybe a websearch (and any further if missing info)

#

so then later on, we could use embedding simularity to compare the goals

agile jay
#

Yep, more abstract goals.

sleek vortex
agile jay
#

Yep, or give them that result, and they can click a button to do a fresh search if needed.

sleek vortex
#

so how to prompt the initial model now is the question

#

asking for a list of goals, that can have tasks as children?

agile jay
#

Yep, but you probably wanna make a few manual examples.

sleek vortex
#

yeah

#

shot prompting

#

and i might look into finetuning an 8b down the line

agile jay
#

Yep, improves it a lot.

sleek vortex
#

right now just doing haiku/sonnet

#

$4.86 left on my account lmao

#

not bad

agile jay
#

Mine infrastructure works slightly differently, but that's because I'm treating the streamed model messages like websockets.

sleek vortex
#

i mean right now i dont have any infrastructure

#

i have a tech demo for a mini perplexity in python

#

but im going to completely rewrite it to use a socketio based backend on modal cloud, which connects to embeddings gpu, and models and then frontend in something

agile jay
#

You could also try moving some of the load to the client.

#

Such as the web searching/requests.

sleek vortex
#

that wouldnt work at scale

#

+huge cors issue

#

my mindset building this is making it like a perplexity-type product/site, not a product just for me

#

obviosuly it might never succeed but

#

fun to try

agile jay
#

So server side everything?

sleek vortex
#

Yeah basically

agile jay
#

You probably wanna calculate your cost per query then.

#

To see what parts are best self hosted, or using an API.

candid violet
#

Hello Perplexity (World)

agile jay
#

Have you seen the Muffin Man?

sleek vortex
#

My little CLI demo per query is around

$0.005 for running the cloud gpu for embeddings
+
$0.00075 input haiku + $0.000625 output haiku

halcyon sequoia
sleek vortex
#

$0.006375 per query

sleek vortex
agile jay
sleek vortex
sleek vortex
agile jay
#

Oh, you are using the gemma model?

sleek vortex
#

that was for the cli thing

#

gemma turned the query into 3 searches

candid violet
#

Trying to find a little help in using perplexity with financial portfolio management. I wish to input my own data not really for search the web so much. I was hoping I could find like minded pro users.

sleek vortex
#

for my new thing im probably going to use haiku as the first model too

#

well lets just put my estimates at what 3x higher cost

halcyon sequoia
# sleek vortex

I tried the writing mode - the answer is now worse. Before, at least, it suggested the last but one release. Now it doesn't know at all.

sleek vortex
#

$0.019305 per query?

agile jay
halcyon sequoia
#

But it didn;t know when i used search either

sleek vortex
#

well tldr, you got a query where the model failed to find any reliable sources

halcyon sequoia
#

but someone else asked same question - and they got the right answer?

sleek vortex
#

ok um idk

agile jay
sleek vortex
#

wait let me try with a model that itself isnt the latest version of chatgpt

halcyon sequoia
#

I am debating whether to move to perplexity, so just trying to work out if it is worth it. So far, the first question I ask is, and it seems to struggle. My question is, is the paid version better?

sleek vortex
# agile jay

They definetly have their own index
I cant seem to index openai.com since its a react app or something

agile jay
sleek vortex
finite topaz
#

Is it planned in the roadmap to have a more dedicated approach to files?? Like uploading some files (not only PDFs) and have several threads be on that + Threads concerning those files???

agile jay
sleek vortex
#

wait what this isnt even react

#

why did it fail to scrape

finite topaz
agile jay
sleek vortex
#

(Or at least ive experienced)

finite topaz
sleek vortex
#

i dont think so, no

#

not sure

agile jay
#

You would have to ask the pp team

finite topaz
#

which channel?

agile jay
finite topaz
#

ty!

sleek vortex
#

should i remove the details on the specific output formats

#

since i think we could just run a small model in the middle anyway to handle it

#

i think it will always output a string of some sort so like?

#

idk

agile jay
#

Yep, it uses a lot of tokens

#

Better to write a schema at the start, to use less tokens.

sleek vortex
#

hate discord ux sometimes man

#

anyway

agile jay
#

Is that groq UI?

sleek vortex
#

yeah

#

trying on 8b/70b now

#

User Query: "plan me a trip to japan somewhere in the month of may 2024 from london"
llama3-70b

{
    "query": "plan me a trip to japan somewhere in the month of may 2024 from london",
    "goals": {
        "Find flights from London to Japan in May 2024": [
            {
                "worker": "flight_price",
                "inputs": {
                    "takeoff_location": "LON",
                    "landing_location": "NRT or HND", // Tokyo airports
                    "day": "May 2024"
                }
            }
        ],
        "Find airport codes for Japan": [
            {
                "worker": "airport_search",
                "inputs": {
                    "location": "Japan"
                }
            }
        ],
        "Find weather forecast for Japan in May 2024": [
            {
                "worker": "weather_worker",
                "inputs": {
                    "location": "Japan"
                }
            }
        ],
        "Plan itinerary for Japan trip": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [ "Japan travel guide", "things to do in Japan in May" ]
                }
            }
        ]
    }
}```
#

i think this might be possible but we would need like 50 queries of data to finetune it with

#

we're trying to get it to connect the dots when it has no clue how

#

whats the ideal response

#

maybe like this

#
{
    "query": "plan me a trip to japan somewhere in the month of may 2024 from london",
    "goals": {
        "Plan itinerary for Japan trip": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [
                        "japan travel may 2024",
                        "best places to visit in japan may",
                        "japan travel packages may 2024",
                        "japan festivals events may"
                    ]
                }
            }
        ],
        "Find flights from London to Japan in May 2024": [
            {
                "worker": "airport_search",
                "inputs": {
                    "location": "Japan"
                }
            },
            {
                "worker": "flight_price",
                "inputs": {
                    "takeoff_location": "LON",
                    "landing_location": "<found airports>", 
                    "day": "May 2024"
                }
            }
        ],
        "Find weather forecast for Japan in May 2024": [
            {
                "worker": "weather_worker",
                "inputs": {
                    "location": "Japan"
                }
            }
        ]
    }
}```
#

ok here's an instant problem...

#
{
    "query": "find the weights of the iphone 15 pro, and the ip15 pro leather case, and then add them up",
    "goals": {
        "Find the weight of iPhone 15 Pro": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [ "iPhone 15 Pro weight", "iPhone 15 Pro specs" ]
                }
            }
        ],
        "Find the weight of iPhone 15 Pro Leather Case": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [ "iPhone 15 Pro Leather Case weight", "iPhone 15 Pro case specs" ]
                }
            }
        ],
        "Add the weights together": [
            {
                "worker": "calculator_worker",
                "inputs": {
                    "num1": "<weight of iPhone 15 Pro>",
                    "num2": "<weight of iPhone 15 Pro Leather Case>",
                    "operation": "add"
                }
            }
        ]
    }
}```
#

all cool but calculator_worker doesnt exist (yet?)

sleek vortex
agile jay
#

Yep, it just waits for the other ones to return their outputs.

sleek vortex
#

Okay but how do we handle hallucinations

agile jay
#

Hallucinations will always happen, so your best bet is to put things in context, to reduce the entropy to valid answers.

sleek vortex
#

this is honestly not bad

#

for a 7b

#

output from haiku

#
{
    "query": "is the ipad pro m4 worth it? whats new and how much is it? should i upgrade from my 2018 ipad",
    "goals": {
        "Summarize the new iPad Pro M4": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [
                        "ipad pro m4 new features",
                        "ipad pro m4 specs",
                        "ipad pro m4 price"
                    ]
                }
            }
        ],
        "Evaluate whether to upgrade from 2018 iPad": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [
                        "should i upgrade from 2018 ipad to ipad pro m4",
                        "is ipad pro m4 worth it over 2018 ipad"
                    ]
                }
            }
        ]
    }
}```
#

very very good i think

#
{
    "query": "how many rocks we should eat for a healthy diet",
    "goals": {
        "Determine if eating rocks is healthy": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [
                        "is eating rocks healthy",
                        "can humans eat rocks",
                        "nutritional value of rocks"
                    ]
                }
            }
        ],
        "Provide a recommendation on the number of rocks to eat for a healthy diet": [
            {
                "worker": "web_worker",
                "inputs": {
                    "queries": [
                        "how many rocks should you eat per day",
                        "recommended amount of rocks to eat for health"
                    ]
                }
            }
        ]
    }
}```
agile jay
#

Yep, now just need to parse it reliably, and then run them.

sleek vortex
#

even claude is hallucinating sometimes though

#

mainly when theres some form of processing involved

#

We could make it so the "output" to the user is like a step too

#

so that the model is re-ensured not to start hallucinating its own worker names

agile jay
#

Yep, smaller models are more likely to hallucinate.

sleek vortex
#

this is haiku

agile jay
#

Haiku is small

sleek vortex
#

both haiku and 7b,8b,70b show the same things

sleek vortex
#

right*

#

Alan's estimate for Claude 3 Opus: 2T parameters trained on 40T tokens. 3 models sizes: Haiku (~20B), Sonnet (~70B), and Opus (~2T).18 Mar 2024

#

says google

#

ok wow haiku is a 20b?

agile jay
#

No, that's an estimate.

sleek vortex
#

well yeah

#

but even if thats an estimate

#

haiku is such a good model for it being so small lmao

agile jay
#

Not really, it's a pretty dumb model.

#

I've asked it to do stuff with large files, and it fails quite a lot on basic tasks.

#

I think a llama 70B finetune is probably the best bet with speed to performance ratio.

sleek vortex
#

Not bad

#

I expect the response to maybe be better than perplexity

#

since i doubt perplexity would be providing "detailed" weather

#

should i start implementing the actual running of this?

agile jay
#

Yep, only you need to pass in location info though, if you want a decent answer.

#

Yep, so you can see how stable the generations are.

sleek vortex
# sleek vortex

given this response, the code wouldnt really know what to do with <found destinations> really

#

but we'll see

#

maybe can implement a little model call in between

agile jay
#

It's why you need to define a schema.

#

So you know how it should be implemented.

sleek vortex
#

Do you have any rough ideas for how i could lay it out?

#

I tried to before but it didnt go that well

#

i can send you this whole conversation (in exported code) if it helps

agile jay
#

First come up with multiple examples of different kinds of questions. This is for two things, one for multi shot learning, and two, so you can create a schema which works well for all of those cases.

#

In your case, you are just doing a list of goals. Really the goals should be more hierarchal, so that they solve the dependency problem automatically. I can show more example after you've come up with the different styled questions/tasks.

agile jay
sleek vortex
#

Maybe for these queries (both complex and simple):

* find the weight of an iphone 15 and iphone 15 pro, and to each add the weight of their sillicone cases
* is the ipad pro m4 worth it? whats new and how much is it? should i upgrade
* price of btc and eth in gbp
#

just outline it i guess, obviously not expecting a full json or whatever

torn sierra
#

Something new is about to come

agile jay
#

First out the different kinds of tasks.

  • Retrieval: web search, document search, user info
  • Mutation: turn data into table, html into markdown, add two number, etc
  • Task groups, group up similar tasks and wrap them in a parent.

Find the weight of an iPhone 15 and 15 pro, and add them together.

{
    intent: "Get the weight of the iPhone 15 and 15 pro.",
    steps: [
         { tool: "search", query: "weight of iPhone 15"},
         { tool: "search", query: "weight of iPhone 15 pro"},
    ]
},
{
    intent: "Add their weights together",
    steps: [
         { tool: "calculator", query: "--add {0} {1}"}
    ]
},
#

And all the steps just happen at the same time. If you need to wait for something, put it as the next task.

sleek vortex
#

lmao

#

i could just turn up the amount of sources and get a similiar thing surely

agile jay
sleek vortex
#

i mean the twitter thing

sleek vortex
#

but surely we want "turn data into a table" to just be handled by the model at the end (sonnet, opus, 4o)

sleek vortex
#

"based around 500 articles" idk about that

agile jay
#

Yep, that context would be pretty large.

#

And take ages to respond.

sleek vortex
#

https://im.fo goes to google

#

weird

#

does he work at google

#

???

sleek vortex
agile jay
sleek vortex
#

i mean im doing that

#

i mean in the previpus stage

#

where does he even find

#

500 articles

#

i guess i could use a chromium browser to first hit all links possible

#

then i could go and hit every single sublink

torn sierra
agile jay
sleek vortex
#

ok ive turned up the

#

source limit

#

to like

#

75

agile jay
#

Wonder how long it takes to crawl a site.

sleek vortex
#

lets see what happens

#

half of it is cant reads since this is http scraping only, no browser

#

Some key points I can gather from the context:

- There have been advancements in areas like natural language processing, machine learning for conservation efforts, AI in healthcare, and AI in cybersecurity. 

- There are concerns around the risks and ethical considerations of AI, leading to discussions around AI regulation and governance.

- AI is rapidly evolving and transforming many industries, with new applications and use cases emerging regularly. 

- Keeping up with the latest AI news and developments is important for professionals and the general public to understand the implications of this transformative technology.

However, without more specific details on the most current AI news and stories, I don't want to make any definitive claims. The context provided covers a broad landscape, but doesn't allow me to summarize the latest AI news in a comprehensive way. I'd suggest checking reputable technology news sources for the most up-to-date information on the latest AI developments and news.```

well thats useless
#

a minute ago it gave me a way better response wth

agile jay
#

Yep, you need to work around cloudflare and other stuff.

sleek vortex
#

The key AI news and topics from the context include:

- Efforts to regulate and set safety standards for advanced AI systems, including a summit in Seoul and companies signing up to AI safety standards.

- Concerns raised by high-profile figures like Elon Musk and Scarlett Johansson about the risks and challenges posed by the rapid development of AI.

- The use of AI in elections, including allegations of AI-generated impersonations being used to spread misinformation.

- The continued growth and investment in major AI companies and startups like OpenAI, xAI, and Microsoft's AI-powered PCs.

- The impact of AI on sectors like cybersecurity, loneliness, and productivity.

- Debates around the ethics and responsible development of AI, including concerns about AI-generated content and the potential existential risks of advanced AI systems.

Overall, the news highlights the accelerating pace of AI innovation and the growing recognition of both the benefits and risks associated with this rapidly evolving technology. The context suggests there is an active dialogue and efforts underway to try to shape the responsible development and deployment of AI systems.```
#

this is like 25% better

sleek vortex
#

thats the whole franchise

agile jay
#

How many items can the AI focus at once?

sleek vortex
#

ok one way i can think of it is that he indexes the stories himself

agile jay
#

After a certain point more sources doesn't make it better.

sleek vortex
#

then he has llama or a model look at it and extract the main points from eveery source

#

then when queires are made he uses these summarised snippets of the world's current stories

#

but that would only work for news

cinder comet
sleek vortex
#

he would have to be scraping all news sites beforehand

#

like polling rss feeds and google news or something, and having a cloud function constantly ingesting and vectorising/summarising/whatever

cinder comet
#

"each reponse is based on around ~500 articles and the more you ask the more articles are incorporated"

so basically its stored and then gets called if it fits the user query

sleek vortex
#

yeah...

cinder comet
#

if not it searches new articles and adds it to the polling

sleek vortex
#

why does it feel like theres a new competitor to whatever thing im building everyday 😭

#

such a funny industry

agile jay
#

At the end of the day the sources don't matter. What matters is that the user gets what they wanted...

sleek vortex
#

yeah, i guess

cinder comet
#

i mean the moment perplexity was out many people tried to get something similar

#

i dont think its that hard to do

sleek vortex
#

whats gpt4 going to do after being able to click things

agile jay
cinder comet
#

idk maybe do something like devin but locally

sleek vortex
#

its good but its kinda useless when you think about the big picture

cinder comet
#

goes back to chatgpt then to your ide, and so on until it reaches the wanted result

sleek vortex
#

yeah but thats gonna suffer the same fate as devin

#
  • doesnt do much, or takes months to do it
#
  • expenses out the roof
cinder comet
#

i will put a percentage of success

#

it doesnt have to be 100%

sleek vortex
#

At that point just give the thing access to your filesystem

cinder comet
#

a 50% and we call it a win

sleek vortex
#

and do it via that instead

agile jay
#

It can watch corn for you, so you can save time.

tender robin
#

Some fast advice needed. I am giving a talk tomorrow (just 45 minutes) on AI to warm folks up before the annual dinner of my non-tech professional association. Although I can obviously do without slides (this is a talk, not a webinar), I have decided to investigate the idea at the last moment of having a few graphic slides to explain some of the concepts I want to explain to these lay people (I am a lay person as well) and for a little relief for them from looking at me. Just a key word or two and an image on each slide – probably shouldn’t be a true picture – would do the trick. What is my fastest, most accessible and best choice given the lateness of the hour, as it were? I will be thankful for any responses.

tender robin
#

I should have made it plain I am looking for something I can make with Perplexity

sleek vortex
#

I mean you could ask perplexity, maybe generate some ai images

#

related visuals that could please their eyes

tender robin
agile jay
sleek vortex
#

Maybe first take the list of concepts youre talking about, and ask it to generate ideas for images for each

#

then on the top right as code says, you can use the image thing with a custom prompt, just putting in the idea for the image

tender robin
#

Thanks!

sleek vortex
#

but now i think its time to go back to the drawing board with these website embeddings for a second

south kindle
# tender robin Some fast advice needed. I am giving a talk tomorrow (just 45 minutes) on AI to...
Perplexity AI

Large Language Models (LLMs) such as GPT-4 and BERT represent the cutting edge of natural language processing, utilizing deep learning architectures based on transformers to predict and generate human-like text. Internally, these models operate through a complex interplay of neural network layers, where each layer processes input data sequential...

south kindle
sleek vortex
#

@agile jay @warm cave fixed the scraper fail rate from like 50% to like 10% lol

summer crescent
#

Are we still limited to like 20 tokens daily for cloude ?

agile jay
summer crescent
#

What happens if you're in a thread that is running cloude will it simply error out or convert to a different llm ?

#

Once you run out?

grand silo
warm cave
#

It switches to Sonnet

warm cave
grand silo
#

ah, that's good to know

warm cave
#

and btw haiku is "Default"

grand silo
#

yeah, I don't like haiku, but 600 sonnet feels like more than claude direct

#

so that's fine

tame current
jovial mantle
tame current
jovial mantle
#

I know, but it looks they are from before 2 May? Probably patched

tame current
jovial mantle
#

You had Mahoraga adapt to it

tame current
jovial mantle
#

šŸ˜Ž

fading moth
#

My kingdom for like... an integration in Discord for perplexity that summarizes the last messages since I was idle in this channel in a private message.

#

I'd say like... /catchmeup or something and perplex would send it privately

cinder comet
fading moth
#

I know, but wouldn't it be neat!

warm cave
sage chasm
#

Has anyone noticed we searches are happening less?

#

I've been struggling to get it to show sources and stuff, it's mostly just using the offline chatgpt

#

Which is not useful at all

warm cave
warm cave
fading moth
#

MODS

#

@vapid onyx

distant aurora
sage chasm
#

It was fine before though

#

I don't like "pro" because it forces me to answer questions it should have context on

gentle dirge
#

Hi

cloud sigil
#

yo what are the limits for perplexity ai in normal mode

#

without signing in

halcyon coral
cloud sigil
#

thx. how accurate you think perplexity is when it comes to mcq?

grand silo
#

Not in the same realm as OpenAI/Anthropic, or they’d be screaming about it

cloud sigil
grand silo
quaint star
#

Have to ask here as well since the Perplexity status webpage Discord link has expired: I'm getting 500 Internal Server Error for all of my Perplexity API utilizing applications right now. API status page shows nothing off? What's going on? (The implementations worked without issues previously.)

warped marsh
#

@halcyon coral wassup with the opus 50 limit?

#

you know y'all can just say that it is going to be permanent, right?

#

we get that its expensive

grand silo
#

imo there’s nothing worse than unstated limits (or falling back to lesser models without notice)

warped marsh
#

does you.com provide claude 3 opus unlimited?

#

okay they do

grand silo
#

I… doubt that.

#

The wording also leaves a lot to the imagination.

cloud sigil
#

can anyone share some ai's that are becoming rlly popular and are very accurate at doing their jobs?

warped marsh
cloud sigil
#

what are some good ai's that are becoming more popular and accurate?

sleek vortex
#

ive been working on my own agent-like research thing in this server

cinder comet
#

how is it going so far

#

https://openai.com/index/openai-board-forms-safety-and-security-committee/

OpenAI has recently begun training its next frontier model and we anticipate the resulting systems to bring us to the next level of capabilities on our path to AGI. While we are proud to build and release models that are industry-leading on both capabilities and safety, we welcome a robust debate at this important moment.

A first task of the Safety and Security Committee will be to evaluate and further develop OpenAI’s processes and safeguards over the next 90 days. At the conclusion of the 90 days, the Safety and Security Committee will share their recommendations with the full Board. Following the full Board’s review, OpenAI will publicly share an update on adopted recommendations in a manner that is consistent with safety and security.

#

gpt5 is near

sleek vortex
#

bruh

#

going to lose my job as a programmer by the time gpt6 drops

sleek vortex
#

do some actual work irl that isnt this project

#

then if i have time start work on a new backend with a ui

#

what idk how i'm going to handle is storing embeddings

#

ok i mean realistically do i actually need embeddings?

#

if i went onto the smaller goal architecture

#

could i not just limit each mini model to like 4 sources

#

and summarise?

#

idk

austere kestrel
tame current
#

How long does perplexity store uploaded files for

cloud sigil
#

that's free

sleek vortex
#

is decent

#

chatgpts own website is ok-ish

#

pplx labs for not online models

sleek vortex
tame current
#

What if retention is turned off

#

And where does it say this if you don't mind me asking

sleek vortex
#

Not 100% sure - test it by checking the AWS url when you copy citations

cloud sigil
cloud sigil
#

but i;ll try the others

#

o

#

ty

sleek vortex
#

well its limited

cloud sigil
sleek vortex
#

to idk per hour

cloud sigil
#

for like daily purposes/school

sleek vortex
#

you could go and install gpt4free on your machine

#

with open webui or something

#

i had that setup till i got pplx pro

cloud sigil
#

i don't LOL

#

i can try

sleek vortex
cloud sigil
sleek vortex
sleek vortex
sleek vortex
#

not sure tho

#

or 10 per 24

#

no clue

cloud sigil
#

damn

#

i ain'tn paying a lot

#

20 per month or smth

#

iirc

#

i'll see ty

sleek vortex
#

yeah its quite expensive

#

best value one is probably pplx

#

idk

cloud sigil
#

thx

#

been hearing about you.com for a while

sleek vortex
#

its like ok ish

cloud sigil
#

what about the gemini thing?

sleek vortex
#

oh yeah

cloud sigil
#

gemini ai

#

idk

sleek vortex
#

you can use gemini aistudio for 1.5 pro free

cloud sigil
sleek vortex
#

but its not the "best" according to some

cloud sigil
#

better than nothign ig

sleek vortex
#

if youre doing sm like

#

finding quotes in a text

cloud sigil
#

i ain't doing that

sleek vortex
#

to upload the whole text

cloud sigil
#

im more on statistics

#

econ

sleek vortex
#

oh fairs

#

yeah idk just try a mix

cloud sigil
#

ya i'l try em all

sleek vortex
#

and avoid image uploads when you need accuracy id say

#

not all of them are the best at that

cloud sigil
#

ic

#

i thought ppx could upload images

#

sadly not for free

sleek vortex
#

again, hourly limited though yeah

#

could make multiple accounts

cloud sigil
#

o that's true

#

free trials on chatgpt

#

4o

sleek vortex
#

this is looking promising already

#

well we've got more grounding than google....

#

wow...

#

that would be around
gpt4o $0.0075
groq llama3-70b $0.000689
sonnet $0.0057
opus $0.0285

(5/1000000)*900 + (15/1000000)*200

#

now i really dont think we need opus

orchid dust
#

Hello! I'me using the Perplexity AI Companion for Chrome. I can't seem to use it on webpages, for example to summarize or ask a question related to the page's content. I get an error saying it cannot access the page content. Any tips on how to fix this? Thanks

tame current
#

I used it. Thanks.

halcyon coral
stray fox
#

How will perplexity be protected from the influence of disinformation that news corp or Rupert Murdoch will create?

warped marsh
#

I confirmed in you.com's discord channel

#

if you have their pro subscription then you get unlimited opus

#

I may just get that

sleek vortex
#

how are they not broke lmao

#

so confused

#

they must have like unlimited vc money or something what

warped marsh
#

idk

sleek vortex
#

even with a 50% discount it isnt possible to make that profitable

warped marsh
#

I just know they do provide it for free

#

unlimited

#

not for free but yeah

sleek vortex
#

and opus doesnt have PTU anywhere

#

idk

warped marsh
#

also

#

they have student discount

sleek vortex
#

yeah

#

how is that profitable

warped marsh
#

even idk

#

they are losing money then?

sleek vortex
#

idk

warped marsh
#

to keep more people with them... i think?

warped marsh
sleek vortex
#

yeah

warped marsh
#

plus that article is old?

sleek vortex
#

yeah idk

#

but at some point theyre gonna have an opus limit too

warped marsh
#

that's true

sleek vortex
#

or until opus gets cheaper or releases on aws

#

because most ai companies (like perplexity) id assume dont pay for api

#

they pay for provisioned throughput units on aws/azure

warped marsh
#

yeah fair

sleek vortex
#
Amazon Web Services

We are living in the generative artificial intelligence (AI) era; a time of rapid innovation. When Anthropic announced its Claude 3 foundation models (FMs) on March 4, we made Claude 3 Sonnet, a model balanced between skills and speed, available on Amazon Bedrock the same day. On March 13, we launched the Claude 3 Haiku […]

#

idk??

tame current
# sleek vortex but at some point theyre gonna have an opus limit too

The way it works is that they let u use it but then there are cool down periods that can last usually 10-15 mins. That's what happened to me when I used it. Also they definitely limit the output to make it shorter so it's not that detailed. Imo it's better than perplexitys limits for opus but not the best

livid mantle
sweet jasper
#

How is gpt4 30 points above

#

Sounds like bull

orchid dust
livid mantle
tacit nimbus
#

gm gm

halcyon coral
sleek vortex
#

@agile jay @warm cave table made by intent/agent format

#

seems to have got a few things wrong

#

there's no embeddings here, back to source feeding

sleek vortex
#

no i put it into the excel

#

so its readable

#

it was markdown in my terminal

#

šŸŽÆ Get the prices and specs of the iPhone 15 lineup

and then it had 2 searches

agile jay
#

Why not just use a markdown editor then?

sleek vortex
#

so the issue was it didnt really search the prices individually

agile jay
#

Like obsidian

sleek vortex
#

Eh dont have one installed

#

excel was already open

#

ĀÆ_(惄)_/ĀÆ

agile jay
sleek vortex
#

it searched for iphone 15 and not 15 pro, so it sort of guessed the prices for the 15 pros?

agile jay
#

What steps did it generate?

sleek vortex
#

also it guessed the cpu for the pro models

#

maybe because the main source from apple.com got cut off...

#

let me see if i can first swap to searxng

agile jay
#

Possibly, you could also make it use other sources, since they can likely be found on stuff like GSMArena.

sleek vortex
#

I mean yeah but i cant sit and finetune each query

#

Who knows what an actual user would ask, right?

#

Current i dont have any form of pageranking

#

my guess is theres a chance its being shuffled somewhere due to the random search api im using

sleek vortex
agile jay
#

Yep, and see how the sources change

#

Also the apple site is pretty sh*t...

#

The html is such a mess.

elfin geyser
#

where can i find promo codes

rancid dome
#

How do I see the history of all my threads? The place called "Library" does not show all my threads. Thanks!

sleek vortex
#

if it's empty, try and reload the page once

sleek vortex
# agile jay Yep, and see how the sources change
Based on the search results, here are the prices of the iPhone 15:

* iPhone 15: starts at $799 (128GB), goes up to $1,549 (1TB)
* iPhone 15 Plus: starts at $899 (128GB), goes up to $1,549 (1TB)
* iPhone 15 Pro: starts at $999 (128GB), goes up to $1,749 (1TB)
* iPhone 15 Pro Max: starts at $1,199 (256GB), goes up to $1,799 (1TB)

Note that these prices may vary depending on the country, taxes, and availability. Additionally, prices may change over time due to promotions, sales, and other factors.

It's also important to note that these prices do not include any trade-in credits or promotions that may be available. If you're planning to purchase an iPhone 15, I recommend checking with your carrier or retailer for any available promotions and prices.```
#

price issue fixed literally by changing to searxng

#

however i think this misinformation on the device's chip is due to the sources being cut off?

agile jay
sleek vortex
#

no lol

#

i have the whole complex parser setup

#

i mean cut off because i was dividing the 8k context into 7 sources

#

which isnt exactly good

#

let me lower it to like 3

sleek vortex
#

@agile jay

#

only thing wrong is "A17 Bionic" instead of "A17 Pro"

agile jay
#

Almost there, then.

sleek vortex
#

i first

  • made it so i treat same links with regional duplication like en-us (like apple does) as the same
#
  • then resorts them back to og search engine order
#

price agent:

šŸ”Ž [1.03s] Searched for "iPhone 15 lineup prices"@SearXNG - got 10 links, 0 snippets
šŸ”— [SearchAgent] [0.00s] Found 3 links total
🌐 [0.35s] [17609 chars / 4027 tokens] Read source https://www.apple.com/iphone-15/
🌐 [0.35s] [20413 chars / 5026 tokens] Read source https://www.phonearena.com/iphone-15-release-date-price-features
🌐 [1.81s] [437 chars / 112 tokens] Read source https://mobilekishop.net/blog/iphone-15-series-global-pricing-unveiled/```
#

specs agent:

šŸ”— [SearchAgent] [0.00s] Found 3 links total
🌐 [0.37s] [22946 chars / 6085 tokens] Read source https://www.apple.com/lae/iphone-15-pro/specs/
🌐 [0.36s] [14005 chars / 3325 tokens] Read source https://www.tomsguide.com/news/iphone-15
🌐 [0.44s] [7354 chars / 2101 tokens] Read source https://www.cnet.com/tech/mobile/iphone-15-series-compared-which-model-suits-you/```
#

ok idfk why it did it again there...

#

realistically i could do what you said and not rely on apple.com

sleek vortex
#

pplx haiku pro gets it wrong though...

agile jay
#

Yep, haiku hallucinates a lot more.

sleek vortex
#

no

#

its a source issue

#

4o also said a17 bionic

#

so did opus

agile jay
#

What sources are you using?

sleek vortex
#

opus

#

"a17 bionic"

#

i mean i could just tell it to find the details on gsmarena

#

but again isnt that overfitting my solution to this one query

#

well lets see what happens if i add that as one example in the n-shot of the initial intent-router prompt

#

and this time it decided to work

#

since it made more searches

#
šŸ”Ž [1.25s] Searched for "iPhone 15 Pro price"@SearXNG - got 10 links, 0 snippets
šŸ”Ž [1.26s] Searched for "iPhone 15 Pro Max price"@SearXNG - got 19 links, 0 snippets
šŸ”Ž [0.00s] Searched for "iPhone 15 specs"@SearXNG - got 10 links, 0 snippets
šŸ”Ž [3.35s] Searched for "iPhone 15 Pro specs"@SearXNG - got 10 links, 0 snippets
šŸ”Ž [3.34s] Searched for "iPhone 15 Pro Max specs"@SearXNG - got 10 links, 0 snippets
ā° [FinalAnswerAgent] [4.10s] Model response time (claude-3-haiku)```
#

so it's only done that once

#

idk honestly...

#

ok no way...

#

is that a response BETTER than perplexity???

#

@agile jay remember this yesterday

#

oh my god

#

Day 1: Arrive in Tokyo
- Arrive at Haneda Airport and take the train to your hotel in the Asakusa district
- Explore the Senso-ji Temple and the surrounding historic streets

Day 2: Tokyo 
- Visit the Tsukiji Fish Market for a sushi-making experience
- Explore the Imperial Palace East Garden
- Enjoy a traditional Japanese dinner and drinks in Asakusa

Day 3: Tokyo
- Go up the Tokyo Skytree for panoramic city views
- Stroll through the beautiful Hamarikyu Gardens
- Spend time in the trendy Harajuku district

Day 4: Tokyo to Kanazawa
- Take the bullet train to Kanazawa (approx 2.5 hours)
- Visit the stunning Kenrokuen Garden
- Explore the historic Higashi Chaya district

Day 5: Kanazawa
- Tour Kanazawa Castle
- Experience a traditional tea ceremony
- Wander through the charming Nagamachi samurai district

Day 6: Kanazawa to Kyoto
- Take the bullet train to Kyoto (approx 2 hours)
- See the iconic Fushimi Inari Shrine
- Visit the beautiful Kiyomizu-dera Temple

Day 7: Kyoto
- Admire the stunning Kinkaku-ji (Golden Pavilion)
- Explore the Gion geisha district
- Enjoy a traditional Kyoto-style dinner

For flights, I found round-trip options from London to Tokyo in May 2024 starting around £955. Some top airline choices include British Airways, Japan Airlines, and KLM. 

As for accommodations, there are many great hotel options in Tokyo, Kanazawa, and Kyoto to fit various budgets. Some highly rated picks include Andaz Tokyo Toranomon Hills, Park Hyatt Kyoto, and Kanazawa Tokyu Hotel.

Let me know if you need any other details or have additional requests as you plan your Japan trip!
#

@agile jay @warm cave @cinder comet look!!!!!! this is actually crazy good what

agile jay
warm cave
#

Which model was used, sonnet?

sleek vortex
#

70b = router
8b = search source summary agents
haiku = final response

#

let me send the full log

#

for some reason this python parallelism isnt working so right now it took like 100 seconds

#

other requests take like 12 though

#

theres no embeddings involved

#

i went for a method where it tries to hit 6 links, but then cuts it to 3, since sometimes there are like 3 dead links

#

sometimes there are 6 alive

agile jay
warm cave
#

Nice šŸ‘

agile jay
#

Because that's what I generally use.

sleek vortex
#

8b summarises the 3 sources into the main info of that intent

#

and then haiku takes that in for the main response

agile jay
#

Yep, 8B is great for managing context.

sleek vortex
#

70b, 8b on groq, and haiku on ...haiku

agile jay
#

Yep, well on anthropic.

sleek vortex
#

yeah

#

lol

#

quite the unlucky search with 5 links dead

#

but the answer pulled through still

#

all in 9 seconds...

agile jay
#

Now you just gotta fix the async code in Python.

sleek vortex
# sleek vortex

actually this might be due to searxng putting news onto a seperate search type

#

pplx had 19 sources for it so hmm

#

ok but the whole japan trip one only sent 5k tokens to haiku

#

thats crazy good

warm cave
#

5k, yeah that is

sleek vortex
#

5k for all that info

#

pplx pro on 4o/opus level

#

and it even gave average flight price which i dont think ive seen pplx do (not sure though)

warm cave
#

Yeah, great quality with lower cost models

sleek vortex
agile jay
sleek vortex
#

well sockets would be the backend, no?

#

i need to refactor a decent amount of this to work as an actual app rather than a cli tool

warm cave
#

How long is your system prompt now?

sleek vortex
#

well

#

the system prompt is like 500

#

and then the examples 5-shot is like 1k

#

that's to llama3-70b

warm cave
#

That’s pretty good, and that should be cheap with Llama

sleek vortex
#

well the prompt to the 8b models is basically full context

#

so lets say an average of like

#

3 searches? per query

#

some maybe none, some maybe like 6

warm cave
#

Yeah, that ends up being good, but now imagine if you were running Opus 😧

#

Glad Llama 3 is so good, llama 2 would not cut it

sleek vortex
#
(0.59/1000000)*1500 + (0.05/1000000)*350
+ ((0.05/1000000)*6656 + (0.05/1000000)*768))*4
+ (0.25/1000000)*1750 + (1.25/1000000)*400
#
(0.59/1000000)*1500 + (0.05/1000000)*350 # groq/llama-3-70b router 
+ ((0.05/1000000)*6656 + (0.05/1000000)*768))*4 # ~4x groq/llama-3-8b search summarisers
+ (0.25/1000000)*1750 + (1.25/1000000)*400 # claude-3-haiku final
#

let me add that up

#

$0.0033248 per query?

warm cave
#

And with pplx I would likely 6 or more searches planning that same trip

sleek vortex
#

so more like ~$0.0035 average

warm cave
sleek vortex
#

could be a bit more

#

could be way less

#

thats for like a decently beefy query

#

because not every query does 4 searches

#

one time it had the iphone query do 6 searches

#

but on avg it did like 2

sleek vortex
#

thats crazy...

warm cave
#

5,700 queries per $20

sleek vortex
#

190 per day

#

assuming each user somehow uses every single query

#

realistically not happening

warm cave
sleek vortex
#

ok since we dont have opus

#

say we offered sonnet and 4o as best models

#

sadly no vc money to throw for opus

#

say maybe

#

$15/mo

warm cave
sleek vortex
#

gemini 1.5 pro is cheaper than 4o

#

stick that in there too

#

and it wouldnt increase prices much anyway since that model is only being used for the end query

warm cave
#

Yeah true

sleek vortex
#

we're feeding like max 5k tokens

#

and then if we brought document support in then idk we'd have to see how that works

#

maybe use sonnet as the model in the chain for documents

warm cave
#

Gemini flash is also a good option

#

1M

#

For doc upload

sleek vortex
#

True

#

well one thing to consider is that im on residential ip right now

#

some sources might be blocked when i move scraping to a cloud server

warm cave
#

Also if you self host any models, there is llama 3 8b 100k and 1M, same thing with 70b. But again the goal is not max context, but maximizing the quality and cost

sleek vortex
#

yeah but where would i host

#

i can put it on modal

#

but cold starts is like so slow

#

20-30 seconds? for llama3 8b

#

and it works out to more than groq

warm cave
#

Yeah I guess that would not be optimal

sleek vortex
#

i cant selfhost locally i have no hardware lol

#

why is mistral 7b so expensive

warm cave
#

I may have made a typo, let me check lol

sleek vortex
#

whos using mistral api realistically

#

idk

#

i wouldnt be using that

#

mistral large sounded like an eh model from the start to me

#

since its gonna be superseeded anyway

#

wait so flash is 0.70 until you go above 128k tokens

#

?

warm cave
#

Only reason I use it, is to test out 8x22b

sleek vortex
#

also ideally we would maintain some form of cache of websites and intent queries right

#

idk where i'd store that

#

s3?

#

5gb free so i mean?

#

As part of the AWS Free Tier, you can get started with Amazon S3 for free. Upon sign-up, new AWS customers receive 5GB of Amazon S3 storage in the S3 Standard storage class; 20,000 GET Requests; 2,000 PUT, COPY, POST, or LIST Requests; and 100 GB of Data Transfer Out each month.

#

having a disk version of the scraped pages also lets me see the broken pages lol

#

what page is actually hitting >24k chars in meaningful content

#

s3 would be like decent latencies

warm cave
#

Yeah, no thats the price

sleek vortex
#

at that point

#

would rather just host postgres

#

to store these pages

warm cave
#

to big to fit in one screen shot now

sleek vortex
#

ive been manually fixing a few pages

#

its like whack-a-mole more or less

#

wikipedia seems to be huge due to it having references included in the main article

#

...ought to go and fix that lmao

#

im probably going to take a break for today

#

made some immense progress i think

warm cave
#

Yeah, huge progress

tame current
#

I'm so happy that I managed to use the code and paid considerably less. I didn't know that this code cut the Perplexity subscription price in half. I should have done this much earlier. I was considering paying for ChatGPT, but in terms of cost-benefit, there's no competition with Perplexity, which only costs half the price. So, here I stay.

kind hamlet
#

šŸ—æ

rancid dome
tame current
#

Here.

agile jay
#

Oh, it's a one month half price referal code

#

Like this one, which is mine:

warm cave
#

Yep ^^

rancid dome
#

What happen to Claud 3? I asked it to improve my prompts and it used to follow my instructions specifically, in details correctly, but not anymore. Now it answers in general not even follow my instructions?

rancid dome
#

Yes, I tried several times

agile jay
#

Sonnet or Opus

#

Also give an example of your prompt.

rancid dome
warm cave
#

Can you share a thread or prompt that you used and had bad results with

rancid dome
#

My prompt is quite long to post (1600 words)

fading moth
#

Put it in a txt

devout geyser
#

What was recently discussed here sounds a bit like a personal project I did recently for myself, primarily for curiosity. Takes a lot longer than pplx at doing an internet type of query though šŸ˜‚ - about 30 seconds. Uses a combination of Gemini 1.5 Flash (e.g. for extracting relevant content from webpages) and GPT-4o (e.g. to answer the user's message)

terse drum
devout geyser
# terse drum How you extract content from webpage?

I use Gemini 1.5 Flash with a suitable prompt, feeding it the text content of the webpage, the query (e.g. the search query) and the URL of the webpage, instructing it to do the extraction of relevant content blah blah. Temperature and top_p were changed too, until I found I had a consistent result. Ironically the above query in the screenshot proved to be a good example for testing it, plenty of text on that page and some models easily got mixed up

#

When I say that page, I mean the Docker Compose release notes page that is, on Docker's website

terse drum
#

Ok but how you scrape content from website? Example I use jina reader for this

devout geyser
#

Currently I simply use requests (Python) with some additional headers, on my own network (so I appear as a residential IP of course)

#

As well as a custom user agent

#

Some websites block this or don't work well with this because no JS for example

sleek vortex
devout geyser
sleek vortex
#

Ive got response time down below 30s though :d

#

well mostly

#

in theory all the time once i fix my parallel code

devout geyser
#

Yeah I haven't managed that yet, I don't cache various URLs

sleek vortex
#

Based on my search results, the key changes and improvements made in Docker Compose v1.24.0 include:

1. Improved navigation menu in the `docker compose up` command, allowing you to manage services, stop and remove containers, and view logs.
2. Displaying the reason for image pull failures.
3. Fixes for crashes when running `up` with `--no-build` and `--watch`, and when no TTY is available and the menu is enabled.
4. Legibility improvements to the menu action text.
5. Added support for annotations in the Compose file.
6. Improvements to the `config` command, including options to list Compose model variables and specify the output format.
7. Integration of Mutagen to provide synchronized file shares between containers.

However, I do not have access to the full official release notes for this specific version of Docker Compose. If you need more detailed information, I recommend checking the official Docker documentation or website for the latest release notes.```
#

ok that was pretty mid

#

🌐 [1.34s] [99000 chars / 21946 tokens] Read source https://docs.docker.com/compose/release-notes/

#

why is it 21k tokens...

#

since im not using large models for source processing

#

whenever i run into a large or broken-parsed source its a bit of a problem

devout geyser
#

That page is huge sadly, but also a good test bench imo

sleek vortex
#

ok well this is great they just have a whole page on every docker release ever..

#

not great for my 8k llm

#

thats made me think, i could go for a hybrid approach actually

terse drum
sleek vortex
#

I havent added in the feedback loop

devout geyser
#

Yeah Playwright is a good choice but it's slow 😦

sleek vortex
#

i got around 4 seconds to scrape one page when warm on cloud worker

#

though could be improved by keeping the browser itself open

terse drum
#

Please give me example URL with heavy js

sleek vortex
#

perplexitys own website

#

quora

terse drum
#

I will test time to load

sleek vortex
#

are both clientside react

devout geyser
#

As for the context, I use Gemini 1.5 Flash to extract the content that's most likely relevant from the webpage text (HTML stripped that is), although this process slows things down of course

sleek vortex
#

your gemini 1.5 flash has given me a little idea

#

@terse drum 2.9seconds to screenshot on playwright, incl launching browser

sleek vortex
#

if the smaller model doesnt find enough information from the search, then i can fallback to 6 searches with a larger llm (haiku or flash)

#

yeah thats a great idea actually

devout geyser
#

Nice, yeah mine generally aims for 2 search queries, it can do more if it feels it might be useful though (e.g. a complex query)

wicked spoke
#

Perplexity down for anyone else?

fading moth
#

With the sickness?

sleek vortex
#

nah

#

working fine

supple vine
#

all good here

wicked spoke
#

Weird. Nothing in my library and nothing pops up

supple vine
#

try a page refresh

sleek vortex
#

reload/refresh the page

supple vine
#

I get that sometimes

terse drum
sleek vortex
#

he probably doesnt

wicked spoke
#

Oh hey it worked šŸ™„ I thought I tried that already. Thanks!

sleek vortex
#

does similiar/same thing as me im guessing, parsing the pages

devout geyser
sleek vortex
#

how do you fare against

#

painful site

#

espicially the "compare devices" page

#

its pretty horrible

fading moth
#

So... I know a lot of people say refresh and most people hit F5 or the circle arrow, but the best refresh is Ctrl+F5

terse drum
sleek vortex
#

60k tokens of useless i think

devout geyser
#

Hmm good question, I'll try something that accesses Apple's site

sleek vortex
#

ive filtered out a few apple pages

#

and also dealt with various other pages

#

with custom selectors

#

as ive said its like whack-a-mole lol

devout geyser
#

Yeah I've excluded some domains from being scraped, such as Reddit

sleek vortex
#

reddit i want to have eventually

#

since it is a decent/good source if i can have grounding

#

well youre thinking personal use, im thinking if i can build a product lol

terse drum
devout geyser
#

Interesting

terse drum
#

I don't try Gemini 1.5 Flash - must make some test

sleek vortex
#

why 1.0?

#

1.5 is better

#

cheaper? 1M ctx window

devout geyser
#

I tried Haiku with the webpage extraction part of the process, but it didn't feel as accurate or as detailed at the task generally compared to Gemini 1.5 Flash, however it could also be that I need to play with my prompt more, or the temp/top_p

terse drum
#

Cheap and fast. Use from openrouter. The result are good so...

devout geyser
#

There is a downside to Flash though, its moderation layer can be triggered by some webpages

sleek vortex
#

turn it down i think

terse drum
devout geyser
#

Yeah same, I've also used GPT Prompt Engineer (I think it's called that?) on GitHub as well as the recent Prompt Generator tool on Claude

terse drum
devout geyser
#

Oh nice, I'll check it out šŸ™‚

#

Thanks

sleek vortex
wicked spoke
terse drum
#

Glad to hear

rancid dome
terse drum
#

When Claude build the prompt to specific task I switch into preferred model example haiku and make test (I have database with websites for this with different use case's). If test pass I implementing new prompt to work

rancid dome
terse drum
rancid dome
#

Playground on Claude website?

sleek vortex
#

some websites seem to not send a captcha challenge

#

if you dont include a useragent

terse drum
sleek vortex
#

<script>!function(){var e=document.createElement("iframe");function n(){var n=e.contentDocument||e.contentWindow.document;if(n){var t=n.createElement("script");t.nonce="",t.innerHTML="window['__CF$cv$params']={r:'792f8224776acf9f',m:'hMcSCCrnIkr7c8Pec6Na6boaaFAnQ6S0ypG2GKRbKgc-1675305063-0-AaJn0SqKZQnadmRQ5O1dM9xMkXWyP+ll7gpl2NHeoNbZTEXMjlB10KkwnEU3hf0/gMODfKqcBGLVecql6U04GGs+iJ/kNrNqj1FgfAOlQV+T2koMQMvUy1zr9tegBBX6BikfccHZhwoJhnXc0eTcg58=',s:[0x60b082f691,0xee65a67e11],u:'/cdn-cgi/challenge-platform/h/b'};var now=Date.now()/1000,offset=14400,ts=''+(Math.floor(now)-Math.floor(now%offset)),_cpo=document.createElement('script');_cpo.nonce='',_cpo.src='/cdn-cgi/challenge-platform/h/b/scripts/alpha/invisible.js?ts='+ts,document.getElementsByTagName('head')[0].appendChild(_cpo);",n.getElementsByTagName("head")[0].appendChild(t)}}if(e.height=1,e.width=1,e.style.position="absolute",e.style.top=0,e.style.left=0,e.style.border="none",e.style.visibility="hidden",document.body.appendChild(e),"loading"!==document.readyState)n();else if(window.addEventListener)document.addEventListener("DOMContentLoaded",n);else{var t=document.onreadystatechange||function(){};document.onreadystatechange=function(e){t(e),"loading"!==document.readyState&&(document.onreadystatechange=t,n())}}}();</script></body>

#

this looks like cloudflare right?

#
 <div class="data"><form id="challenge-form" class="challenge-form"><div id="cf-please-wait"><div id="spinner"><div id="cf-bubbles"><div class="bubbles"></div><div class="bubbles"></div><div class="bubbles"></div></div></div><p id="cf-spinner-please-wait"></p><p id="cf-spinner-redirecting" style="display:none"></p></div><noscript id="cf-captcha-bookmark" class="cf-captcha-info"><h1 style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1></noscript><div id="no-cookie-warning" class="cookie-warning" style="display:none">```
#

yeah

#

cf

#

hmm

terse drum
sleek vortex
#

do you not run into captcha often though

terse drum
#

Use stealth mode to bypass captcha

#

Its work for most use case's

sleek vortex
#

im so confused

#

this code in ts

#
const res = await fetch("https://openai.com/api/pricing/", {
    "headers": {
        "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
        "accept-language": "en,en-US;q=0.9",
        "cache-control": "no-cache",
        "pragma": "no-cache",
        "priority": "u=0, i",
        "sec-ch-ua": "\"Chromium\";v=\"125\", \"Not.A/Brand\";v=\"24\"",
        "sec-ch-ua-mobile": "?0",
        "sec-ch-ua-platform": "\"macOS\"",
        "sec-ch-ua-arch": "\"x86\"",
        "sec-ch-ua-bitness": "\"64\"",
        "sec-fetch-dest": "document",
        "sec-fetch-mode": "navigate",
        "sec-fetch-site": "none",
        "sec-fetch-user": "?1",
        "upgrade-insecure-requests": "1",
        "DNT": "1",
    },
    // "referrerPolicy": "strict-origin-when-cross-origin",
    "body": null,
    "method": "GET"
});
const text = await res.text();
console.log(text);
#

works fine

#

but in httpx the exact same headers is not working im sent a cf captcha

#

...what

sleek vortex
#

omg i hate myself

#

why do i not name folders properly

#

i cant even find where i wrote that code

modest token
#

So does anyone know how I might be able to save a Perplexity thread to my account since I created it before I made my account?

sleek vortex
#

you could just bookmark the link if it's extra important

modest token
#

So is perplexity wrong about its own functionality? ```
To save the thread [link] to your account library for use anywhere:

Sign in or create a Perplexity account if you haven't already.

Once signed in, open the thread you want to save.
Click the "Save" button at the top right of the thread.
The thread will now be saved to your personal library, accessible from any device by signing into your account.

Saving threads to your library allows you to access them privately from anywhere, even if you started the conversation before creating an account. Saved threads are not locked or shareable unless you explicitly share the URL.

https://www.perplexity.ai/hub/getting-started was the main source it used

Your essential guide to unlocking Perplexity's full potential. Dive in, explore, and make the most of your knowledge journey.

sleek vortex
#

wait i have no clue if thats a thing or not

modest token
#

Well I couldn't see that button so...

sleek vortex
#

yeah neither do i

#

maybe its an old feature

#

or yeah pplx lied about itself

terse drum
#

I don't have pplx premium - worth?

#

I don't renewed openai subscription so I looking for something interesting

#

Or any other tool?

shut plaza
#

I use it a ton to help me understand content from school

#

I also use it as a secondary search engine

sleek vortex
#

turns out bun ts library adds a useragent

#

cf proxy was looking for a useragent

#

but if it can parse chrome it sends you the captcha

#

hmm...

#
  • it failed to find gpt4o
  • failed to get ANY context windows
#

bruh

#

🌐 [0.42s] [1755 chars / 318 tokens] Read source https://www.binance.com/en/price/gpt-4o

#

what is this... this page does not exist

terse drum
#

šŸ˜‚

sleek vortex
#

exactly

#

most stupid page lmao

chilly helm
#

Is Perplexity down right now? I can't get any prompts to go through from the website

agile jay
#

For some people

agile jay
sleek vortex
#

so close yet so far

agile jay
sleek vortex
#

ā“ Query? > ai model price comparison - claude 3 models, gpt 4 turbo, gpt4o, gemini 1.5 pro+flash - use unit million tokens and give it in a table