#š¬āgeneral
1 messages Ā· Page 56 of 1
- fetches and scrapes pages, could even use embedded chrome to be able to fetch js pages and pdfs
- maybe uses llama to consider how good the source is?
based on some form of threshold we could store the source snippets into embeddings
obviously thereās the problem that i wouldnāt have a pagerank algorithm
and my web index would be purely searching based on text similarity
Yep, I think refining the input using embeddings, and then using a fast and cheap model like llama 3 would be the best option.
What decides if my question (in non-English) be translated into Eng for pro search? I notice sometimes my questions are translated, but not always, kind of annoying to be honest, are there some kind of switch?
Could be related to the sources. Pretty sure the default is for it to guess. You can go to profile in settings to tell it a specific language.
https://hipertextual.com/2024/05/perplexity-la-ia-que-vence-a-chatgpt-y-gemini-de-la-que-nadie-habla
I didn't understand it. Perplexity can win Claude 3 using Haiku as default?
Was someone using both kagi and perplexity to compare their ai features? Im thinking of trying kagi but not sure if its worth the time and money
Quick question. I am new to Perplexity. I just asked it what the latest version of ChatGPT is, and it said the latest version is GPT4 Turbo, which was released in November 2023. Is this out-of-date information because I am using the free version, or is this typical of Perplexity?
haiku's response to
"plan me a trip to japan somewhere in the month of may 2024"
{
"steps": [
{
"id": 0,
"worker_type": "web_worker",
"inputs": {
"goal": "best places to visit in japan in may 2024"
},
"dependencies": []
},
{
"id": 1,
"worker_type": "web_worker",
"inputs": {
"goal": "flight prices from major us airports to japan in may 2024"
},
"dependencies": [
0
]
},
{
"id": 2,
"worker_type": "web_worker",
"inputs": {
"goal": "top rated hotels in popular japan destinations for may 2024"
},
"dependencies": [
0
]
},
{
"id": 3,
"worker_type": "web_worker",
"inputs": {
"goal": "japan rail pass options and pricing for may 2024"
},
"dependencies": [
0
]
},
{
"id": 4,
"worker_type": "final_output",
"dependencies": [
0, 1, 2, 3
]
}
]
}
Hey, Are you on web search or writing mode, here is what I get when I ask the same question in writing mode: https://www.perplexity.ai/search/What-is-the-naaH6lGFTmipttnJeN1xMw
The latest version of ChatGPT is GPT-4o, which was introduced on May 13, 2024. This new model offers several enhancements over previous versions, including improved speed and advanced capabilities in text, voice, and vision processing. GPT-4o is designed to provide GPT-4-level intelligence but is much faster and more efficient. It also includes ...
seems to not pick up on the weather thing
Oh, let me take a look
and this is sonnet
way better but still idk
{
"steps": [
{
"id": 0,
"worker_type": "web_worker",
"inputs": {
"goal": "find popular tourist destinations in Japan"
},
"dependencies": []
},
{
"id": 1,
"worker_type": "web_worker",
"inputs": {
"goal": "find typical weather conditions in Japan during May"
},
"dependencies": []
},
{
"id": 2,
"worker_type": "airport_search",
"inputs": {
"location": "my current location"
},
"dependencies": []
},
{
"id": 3,
"worker_type": "airport_search",
"inputs": {
"location": "major cities in Japan"
},
"dependencies": [0]
},
{
"id": "4*",
"worker_type": "map:flight_price",
"inputs": {
"per:takeoff_location": "ref:2.airport_code",
"per:landing_location": "ref:3.airport_code",
"per:day": "dates in May 2024"
},
"dependencies": [2, 3]
},
{
"id": 5,
"worker_type": "web_worker",
"inputs": {
"goal": "find highly rated hotels and accommodation options in the Japanese cities from step 0"
},
"dependencies": [0]
},
{
"id": 6,
"worker_type": "final_output",
"dependencies": [0, 1, 4, 5]
}
]
}```
havent fully figured out the syntax for map steps but
this is already promising i think
Yeah it is, looks like you are getting close
i need to figure out map steps
or how to syntax it
aka like when the following step is ran on each output from a previous step
so if i asked for an ipad lineup comparison
What model do you plan to use on the final product?
idelaly first step would get a list
final output maybe like sonnet or haiku or 4o
im too broke to serve opus atm
all the intemedary models would be llama 8b or something?
and the starting model im debating between haiku, sonnet or a finetune 8b
let me ask this same query to groq l3- 70b actually
and 4o
Yeh letās see
gpt4o input $5/mtok, output $15/mtok
time to sign into my $5 funded api account on gpt platform lol
Iām glad groq is free rn, make it easy to test things out
ok so heres gpt4o on temp 1
{
"steps": [
{
"id": 0,
"worker_type": "web_worker",
"inputs": {
"goal": "best cities to visit in Japan in May 2024"
},
"dependencies": []
},
{
"id": "1*",
"worker_type": "map:airport_search",
"inputs": {
"per:location": "<ref>"
},
"dependencies": [
0
]
},
{
"id": 2,
"worker_type": "flight_price",
"inputs": {
"takeoff_location": "origin_airport_code",
"landing_location": "destination_airport_code",
"day": "May 2024"
},
"dependencies": [
1
]
},
{
"id": 3,
"worker_type": "weather_worker",
"inputs": {
"location": "Japan"
},
"dependencies": []
},
{
"id": 4,
"worker_type": "final_output",
"dependencies": [
0,
"1*",
2,
3
]
}
]
}```
ok ideally what do we actually want it to do
- search up best tourist places
- look at weather in that area for the month
- find the airport codes for those areas
- find flight price from user's origin airport
- build an output itenary
well we're definetly very close
i think maybe we have to first expose a better schema for the weather function
and also make it more clear how to specify inputs from other functions
like "landing_location"
right now its a large system prompt + 2 shot of example queries
around 2k tokens (given that the json is formatted)
Sounds good, do what the rabbit team promised, haha
Yeah that is getting a bit large
lmao fr
the system prompt is
1.17k
with the "tools":
web_worker, spotify_worker, weather_worker, airport_search, flight_price, final_output
now that doesnt help that this might grow if we added more things
I have wondered if there is a way to dynamically load is context for relaxant tools, but idk
relaxant?
lol, ārelevantā
oh right
On my phone
i mean we could have a super small model look at only the query and decide which tools might be required
then use a large model after to decide the specifics and dependencies
but then this chain is going to involve like 25+ model requests at this rate
also small model might neglect quality?
ooh i have an idea
Yeah, you could have a list of tools and a brief description, and a small models would select the tools
what if we get the model to write sort of psuedo js code or something
And those would load into the big models context
but
the small model given something like japan query
might not pick up on needing to figure out the weather
or something
i mean even haiku didnt pick up on flight price like 60% of the time
Hmmmm
this is an interesting idea
let me try this
instead of huge json schemas
we could maybe make it write something like this
You could have broad groups of tools, like many travel tools would be a category
Why not just write a simple interface for the tools?
this is an example i wrote
make a table comparing the whole current 2024 ipad lineup
->
{
"steps": [
{
"id": 0,
"worker_type": "web_worker",
"inputs": {
"goal": "find out the names of the ipads in the 2024 lineup",
},
"dependencies": []
},
{
"id": "1*",
"worker_type": "map:web_worker",
"inputs": {
"per:goal": "find out the price and specs for the device <ref>",
},
"dependencies": []
},
{
"id":2,
"worker_type": "final_output",
"dependencies": [0,"1*"]
}
]
}```
as part of a shot for the model
so maybe instead we could represent it like
/**
* Website explorer - if you have more than one goal, use separate queries
* @constructor
* @param {string} goal - Specifically what to find out from the web.
*/
type SearchWorker<T> = (goal: string) => T;
/**
* Plays a song on the user's Spotify account
*/
type SpotifyWorker = (song_name: string, artist: string) => void;
type WeatherForecast = {
temperature: number;
weather: string;
date: string;
};
/**
* Get the seven day forecast for a location
* @param {string} location - The location to get the forecast for
*/
type SevenDayForecastWorker = (location: string) => Promise<WeatherForecast[]>;
/**
* Get the current 3 character airport code for a location
* @param {string} rough_name_or_location - The location to get the airport code for
* @returns {string} - The airport code (e.g. "LAX" or "LHR")
*/
type AirportSearchWorker = (rough_name_or_location: string) => Promise<string>;
type Flight = {
origin: string;
destination: string;
date: Date;
price: number;
airline: string;
};
/**
* Get the price of a flight
* @param {string} origin_code - The 3 character airport code of the origin airport
* @param {string} destination_code - The 3 character airport code of the destination airport
* @param {Date} dateStart - The start date of the flight
* @param {Date} dateEnd - The end date of the flight
* @returns {Flight[]} - An array of possible flights
*/
type FlightPriceWorker = (origin_code: string, destination_code: string, dateStart: Date, dateEnd: Date) => Promise<Flight[]>;
/**
* This returns the final output to the user in a message
* You must call this function at the end of your code
* @param {any[]} inputs - Any inputs from previous functions
* @param {string} [extra_instructions] - Any extra instructions or information for the worker (optional)
*/
type FinalOutput = (inputs: any[], extra_instructions?: string) => void;
thinking
Wouldn't it make more sense to reorganize the response?
{
query: "Find out the names of the iPad in the 2024 lineup",
steps: [
{ //step 1 },
{ //step 2 }
]
}
And have them be a tree of nodes?
this is 464 tokens
big json was 780 ish
I mean is that not what this is already
The model takes in the query and has to return an organised list of steps
You have a list of steps, and dependencies to group them.
Instead of just a tree.
Yeah but we want steps to be able to run in parallel too
otherwise some queries are gonna take like 5minutes
Yep, they do run in parallel.
You just loop through steps, and steps can have steps etc.
Yeah but say a query like
make a table comparing the whole current 2024 ipad lineup in price and specs
how would you approach it (asking)
Step 1 produces a task and adds it to the message queue. In this case it could be to make a search and you use the sources in your output.
Then when the task is done, the next step is triggered. In this case, it would pass the data into a table maker, to make the table you wanted.
But it's a simple producer consumer model, where the tools consume the tasks as they come in, for maximum load efficiency.
But how would you deal with fetching each ipads own data
We would have to first teach the decision model in prompting or shotting or finetuning to break down something like that into multiple queries
where it first finds out the ipads then researches each individually
you also wouldnt search for all the ipads' specs at once right?
That would depend if you can find them all in one place.
If you start splitting your tasks up too much, it gonna take longer and cost more too.
hmm, true
let search_worker: SearchWorker<string[]>;
const ipad_models = await search_worker("list of 2024 ipad models");
const specs_worker: SearchWorker<{[key: string]: any}[]> = async (model: string) => {
const specs = await search_worker(`specs of ${model} ipad`);
const price = await search_worker(`price of ${model} ipad`);
return [{model, ...specs}, {model, price}];
}
const ipad_specs = await Promise.all(ipad_models.map(specs_worker));
const table = ipad_specs.reduce((acc, curr) => acc.concat(curr), []);
final_output([table], "The table contains the model name as keys, with specs and price as values");```
sonnet turned that query into this
which works but yeah i see your point, this would use like 10 google queries...
which might be a bit stupid when perplexity can already do it from existing sources
https://www.perplexity.ai/search/make-a-table-sMQdUCkEThWdpBftzut4Cw#0
though i had this query
"price comparison - claude 3 models, gpt 4 turbo, gpt4o, gemini 1.5 pro and flash - use unit million tokens and give it in a table"
idk pplx like 90% gets it
hmm
Yep, you want to create the smallest amount of steps, to get the answer you want.
well the problem is how do we know when the initial sources arent enough
that we have to fallback to searching individually
Is this enough data to solve the task? If not, create the queries needed to get the information you need.
need a good example of a complex query that pplx cant do
hm
You probably also want to add a context timeout, to maximise the time it has to solve any task. Otherwise it could carry on for ever...
so rethinking based on your ideas
adopt something like the perplexity model, where im thinking the first model outlines a list of searches and maybe steps
so it could be something like:
- search thing a
- search thing b
- use code interpreter
but not like some complex thing like im thinking
then the worker on task "search thing a" can also check if there's anything missing, and go search that independently if needed
because yeah my approach would i think go out the roof in model costs, search costs and query time, since i might get to the same endgoal with 100% information but its too slow
Yep, make it simple and reduce the number of steps, if possible.
Also save the output and embed it, so you can use it as a source for people who ask the same question...
you mentioned subtasks right
Yep, basically a node tree of tasks.
im thinking
maybe a task, like say "find out price of iphone 15 lineup" could be the task goal
and then thats a container for maybe a websearch (and any further if missing info)
so then later on, we could use embedding simularity to compare the goals
Yep, more abstract goals.
maybe if theyre like 95% similiar then we can use the same cached query
Yep, or give them that result, and they can click a button to do a fresh search if needed.
so how to prompt the initial model now is the question
asking for a list of goals, that can have tasks as children?
Yep, but you probably wanna make a few manual examples.
Yep, improves it a lot.
Mine infrastructure works slightly differently, but that's because I'm treating the streamed model messages like websockets.
i mean right now i dont have any infrastructure
i have a tech demo for a mini perplexity in python
but im going to completely rewrite it to use a socketio based backend on modal cloud, which connects to embeddings gpu, and models and then frontend in something
You could also try moving some of the load to the client.
Such as the web searching/requests.
that wouldnt work at scale
+huge cors issue
my mindset building this is making it like a perplexity-type product/site, not a product just for me
obviosuly it might never succeed but
fun to try
So server side everything?
Yeah basically
You probably wanna calculate your cost per query then.
To see what parts are best self hosted, or using an API.
Hello Perplexity (World)
Have you seen the Muffin Man?
My little CLI demo per query is around
$0.005 for running the cloud gpu for embeddings
+
$0.00075 input haiku + $0.000625 output haiku
I am brand new, and I have no idea what the difference is between the two modes.? I can't even see where to switch modes?
$0.006375 per query
its this button called focus when you're making the query
You change it in the pre search page.
ok with gemma added on
(0.10/1000000)*600 input+output gemma
+
$0.005 ish for running the cloud gpu for embeddings
+
(0.25/1000000)*3000 input haiku + (1.25/1000000)*500 output haiku
= total 0.00006 + 0.005 + 0.00075 + 0.000625
= ~$0.006435/query (155 queries per $1)
Oh, you are using the gemma model?
Trying to find a little help in using perplexity with financial portfolio management. I wish to input my own data not really for search the web so much. I was hoping I could find like minded pro users.
for my new thing im probably going to use haiku as the first model too
well lets just put my estimates at what 3x higher cost
I tried the writing mode - the answer is now worse. Before, at least, it suggested the last but one release. Now it doesn't know at all.
$0.019305 per query?
Yes, because the model isn't using search, so it's knowledge is only to it's last cut-off date.
But it didn;t know when i used search either
well tldr, you got a query where the model failed to find any reliable sources
but someone else asked same question - and they got the right answer?
wait let me try with a model that itself isnt the latest version of chatgpt
I am debating whether to move to perplexity, so just trying to work out if it is worth it. So far, the first question I ask is, and it seems to struggle. My question is, is the paid version better?
They definetly have their own index
I cant seem to index openai.com since its a react app or something
wow
Use a headless browser, or cached version...
Headless browser is sending the speeds out the window
Is it planned in the roadmap to have a more dedicated approach to files?? Like uploading some files (not only PDFs) and have several threads be on that + Threads concerning those files???
Yep, but if you are crawling a site, it's probably the best option.
I Find uploading to be the most uncared for feature in perplexity
Maybe your headers
Yeah they kinda just stuck a summariser on the documents most of the time and called it a day
(Or at least ive experienced)
has there been mention of more work on that?
You would have to ask the pp team
which channel?
#šāideas or something
ty!
should i remove the details on the specific output formats
since i think we could just run a small model in the middle anyway to handle it
i think it will always output a string of some sort so like?
idk
Yep, it uses a lot of tokens
Better to write a schema at the start, to use less tokens.
Is that groq UI?
yeah
trying on 8b/70b now
User Query: "plan me a trip to japan somewhere in the month of may 2024 from london"
llama3-70b
{
"query": "plan me a trip to japan somewhere in the month of may 2024 from london",
"goals": {
"Find flights from London to Japan in May 2024": [
{
"worker": "flight_price",
"inputs": {
"takeoff_location": "LON",
"landing_location": "NRT or HND", // Tokyo airports
"day": "May 2024"
}
}
],
"Find airport codes for Japan": [
{
"worker": "airport_search",
"inputs": {
"location": "Japan"
}
}
],
"Find weather forecast for Japan in May 2024": [
{
"worker": "weather_worker",
"inputs": {
"location": "Japan"
}
}
],
"Plan itinerary for Japan trip": [
{
"worker": "web_worker",
"inputs": {
"queries": [ "Japan travel guide", "things to do in Japan in May" ]
}
}
]
}
}```
i think this might be possible but we would need like 50 queries of data to finetune it with
we're trying to get it to connect the dots when it has no clue how
whats the ideal response
maybe like this
{
"query": "plan me a trip to japan somewhere in the month of may 2024 from london",
"goals": {
"Plan itinerary for Japan trip": [
{
"worker": "web_worker",
"inputs": {
"queries": [
"japan travel may 2024",
"best places to visit in japan may",
"japan travel packages may 2024",
"japan festivals events may"
]
}
}
],
"Find flights from London to Japan in May 2024": [
{
"worker": "airport_search",
"inputs": {
"location": "Japan"
}
},
{
"worker": "flight_price",
"inputs": {
"takeoff_location": "LON",
"landing_location": "<found airports>",
"day": "May 2024"
}
}
],
"Find weather forecast for Japan in May 2024": [
{
"worker": "weather_worker",
"inputs": {
"location": "Japan"
}
}
]
}
}```
ok here's an instant problem...
{
"query": "find the weights of the iphone 15 pro, and the ip15 pro leather case, and then add them up",
"goals": {
"Find the weight of iPhone 15 Pro": [
{
"worker": "web_worker",
"inputs": {
"queries": [ "iPhone 15 Pro weight", "iPhone 15 Pro specs" ]
}
}
],
"Find the weight of iPhone 15 Pro Leather Case": [
{
"worker": "web_worker",
"inputs": {
"queries": [ "iPhone 15 Pro Leather Case weight", "iPhone 15 Pro case specs" ]
}
}
],
"Add the weights together": [
{
"worker": "calculator_worker",
"inputs": {
"num1": "<weight of iPhone 15 Pro>",
"num2": "<weight of iPhone 15 Pro Leather Case>",
"operation": "add"
}
}
]
}
}```
all cool but calculator_worker doesnt exist (yet?)
also surely this last step has to be dependent? @agile jay
Yep, it just waits for the other ones to return their outputs.
Okay but how do we handle hallucinations
Hallucinations will always happen, so your best bet is to put things in context, to reduce the entropy to valid answers.
this is honestly not bad
for a 7b
output from haiku
{
"query": "is the ipad pro m4 worth it? whats new and how much is it? should i upgrade from my 2018 ipad",
"goals": {
"Summarize the new iPad Pro M4": [
{
"worker": "web_worker",
"inputs": {
"queries": [
"ipad pro m4 new features",
"ipad pro m4 specs",
"ipad pro m4 price"
]
}
}
],
"Evaluate whether to upgrade from 2018 iPad": [
{
"worker": "web_worker",
"inputs": {
"queries": [
"should i upgrade from 2018 ipad to ipad pro m4",
"is ipad pro m4 worth it over 2018 ipad"
]
}
}
]
}
}```
very very good i think
{
"query": "how many rocks we should eat for a healthy diet",
"goals": {
"Determine if eating rocks is healthy": [
{
"worker": "web_worker",
"inputs": {
"queries": [
"is eating rocks healthy",
"can humans eat rocks",
"nutritional value of rocks"
]
}
}
],
"Provide a recommendation on the number of rocks to eat for a healthy diet": [
{
"worker": "web_worker",
"inputs": {
"queries": [
"how many rocks should you eat per day",
"recommended amount of rocks to eat for health"
]
}
}
]
}
}```
Yep, now just need to parse it reliably, and then run them.
even claude is hallucinating sometimes though
mainly when theres some form of processing involved
We could make it so the "output" to the user is like a step too
so that the model is re-ensured not to start hallucinating its own worker names
Yep, smaller models are more likely to hallucinate.
this is haiku
Haiku is small
both haiku and 7b,8b,70b show the same things
oh yeah
right*
Alan's estimate for Claude 3 Opus: 2T parameters trained on 40T tokens. 3 models sizes: Haiku (~20B), Sonnet (~70B), and Opus (~2T).18 Mar 2024
says google
ok wow haiku is a 20b?
No, that's an estimate.
well yeah
but even if thats an estimate
haiku is such a good model for it being so small lmao
Not really, it's a pretty dumb model.
I've asked it to do stuff with large files, and it fails quite a lot on basic tasks.
I think a llama 70B finetune is probably the best bet with speed to performance ratio.
Not bad
I expect the response to maybe be better than perplexity
since i doubt perplexity would be providing "detailed" weather
should i start implementing the actual running of this?
Yep, only you need to pass in location info though, if you want a decent answer.
Yep, so you can see how stable the generations are.
given this response, the code wouldnt really know what to do with <found destinations> really
but we'll see
maybe can implement a little model call in between
Do you have any rough ideas for how i could lay it out?
I tried to before but it didnt go that well
i can send you this whole conversation (in exported code) if it helps
First come up with multiple examples of different kinds of questions. This is for two things, one for multi shot learning, and two, so you can create a schema which works well for all of those cases.
In your case, you are just doing a list of goals. Really the goals should be more hierarchal, so that they solve the dependency problem automatically. I can show more example after you've come up with the different styled questions/tasks.
Hello
Hi magnus
Can you give me some examples of your thinking?
Maybe for these queries (both complex and simple):
* find the weight of an iphone 15 and iphone 15 pro, and to each add the weight of their sillicone cases
* is the ipad pro m4 worth it? whats new and how much is it? should i upgrade
* price of btc and eth in gbp
just outline it i guess, obviously not expecting a full json or whatever
Something new is about to come
First out the different kinds of tasks.
- Retrieval: web search, document search, user info
- Mutation: turn data into table, html into markdown, add two number, etc
- Task groups, group up similar tasks and wrap them in a parent.
Find the weight of an iPhone 15 and 15 pro, and add them together.
{
intent: "Get the weight of the iPhone 15 and 15 pro.",
steps: [
{ tool: "search", query: "weight of iPhone 15"},
{ tool: "search", query: "weight of iPhone 15 pro"},
]
},
{
intent: "Add their weights together",
steps: [
{ tool: "calculator", query: "--add {0} {1}"}
]
},
And all the steps just happen at the same time. If you need to wait for something, put it as the next task.
this is so ironiclly similar to my tool @agile jay
lmao
i could just turn up the amount of sources and get a similiar thing surely
Yep, the difference would be in the parser implementation, can't write too much out on my phone.
i mean the twitter thing
hm, okay
but surely we want "turn data into a table" to just be handled by the model at the end (sonnet, opus, 4o)
ok but how is he accurately pulling 500 articles
Embeddings maybe?
i mean im doing that
i mean in the previpus stage
where does he even find
500 articles
i guess i could use a chromium browser to first hit all links possible
then i could go and hit every single sublink
I don't know the details, but it indexes around 200k articles per day from all over the world
Web Crawling
Wonder how long it takes to crawl a site.
lets see what happens
half of it is cant reads since this is http scraping only, no browser
Some key points I can gather from the context:
- There have been advancements in areas like natural language processing, machine learning for conservation efforts, AI in healthcare, and AI in cybersecurity.
- There are concerns around the risks and ethical considerations of AI, leading to discussions around AI regulation and governance.
- AI is rapidly evolving and transforming many industries, with new applications and use cases emerging regularly.
- Keeping up with the latest AI news and developments is important for professionals and the general public to understand the implications of this transformative technology.
However, without more specific details on the most current AI news and stories, I don't want to make any definitive claims. The context provided covers a broad landscape, but doesn't allow me to summarize the latest AI news in a comprehensive way. I'd suggest checking reputable technology news sources for the most up-to-date information on the latest AI developments and news.```
well thats useless
a minute ago it gave me a way better response wth
Yep, you need to work around cloudflare and other stuff.
The key AI news and topics from the context include:
- Efforts to regulate and set safety standards for advanced AI systems, including a summit in Seoul and companies signing up to AI safety standards.
- Concerns raised by high-profile figures like Elon Musk and Scarlett Johansson about the risks and challenges posed by the rapid development of AI.
- The use of AI in elections, including allegations of AI-generated impersonations being used to spread misinformation.
- The continued growth and investment in major AI companies and startups like OpenAI, xAI, and Microsoft's AI-powered PCs.
- The impact of AI on sectors like cybersecurity, loneliness, and productivity.
- Debates around the ethics and responsible development of AI, including concerns about AI-generated content and the potential existential risks of advanced AI systems.
Overall, the news highlights the accelerating pace of AI innovation and the growing recognition of both the benefits and risks associated with this rapidly evolving technology. The context suggests there is an active dialogue and efforts underway to try to shape the responsible development and deployment of AI systems.```
this is like 25% better
Too many chefs spoil the broth.
yeah but then what the hell is this
thats the whole franchise
Fake news.
How many items can the AI focus at once?
ok one way i can think of it is that he indexes the stories himself
After a certain point more sources doesn't make it better.
then he has llama or a model look at it and extract the main points from eveery source
then when queires are made he uses these summarised snippets of the world's current stories
but that would only work for news
uh isnt that alot? is this like something happening in real-time or just one-time func, to save articles or smth like that
he would have to be scraping all news sites beforehand
like polling rss feeds and google news or something, and having a cloud function constantly ingesting and vectorising/summarising/whatever
"each reponse is based on around ~500 articles and the more you ask the more articles are incorporated"
so basically its stored and then gets called if it fits the user query
yeah...
if not it searches new articles and adds it to the polling
why does it feel like theres a new competitor to whatever thing im building everyday š
such a funny industry
That's life
At the end of the day the sources don't matter. What matters is that the user gets what they wanted...
yeah, i guess
i mean the moment perplexity was out many people tried to get something similar
tbh this seems like a fun project to do : https://x.com/Charles12509909/status/1794630406064795909
i dont think its that hard to do
yeah but then what will you really do with this
whats gpt4 going to do after being able to click things
Be me on discord....
idk š
idk maybe do something like devin but locally
its good but its kinda useless when you think about the big picture
goes back to chatgpt then to your ide, and so on until it reaches the wanted result
yeah but thats gonna suffer the same fate as devin
- doesnt do much, or takes months to do it
- expenses out the roof
At that point just give the thing access to your filesystem
a 50% and we call it a win
and do it via that instead
It can watch corn for you, so you can save time.
Some fast advice needed. I am giving a talk tomorrow (just 45 minutes) on AI to warm folks up before the annual dinner of my non-tech professional association. Although I can obviously do without slides (this is a talk, not a webinar), I have decided to investigate the idea at the last moment of having a few graphic slides to explain some of the concepts I want to explain to these lay people (I am a lay person as well) and for a little relief for them from looking at me. Just a key word or two and an image on each slide ā probably shouldnāt be a true picture ā would do the trick. What is my fastest, most accessible and best choice given the lateness of the hour, as it were? I will be thankful for any responses.
I should have made it plain I am looking for something I can make with Perplexity
I mean you could ask perplexity, maybe generate some ai images
related visuals that could please their eyes
I have not done it before. Can you point me to some instructions?
Tell it to make a description of the image you want, and an image button will appear at the top right of the response.
Maybe first take the list of concepts youre talking about, and ask it to generate ideas for images for each
then on the top right as code says, you can use the image thing with a custom prompt, just putting in the idea for the image
Thanks!
ok im working on implementing this
but now i think its time to go back to the drawing board with these website embeddings for a second
https://www.perplexity.ai/page/Internal-Mechanics-of-h515ZojmQFiTCxRHB3OEyw > this now has a purpose for being made. nice.
Large Language Models (LLMs) such as GPT-4 and BERT represent the cutting edge of natural language processing, utilizing deep learning architectures based on transformers to predict and generate human-like text. Internally, these models operate through a complex interplay of neural network layers, where each layer processes input data sequential...
as a "demo" load the AI Profile with a specific prompt and show the output with and without it, to show the power of System Prompting.
@agile jay @warm cave fixed the scraper fail rate from like 50% to like 10% lol
What was the main cause?
Are we still limited to like 20 tokens daily for cloude ?
Currently limit is 50/day
What happens if you're in a thread that is running cloude will it simply error out or convert to a different llm ?
Once you run out?
is that opus, or all claude?
It switches to Sonnet
Claude 3 Opus is 50 per day, Claude 3 Sonnet is 600 per day, and Claude 3 Haiku is ā per day
ah, that's good to know
and btw haiku is "Default"
yeah, I don't like haiku, but 600 sonnet feels like more than claude direct
so that's fine
š
Jailbreaks
I know, but it looks they are from before 2 May? Probably patched
I have a jailbreak that works for 3.5 from a year ago
You had Mahoraga adapt to it
Danis I think I don't remember
š
My kingdom for like... an integration in Discord for perplexity that summarizes the last messages since I was idle in this channel in a private message.
I'd say like... /catchmeup or something and perplex would send it privately
Pretty much nothing happened recently
I know, but wouldn't it be neat!
Yeah, would be cool
Has anyone noticed we searches are happening less?
I've been struggling to get it to show sources and stuff, it's mostly just using the offline chatgpt
Which is not useful at all
I have not, are you using āAllā mode with āProā turned on
Wait until it gains consciousness and starts demanding for ai rights, lol.
It's set to all
It was fine before though
I don't like "pro" because it forces me to answer questions it should have context on
Hi
You wonāt reach any limits, only for Pro Search
thx. how accurate you think perplexity is when it comes to mcq?
Not in the same realm as OpenAI/Anthropic, or theyād be screaming about it
wait is open ai the big name over chatgpt?
Yeah, OpenAI = ChatGPT.
Have to ask here as well since the Perplexity status webpage Discord link has expired: I'm getting 500 Internal Server Error for all of my Perplexity API utilizing applications right now. API status page shows nothing off? What's going on? (The implementations worked without issues previously.)
@halcyon coral wassup with the opus 50 limit?
you know y'all can just say that it is going to be permanent, right?
we get that its expensive
imo thereās nothing worse than unstated limits (or falling back to lesser models without notice)
can anyone share some ai's that are becoming rlly popular and are very accurate at doing their jobs?
thats true
what are some good ai's that are becoming more popular and accurate?
idk perplexity is the really only big one
ive been working on my own agent-like research thing in this server
yea
how is it going so far
https://openai.com/index/openai-board-forms-safety-and-security-committee/
OpenAI has recently begun training its next frontier model and we anticipate the resulting systems to bring us to the next level of capabilities on our path to AGI. While we are proud to build and release models that are industry-leading on both capabilities and safety, we welcome a robust debate at this important moment.
A first task of the Safety and Security Committee will be to evaluate and further develop OpenAIās processes and safeguards over the next 90 days. At the conclusion of the 90 days, the Safety and Security Committee will share their recommendations with the full Board. Following the full Boardās review, OpenAI will publicly share an update on adopted recommendations in a manner that is consistent with safety and security.
gpt5 is near
idk today im going to try to implement the agent like thing
do some actual work irl that isnt this project
then if i have time start work on a new backend with a ui
what idk how i'm going to handle is storing embeddings
ok i mean realistically do i actually need embeddings?
if i went onto the smaller goal architecture
could i not just limit each mini model to like 4 sources
and summarise?
idk
i don't think so
How long does perplexity store uploaded files for
i mean theres gotta be more no? i mean im using it for school a bit idk what else to use
that's free
is decent
chatgpts own website is ok-ish
pplx labs for not online models
30 days non-enterprise (can be used to improve pplx)
7 days on enterprise (not used to improve pplx)
What if retention is turned off
And where does it say this if you don't mind me asking
wdym own website? it just chatgpt 3 for me cuz free
well its limited
are there other free ones
to idk per hour
for like daily purposes/school
if you have like the smallest amount of code experience
you could go and install gpt4free on your machine
with open webui or something
i had that setup till i got pplx pro
pplx = perplexity?
yeah
its like ok ish
what about the gemini thing?
oh yeah
you can use gemini aistudio for 1.5 pro free
and gemini.google.com for free gemini generally
ok for school?
but its not the "best" according to some
k thx
better than nothign ig
yeah its prob fine
if youre doing sm like
finding quotes in a text
i ain't doing that
then this is pretty good
to upload the whole text
ya i'l try em all
and avoid image uploads when you need accuracy id say
not all of them are the best at that
you can get images on chatgpt.com
again, hourly limited though yeah
could make multiple accounts
time to work on this
this is looking promising already
well we've got more grounding than google....
wow...
that would be around
gpt4o $0.0075
groq llama3-70b $0.000689
sonnet $0.0057
opus $0.0285
(5/1000000)*900 + (15/1000000)*200
now i really dont think we need opus
Hello! I'me using the Perplexity AI Companion for Chrome. I can't seem to use it on webpages, for example to summarize or ask a question related to the page's content. I get an error saying it cannot access the page content. Any tips on how to fix this? Thanks
I used it. Thanks.
Hey @orchid dust!
On which website does it not work?
How will perplexity be protected from the influence of disinformation that news corp or Rupert Murdoch will create?
I confirmed in you.com's discord channel
if you have their pro subscription then you get unlimited opus
I may just get that
how are they not broke lmao
so confused
they must have like unlimited vc money or something what
idk
even with a 50% discount it isnt possible to make that profitable
You.com, an AI-powered search engine founded by ex-Salesforce chief scientist Richard Socher, has closed a $25M funding round -- all equity.
idk
to keep more people with them... i think?
I mean that's like every ai company rn
yeah
plus that article is old?
that's true
or until opus gets cheaper or releases on aws
because most ai companies (like perplexity) id assume dont pay for api
they pay for provisioned throughput units on aws/azure
yeah fair
We are living in the generative artificial intelligence (AI) era; a time of rapid innovation. When Anthropic announced its Claude 3 foundation models (FMs) on March 4, we made Claude 3 Sonnet, a model balanced between skills and speed, available on Amazon Bedrock the same day. On March 13, we launched the Claude 3 Haiku [ā¦]
idk??
The way it works is that they let u use it but then there are cool down periods that can last usually 10-15 mins. That's what happened to me when I used it. Also they definitely limit the output to make it shorter so it's not that detailed. Imo it's better than perplexitys limits for opus but not the best
All sites I try. Also doesn't work on youtube. Would be easy to access transcripts and use that text to analyse.
gm gm
It doesn't work on paid websites/websites with login.
@agile jay @warm cave table made by intent/agent format
seems to have got a few things wrong
there's no embeddings here, back to source feeding
Is that just making a csv?
no i put it into the excel
so its readable
it was markdown in my terminal
šÆ Get the prices and specs of the iPhone 15 lineup
and then it had 2 searches
Why not just use a markdown editor then?
so the issue was it didnt really search the prices individually
Like obsidian
You could make it first come up with the headers of the table.
it searched for iphone 15 and not 15 pro, so it sort of guessed the prices for the 15 pros?
Likely
What steps did it generate?
also it guessed the cpu for the pro models
maybe because the main source from apple.com got cut off...
let me see if i can first swap to searxng
Possibly, you could also make it use other sources, since they can likely be found on stuff like GSMArena.
I mean yeah but i cant sit and finetune each query
Who knows what an actual user would ask, right?
Current i dont have any form of pageranking
my guess is theres a chance its being shuffled somewhere due to the random search api im using
so first im going to try this
Yep, and see how the sources change
Also the apple site is pretty sh*t...
The html is such a mess.
where can i find promo codes
How do I see the history of all my threads? The place called "Library" does not show all my threads. Thanks!
it should!
if it's empty, try and reload the page once
Based on the search results, here are the prices of the iPhone 15:
* iPhone 15: starts at $799 (128GB), goes up to $1,549 (1TB)
* iPhone 15 Plus: starts at $899 (128GB), goes up to $1,549 (1TB)
* iPhone 15 Pro: starts at $999 (128GB), goes up to $1,749 (1TB)
* iPhone 15 Pro Max: starts at $1,199 (256GB), goes up to $1,799 (1TB)
Note that these prices may vary depending on the country, taxes, and availability. Additionally, prices may change over time due to promotions, sales, and other factors.
It's also important to note that these prices do not include any trade-in credits or promotions that may be available. If you're planning to purchase an iPhone 15, I recommend checking with your carrier or retailer for any available promotions and prices.```
price issue fixed literally by changing to searxng
however i think this misinformation on the device's chip is due to the sources being cut off?
Are you just passing in the raw html?
no lol
i have the whole complex parser setup
i mean cut off because i was dividing the 8k context into 7 sources
which isnt exactly good
let me lower it to like 3
Almost there, then.
i first
- made it so i treat same links with regional duplication like en-us (like apple does) as the same
- then resorts them back to og search engine order
price agent:
š [1.03s] Searched for "iPhone 15 lineup prices"@SearXNG - got 10 links, 0 snippets
š [SearchAgent] [0.00s] Found 3 links total
š [0.35s] [17609 chars / 4027 tokens] Read source https://www.apple.com/iphone-15/
š [0.35s] [20413 chars / 5026 tokens] Read source https://www.phonearena.com/iphone-15-release-date-price-features
š [1.81s] [437 chars / 112 tokens] Read source https://mobilekishop.net/blog/iphone-15-series-global-pricing-unveiled/```
specs agent:
š [SearchAgent] [0.00s] Found 3 links total
š [0.37s] [22946 chars / 6085 tokens] Read source https://www.apple.com/lae/iphone-15-pro/specs/
š [0.36s] [14005 chars / 3325 tokens] Read source https://www.tomsguide.com/news/iphone-15
š [0.44s] [7354 chars / 2101 tokens] Read source https://www.cnet.com/tech/mobile/iphone-15-series-compared-which-model-suits-you/```
ok idfk why it did it again there...
realistically i could do what you said and not rely on apple.com
oh it did "lineup specs" as the search instead of just specs??
pplx haiku pro gets it wrong though...
Yep, haiku hallucinates a lot more.
What sources are you using?
Here is a table summarizing the price and key specs of the iPhone 15 lineup based on the provided search results:
ModeliPhone 15iPhone 15 PlusiPhone 15 ProiPhone 15 Pro Max
PriceFrom $799From $899From $999From $1199
Storage128GB, 256GB, 512GB128GB, 256GB, 512GB128GB, 256GB, 512GB, 1TB256GB, 512GB, 1TB
Display6.1" OLED, 60Hz6.7" OLED,...
opus
"a17 bionic"
i mean i could just tell it to find the details on gsmarena
but again isnt that overfitting my solution to this one query
well lets see what happens if i add that as one example in the n-shot of the initial intent-router prompt
and this time it decided to work
since it made more searches
š [1.25s] Searched for "iPhone 15 Pro price"@SearXNG - got 10 links, 0 snippets
š [1.26s] Searched for "iPhone 15 Pro Max price"@SearXNG - got 19 links, 0 snippets
š [0.00s] Searched for "iPhone 15 specs"@SearXNG - got 10 links, 0 snippets
š [3.35s] Searched for "iPhone 15 Pro specs"@SearXNG - got 10 links, 0 snippets
š [3.34s] Searched for "iPhone 15 Pro Max specs"@SearXNG - got 10 links, 0 snippets
ā° [FinalAnswerAgent] [4.10s] Model response time (claude-3-haiku)```
so it's only done that once
idk honestly...
ok no way...
is that a response BETTER than perplexity???
@agile jay remember this yesterday
oh my god
Day 1: Arrive in Tokyo
- Arrive at Haneda Airport and take the train to your hotel in the Asakusa district
- Explore the Senso-ji Temple and the surrounding historic streets
Day 2: Tokyo
- Visit the Tsukiji Fish Market for a sushi-making experience
- Explore the Imperial Palace East Garden
- Enjoy a traditional Japanese dinner and drinks in Asakusa
Day 3: Tokyo
- Go up the Tokyo Skytree for panoramic city views
- Stroll through the beautiful Hamarikyu Gardens
- Spend time in the trendy Harajuku district
Day 4: Tokyo to Kanazawa
- Take the bullet train to Kanazawa (approx 2.5 hours)
- Visit the stunning Kenrokuen Garden
- Explore the historic Higashi Chaya district
Day 5: Kanazawa
- Tour Kanazawa Castle
- Experience a traditional tea ceremony
- Wander through the charming Nagamachi samurai district
Day 6: Kanazawa to Kyoto
- Take the bullet train to Kyoto (approx 2 hours)
- See the iconic Fushimi Inari Shrine
- Visit the beautiful Kiyomizu-dera Temple
Day 7: Kyoto
- Admire the stunning Kinkaku-ji (Golden Pavilion)
- Explore the Gion geisha district
- Enjoy a traditional Kyoto-style dinner
For flights, I found round-trip options from London to Tokyo in May 2024 starting around £955. Some top airline choices include British Airways, Japan Airlines, and KLM.
As for accommodations, there are many great hotel options in Tokyo, Kanazawa, and Kyoto to fit various budgets. Some highly rated picks include Andaz Tokyo Toranomon Hills, Park Hyatt Kyoto, and Kanazawa Tokyu Hotel.
Let me know if you need any other details or have additional requests as you plan your Japan trip!
@agile jay @warm cave @cinder comet look!!!!!! this is actually crazy good what
Iāll take look
Yeah, that looks amazing!
Nice, just from the change in the step system?
Which model was used, sonnet?
70b = router
8b = search source summary agents
haiku = final response
let me send the full log
for some reason this python parallelism isnt working so right now it took like 100 seconds
other requests take like 12 though
theres no embeddings involved
i went for a method where it tries to hit 6 links, but then cuts it to 3, since sometimes there are like 3 dead links
sometimes there are 6 alive
Yep, just the 8B checking the sources, right?
Nice š
Because that's what I generally use.
8b summarises the 3 sources into the main info of that intent
and then haiku takes that in for the main response
Yep, 8B is great for managing context.
70b, 8b on groq, and haiku on ...haiku
Yep, well on anthropic.
yeah
lol
quite the unlucky search with 5 links dead
but the answer pulled through still
all in 9 seconds...
Now you just gotta fix the async code in Python.
actually this might be due to searxng putting news onto a seperate search type
pplx had 19 sources for it so hmm
ok but the whole japan trip one only sent 5k tokens to haiku
thats crazy good
5k, yeah that is
5k for all that info
pplx pro on 4o/opus level
and it even gave average flight price which i dont think ive seen pplx do (not sure though)
Yeah, great quality with lower cost models
Well... and implement websockets and a UI
Get the backend done first, lol.
well sockets would be the backend, no?
i need to refactor a decent amount of this to work as an actual app rather than a cli tool
How long is your system prompt now?
i think its shy of 1.5k
well
the system prompt is like 500
and then the examples 5-shot is like 1k
that's to llama3-70b
Thatās pretty good, and that should be cheap with Llama
well the prompt to the 8b models is basically full context
so lets say an average of like
3 searches? per query
some maybe none, some maybe like 6
Yeah, that ends up being good, but now imagine if you were running Opus š§
Glad Llama 3 is so good, llama 2 would not cut it
(0.59/1000000)*1500 + (0.05/1000000)*350
+ ((0.05/1000000)*6656 + (0.05/1000000)*768))*4
+ (0.25/1000000)*1750 + (1.25/1000000)*400
(0.59/1000000)*1500 + (0.05/1000000)*350 # groq/llama-3-70b router
+ ((0.05/1000000)*6656 + (0.05/1000000)*768))*4 # ~4x groq/llama-3-8b search summarisers
+ (0.25/1000000)*1750 + (1.25/1000000)*400 # claude-3-haiku final
let me add that up
$0.0033248 per query?
And with pplx I would likely 6 or more searches planning that same trip
so more like ~$0.0035 average
Thatās butter than I expected š®
could be a bit more
could be way less
thats for like a decently beefy query
because not every query does 4 searches
one time it had the iphone query do 6 searches
but on avg it did like 2
285 queries per dollar
thats crazy...
5,700 queries per $20
190 per day
assuming each user somehow uses every single query
realistically not happening
Thatās great, sounds like you could easily provide high usage with good cost margins
ok since we dont have opus
say we offered sonnet and 4o as best models
sadly no vc money to throw for opus
say maybe
$15/mo
lol no
gemini 1.5 pro is cheaper than 4o
stick that in there too
and it wouldnt increase prices much anyway since that model is only being used for the end query
Yeah true
we're feeding like max 5k tokens
and then if we brought document support in then idk we'd have to see how that works
maybe use sonnet as the model in the chain for documents
True
well one thing to consider is that im on residential ip right now
some sources might be blocked when i move scraping to a cloud server
Also if you self host any models, there is llama 3 8b 100k and 1M, same thing with 70b. But again the goal is not max context, but maximizing the quality and cost
yeah but where would i host
i can put it on modal
but cold starts is like so slow
20-30 seconds? for llama3 8b
and it works out to more than groq
Yeah I guess that would not be optimal
I may have made a typo, let me check lol
whos using mistral api realistically
idk
i wouldnt be using that
mistral large sounded like an eh model from the start to me
since its gonna be superseeded anyway
wait so flash is 0.70 until you go above 128k tokens
?
Only reason I use it, is to test out 8x22b
also ideally we would maintain some form of cache of websites and intent queries right
idk where i'd store that
s3?
5gb free so i mean?
As part of the AWS Free Tier, you can get started with Amazon S3 for free. Upon sign-up, new AWS customers receive 5GB of Amazon S3 storage in the S3 Standard storage class; 20,000 GET Requests; 2,000 PUT, COPY, POST, or LIST Requests; and 100 GB of Data Transfer Out each month.
wtf is this...
having a disk version of the scraped pages also lets me see the broken pages lol
what page is actually hitting >24k chars in meaningful content
...
s3 would be like decent latencies
Yeah, no thats the price
to big to fit in one screen shot now
That is weird
ive been manually fixing a few pages
its like whack-a-mole more or less
wikipedia seems to be huge due to it having references included in the main article
...ought to go and fix that lmao
im probably going to take a break for today
made some immense progress i think
Yeah, huge progress
I'm so happy that I managed to use the code and paid considerably less. I didn't know that this code cut the Perplexity subscription price in half. I should have done this much earlier. I was considering paying for ChatGPT, but in terms of cost-benefit, there's no competition with Perplexity, which only costs half the price. So, here I stay.
šæ
Whatās the code? I went back previous posts but did not see it. Thanks!
Here.
.
Oh, it's a one month half price referal code
Like this one, which is mine:
Yep ^^
What happen to Claud 3? I asked it to improve my prompts and it used to follow my instructions specifically, in details correctly, but not anymore. Now it answers in general not even follow my instructions?
Straight up refusal?
Which Claude model?
Yes, I tried several times
Both
Can you share a thread or prompt that you used and had bad results with
My prompt is quite long to post (1600 words)
Put it in a txt
What was recently discussed here sounds a bit like a personal project I did recently for myself, primarily for curiosity. Takes a lot longer than pplx at doing an internet type of query though š - about 30 seconds. Uses a combination of Gemini 1.5 Flash (e.g. for extracting relevant content from webpages) and GPT-4o (e.g. to answer the user's message)
How you extract content from webpage?
I use Gemini 1.5 Flash with a suitable prompt, feeding it the text content of the webpage, the query (e.g. the search query) and the URL of the webpage, instructing it to do the extraction of relevant content blah blah. Temperature and top_p were changed too, until I found I had a consistent result. Ironically the above query in the screenshot proved to be a good example for testing it, plenty of text on that page and some models easily got mixed up
When I say that page, I mean the Docker Compose release notes page that is, on Docker's website
Ok but how you scrape content from website? Example I use jina reader for this
Currently I simply use requests (Python) with some additional headers, on my own network (so I appear as a residential IP of course)
As well as a custom user agent
Some websites block this or don't work well with this because no JS for example
lol if you search messages from me ive been doing similiar work
Yeah I was referring to your work, which is nice š
Ive got response time down below 30s though :d
well mostly
in theory all the time once i fix my parallel code
Yeah I haven't managed that yet, I don't cache various URLs
Based on my search results, the key changes and improvements made in Docker Compose v1.24.0 include:
1. Improved navigation menu in the `docker compose up` command, allowing you to manage services, stop and remove containers, and view logs.
2. Displaying the reason for image pull failures.
3. Fixes for crashes when running `up` with `--no-build` and `--watch`, and when no TTY is available and the menu is enabled.
4. Legibility improvements to the menu action text.
5. Added support for annotations in the Compose file.
6. Improvements to the `config` command, including options to list Compose model variables and specify the output format.
7. Integration of Mutagen to provide synchronized file shares between containers.
However, I do not have access to the full official release notes for this specific version of Docker Compose. If you need more detailed information, I recommend checking the official Docker documentation or website for the latest release notes.```
ok that was pretty mid
š [1.34s] [99000 chars / 21946 tokens] Read source https://docs.docker.com/compose/release-notes/
why is it 21k tokens...
since im not using large models for source processing
whenever i run into a large or broken-parsed source its a bit of a problem
That page is huge sadly, but also a good test bench imo
ok well this is great they just have a whole page on every docker release ever..
not great for my 8k llm
thats made me think, i could go for a hybrid approach actually
For website with heavy js I recommend using playwright in headless mode
I havent added in the feedback loop
pretty slow though
Yeah Playwright is a good choice but it's slow š¦
i got around 4 seconds to scrape one page when warm on cloud worker
though could be improved by keeping the browser itself open
Please give me example URL with heavy js
I will test time to load
are both clientside react
As for the context, I use Gemini 1.5 Flash to extract the content that's most likely relevant from the webpage text (HTML stripped that is), although this process slows things down of course
your gemini 1.5 flash has given me a little idea
@terse drum 2.9seconds to screenshot on playwright, incl launching browser
what im yet to add is a feedback loop
if the smaller model doesnt find enough information from the search, then i can fallback to 6 searches with a larger llm (haiku or flash)
yeah thats a great idea actually
Nice, yeah mine generally aims for 2 search queries, it can do more if it feels it might be useful though (e.g. a complex query)
Perplexity down for anyone else?
With the sickness?
all good here
Weird. Nothing in my library and nothing pops up
try a page refresh
reload/refresh the page
I get that sometimes
You just send all data to Gemini? Why not clean using example Python? Convert to markdown etc?
he probably doesnt
Oh hey it worked š I thought I tried that already. Thanks!
does similiar/same thing as me im guessing, parsing the pages
Sorry I should've clarified, I strip the HTML tags, I don't include script or style stuff, and only what's within the <body></body>
how do you fare against
painful site
espicially the "compare devices" page
its pretty horrible
So... I know a lot of people say refresh and most people hit F5 or the circle arrow, but the best refresh is Ctrl+F5
Ok š I do the same in work
60k tokens of useless i think
Hmm good question, I'll try something that accesses Apple's site
ive filtered out a few apple pages
and also dealt with various other pages
with custom selectors
as ive said its like whack-a-mole lol
Yeah I've excluded some domains from being scraped, such as Reddit
reddit i want to have eventually
since it is a decent/good source if i can have grounding
well youre thinking personal use, im thinking if i can build a product lol
For 1 task in my work I'm using Gemini 1.0 - for 2 tasks Haiku has better result's
Interesting
I don't try Gemini 1.5 Flash - must make some test
I tried Haiku with the webpage extraction part of the process, but it didn't feel as accurate or as detailed at the task generally compared to Gemini 1.5 Flash, however it could also be that I need to play with my prompt more, or the temp/top_p
Cheap and fast. Use from openrouter. The result are good so...
There is a downside to Flash though, its moderation layer can be triggered by some webpages
turn it down i think
I build prompts for haiku using Claude opus
Yeah same, I've also used GPT Prompt Engineer (I think it's called that?) on GitHub as well as the recent Prompt Generator tool on Claude
For checking result I recommend https://promptfoo.dev/
interesting
Definitely looks nicer than a bunch of Jupyter Notebooks. Maybe I'll try it out too
Glad to hear
How do you do that exaxtly (build prompt)? Do you use the model Claude 3 Opus in Perplexity and ask it to write a prompt?
Use opus directly in playground on Claude website. I have large system message (like prompt builder)
When Claude build the prompt to specific task I switch into preferred model example haiku and make test (I have database with websites for this with different use case's). If test pass I implementing new prompt to work
Interested. Whatās the addresses for this?
You ask for the websites from my test base? Nothing special, raw website (without js), pages with heavy js, website built from image š , specific data in PDF file (on website) etc.
Playground on Claude website?
Build with the Claude API, an AI assistant from Anthropic
<script>!function(){var e=document.createElement("iframe");function n(){var n=e.contentDocument||e.contentWindow.document;if(n){var t=n.createElement("script");t.nonce="",t.innerHTML="window['__CF$cv$params']={r:'792f8224776acf9f',m:'hMcSCCrnIkr7c8Pec6Na6boaaFAnQ6S0ypG2GKRbKgc-1675305063-0-AaJn0SqKZQnadmRQ5O1dM9xMkXWyP+ll7gpl2NHeoNbZTEXMjlB10KkwnEU3hf0/gMODfKqcBGLVecql6U04GGs+iJ/kNrNqj1FgfAOlQV+T2koMQMvUy1zr9tegBBX6BikfccHZhwoJhnXc0eTcg58=',s:[0x60b082f691,0xee65a67e11],u:'/cdn-cgi/challenge-platform/h/b'};var now=Date.now()/1000,offset=14400,ts=''+(Math.floor(now)-Math.floor(now%offset)),_cpo=document.createElement('script');_cpo.nonce='',_cpo.src='/cdn-cgi/challenge-platform/h/b/scripts/alpha/invisible.js?ts='+ts,document.getElementsByTagName('head')[0].appendChild(_cpo);",n.getElementsByTagName("head")[0].appendChild(t)}}if(e.height=1,e.width=1,e.style.position="absolute",e.style.top=0,e.style.left=0,e.style.border="none",e.style.visibility="hidden",document.body.appendChild(e),"loading"!==document.readyState)n();else if(window.addEventListener)document.addEventListener("DOMContentLoaded",n);else{var t=document.onreadystatechange||function(){};document.onreadystatechange=function(e){t(e),"loading"!==document.readyState&&(document.onreadystatechange=t,n())}}}();</script></body>
this looks like cloudflare right?
<div class="data"><form id="challenge-form" class="challenge-form"><div id="cf-please-wait"><div id="spinner"><div id="cf-bubbles"><div class="bubbles"></div><div class="bubbles"></div><div class="bubbles"></div></div></div><p id="cf-spinner-please-wait"></p><p id="cf-spinner-redirecting" style="display:none"></p></div><noscript id="cf-captcha-bookmark" class="cf-captcha-info"><h1 style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1></noscript><div id="no-cookie-warning" class="cookie-warning" style="display:none">```
yeah
cf
hmm
I prefer render all website with js (playwright - headless browser working on the server). I'm using stream mode to grab all context as rag.
do you not run into captcha often though
im so confused
this code in ts
const res = await fetch("https://openai.com/api/pricing/", {
"headers": {
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"accept-language": "en,en-US;q=0.9",
"cache-control": "no-cache",
"pragma": "no-cache",
"priority": "u=0, i",
"sec-ch-ua": "\"Chromium\";v=\"125\", \"Not.A/Brand\";v=\"24\"",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "\"macOS\"",
"sec-ch-ua-arch": "\"x86\"",
"sec-ch-ua-bitness": "\"64\"",
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
"DNT": "1",
},
// "referrerPolicy": "strict-origin-when-cross-origin",
"body": null,
"method": "GET"
});
const text = await res.text();
console.log(text);
works fine
but in httpx the exact same headers is not working im sent a cf captcha
...what
Actually yeah i did make a unfingerprintable browser a whileback for some tiktok project i was working on
it got a better trust score than my actual browser on https://abrahamjuliot.github.io/creepjs/
omg i hate myself
why do i not name folders properly
i cant even find where i wrote that code
So does anyone know how I might be able to save a Perplexity thread to my account since I created it before I made my account?
Don't think it's possible really
you could just bookmark the link if it's extra important
So is perplexity wrong about its own functionality? ```
To save the thread [link] to your account library for use anywhere:
Sign in or create a Perplexity account if you haven't already.
Once signed in, open the thread you want to save.
Click the "Save" button at the top right of the thread.
The thread will now be saved to your personal library, accessible from any device by signing into your account.
Saving threads to your library allows you to access them privately from anywhere, even if you started the conversation before creating an account. Saved threads are not locked or shareable unless you explicitly share the URL.
https://www.perplexity.ai/hub/getting-started was the main source it used
oh what
wait i have no clue if thats a thing or not
Well I couldn't see that button so...
I don't have pplx premium - worth?
I don't renewed openai subscription so I looking for something interesting
Or any other tool?
I would say so
I use it a ton to help me understand content from school
I also use it as a secondary search engine
i got it to work
turns out bun ts library adds a useragent
cf proxy was looking for a useragent
but if it can parse chrome it sends you the captcha
hmm...
- it failed to find gpt4o
- failed to get ANY context windows
bruh
š [0.42s] [1755 chars / 318 tokens] Read source https://www.binance.com/en/price/gpt-4o
what is this... this page does not exist
Is Perplexity down right now? I can't get any prompts to go through from the website
For some people
I have the same problem.
so close yet so far
What? the price?
ā Query? > ai model price comparison - claude 3 models, gpt 4 turbo, gpt4o, gemini 1.5 pro+flash - use unit million tokens and give it in a table