#๐ฌโgeneral
1 messages ยท Page 57 of 1
this is a good test query to test lots of parts of the system
ive changed my searxng instance to have way better and updated source rankings
now i need to see why its reporting the wrong info for 1.5 pro
๐ [SearchAgent] [0.00s] Picked 6 links total
๐ [SearchAgent] [1.98s] Finished pulling 3 sources - 8874 max chars/source
* ๐ https://ai.google.dev/pricing - 3699 chars
* ๐ https://www.cnet.com/tech/services-and-software/google-gemini-pricing-1-5-pro-and-1-5-flash-compared/ - 2526 chars
* ๐ https://beebom.com/how-use-gemini-1-5-flash/ - 7446 chars```
hmm
this has the info, just not organised well?
you have got me curious, so I'm trying a similar query with my own project just to see how it handles it. I don't expect it to work as I never tested this aspect ๐
well mine missed some models unfortunately (such as GPT-4o). it did a lot of search queries though, what's in the table is somewhat correct, a few anomalies from what I can see
I'll try your exact query instead and see what that results in
if you're wondering what my current result is with my personal project ๐, as you can see, it's got a few problems in the result
can you try gemini 1.5 pro price per mtok
im getting really annoyed at this single query
just that alone?
since my one can never seem to get it perplexity-like correct
yeah just that
* gemini 1.5 pro price in million tokens
* gemini 1.5 pro price per mtok```
trying both variants now
when i did mtok
it thought mtok was a crypto
....
and when i did the first one half the time it doesnt take in the information from the main site
๐คฃ
ai.google.dev/pricing
half the time it gives me vertex ai pricing
this is so infuriating
should know what mtok says in a moment
for mtok it said:
The pricing for Gemini 1.5 Pro is as follows:
- For input prompts up to 128K tokens, the cost is $3.50 per 1 million tokens.
- For input prompts longer than 128K tokens, the cost is $7.00 per 1 million tokens.
- For output, the cost is $10.50 per 1 million tokens for prompts up to 128K tokens and $21.00 per 1 million tokens for prompts longer than 128K tokens.
asked it for the comparison and i got this
come on omfg
why has it given so many variants
lol
trying once more...
trying the million one now
so close yet SO FAR
why is it only this stupid gemini one thats only right like 10% of the time
well million said this:
The pricing for Gemini 1.5 Pro in terms of million tokens is as follows:
- **Input Tokens:**
- $3.50 per million tokens for prompts up to 128K tokens.
- $7.00 per million tokens for prompts longer than 128K tokens.
- **Output Tokens:**
- $10.50 per million tokens for prompts up to 128K tokens.
- $21.00 per million tokens for prompts longer than 128K tokens.
This means the cost will vary depending on the length and complexity of the input and output text.
yeah spot on
but again im budget limited on context
which might be the main issue
see the issue is it also doesnt know when it's wrong
so my feedback loop idea from before didnt really work...
yeah probably, I'm using Gemini 1.5 Flash to extract relevant content from the webpages scraped, to reduce wasted input context on GPT-4o. This however does slow things down a bit, although I get an accurate answer at the end of the day so it's worth waiting about half a minute ๐
I see
yeah I had a similar problem with some models, where it would either make something up or mix up details on the webpage
its mainly because it keeps thinking this vertex page is the main source
* ๐ https://ai.google.dev/pricing - 3699 chars
* ๐ https://cloud.google.com/vertex-ai/generative-ai/pricing - 11353 chars
* ๐ https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/ - 11572 chars```
I see
Here is a detailed summary of the pricing information:
**Gemini Models Pricing**
* Model | Price (input) | Price (output) | Notes
* Gemini 1.5 Flash | $0.0001315 / image | $0.000125 / 1k characters |
* Gemini 1.5 Pro | $0.001315 / image | $0.00125 / 1k characters |
* Gemini 1.0 Pro | $0.0025 / image | $0.000375 / 1k characters |
**Context Caching for Gemini**
* Model | Price (input) | Price (output) |
* Gemini 1.5 Pro | $0.0006575 / image | $0.000625 / 1k characters |
**Imagen Pricing**
* Model | Price (input) | Price (output) |
* Imagen | $0.020 per image | $0.003 per image |
**Multimodal Embeddings Pricing**
* Model | Price (input) | Price (output) |
* Multimodal Embeddings | $0.0002 / 1k characters | No charge for output |
**PaLM 2 for Text Pricing**
* Model | Price (input) | Price (output) |
* PaLM 2 for Text | $0.00025 per 1,000 characters | $0.0005 per 1,000 characters |
**Partner Models Pricing**
* Model | Pricing |
* Claude 3 Opus | Input: $15 / million tokens Output: $75 / million tokens |
* Claude 3 Sonnet | Input: $3 / million tokens Output: $15 / million tokens |
* Claude 3 Haiku | Input: $0.25 / million tokens Output: $1.25 / million tokens |
**Gemini 1.5 Pricing**
* Model | Price (input) | Price (output) |
* Gemini 1.5 Pro | varies depending on context window size (starts at $0.0006575 / image) | varies depending on context window size |
It's important to note that the prices listed are subject to change and may vary depending on the specific use case and context window size. Additionally, there may be additional costs associated with using these models, such as latency and computational requirements.
It's also important to note that the prices listed are in USD and that prices may vary depending on the user's location.
so thats what this "worker" kinda outputted
which might be right but its interms of images??? and idfk what
๐
well just not what im looking for
yeah understandable
now theres only so many options i have here
i could use haiku instead of 8b
but that would increase the price per query quite a bit
lemme run that calculation
right now average
(0.59/1000000)*1500 + (0.05/1000000)*350 # groq/llama-3-70b router
+ ((0.05/1000000)*6656 + (0.05/1000000)*768))*4 # ~4x groq/llama-3-8b search summarisers
+ (0.25/1000000)*1750 + (1.25/1000000)*400 # claude-3-haiku final
and with haiku
(0.59/1000000)*1500 + (0.05/1000000)*350 # groq/llama-3-70b router
+ ((0.25/1000000)*6656 + (1.25/1000000)*768))*4 # ~4x claude-3-haiku search summarisers
+ (0.25/1000000)*1750 + (1.25/1000000)*400 # claude-3-haiku final
$0.012336 per query
which is more than double
๐ฆ
ah so I see. I'd like to drop gemini for something else primarily because of the moderation sometimes falsely flagging, but I've yet to find a viable alternative. I'll have to try tweaking the prompt I use for extracting the relevant content with another model and see if I can get it to eventually give me similar results on some specific tests
ideally the model needs to be fast
i mean i could use gemini flash but again im trying to minmax costs
so that i could maybe turn this into a real service lol
yeah of course, that's understandable
well the question here might be
whats actually at fault
whats wrong in searchworker?
- websearch rankings
- the scraping output
- the 8b model
a mystery to solve
* ๐ https://ai.google.dev/pricing - 3699 chars
* ๐ https://cloud.google.com/vertex-ai/generative-ai/pricing - 11353 chars
* ๐ https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/ - 11572 chars
โฐ [SearchAgent] [2.60s] Model response time (llama8b)```
so its shipped the context of the main source which is ai.google.dev/pricing
but the thing is my extraction for that page also isnt the best due to google's rubbish layout
yeah that won't help for sure
sure let me see
what im trying to do is like a needle in my head
few accurate sources per element of the query
what you and pplx are doing is more like search and verification by the masses?
and the ai.google.dev is the /pricing page
at least on the first one
hmm
do you have a debugger to see how it was extracted?
curious to see the differences in our impl
yeah I do have such output, let me see what it did for that specific page
<webpage url="https://ai.google.dev/pricing" search_query="gemini 1.5 pro price in tokens" publication_date="">
<result>
The Gemini 1.5 Pro model is priced at $3.50 per million tokens for input prompts up to 128K tokens and $7.00 per million tokens for input prompts longer than 128K tokens. For output, the price is $10.50 per million tokens for prompts up to 128K tokens and $21.00 per million tokens for prompts longer than 128K tokens.
</result>
</webpage>
<webpage url="https://ai.google.dev/pricing" search_query="gemini 1.5 pro price in millions" publication_date="">
<result>
Gemini 1.5 Pro is currently in preview and is free of charge. Starting May 30, 2024, it will be priced on a pay-as-you-go basis. The pricing for Gemini 1.5 Pro is as follows:
* **Input:** $3.50 / 1 million tokens (for prompts up to 128K tokens) and $7.00 / 1 million tokens (for prompts longer than 128K tokens).
* **Output:** $10.50 / 1 million tokens (for prompts up to 128K tokens) and $21.00 / 1 million tokens (for prompts longer than 128K tokens).
* **Context Caching:** $1.75 / 1 million tokens (for prompts up to 128K tokens) and $3.50 / 1 million tokens (for prompts longer than 128K tokens).
* **Storage:** $4.50 / 1 million tokens per hour.
The pricing is in USD.
</result>
</webpage>
There are a few more but I think two of them are adequate examples. What it extracts as 'relevant content' depends solely on the search query used basically.
are you using scraping like me or a headless web browser?
like do you know what text content came extracted out of that page
requests in Python, so not a headless browser
thats what im curious about
to answer that question, I don't explicitly output that in the console but I can certainly add that output
yeah if you can ๐
just would like to see :d
sure, one moment
I think I'll save the output to file instead, as it will be messy to find it in the console
avg ~$0.00642752 per query with deepseek-v2-32k (in theory)
vs
avg ~$0.0034784 per query with llama3-8b-8192 (in theory)
interesting
How abput yi-large
Is it available yet through api?
It seems slow on lmsys
Here you go
ok well in the end nobody is beating the price of groq
whar
Hmm, your parsing is very similisr to mine
That's what I send to Gemini 1.5 Flash to extract relevant content from that
so it must be down to a case of the model just being confused due to the extra sources and not being able to pickup on the first source
man
idek how to fix this without changing model
changing model isnโt really an option iโll be entirely honest
possibly a combination of a good prompt and right temperature/top_p, but if the model just isn't good enough to do it then rip ๐ฆ
so far the only model I've had actual success with is flash, haiku worked but it wasn't as detailed with the docker compose release notes
I found temperature alone didn't work for me, I had to also adjust top_p to stop Flash randomly getting the docker compose release notes "test" wrong every so often
Iโve never really played with top_p
yeahโฆ i donโt think iโve set a temperature
whatโs the default
1.0???
nope, mine is 0.3 actually
but whatโs the default temperature for groq api
not sure on that one
Yep, or the speed.
I tried asking about that in my personal project, it didn't find the answer
whatโs your top_p set to?
yeah the answer doesnโt exist i think
0.9
nowhere on the groq docs
What is?
.
yeah I'd probably say it's 1 by default, most APIs usually go with 1 by default
i know what this means due to that 3blue1brown video
lol
have you seen it?
the ones on transformers
Yep, good for removing unlikely tokens.
Looks like the default is 1, so all probabilities are considered.
Nice, what change you make?
just the temperature
i realised all of them were at default (1) in my code
Lol, to make it more deterministic?
setting the right parameters helps
silliest oversight from me but
Lowering it should reduce hallucinations.
And lowering the Top P should help too.
yeah
llama70b = Groq(model="llama3-70b-8192", temperature=0.6, top_p=0.9)
llama8b = Groq(model="llama3-8b-8192", temperature=0.2, top_p=0.9)
claude = Claude(model="claude-3-haiku-20240307",
api_key="",
temperature=0.3, top_p=0.9
)```
ive gone with this configuration
It's probably set like that since the default is best for general tasks.
70b is the initial model, 8b is the source summarisers, and claude is the final response
llama70b is for what? generating the search queries?
(api key redacted)
Such as writing etc, where randomness is seen as a feature at times.
i've adopted an intent/destructing model (if youve been up-to-date)
I am guessing it's for writing the intent and steps.
Yep, is it the style I recommended?
yeah
[
{
"intent": "Get the prices of the AI models",
"steps": [
{
"tool": "search",
"query": "Claude 3 models pricing"
},
{
"tool": "search",
"query": "GPT-4 Turbo pricing"
},
{
"tool": "search",
"query": "GPT4O pricing"
},
{
"tool": "search",
"query": "Gemini 1.5 Pro+Flash pricing"
}
]
},
{
"intent": "Convert prices to million tokens",
"steps": [
{
"tool": "calculator",
"query": "convert prices to million tokens"
}
]
},
{
"intent": "Create a price comparison table",
"steps": [
{
"tool": "write_answer",
"query": "create table with prices in million tokens"
}
]
}
]```
Intent is what I use, since it's the models guess of what you wanted.
maybe i ought to lower temp more
right now it ignored calculator since i havent impl'd it
yeah but then im thinking
yeah play around with the temperature and see what results you get basically
will it just go really bad on the queires like the japan one
No, it shouldn't
so when i did it before and got a crazy good response, it was on temp 1
so idk
it did like 7 searches on different aspects
it was really good lol
for curiosity I'll try that japan query with mine ๐
yeah ok
here it is
plan me a trip to japan somewhere in the month of may 2024 from london
thanks
Also, you can change the temperature for the different steps, if needed.
Yep
for now ive hidden the nonexistent tools that we mocked up though
calulcator doesnt actually work rn, but the thing is i have it in the shot prompts so to avoid confusion/keep consistency its there
Yep, my infrastructure is an intent based model, and I also think intent just sounds cool too.
Could plug it into wolfram alpha?
yeah
For the calc
searxng has wolfram
Nice
searxng is pretty nice.
Yep, also better for latency long term.
I am hooking up tools using rpc, so they are pretty seamless.
And not coupled.
ok I have no idea how much of this is correct but this is what the japan query said for me on my personal project. Mine isn't as comprehensive in that it won't search for plane ticket availability or such
gpt-4o yes
๐
gemini 1.5 pro price in million tokens
back to this query
sometimes it works
sometimes it doesnt
for the sake of curiosity, I could try a different end model if you want
He's using Haiku atm
Yep, and the speed too...
yeah understandable, I'll try Haiku with the same query on my setup
๐ [0.00s] Searched for "Gemini 1.5 Pro price"@SearXNG - got 5 links, 0 snippets
๐ [SearchAgent] [0.00s] Picked 4 links total
๐ [SearchAgent] [7.63s] Finished pulling 3 sources
* ๐ https://ai.google.dev/pricing - 3699 chars
* ๐ https://cloud.google.com/vertex-ai/generative-ai/pricing - 11353 chars
* ๐ https://artificialanalysis.ai/models/gemini-1-5-pro - 11572 chars
โฐ [SearchAgent] [2.41s] Model response time (llama8b)
Based on the provided context, the Gemini 1.5 Pro model is a multimodal model that can be used for various tasks such as text generation, image generation, and multimodal fusion. The pricing for the Gemini 1.5 Pro model is as follows:
* Input token price: $0.001315 per image, $0.001315 per second, $0.00125 per 1k characters
* Output token price: $0.00375 per 1k characters
* Context caching: 0.0006575 per image, 0.0006575 per second, 0.000625 per 1k characters
* Context cache storage: 0.0011835 per image per hour, 0.0011835 per second per hour, 0.001125 per 1k characters per hour
The pricing for the Gemini 1.5 Pro model is based on the number of input and output tokens, as well as the context caching and storage. The prices are listed in US Dollars (USD) and are subject to change.
It's worth noting that the pricing for the Gemini 1.5 Pro model is different from the pricing for other models, such as the Gemini 1.5 Flash model, which has a different pricing structure. ```
really...
come on
๐
back to square one ๐ญ
ah crud, I can't actually do that easily as I forgot I'm using OpenAI directly, not using OR for the final model. Well I'll try it tomorrow when I'm more awake ๐
yeah but mine takes it too seriously
how do i stop it from thinking million tokens is a crypto
ok for the sake of testing
let me try sonnet's output
it wasn't as difficult as I thought it would be to change it to haiku, here's what Haiku said, which isn't as detailed:
Here is a suggested 10-day Japan itinerary for a trip in May 2024 from London:
Day 1: Arrive in Tokyo
- Check into your hotel
- Explore the Asakusa district, including the Sensoji Temple and Nakamise shopping street
- Visit the Imperial Palace East Gardens
Day 2: Tokyo
- Visit the Meiji Shrine and Yoyogi Park
- Explore the Shibuya Crossing and Harajuku district
- Attend the Sanja Matsuri festival in Asakusa (mid-May)
Day 3: Tokyo to Kyoto
- Take the Shinkansen bullet train to Kyoto (approx. 2.5 hours)
- Visit the Kinkakuji (Golden Pavilion) and Arashiyama Bamboo Grove
- Explore the Gion district and watch a traditional geisha performance
Day 4: Kyoto
- Visit the Kiyomizudera Temple and Fushimi Inari Shrine
- Explore the Nijo Castle and Nishiki Market
Day 5: Kyoto to Nara
- Take a day trip to Nara
- See the friendly deer in Nara Park
- Visit the Todaiji Temple and Kasuga Taisha Shrine
Day 6: Kyoto to Hiroshima
- Take the Shinkansen to Hiroshima (approx. 2 hours)
- Visit the Hiroshima Peace Memorial Park and Museum
- See the iconic Itsukushima Shrine on Miyajima Island
Day 7: Hiroshima to Osaka
- Travel to Osaka (approx. 1.5 hours)
- Explore the Dotonbori district and try the local cuisine
- Visit the Osaka Castle
Day 8: Osaka
- Take a day trip to Himeji Castle
- Explore the Kobe Harborland and Kitano Ijinkan district
Day 9: Osaka to Hakone
- Travel to Hakone (approx. 2 hours)
- Ride the Hakone Ropeway and enjoy the views of Mount Fuji
- Relax in an onsen (hot spring)
Day 10: Hakone to Tokyo, depart
- Return to Tokyo (approx. 1.5 hours)
- Explore any remaining sights in Tokyo
- Depart for London
This itinerary allows you to experience the highlights of Tokyo, Kyoto, Nara, Hiroshima, Osaka, and Hakone, with a focus on cultural attractions, festivals, and natural scenery. Let me know if you would like me to modify or expand on this suggested Japan trip plan for May 2024.
So just a simpler and less accurate response.
yeah
yeah
It's probably still affordable, as long as there is more filtering before passing into 4o.
Some finetuning of llama 70B could also do the trick.
ok im moving on from this one gemini query
Can't wait for groq to add it.
Yep, asking for the cost of all the different models has always been hard.
no
although slower to reply, here's what wizardlm-2-8x22b also said for anyone curious ๐
this is different
i got that query down
this is gemini 1.5 pro price in million tokens
problem 1: keeps quoting https://cloud.google.com/vertex-ai/generative-ai/pricing (this is completely unrelated and has prices in per character)
problem 2: thinks million tokens is some crypto
I think making a knowledge graph would probably be useful.
And then injecting the knowledge if it's relevent to the query.
For prompts up to 128K tokens:
- $0.35 per 1 million input tokens
- $1.05 per 1 million output tokens
For prompts longer than 128K tokens:
- $0.70 per 1 million input tokens
- $2.10 per 1 million output tokens```
?
well its like 50% right...
it didnt do the crypto thing
but thats the flash prices, not pro
Maybe ask it to repeat what it found before giving the answer?
But as a summary, not the whole number of sources.
๐ [0.91s] Searched for "Gemini 1.5 Pro price in Million Tokens"@SearXNG - got 22 links, 0 snippets
๐ [SearchAgent] [0.01s] Picked 6 links total
๐ [SearchAgent] [7.17s] Finished pulling 4 sources
* ๐ https://artificialanalysis.ai/models/gemini-1-5-pro - 11600 chars
* ๐ https://indianexpress.com/article/explained/explained-sci-tech/google-gemini-pro-1-5-1-million-tokens-9166398/ - 11600 chars
* ๐ https://ai.google.dev/pricing - 3699 chars
* ๐ https://www.cnet.com/tech/services-and-software/googles-gemini-1-5-pro-will-have-2-million-tokens-heres-what-that-means/ - 2797 chars
โฐ [SearchAgent] [2.83s] Model response time (llama8b)
"""Based on the provided context, I understand that you are looking for the price of Gemini 1.5 Pro in Million Tokens. According to the text, the pricing for Gemini 1.5 Pro is as follows:
* For prompts up to 128K tokens: $0.35 / 1 million tokens (input) and $1.05 / 1 million tokens (output)
* For prompts longer than 128K tokens: $0.70 / 1 million tokens (input) and $2.10 / 1 million tokens (output)
Please note that these prices are subject to change and may vary depending on the specific use case and requirements."""```
time to look into these sources
well first, both these sources have some parsing issue...
Yep, must be really confusing for the model.
What about suing 70B and asking it to remove any errors from the scraping? Maybe it's already knowledgeable to do it.
it would slow the query too much tbf
Just as a test.
if this was to become a site where i can get active like rlhf feedback from users then maybe i could see when these sites make certain issues
Since if 8B can do it too, then it should be pretty fast with groq.
$10.50 per 1M Tokens for prompts up to 128K tokens
$21.00 per 1M Tokens for prompts longer than 128K tokens
The key details are:
- Gemini 1.5 Pro has a pay-as-you-go pricing model
- For prompts up to 128K tokens, the price is $0.35 per 1M Tokens
- For prompts longer than 128K tokens, the price is $0.70 per 1M Tokens
- There are also additional charges for output prompts
- Billing for Gemini 1.5 Pro starts on May 30, 2024
Please note that these prices are subject to change and may vary based on the specific use case and requirements.
โฑ๏ธ 12.199028968811035 seconds```
oh my god FINALLY it works
ok well again its only 75% correct
yo sneakyfishy
Yep, I'm just working on an easier way to clean up the input data.
yo
you mentioned ages ago
markdown
im trying that rn
i dont want to sound like dumb or anything but once you use perplexity what like website do you sue to bypass like the ai detection
perplexity since it cites the web usually doesnt sound like ai much anyway
if you want you can click copy, remove the sources at the end and ask ai to rephrase it, or if youre super paranoid something like quillbot
its for presentation
but if its for important research, or maybe an assignment youre turning in, id reccomend you rephrase anything manually really
and i have to include all like the sources and everything
you could ask it to give the main points for each slide/etc
you can copy it over and change a few words
then add sources as seems fit
Or you can just use perplexity to quickly find sources you can use.
yeah
Can't you summarize YouTube videos?
Yep, also thinking how to handle graphs and svg's.
I'm guessing removing the actual svg content and only leaving the class and stroke/fill will be enough for those.
For graphs, probably just leaving them as plain html would work.
Think I'm gonna make a quick interface to measure how many tokens are saved from each method, compared to the average end result of the query output.
i just stripped images and svg entirely
$10.50 per 1M Tokens for prompts up to 128K tokens
$21.00 per 1M Tokens for prompts longer than 128K tokens
The pricing details are:
- Input: $7.00 per 1M Tokens (for prompts up to 128K tokens) and $21.00 per 1M Tokens (for prompts longer than 128K tokens)
- Output: $10.50 per 1M Tokens (for prompts up to 128K tokens) and $21.00 per 1M Tokens (for prompts longer than 128K tokens)
Please note that these prices are subject to change and may vary depending on the context window and other factors.```
i will take that
[
{
"intent": "Plan a trip to Japan in May 2024 from London",
"steps": [
{
"tool": "search",
"query": "flights from London to Japan in May 2024"
},
{
"tool": "search",
"query": "best places to visit in Japan in May"
},
{
"tool": "search",
"query": "Japan weather in May"
},
{
"tool": "search",
"query": "Japan travel guide"
}
]
},
{
"intent": "Create an itinerary",
"steps": [
{
"tool": "search",
"query": "7-day Japan itinerary"
},
{
"tool": "search",
"query": "things to do in Tokyo in May"
},
{
"tool": "search",
"query": "things to do in Kyoto in May"
}
]
},
{
"intent": "Book accommodations and flights",
"steps": [
{
"tool": "search",
"query": "book flights from London to Japan in May 2024"
},
{
"tool": "search",
"query": "book hotel in Tokyo"
},
{
"tool": "search",
"query": "book hotel in Kyoto"
}
]
},
{
"intent": "Return the final answer",
"steps": [
{
"tool": "write_answer"
}
]
}
]```
interestingly does a huge levelled response for japan query
like before
even on low temp
Yep, because lowering the temp just reduces the randomness
Which is good when using sources.
[
{
"intent": "Plan a trip to Japan in May 2024 from London",
"steps": [
{
"tool": "search",
"query": "flights from London to Japan in May 2024"
},
{
"tool": "search",
"query": "best places to visit in Japan in May"
},
{
"tool": "search",
"query": "Japan weather in May"
},
{
"tool": "search",
"query": "Japan itinerary for 7-10 days"
}
]
},
{
"intent": "Get accommodation options",
"steps": [
{
"tool": "search",
"query": "hotels in Tokyo"
},
{
"tool": "search",
"query": "best areas to stay in Japan"
}
]
},
{
"intent": "Plan transportation and activities",
"steps": [
{
"tool": "search",
"query": "Japan train tickets"
},
{
"tool": "search",
"query": "things to do in Tokyo in May"
}
]
},
{
"intent": "Return the final trip plan",
"steps": [
{
"tool": "write_answer"
}
]
}
]```
ran it again
even better???
groq.BadRequestError: Error code: 400 - {'error': {'message': 'Please reduce the length of the messages or completion.', 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
uhh
* ๐ https://www.selectiveasia.com/japan-holidays/weather/may - 366 chars
* ๐ https://top.his-usa.com/destination-japan/blog/a_guide_to_japan_-_may_and_june.html - 9822 chars
* ๐ https://www.japan-guide.com/e/e2273.html - 10000 chars
* ๐ https://www.holiday-weather.com/tokyo/averages/may/ - 7460 chars```
Rip, too long
Flights:
- Depart London on May 1, 2024 and return on May 15, 2024.
- Based on the search results, the cheapest flights from London to Tokyo during this time period are around ยฃ402 roundtrip. The flights will take approximately 13 hours and 49 minutes each way.
- The most popular airline for this route is Iberia.
Accommodations:
- For your 2-week trip, I would recommend staying in a mix of hotels and traditional ryokans (Japanese inns) to experience both modern and cultural aspects of Japan.
- In Tokyo, consider staying at the Hotel Gracery Shinjuku, which has an 8.3 rating and rates starting around ยฃ128 per night.
- In Kyoto, you could stay at a ryokan like Yoshida-sanso, which offers a more authentic Japanese experience.
- In Hakone, the Park Hotel Tokyo is a luxury option with stunning views of Mount Fuji, starting around ยฃ165 per night.
Itinerary Suggestions:
- Spend 4-5 days in Tokyo to see the top sights like the Imperial Palace, Sensoji Temple, and explore the diverse neighborhoods.
- Take a day trip to Kamakura to visit the famous Daibutsu (Great Buddha) statue and historic temples.
- Spend 3-4 days in Kyoto to see the Kinkakuji, Kiyomizudera, and Arashiyama bamboo forest.
- Visit Hakone for 2-3 days to ride the Hakone Ropeway, see Lake Ashi, and try to catch a glimpse of Mount Fuji.
- Consider a day trip to Nara to see the friendly deer and historic temples.
- Spend the remaining days exploring other areas of interest, such as Hiroshima, Miyajima, or Kanazawa.
Let me know if you need any other details or have additional requests for your Japan trip planning!
โฑ๏ธ 42.565316915512085 seconds```
the hotel pricing is decently accurate
the flight not so much?
well its literally may 29
how is it showing info for a flight in the past
Looks like they can save a lot. Now I wanna find out which filters have the most gain, and for what sites. So I can create embeddings to choose which filters to apply.
whys your cleaned html so high
my cleaned text was like only 1.2x markdown size
markdown just added like the bolding and title separation
Your cleaned html is the text content right?
Mine is actual html
oh lol
And tailwind css adds a lot to the amount of characters lol.
The initial cleanup is stuff like removing tags like script, and element with classes which include nav, sidebar, header, footer etc.
But I'm gonna make more advanced ones to remove stuff like all the tailwindcss classes.
But the aim is to remove any extra clutter which isn't part of the main content.
So stuff like the apple webpage would be doable after cleanup.
There's the cleaned html is getting smaller.
After the cleanup, and then passed back to llama 3 70B with the initial prompt, I get this as the source:
### Gemini 1.5 Pro: Quality, Performance & Price Analysis
**Quality:** Gemini 1.5 Pro is of higher quality compared to average, with a MMLU score of 0.819 and a Quality Index across evaluations of 88.
**Price:** Gemini 1.5 Pro is more expensive compared to average with a price of **$10.50 per 1M Tokens** (blended 3:1). Gemini 1.5 Pro Input token price: $7.00, Output token price: $21.00 per 1M Tokens.
**Speed:** Gemini 1.5 Pro is slower compared to average, with a throughput of 56.2 tokens per second.
**Latency:** Gemini 1.5 Pro has a higher latency compared to average, taking 0.95 s to receive the first token (TTFT).
**Context Window:** Gemini 1.5 Pro has a larger context window than average, with a context window of 1.0M tokens.
I think making the input super small like this is probably the way to go for price.
True
Really only writing mode doesn't have "search"
I got an issue, in the last two days while using the Perplexity answer repeating the previous answer with different prompt
yeah I think this is part of the issue.. and for all the models involved - like if one is told to find links relevant to X, and another is tasked with producing a response about X, the smaller models are prone to just making it up if the information (about X) isn't actually there (or if it's there but is embedded within a bunch of unstructured and non-relevant information)
lowering temp would presumably help
but it's also kinda just a limitation with the smaller models imo.. they just can't parse lots of information and stay focussed on multiple requirements effectively.. they get lost in a way that models like GPT-4 and Opus don't (or at least are less likely to)
btw came across this yesterday. haven't tried it out, but looks interesting (/potentially relevant @sleek vortex ) https://www.firecrawl.dev/
Firecrawl looks very interesting! Thanks for sharing here
I recall seeing that website, not cheap
yeah, looks expensive
i mean i think its open source
well ive done similiar things
just isnt organised lol
yeah looks like you can self host it for free
there was this very fast scraper
it doesnt do any of the parsing
this is just a crawler
but its pretty fast
written in go
I see
so what if we kept a function warm on like aws lambda
with a chromium browser backed by s3 scraping cache
then maybe when one query hits a site, we can go in the background and scrape the whole site into s3 cache
idk
but is there really much benefit from that
I subscribed to Perplexity a day ago, but I'm having a problem that I haven't had with others. It responds quite slowly and, worse, it repeats itself. For example, if I have an error in my code, it gives me a solution. I then ask something else, and it repeats the same thing I asked in previous questions. Is there any solution? Am I using the chat incorrectly? I don't know. These things never happen to me with the official GPT page or even with Phind.
I've noticed this with GPT-4o recently. Try to have it re-write using OPUS.
I subscribed to a pro plan today, and it seemed like we had a limit of 600 requests per day. Where can I see the remaining credits? It's not displayed anywhere in my account?
hover on the pro button
if youre enterprise yeah it just doesnt say
if you really want to check then open this link https://www.perplexity.ai/_next/data/hoDUGcZA5-ruYK-5Lc8a9/en-US/settings/org.json and look for this section
Over Pro button i just see CTRL .
yeah then youll have to do this if you want to really check
they hide it for enterprise pro, idk why
yes the bad thing is that I ask 5 things and they make me wait a whole day to use it again ๐ฆ I have a 20 dollar subscription.
well i dont think ive ever hit 600 anyway
ok it's was just for check but i'm not enterprise
yes just for curiosity ๐
BEcause i played so much today haha
pplx has bad followup context management i think
idk
yeah the counter is now hidden until you get 'low'
in Opus's case it's a bit misleading as you don't get the same amount, you get nearer to 1/10th compared to other models such as Sonnet or GPT-4o
My scrapers are already written in Go, so likely won't see a large difference.
8.33%
Looks like the advantage of katana is being able to easily switch between http and headless requests.
I just don't understand something:
- I upload a PDF into Perplexity,
- I ask a question or two about the contents of the pdf,
- I create a Collection and in the "AI Prompt (optional)" section I indicate that I don't want Perplexity to use any outside sources when it responds to questions,
- I come back the next day, open this collection and ask a question and it consults outside sources! I ask it in the prompt NOT to do this and it still does it!
What am I doing wrong? I appreciateย any help someone can provide. ๐ฌ
turn off pro mode
if you have pplx pro then select a model in settings like gpt4o or opus
I do have the Pro slider set to off....
And I am using GPT-4o: https://app.screencast.com/8BgvwUhLJLOYT
it doesn't have any kind of persistent memory. each thread starts fresh. your previous messages and uploads don't carry over
yeah but i think it's for a collection. as in you need to reupload the files each time. The prompt for the Collection is the only part that is "stored"
though i may have misunderstood
I didn't know that. I really thought it had this capability. Surprising.
can you explain what you mean by "upload carry over in follow up"?
be wary of what these large language models will tell you!
they're very convincing, always obliging - but not always accurate ha
The problem is that it told me that certain concepts were contained in the uploaded document, but when I searched for those terms in Adobe Acrobat it - correctly - told me that those terms were NOT mentioned in the pdf!
#โกโask-community message it all stems from the 'hallucination' proble with the technology
I guess I should just use Adobe Acrobat's AI Chat feature.... Or is there another AI tool I should use when I only want to search withing a pdf?
I could use: https://pdf.ai/ but I just don't want to subscribe to yet another AI tool.
you might want to try using the "Writing" mode on Perplexity; it disables the web/online part of perplexity, which will help keep the llm focused on the doucment
yo guys whats the best way to scrape web for giving llm latest context? im doing brave search api + cheerio currently
but fwiw for what you were describing earlier - wanting to upload a repository of files - you might want to look into chatgpt's Custom gpts
How do you turn on "writing mode"?
Press the 'focus' tab then select it from there
Excellent idea. Thank you!
google's Notebook LM is also good for working with documents
Perplexity is going to compete with open ai lol
Well, they've got more funding, but there is such a thing as wearing too many hats
I'd rather have 4.5/5 fix 4/4o's weak points than stretch its capabilities to other things that might suffer from those weak points
Hello!
Can anyone tell me what is the limit of Claude 3 Opus and when does attempts renew
Given that Google's main game is data and being synonymous with Internet searching, and how they have been struggling with Gemini that has 10x the context windows of GPT while having Google's 80% of all user data; plus, experience as the world's biggest search engine...
It shows that OpenAI would need to do a lot of work in areas that currently have better solutions, even if that search functionality was added.
It would be better for OpenAI to be innovative, improve their weak links, and make their foundation unshakable before trying to branch out into areas GPT wasn't built to handle optimally without fixing the weak spots.
Gemini was far less known compared to GPT to the general public and GPT still hallucinates without pulling fringe/troll search data from sites like reddit.
It would be a large blow to their credibility; just as Google's AI currently is losing credibility by suggesting people eat rocks or jump off bridges.
50 and I can't remember, sorry.
I have no clue how google made it such a joke honestly
I was able to make a better search summary than google with like a few days of work
@agile jay what do you think i should make the backend in
I would have probably made it with Go.
true but i know 0 go
Luckily it's probably one of the easiest languages to learn...
true
And the simple binaries as output is nice.
Yeah I find that super nice
And the build times too.
I just switched back to Android from iOS, and perplexity app seems to not support the same 'voice conversation' features on Android as on iOs? is that correct?
Yep, android users are second class citizens on perplexity...
Yep, that what happens when the devs are from SF...
specially since Apple and iPhone are literally the worst platform for Ai and will see signifcant dropoff in the next few years, due to their lack thereof, which they won't be able to make up for anytime soon.
sad.
Yep, in the US it's like 70% apple users, and since SF is likely richer, their percentage is likely even higher.
Yeh. The GPT assistant is macOS only too.
yeah, but that's OpenAI, but indeed, I was suprised considering they literally have microsoft as their top investor.
no idea why anyone wants to be inside the digital prison that is AppleVerse.
I tried an iPhone for 2 years, and yes it has some nice features, but it was a prison.
They like spending money to feel relevant.
It's also weird since Apple is so restrictive with what they approve of and disapprove of in hardware/software.
So much more red tape.
Yep, and no side loading.
But in the EU there is supposed to be sideloading this year.
EU is doing a lot to help everyone fight Apple's "isolationism" tech
In the US we wouldn't be able to hold any ground against Apple
Yep, otherwise you have to pay $99/year as a dev, just to sign your apps...
So, it makes no sense. OpenAI putting out an Apple only thing that I would love to try.
Given Microsoft backing.
The assistant was like my most interested thing during their announcement.
I wonder how windows 12 will work with all this ai hallucination. Will that thing break their system?
Well, MS has said W11 is the "last" iteration and it will instead just be patched and improved as an ongoing product
Chatgpt alone uses 6GB of my ram space currently lol
Chrome and Firefox eating up mah RAM too
MS says everything
Ye, that's why I said said :p
Instead of W11 is the last
The only people who know are above my paygrade.
"just buy a copilot+pc instead"
they said that for 10
then the ceo changed
Copilot+ spooks me
I remember once when microsoft said windows 10 would be the last windows version lol
And I would need to be an entire laptop to use it
recall is a cool rag idea until they start selling my data
why does it take screenshots
surely better way exists
Ye
Maybe the images is for the users benefit, and what is stored is the output from Phi 3 vision
GPT assist doesn't do that stuff like copilot+ ... The memory for the assistant may be less "accurate" in the long term but I really just need it to help with my current questions instead of asking if it remembers what I was doing three months ago.
Data is taken either way, but it's the amount that is so vastly different
If ai will be integrated in windows12, you would need atleast 32GB ram to operate
Yep, the lowest you can get on a windows 12 device is 32GB
why is the api so bad it doesnt even seem like its online
which model are you using?
the api isnt using the same model as the webchat clearly
Ooh my! That will cost alot
i want my money back
Not really, 32GB ram has been cheap for a long time.
Not Apple where adding 8GB more ram increases the price by a few hundred dollars...
LLaMa 3 i beleive
anyone else here using the api?
the sonar one? with -online suffix?
it's unusable, i've built my own api with other model
i can provide you source code
its litreally unusable
Are you also on windows?
im intrested
unrelated but can i ask you a go question
32GB of ram is pretty standard for even prebuilt these days. The RAM speed might be crap, but it's still 32.
i'm using it on openrouter currently, but i will later adjust it to be able to run locally
i just dont understand why it gives such bad respones then the web app? it doesnt say anywhere when signing up for the api that its using a unusable model
it's work-in-progress but searches well and is better than perplexity api already
Make sure you have a spicy GPU if you want to run locally
Sure
So that means, 5 years from now, people will go upto 128GB ram?
Yep, I use windows and linux.
does it give simmlar responeses to the web app?
the thing im confused about is i was under the impression perplexity was just a wrapper, so how come the api doesnt give the same respones
Nope, they are not compareable at all.
the api model doesnt even seem online...
If we think back to the nineties and consider people thought using 1gb of data was impossible to fill up for a non-commercial user... CoD is 200gbs. What we think is excessive now just becomes standard later.
yeah but they dont state that anywhere @agile jay
i want my money back
@signal hamlet
Hey sorry if this is wrong channel but who can I contact to get higher api limits?
There is sort of a soft cap with speed at a universal level when we start dealing with quantum computation at a consumer level.
Maybe try in #๐งชโapi-general
i could try this instead of llama 8b for sources...hmm
Nobody gets higher limits. :s
I know some that did :((
perplexity app uses different model than the one on api, and it's half-baked after some change
I want to be part of the happy few
Businesses undoubtedly
there is something shady going on here ngl
It's for my business too, a 100k+ users
actually 500k+
It only has one endpoint...
yeah its nothing like the pplx frontend
Then you might be in luck. There is contact info on Perplexity's site somewhere about enterprise stuff.
at most it might be the same as what free search gets...?
im going to bring some friends here and dig into there business model
/chat/completions
something seems wrong
alright will try to find it
You can say that again
will send you in 20min on dm, i fked up something in code
i was under the impression perp was a wrapper for other models, yet they cant make that model api accsaible?
seems like they are stealing and using something they dont want us to find
so confusing man
i dont think they have the infra to actually host api scale other models
whats the best api you guys are using
i think they just have some good code and a few cloud gpus on it
groq is good for small models
idk
about it really
really
i never used grog
fastest and cheapest out there
groq
they dont use gpu
openrouter is the best, has the most models for cheap
Groq for speed and price.
they use their own type of chip (LPU)
Surge limit: By default, all users are subject to a maximum rate limit of 200 requests per second to defend against denial-of-service attacks. Contact us in Discord or using our support@ email address if you need a higher limit.
Free limit: If you are using a free model variant (with an ID ending in :free), then you will be limited to 20 requests per minute and 200 requests per day.
hmm
Oh. Speaking of Groq, what y'all think of the xAI thing musky was talking about?
Open router and vercel for other stuff.
Different groq
thats grok right?
i need acruate update responses for my app thats why i was planning to use perp till i realised how dogshit the api is
got it
do you think this sort of api is in demand?
Always
Then what was the big announcement I was reading about this week?
im guessing yes
SInce people always want upto date info
that was grok x.ai (twitter)
Probably they are rolling out the next version of Grok
OH
not groq LPU inference
Gotcha
perp was giving perfect repsones in the web app, im so heartbroken the api isnt the same
Well, Groq Grok was crap on Twitter, I don't see why anyone would want to run it locally
ive been working on something similiar
nothing prod ready tho
idk how i would scale it to a full api
yeah man keep me updated 100%
grok*
My bad
right now i have working web search kinda model as decent as perplexity frontend
i must be missing something
how come we cant archive the same results as them if they aren't using there own model, i get it has harcoded prompts but how is it getting arcuate information via web search?. couldn't we jsut build the same thing?
or is that what you are doing @sleek vortex
A secret sauce that perplexity has on their web app?
well ive built that in less than a week
something shady man
something equivilant to pplx web app
yes please
Yep, it's not hard to make a perplexity like app. The harder part is getting VC money.
yeah lmao
To scale to the moon.
So do I, but most of the competition will likely get nuked by openai when they release their search.
The path to AGI is full of dead startups...
Wasn't Gemini 1.5 Pro with the 1M window also supposed to be a nuke?
Who actually needs 1M context?
my theory is something like this:
- they take the user's query
- in copilot mode, they send this to a small finetuned llm which returns a bunch of searches to make on google/bing/their own indexer
- in non-copilot mode, they just use keyword extraction or search your query as is on google/bing/their own indexer
they may have layers in the backend that summarises the sources or uses embeddings, but im not entirely sure - if they have their own indexer then they may be running this in the background but i really doubt pplx is doing this
- they then take the top N results and fit as much as they can into the LLM's context and make a response
i have users without web seacrh, web search will only make the respones in my app 100x better which should = more growth
ive been trying to make something similiar that can also do multistep reasoning
maybe im missing something
The longer the context the slower the response, and the more entropy to the output.
this makes alot of sense
well they need that 2.8million context to show their investors how you can upload a 2hr movie and ask it about a random frame
do you want me to give you details of what ive been doing?
Yep, the more difficult part is making the model answer the way the user wants.
Which is what multi step reasoning is useful for.
yeah
but right now my issues/todos are
- first moving the codebase out of python local
- somehow scaling it
- and then i need to maybe build my own indexer/cache layer backed on s3 or some form of cheap storage
- and then frontend, ofc
honestly its going to sound stupid but I'm just making a wrappers for different llm's that provide specifc responses for a targeted group and then proving them user friendly UI/UX to interact with and charging a subscription for the amount of tokens they use
perp skipped half of this for me
Yep, I think the best options are to make the backend in Go.
To change the infra to use a producer/consumer model for easy scaling.
To pre-cache a lot of popular sites using katana.
i mean yeah half of ai is just making it acessible and useful to each user's own circumstances
providing up to date responses
but the api is useless
so i cant use it inside my apps
Try sending as txt
No idea, but they can't hide it from me...
The ipad pro m4 pricing one
this the sort of thing ive been building so far
ooo
accurate up-to date information using multiple llms and custom searching pipeline
for e-com?
Yep, python can be janky when trying to make it concurrent.
no just something like pplx
you should try and self fund this, i would be intrested
You don't need much funding for it.
using what i can get
dm me details
for free
Go fund me, lol
nobodys going onto gofundme for an ai project
if i built out the whole project and platform i could probably get users from just promoting it
Or start a patreon for people to support you if they want.
then make a consumer subscription
like pplx
cheaper since i dont have opus
have every model except opus
Yep, there's a lot you can improve on with the current model.
i plan to add code interpreter and other tools too idk
Such as making it more agentic, since you have steps and intent prediction.
yeah
right now dont have dependencies really working tbh
Yep, since you don't have a clear schema for it.
are you working on it alone?
In my case, I just use gRPC to combine them, and have them as seperate services.
Yep, and by sharing progress so I and a few others can help with dev suggestions.
You could probably make more by just focusing on making an API for other devs to use, lol.
yeah ive been working on it here in this server for like a week or two
And then increasing the margins.
from scratch
lmao the fact we are talking about building a competitor in there discord cracks me up
yeah
yeah lmao ive had that thought at the back of my mind
fixed the code, accept invitation
yeah like even the fire scraping forgot-the-name that somebody else sent
like thats what just a project turned into a nice credits/api
and then people use it
because convenience, right?
Yep, probably the way to go, for the dev facing side.
my 50 opus queries can come up with a better name
?
i wonder if the vc's know how much money they are missing out on bc the api doesnt work
might have to let them know
But probably pplx also doesn't want an easy API, since then someone can easily just make a mirror site that just uses the API...
why do you think theyve got so much funding from telecoms
And make it cheaper than the subscription...
Yep, SKT and Softbank partnerships.
ill wait for the vodafone partnership so i dont have to pay for pplx pro...
ยฏ_(ใ)_/ยฏ
until then!
Yep, the question now is how to make citations a lot better in search.
citations...hm
Since currently just adding a super long list of source numbers is probably not the way to go...
we would need the mini models to push forward the used sources
What is going on here??
Just pages.
They are telling me โplease react to the channelโ with that and then they showed me I have to then access this channel
But I have no idea what channel it is
Ok my guy Iโm Not god I donโt know everything neither am I up to date on everythingโฆ
It is not my fault
nono, i dont mean it rudely :d
but yeah its a new feature they were testing
Maybe... Ask perplexity?
it isnt the best - i think theres still a decent amount of issues with it
Did they initiate the feature? Could I access it?
but its not bad either
Could I see for myself?
I love experiments
@sleek vortex katana is pretty good for crawling sites btw.
yeah it is
Many times they donโt respond to me at all
i got access by doing some form ages ago
not sure if theyre still checking it though
But it is not like a new update or a feature everyone can access
i think they were beta testing it
so they made it like gated
but then did they give up or something
as there hasnt been feedback for a few weeks
if you want to see it right now, i could put in a query for you
yeah, but we'd have to roll our own parser ontop
what i have right now is pretty good
but yeah as you said id like to be able to deal with sites like apple too
It's also easily accessible in Go, since it's a go package...
I was thinking of a technique to easily remove a lot of junk data.
Yes Sir
Basically compare all the pages of a site and remove the duplicate elements.
what on earth is this price
There are multiple layers in place.
@sleek vortex I clicked on the link you sent and it told me to insert an email and it said โwe will send you a message shortlyโ
interesting
Which is probably why.
So alright
Yeah i got accepted after like a day or two
I would imagine that would remove a huge amount of junk.
And work for nearly every site.
yeah then we could combine with convert to markdown or something
markdown is quite good because it preserves title weights/significance from articles
Yep, makes life so much easier.
where to start with go
I use markdown a lot, so i know why it is good lol...
Basically you can think of slices as the default arrays. Since it's pretty uncommon to use an array, which has a fixed length.
But they are using a slice in that one, since they didn't specify the length of it.
@sleek vortex what can it do anyways???
I hate is it even good at doing ?
Pages
right after they extend it?
I have experimented with various Language models and some Ai powered search engines #
so is the slice like undelying reference to the array
which is why its able to be expanded all of a sudden and re-adapt the elements
More or less. and the [:0] is just a slice, which python also has.
basically it asks your query to some small llm which splits it into 3 titles, then its the same as going and asking perplexity free to answer each of those title queries
its really not the best imo
So I already have a lot of knowledge and understanding in these models
hm, okay
Slicing is pretty useful, do you not use it when limiting the model input?
no i do but
go is like a bit different in how you can re-expand the slice
thats my main question
Huh? So you ask a question and then it answers them?
Not sure what you mean
they do s = s[:0] but then it's re-expanded with the same s?
let me screen rec a demo
Oh, yes, you can change the length of the slice, but you normally don't;
K
yeah like in python you dont do that right
once youve sliced it you lose the rest
same with like js
It does do it, but under the hood.
some_list = [1, 2, 3, 4, 5]
some_list = some_list[:0] # gets rid of all items in the list
some_list[25] = 25 # now i've added it to the 25th index, even though there are no values between 0 and 24
So it's something you can do, but you rarely ever see it in code.
Ok..
compressing it
im not lying
macos screen recorder outputs huge files
so im running ffmpeg on it (slowly)
frame= 3033 fps= 73 q=31.0 size= 8960kB time=00:00:50.51 bitrate=1453.0kbits/s dup=27 drop=0 speed=1.21x
nearly done
slightly long but yeah there you go!
Yep, ffmpeg can take a while, if you're doing CPU encoding.
It looks really professional
But can you be sure it didn't hallucinate?
Looks like something you can use to get 100% on a whole assignment
Not really. It's more like a overview of a topic, rather than an answer to an essay.
Well if you verify the information that it is giving you by means of sources, then they is a very low chance it hallucinated
You would be surprised...
how do you have the exact same UI as perp?
did you rebuild it
or get access to the source code
pages is perp
The "Pages" feature is currently in closed beta.
check dm
Nvidia single handedly is doing the heavy lifting for the entire US economy at this point
Yeah
Went from the 94th position
To 2nd
Within a year
When this bubble pops
Oh boi
i mean
will it pop
i didnt see any insane company growth like this in failed hypetrains like crypto/web3
gpt5
Yeah maybe that
But more like
Massive rollout worldwide free gpt 4 voice for the normies
did they go up after copilot+pcs
yeah
most people use that
some might be touching google gemini and others
And now gpt 4o is free including vision browsing etc as a small limit
but id assume the paid model population is really low in actual consumer adoption
Yep, but now they all have access to 4o with all its features.
Definitely going to hype the markets
the average consumer, doesnt know what they could do, or at least thats what i think
But it also means that the next model should come out soon, for the plus users.
Otherwise, what's the point.
Now the revolutionary thing will be if.....
They roll out voice mode
For free
As well
Yep
If some boomer in middle of nowhere
Gets to use the voice
I bet he will.take this ai stuff more seriously
Yep, all those retired boomers with no social life will likely use it a lot.
i wonder how google would change
if they released project astra tommorow
but after their already bad situation...
no clue
Guess AGI for president will be more realistic since it will have the retired voters votes.
Yeah definitely
Google has no quality control...
And by far the most AI devs
Didn't google raise the price for gemini flash
bruh their teams literally invented transformers
Right after bragging that it's cheap
Yep, doubled more or less
they have their own insane tpus
Ugh ๐
i honestly dont know why they arent first...
so stupid
they have the whole internet
they have all the compute ever
what are they missing?????
Because they are bad at making new products.
You know you could ask the same about American government
Or any organization
The answer is
They are only good at going into a current field and improving it.
Can't think of a field made by google.
The beuracratically stuck in a limbo
true
they didnt invent search
but they won in the end
or they have at least
the future, maybe not
They didn't invent YouTube
Yep, it's probably the reason why they have such a large graveyard compared to other companies.
yeah...
they have the whole of the internet on google images
why is their latest model not dalle 9 level
They are just being a wuss
Because they have too many devs.
LMAO
no wonder youre being flamed about eating rocks
Doesn't matter if you have the most compute, if it's shared with a large dev team.
The issue is
Search is so profitable for.them
And so cheap
They want genai to be just as cheap
But it's not
Maybe they are making AI summary sh*t on purpose.
To make people less likely to use it.
release ai summary 2.0
LLMS even a 1 billion parameter when deployed at a scale of billions actually is very expensive
but then why roll it out to the whole world...
To make people not trust it, before a good version even comes out.
Also it's been almost 3 weeks
No voice
And they are treating chatgpt.free users better
Yep, likely because of sky drama
And not even mentioned us plus opens