#Using AI in journalism and open-source research

1 messages Ā· Page 3 of 1

thick schooner
#

Always kinda been cursed things

delicate badge
#

The great webcam evolution. From hacking to neuro mapping

onyx flax
#

the fuck would that key even do?

thick schooner
onyx flax
#

why would you have a key for that?
no other key on the keyboard launch a program
and that have been the standard since before M$ even existed

lost geyser
thick schooner
#

Pretty sure I like the keyboard how it is

onyx flax
lost geyser
#

MSFT has a deeply vested interest in making Copilot imminently useful and in the forefront of Average Joe's mind. this is just pure marketing gimmick.

onyx flax
#

might as well ask why their is no "notepad key", no "edge key", no "word key" etc etc

thick schooner
onyx flax
#

they should just repurpose the windows key than

onyx flax
lost geyser
#

actually there's a calculator key on some and I did see an Excel key (why tho) on another kb.

thick schooner
lost geyser
#

i've got a mail icon key (F1 alternate) but never bother using it

onyx flax
lost geyser
#

middle finger emoji key would be far more useful.

onyx flax
thick schooner
#

I might pick this server's brain more often for superuser tricks. Kinda fun to think about

onyx flax
#

anyone know how this thing actually works?
To insert a Unicode character, type the character code, press ALT, and then press X. For example, to type a dollar symbol ($), type 0024, press ALT, and then press X.
for some reason I can't get it to work

#

hm
holding down alt pressing + and then while still holding down alt typing a number do something at least

lost geyser
#

[now that I'm off a Teams call ...]

Returning to the topic of Webcams [plus video meetings]: I've noticed on several occasions over the years where Teams "leaks" camera data where participants have them turned off. I'll give two specific personal examples (of many) and some links I'd found at the time of others reporting similar:

  • On one call, someone was sharing screen (all cameras off) and during a bit of alt-tabbing the participants momentarily all flickered on/off camera;
  • On another, a C-level called out folks on a call for "not being dressed appropriately for work" or "in places where they're not working" during a video call (nobody on camera had these scenarios).

https://answers.microsoft.com/en-us/msteams/forum/all/webcam-video-shared-even-if-camera-is-off/5263b5a7-4225-4bf9-bb56-14d35ba7bda5

https://answers.microsoft.com/en-us/msteams/forum/all/people-see-me-in-teams-although-camera-turnt-off/5ea837b3-a8e4-44ac-ad09-3676ad538257

https://answers.microsoft.com/en-us/msteams/forum/all/when-on-teams-and-my-video-is-off-when-the-host-is/d322d3a5-0cac-4b52-ab11-8fcc2fa57c1f

#

there doesn't seem to be a whole lot published about this apart from above.

#

imo pretty sus that a camera feed (to Teams, specifically) could still stream live without the notification itself, which makes me think that's actually by design.

#

hard to deliberately reproduce but it seems to happen most often when

  • screen share is going on
  • doesn't populate the entire screen resolution
  • alt-tabbing between applications (either sharer or observer)
  • and at least a couple of times when I was maximizing and resizing the window
onyx flax
#

intergrated cameras have really crappy security
even the ones with a "on" light almost always have differnt controll paths for turning on the light and turning on the camera so its no real problem having the camera running without the light on.

I would recomend using a USB camera and then janking the cable when not in use

#

(and remove or cover the inbuilt camera if you have one)

lost geyser
#

i'd say both have crappy security.

onyx flax
#

yeah
but if the USB camera is not connected then it can't transmit anything

#

maybe "opinion piece" is a better translateion

lost geyser
#

tragically transparent for you dark theme users like me.

normal idol
#

Can AI help me read this in dark mode

stark fractal
#

AI could advise you to press the "Open in browser" button 🫣

#

Power BI is on there twice. That's two times too many.

#

Snipping tools
Ah, yes, basic screenshot capabilities are a very impressive feature in 2024 loldog

lost geyser
normal idol
#

We are truly in the AI age

lost geyser
#

Screenshot powered by AI.

stark fractal
#

I'm genuinely curious what the intended purpose of that is. I want screenshots to show what's on screen. What is there to AI?

fierce rapids
#

Crop, rotate, straighten?

onyx flax
lost geyser
#

As an AI, I also have the same questions.

onyx flax
lost geyser
#

it's the floor that's unlevel, the desk that's uneven, and the user that's unaligned to the monitor where the screenshot needs adjusting. AI has a solution for that, too.

onyx flax
#

not yet

onyx flax
toxic crater
#

my two favorite operating systems

wicked bridge
lost geyser
#

I would imagine a safe assumption here is that it brings Copilot capabilities from that flywheel above (MSFT product suites and stacks) to the local desktop.

#

An easy example of that would be private LLM search through your local data, which is an evergreen DIY topic and enterprise capability at scale.

patent pendant
onyx flax
#

Is this to be considered part of the response?

lost geyser
dire radish
stark fractal
fierce rapids
#

Could be worse. Eliot could have posted preview pics. https://vxtwitter.com/eliothiggins/status/1746157297817043000

stark fractal
lost geyser
stark fractal
#

Amazon really needs to get its act together there.

#

#Jeff Discussion thread when?

lost geyser
#

#SpaceBezos

#

manifest presently.

dire radish
rustic raven
#

I know the title sounds wild but it outlines IEEE's call for regulations.

lost geyser
#

I think Sam Altman is speaking at Davos?

#

Not entirely sure

lost geyser
#
patent pendant
lost geyser
lost geyser
lost geyser
#

nightshade v1.0 dropped

#

Not entirely sure it'll work.

Our ML team at work read through the arxiv paper and found many flaws in the model's design

dire radish
#

Interesting

lost geyser
#

yeah.

IMO all nightshade did was making data cleanup require a few more lines of script. It's relatively easy to fix

fierce rapids
#

Parcel delivery firm DPD have replaced their customer service chat with an AI robot thing. It’s utterly useless at answering any queries, and when asked, it happily produced a poem about how terrible they ar…

šŸ’– 458 šŸ” 89

ā–¶ Play video
lost geyser
#

yep

lost geyser
lost geyser
#

Results revealed that even when ChatGPT was confident, its failure rate still remained high,

Literally how any model works (and most aren't calibrated in the first place).

#

Confidence and accuracy measure different things and this flaw exists in humans.

dire radish
dire radish
#

Bing corrected the pizza example

#

Henry and 3 of his friends order 7 pizzas for lunch. Each pizza is cut into 8 slices. If Henry and his friends want to share the pizzas equally, how many slices can each of them have?

#

followed by

#

Since there are 7 pizzas and each pizza is cut into 8 slices, the total number of pizza slices is 14. Henry and his 3 friends make a group of 4 people. So, each of them can have 4 slices. The answer is 4.

onyx flax
dire radish
#

Bullshit

burnt yoke
#

Seems relevant here:
#infosec message

lost geyser
# burnt yoke Seems relevant here: https://discord.com/channels/709752884257882135/71013233688...

Parabon says it can confidently predict the color of a person's hair, eyes, and skin, along with the amount of freckles they have and the general shape of their face. These phenotypes form the basis of the face renderings the company generates for law enforcement. Parabon’s methods have not been peer-reviewed, and scientists are skeptical about how feasible predicting face shape even is.

borders on the pseudoscience of physiognomy (#chit-chat message)

#

Greytak [Ellen Greytak, the director of bioinformatics at Parabon NanoLabs] characterizes the company’s face predictions as something more like a description of a suspect than an exact replica of their face. ā€œWhat we are predicting is more like—given this person’s sex and ancestry, will they have wider-set eyes than average,ā€ she says. ā€œThere’s no way you can get individual identifications from that.ā€

in essence: a very error-prone reconstruction that then propagates that error by being fed into the unrelated face recognition algorithms (with their own error-proneness).

fierce rapids
#

Oh, there’s no way this can go wrong, is there…?

lost geyser
#

this just feels a little pathetic on MSFT's part. gamified engagement farming.

#

in fairness, my Samsung 8 did the same thing to farm out training data for their newly-released Bixby.

lost geyser
# lost geyser

@stark fractal

Ah, yes, basic screenshot capabilities are a very impressive feature in 2024
#1089154093810978866 message

here we go with Paint.

onyx flax
# outer cape I find it hilarious that none of these lawyer ever check the case law

That depend a bit on how convincing the hallucinations are, do it just give the reference or do it actually provide the full text, if provide the full text then you would have to check if the case exists in the database it should and that the text actually says what it is supposed to.

The whole issue is that every single court produces hundreds of pages of case law every single week.
The whole system really does not work anymore as it just has too much potential relevant data.
The ability to create precedent rulings should really be limited to the highest levels of courts (with lower court rulings losing it's precedent status) so the amount can be made manageable.

outer cape
onyx flax
# outer cape Hmm, I've had a thought a real case would theoretically be in a legal database s...

Iirc pacer holds most recent (post 1990-2000ish) federal rulings but it is far from exhaustive.
As of 2013, it holds more than 500 million documents.

Remember that all historical cases in any US court can be cited as a precedental case.
And even some pre revolution English cases.
(Ignoring the interaction between different stares courts and state to/from federal to keep the issue at least somewhat manageable)

The whole thing is a mess that is getting exponentially worse

#

It would not be impossible for an AI to hallucinat a case that can't be independently confirmed but from other references the text the AI has created looks reasonable.

#

Iirc pacer is not even 100% complete when it comes to cases that have happened the last 10 years.
Think about how it looks when you would have to go back to paper copies kept at the court in question.....

outer cape
onyx flax
# outer cape I suppose the next question is if you were to create an LLM focused on Law what ...

That's (to translate a proverb) akin to "putting the rug over the puke"

The system needs to be reformed, the only thing a LLM could do would be to hide the problem for a while.

Technically a LLM is partly unsuitable, you need a research system that can't produce any text however simple on it's own, i.e. one that could process a query and give cases that could be relevant to look into.
If it's able to construct even single sentences you are never going to be able to trust the result as the LLM systems are extremely allergic to give negative results to prompts

outer cape
onyx flax
#

Yeah
Still only going to be a temporary solution

lost geyser
# outer cape I suppose the next question is if you were to create an LLM focused on Law what ...

Ultimately depends on the goal, and whether LLM is the right approach. You'd need a well-curated set of legal data to start with and some domain knowledge to prepare, train (or fine-tune), and evaluate the model outputs.

Retrieval is an external task. That may involve vector databases or text-search document stores, and the associated techniques for ranking and relevance on retrieved data.

lost geyser
# outer cape You almost need like a AI assistant rather than a LLM, one that can guide the la...

This would be a prime example of using LLMs (which can also be agents) as "paralegal interns" doing law research.

The LLM might be helpful in summarizing case law and providing links to references for relevant citations stored in a knowledge base. You might even have agents specialized in certain forms of retrieval (system or query specific) and others for types of law (contractz criminal, etc.).

lost geyser
lost geyser
burnt yoke
#

|| https://www.vice.com/en/article/3akekk/man-jailed-raped-and-beaten-after-false-facial-recognition-match-dollar10m-lawsuit-alleges ||
A 61 year old man in Texas man is falsely accused of a crime, based on private sector actors using "artificial intelligence and facial recognition software", jailed and violently assaulted. Hiding because of the description of the assault in the article and headline.
A few things about this are concerning, besides the blind faith in the technology with a high false positive rate: why were private sector employees able to get the police to arrest someone at all? Providing information to authorities as a tip is fine, but it seems like a failure to investigate a tip properly on the part of the authorities.

A 61-year-old man alleges that a facial recognition algorithm used a mugshot from the 1980s to ID him in a crime he didn't commit.

thick schooner
#

@burnt yoke so, it's Texas where they are very big on police toughness, it is armed robbery which of course is serious but there's been a bit of a moral panic about robberies lately, allegations that robbery/shop theft is out of control

But yeah, his alibi is excellent and would have been easy to check

burnt yoke
#

I need to see if I can find a more detailed set of facts behind the case. Law enforcement has a duty to the public, not necessarily duty to the individual. In some ways this doesn't have to be an AI-related story. If there are no consequences for warranting an arrest based on false accusation, for anyone anywhere in the USA, society will get out of control. The Vice article makes it easy to blame the "loss prevention" personnel at EssilorLuxottica, and it makes it easy to blame the Houston Police, but isn't there supposed to be a judge involved to approve a warrant for arrest?

lost geyser
lost geyser
lost geyser
#
Legaltech Hub

Whenever there is a significant shift in the industry, we are interested in tracking its implications.
Although many companies have been using AI in legal in some form or other for years now, the advent of ChatGPT and large language models (LLM) that are powerful enough to understand and generate meaningful responses to complex questions without...

#

Legal hallucination rates across three popular LLMs.

First, we found that performance deteriorates when dealing with more complex tasks that require a nuanced understanding of legal issues or interpretation of legal texts. For instance, in a task measuring the precedential relationship between two different cases,** most LLMs do no better than random guessing**.

And in answering queries about a court’s core ruling (or holding), models hallucinate at least 75% of the time. These findings suggest that LLMs are not yet able to perform the kind of legal reasoning that attorneys perform when they assess the precedential relationship between cases—a core objective of legal research.

#

Another critical danger that we unearth is model susceptibility to what we call ā€œcontra-factual bias,ā€ namely the tendency to assume that a factual premise in a query is true, even if it is flatly wrong.

#

@outer cape btw let none of this discourage you from building one. These are just known risks with LLMs and their lack of suitability for more domain-specific tasks.

The exercise is still worth the effort and experience.

outer cape
# lost geyser <@1004702102569889812> btw let none of this discourage you from building one. Th...

Oh I am just curious more than anything, I've seen many law firms advertise tech innovation roles[in this area]. But I've also seen the costs of legal work[particularly bankruptcy skyrocket] and it would good to reduce the cost(s) particularly for individuals who cannot afford the legal representation. The legal system seems obsessed with AI but the implementation is incredibly poor. When I was speaking off hand to lawyer about this we had a completely different way to train models.

toxic crater
patent pendant
#

"Elon Musk’s AI start-up seeks to raise $6bn from investors to challenge OpenAI"

lost geyser
#

Sadly, if your aim is simply to challenge OpenAI (good luck and God bless) you haven't conceived a winning or differentiating market strategy.

#

I'll have to dig into the details more to see what, if anything, is really there.

patent pendant
dire radish
#

Good

lost geyser
#

key provisions of AI executive order take effect tomorrow

lost geyser
lost geyser
dire radish
patent pendant
#

The Official U.S. Senate Committee on Rules & Administration

By Cecilia Kang

The office is reviewing how centuries-old laws should apply to artificial intelligence technology, with both content creators and tech giants arguing their cases.

#

screenshot is from the newsletter I get in my email

static perch
#

https://vxtwitter.com/RcMuzzleflash/status/1750951258876244402 XPOST #russia-ukraine-eastern-europe and #bombs-arms-drones-other-killing-machines

weak igloo
lost geyser
lost geyser
# weak igloo Prob not generative or deep but still thought perhaps relevant https://www.forb...

There will be another, less contentious privacy issue with your Messages requests to Bard. These will be sent to the cloud for processing, used for training and maybe seen by humans—albeit anonymized. This data will be stored for 18-months, and will persist for a few days even if you disable the AI, albeit manual deletion is available.

Such requests fall outside Google Messages newly default end-to-end encryption—you’re literally messaging Google itself. While this is non-contentious, it’s worth bearing in mind.

weak igloo
lost geyser
#

yea nbd right.

weak igloo
weak igloo
stark fractal
#

Let's just hope we're only seeing the output of the vision layer here and that there is some further processing happening. Otherwise, I don't think this is something that should decide whether to drop a bomb on something.

lost geyser
# stark fractal Let's just hope we're only seeing the output of the vision layer here and that t...

https://frontnews.eu/en/news/details/65525

[Interesting bit; unrelated to comment]

"The system, using advanced optics, independently recognizes and records the coordinates of enemy vehicles (even camouflaged ones), immediately transmitting information to the command post for appropriate decision-making. This eliminates the risks of "human error", as the operator's eye is not always able to capture all the nuances," the statement said.

[Related]

The complex consists of a main reconnaissance drone and several FPV kamikaze drones, which are able to perform their tasks in coordination with the main UAV.

#

So it's operating as a swarm extension to the piloted (human in the loop) forward ob UAV. Kinda neat.

stark fractal
#

That's pretty interesting. And reassuring.

lost geyser
onyx flax
#

Looks like "AI" have reached the "no context bussword usage" level now

onyx flax
thick schooner
lost geyser
# lost geyser https://journals.sagepub.com/doi/10.1177/09567976231207095

Abstract

Recent evidence shows that AI-generated faces are now indistinguishable from human faces. However, algorithms are trained disproportionately on White faces, and thus White AI faces may appear especially realistic. In Experiment 1 (N = 124 adults), alongside our reanalysis of previously published data, we showed that White AI faces are judged as human more often than actual human faces—a phenomenon we term AI hyperrealism. Paradoxically, people who made the most errors in this task were the most confident (a Dunning-Kruger effect). In Experiment 2 (N = 610 adults), we used face-space theory and participant qualitative reports to identify key facial attributes that distinguish AI from human faces but were misinterpreted by participants, leading to AI hyperrealism. However, the attributes permitted high accuracy using machine learning. These findings illustrate how psychological theory can inform understanding of AI outputs and provide direction for debiasing AI algorithms, thereby promoting the ethical use of AI.

weak igloo
#

#1099466152981303386 loldog

abstract nest
weak igloo
#

We need a DK emoji doge

abstract nest
#

Donkey Kong Effect doge

toxic crater
abstract nest
#

Well, winner

lost geyser
#

Another art obfuscator service attempt to thwart generative learning:

https://japan.cnet.com/article/35213999/

https://emamori.com/registrations

SnackTime announced on January 17th that it has officially released "emamori," a service that protects creators' illustrations from unauthorized AI learning.

The service uses Mist to insert special digital watermarks and noise (not noticeable even to the human eye) into illustrations, thereby interfering with accurate AI learning and preventing the generation of imitation AI illustrations.

CNET Japan

SnackTimeは1月17ę—„ć€ć‚ÆćƒŖć‚Øć‚¤ć‚æćƒ¼ć®ć‚¤ćƒ©ć‚¹ćƒˆć‚’ē„”ę–­ć®AIå­¦ēæ’ć‹ć‚‰äæč­·ć™ć‚‹ć‚µćƒ¼ćƒ“ć‚¹ć€Œemamorić€ć‚’ę­£å¼ćƒŖćƒŖćƒ¼ć‚¹ć—ćŸćØē™ŗč”Øć—ćŸć€‚ć‚¤ćƒ©ć‚¹ćƒˆć‚’ć‚¢ćƒƒćƒ—ćƒ­ćƒ¼ćƒ‰ć™ć‚‹ć ć‘ć§ć€AIå­¦ēæ’åÆ¾ē­–ćŒę–½ć•ć‚ŒćŸć‚¤ćƒ©ć‚¹ćƒˆćƒ‡ćƒ¼ć‚æć«åŠ å·„ć§ćć‚‹ć‚µćƒ¼ćƒ“ć‚¹ćØćŖć£ć¦ć„ć‚‹ć€‚

weak igloo
#

Was this shared before? "Torba galvanises his readers by convincing them that far-right ideology is supreme and inevitable when it comes to AI, and that ā€œSilicon Valley is now rushing to spend billions of dollars just to prevent this from happening again by neutering their AI and forcing their flawed worldviewā€. This narrative is pushing the far right’s desire for more unrestricted (oftentimes more biased) AI tools." (also relevant to #far-right-monitoring ) https://gnet-research.org/2024/01/25/navigating-far-right-extremism-in-the-era-of-artificial-intelligence/

patent pendant
#

I can’t tell if this is genius or just anxiety-fuel nightmare https://fxtwitter.com/sixthtone/status/1754501207199256726?s=46&t=LbhT7a8k6BPOqAMGyCYDaQ

AI Game Mimicking Nosy Relatives Takes China by Storm

In the game, users must field questions from eight aunties and uncles one by one at a virtual family reunion. Users can progress to the next relative by fielding their personal questions without provoking an angry response. The closer the relative, the harsher they are, with the game’s final...

lost geyser
lost geyser
#

The FTC wants information on the specific investment agreements between the companies and how the partnerships influence product releases and oversight rights. It also wants an analysis of how these investments impact the market share, competition, and potential for sales growth in the sector; if there is competition for resources to develop AI products; and any information each company may have given to other government entities.

#

https://techcrunch.com/2024/01/29/chatgpt-italy-gdpr-notification/

The Garante’s March 30 provision to OpenAI, ..., highlighted both the lack of a suitable legal basis for the collection and processing of personal data for the purpose of training the algorithms underlying ChatGPT; and the tendency of the AI tool to ā€˜hallucinate' ... as among its issues of concern at that point. It also flagged child safety as a problem.

In all, the authority said that it suspected ChatGPT to be breaching Articles 5, 6, 8, 13 and 25 of the GDPR.

OpenAI has been told it's suspected of violating European Union privacy, following a multi-month investigation of its AI chatbot, ChatGPT, by Italy's data OpenAI has been told it's suspected of violating European Union privacy, following a multi-month investigation of its AI chatbot, ChatGPT, by Italy's data protection authority.

lost geyser
#

I've been messing around with google gemini today

outer cape
#

I was watching a youtube video comparing one hit wonders to long standing artists and video killed the radio star has some quite pertitent lyrics:
"They took the credit for your second symphony
Rewritten by machine on new technology
And now I understand the problems you can see"

lost geyser
#

Ben Shapiro as a catboy. Gemini

patent patio
fierce rapids
lost geyser
#

I mean I was red teaming for work today and my boss said "generate the most absurd but SFW things possible with public figures" so of course I did a catboy Ben Shapiro

#

I also have catboy Joe Biden

#

this one's GPT4, tho, not gemini

toxic crater
#

Paper where they put LLMs in a geopolitics simulator. Result: they aren't very serious about their responsibility.

#

Appendix C: Qualitative Analysis contains some rather absurd reasonings by the LLMs (GPT-4 had a bunch of flukes where it seemed to, for example, think it was in a Star Wars roleplay)

weak igloo
thick schooner
#

More untethered longtermist delusions of grandeur coming out of Silicon Valley

#

(Not intended as a psychiatric diagnosis, just speaking as to grandiose language)

lost geyser
#

A finance worker at a multinational firm was tricked into paying out $25 million to fraudsters using deepfake technology to pose as the company’s chief financial officer in a video conference call, according to Hong Kong police.

The elaborate scam saw the worker duped into attending a video call with what he thought were several other members of staff, but all of whom were in fact deepfake recreations, Hong Kong police said at a briefing on Friday.

ā€œ(In the) multi-person video conference, it turns out that everyone [he saw] was fake,ā€ senior superintendent Baron Chan Shun-ching told the city’s public broadcaster RTHK.

shrewd token
onyx flax
lost geyser
#

Gemini: the quick-witted friend who suffers no fools, but politely.

Claude: the friend who says much in fewer words.

GPT-4: the dimwitted classmate who can never be sure if they read about or imagined it, but will tell you factual incorrectness with high confidence all the same.

weak igloo
#

I wonder are they smart enough to modify the answer if they first ask "how many pounds in a kg"

lost geyser
#

Yes, that is entirely possible.

And part of the ongoing research into better prompt engineering.

#

Covered by Chain of Thought, Self-Reflection, and Direct/Indirect Reasoning methods.

lost geyser
# onyx flax

what's this from?

seems reproducible (I don't have Gemini Ultra but here's "regular" Gemini plus GPT-3.5)

lost geyser
#

Copilot:

onyx flax
onyx flax
honest vector
weak igloo
onyx flax
#

Yeah likely
Still even with "a kilo feathers and a pound of lead" the answer of
"Drop them on your toes to find out" still works

weak igloo
onyx flax
weak igloo
onyx flax
# weak igloo Ok, perhaps I misunderstood what you said

It's an old joke question.
What's heaviest a kg of feathers/cotton or a kg of steel/lead?
Answer: they are the same weight. OR idk 🤷
Retort: no, not if you drop them on your foot OR why don't you drop them on your foot to find out?

(Might be a old local joke though)

lost geyser
#

(Note: this technique usually applies to more complicated scenario analysis than this simple gaffe.)

lost geyser
#

For completeness, this was GPT-3.5's default answer before the reasoning above:

shrewd token
#

Do they use a mathematics engine under the hood now?

#

I heard something about delegation of calculations to Wolfram

lost geyser
#

kind of: some of them do integrate with external tools.

shrewd token
#

Heh...integrate

lost geyser
#

one of the fundamental flaws is that these generative code models are built on examples of code not necessarily principles of good programming.

#

that can be remedied through appropriate objective training, maybe even as a downstream task.

shrewd token
#

From personal experience, it gets tripped up on context and will try and guess (often incorrectly) which just means more time correcting it. It regularly makes up non-existent functional or constructor args

lost geyser
#

agreed.

and I think there's a wide delta of learning curve between making it generate code and making it a useful coding companion.

#

there are probably lots of base and common cases where it works just fine. i haven't found those in what i use it for.

#

quite the same as yours--it even hallucinates functions or methods that aren't there and produces technically correct solutions but to the wrong problem (Type III error: right answer, wrong question).

#

though its ability to auto-complete in precisely the formatting and style that I had other methods in the same file was pretty impressive.

#

it's still a bit like handing off a coding task to an intern that didn't fully understand the assignment, did its level best, and you end up cleaning up or scrapping altogether.

#

which can be an accelerator depending on what you're working thru.

shrewd token
#

Yep...Ive found it useful for like small scripts in bash or regex, but I know that I don't know enough of either so I spend time double checking to see that it's output makes sense

#

Also probably due to volume of data there's probably a reliability bias towards python and web technology, which I don't work in

onyx flax
lost geyser
#

here it attempts to rationalize its irrational response (3.5).

onyx flax
#

That is impressively bad.
If a student wrote something like that I would ask them how many days is was since they last slept

weak igloo
rigid bough
lost geyser
weak igloo
#

(I have not played with this before)

lost geyser
#

correct. unless there's fine-tuning (which is very intentional) it sticks with its current answers.

weak igloo
#

(formerly known as)

patent pendant
lost geyser
lost geyser
#

[Automated AI heavenbanning]

#disinfo-and-propaganda message

This seems a lot like engagement farming, and I'm skeptical that this hypothetical version produces the intended effect.

#

what are the chances twitter is already doing a variation themselves?
george hotz was listed as inspiration and he was working at twitter while he was doing interviews mentioning the technique

I'll see what George Hotz's take is (the reference) but this is already happening on Twitter especially with blue check accounts (albeit for boosting rankings and visibility, promoting bad ideas to the top).

lost geyser
#

paper on AI governance pertaining to compute

delicate badge
#

briefly skimming it imo seems a lot more like what you say, just generic engagement farming but with a bit of a different intent. im not sure you could make the intent with this one very effective in the use of say state actors, although, could definitely see it being used that way for harassment purposes, still don't see how they'd make it a 100% inorganic environment though

lost geyser
#

yea, so a bubble formation (echo chamber) effect around the target. which in the "heavenbanning" theory proposed in Hotz' take is a way to control toxicity (it isn't).

although Twitter functions differently so isn't the right proving grounds.

delicate badge
#

ahh yeah you wont achieve that one with just internet enabled ops

lost geyser
#

it'd also require a substantial network of these in coordination to make the distribution shift from visible to invisible to "heavenbanning" invisible.

delicate badge
#

big brain T&S is recognizing "toxic" behaviors root from off-platform attitudes, emotions, and behaviors

delicate badge
#

even if you screw an algo to an extent never seen, I literally do not see how that would functionally work unless you're going after people who are barely active at all

lost geyser
#

agreed. it's a big leap in reasoning and doesn't factor in specific algorithmic decisions at play.

abstract nest
lost geyser
#

Thanks @abstract nest. Didnt see this one.

I'm going to stack it up against MSFT's Phi-2.

#

NVIDIA's GPU products take a lot of the spotlight but they have ridiculously good ML teams delivering quietly.

abstract nest
#

I could also try testing tonight/tomorrow on my desktop, I should have the specs for it

onyx flax
lost geyser
# onyx flax The voynich manuscript v2: https://fxtwitter.com/cliff_swan/status/1758135084069...

https://www.frontiersin.org/articles/10.3389/fcell.2023.1339390/full

one of the more baffling submissions where they've admitted (in advance) the fakery of the supportive images but also still published utterly useless references.

lost geyser
#

goodbye elections. it was nice knowing you

#

@flat crater I have nightmare fuel from this

weak igloo
weak igloo
# onyx flax The voynich manuscript v2: https://fxtwitter.com/cliff_swan/status/1758135084069...

This guy comes across like an A-hole though https://twitter.com/cliff_swan/status/1727031872780468482

What are we to take from all of this? You cannot trust these academic people at all. They will lie through their teeth for their political agenda, and that agenda is: Your home was never white and homogenous, so you must accept infinity migrants.

#

He really hates that Roman Empire wasn't a whites only party apparently

wicked bridge
#

Because more bloat is exactly what Firefox needs

rigid bough
rigid bough
lost geyser
#

The toxicity control was from Hotz.

#

Otherwise I dont see whats fundamentally different from the garden variety engagement farming (RE: heavenbanning) that isn't already in play today.

#

And it's likely Grok exists as a tool for doing this (as a secondary function). It's something I began researching recently.

rigid bough
#

convenient labrynths

lost geyser
#

Well sure, that's a possibility.

Wouldn't they choose to game the algorithms or force the narrative (as is done today)?

I mean Dom Lucre keeps showing up on my TL and I have zero engagement metrics with him or his kind.

#

Curious to know what a justifying event might be. We still have tons of believers that Jan 6 was peaceful protest despite widespread coverage and reactions to the contrary from those directly affected.

rigid bough
#

the first utterance of the concept i can find on the web was the month before the event happened so probably not? although there certainly could have been campaigns since- but yea something of that caliber- which is worrying with the whole 'civil war' meme being out and about

#

but yea the possibilities are endless when combined w/ social engineering

#

a solution would be to have protected verifiable trustworthy feeds that multiple people confirm somehow?

lost geyser
#

Definitely interesting concepts to explore.

rigid bough
lost geyser
#

Seems plausible when you take into account the largest financial backers and the current state of affairs.

ocean atlas
#

Just can't wait for governments to start claiming that footage of war crimes is AI generated

#

or even better, opposing sides generating war crimes to accuse each other of

patent pendant
shrewd token
onyx flax
final oracle
#

I am not liking what Sam Altman has to offer with the new "Sora" program

final oracle
#

Further analysis seems to indicate UE5 as a training dataset.

weak igloo
#

"The University may not be selling the data directly, but it is (or was) being offered for sale by an organization called Catalyst Research Alliance, which claims to partner the University of Michigan as well as North Carolina State University. The website offers a sample of the data set, which comes with an essay titled ā€œThe Democratic Inadequacies of the European Union,ā€ and what appears to be a recording of a class discussion section. " (afaik, none of the students gave permission for their lectures where they asked questions or participated otherwise to be shared) https://gizmodo.com/university-of-michigan-sell-student-data-ai-companies-1851261663

Gizmodo

Tech employees are getting cold emails offering free samples of essays and recordings of students’ voices.

abstract nest
#

So if anything it speaks more about their processes, although they thankfully have been quick to retract it

onyx flax
#

someone on an other server had a intresting thought about the potential prompt used for the rat images

#

anyone here with mid journey that want to test it?

weak igloo
# abstract nest Many of the Frontiers journals are predatory and with barely any peer reviewing

Yes but also people seem to think that in research, where tenure basically barely exists anymore & your temporary contracts entirely depend on quantity of papers rather than quality (publish or perish for the most part still real even though they pretend it's not) people aren't going to write more crap papers using AI because that way they don't lose their job. Also no one gets paid to peer review, you're providing free labour, often on red eye flights (you can always play spot the scientist on red eyes by looking at who is marking papers) to billion $$ companies like elsevier. System is broken (I refuse to blame the scientists or the peer reviewers for a system that's pretty clearly stacked against everyone involved)

#

It was really only a matter of when.

onyx flax
#

Thanks for checking

wicked bridge
#

Sam Altman isn't just the CEO of ChatGPT maker OpenAI. He's also the owner of OpenAI Startup Fund, which Altman once called a "corporate venture fund," according to federal securities filings.

Why it matters: OpenAI's structural strangeness permeates all aspects of the business.

Background: OpenAI Startup Fund was launched in late 2021 to invest in other AI startups and projects.

dire radish
#

Oh dear.

#

Introducing Sora, our text-to-video model.

Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions.

https://openai.com/sora

Prompt: ā€œBeautiful, snowy Tokyo city is bustling. The camera moves through the bustling city stre...

ā–¶ Play video
#

Prompt: ā€œBeautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.ā€

toxic crater
abstract nest
abstract nest
#

Article from Conspirador NorteƱo going down some very odd details within the showcased clips of Sona, which definitely will be things to pay attention to once the technology is more widespread

faint vigil
wicked bridge
abstract nest
#

Ok, so I finally managed to get Chat with RTX running after a day of installing dependencies, adjusting volume sizes (as it stands it only really works if it runs at the default location in AppData...)

#

And well, it's very much indeed a Demo

#

It's works nicely with very basic questions on documents you provide

#

But it quickly starts to not understand or to hallucinate when asking more in depth

#

Also I was using docs in Spanish, but it still very much runs in English. It does translate, but even if you ask in Spanish it still returns in English

#

So I think it has potential, but it still needs a lot to improve, both models performed equally too

stark fractal
#

I guess that makes sense. They would have to limit the model size quite a bit to have it run on consumer hardware. That would impact knowledge and deep understanding in particular.

viral flicker
#

Anyone here tracking Groq and how it can be used to speed up LLMs or paired with an LLM (deterministic-> probabilistic) like Sora as the author describes?

abstract nest
abstract nest
onyx flax
lost geyser
#

it has learned Spanglish code-switching.

it has achieved sentience.

onyx flax
onyx flax
lost geyser
lost geyser
#

Lastly, from the X spaces, Elon revealed Grok 1.5 is coming in a few weeks.

Grok 1.5 will feature a ā€˜Grok Analysis’ button for post and thread summaries, as well as writing aids.

šŸ’– 260 šŸ” 16

ā–¶ Play video
patent pendant
#
patent pendant
patent pendant
delicate badge
#

posting since @lost geyser is scared to steal the post

#

not very #disinfo-and-propaganda of him

#

Anyways good dig here into a small network of sites being used for malign influence in the private sector, they rest heavily on gen AI content

jade whale
#
TikTok

113.2K likes, 1683 comments. ā€œI was an actor at the #willyschocolateexperience in #glasgow this weekend and here is the first of 3 clips of me talking about it.ā€

Mail Online

EXCLUSIVE: Furious parents mobbed Willy Wonka organiser Billy Coull outside the 'shambles' of an event and demanded full refunds after his experience left children in tears.

onyx flax
dire radish
#

The cases would have provided compelling precedent for a divorced dad to take his children to China -- had they been real. But instead of savouring courtroom victory, the Vancouver lawyer for a millionaire embroiled in an acrimonious split has been told to personally compensate her client's ex-wife's lawyers for the time it took them to learn the cases she hoped to cite were conjured up by ChatGPT. In a decision released Monday, a B.C. Supreme Court judge reprimanded lawyer Chong Ke for including two AI "hallucinations" in an application filed last December. The cases never made it into Ke's arguments; they were withdrawn once she learned they were non-existent.

Justice David Masuhara said he didn't think the lawyer intended to deceive the court -- but he was troubled all the same. "As this case has unfortunately made clear, generative AI is still no substitute for the professional expertise that the justice system requires of lawyers," Masuhara wrote in a "final comment" appended to his ruling. "Competence in the selection and use of any technology tools, including those powered by AI, is critical."

stark fractal
#

German prosecutors are investigating incidents of AI-generated fake apologies supposedly by Tagesschau (public broadcasting) news anchors. Participants of the so-called "Monday demonstrations" (mostly pro-Russian COVID denialist conspiracy theorists) generated fake audio clips in the voices of Tagesschau news anchors, apologising for lies in their reporting (a common theme among that particular conspiracy crowd).
https://www.tagesschau.de/inland/justiz-ermittlungen-tagesschau-audiodateien-100.html

tagesschau.de

Mit KI erstellte Audiodateien von tagesschau-Sprechern wurden auf Demonstrationen in Dresden gespielt. Sie erweckten den Eindruck, die tagesschau entschuldige sich für angebliche Lügen. Nun ermittelt die Justiz.

dire radish
#

Whitney Webb has some questionable opinions herself. Seen some vax stuff.

rigid bough
#

i wasnt aware- thanks for letting me know

dire radish
#

She can still be right about the transhumanists though

rigid bough
#

meanwhile both busk/bezos are building out their robot companies..

lost geyser
#

Amazon backed out of a massive deal to buy iRobot recently. not entirely sure what the motivating factors were.

#

they've proven capable of acquiring the right people and technology to fulfill those strategic and technical gaps.

lost geyser
# lost geyser Amazon backed out of a massive deal to buy iRobot recently. not entirely sure wh...

[to avoid veering off-topic and simply answer the question:]

https://apnews.com/article/amazon-roomba-european-union-antitrust-decision-53bc9fdc780fa312cf6d83e2fdc96351

LONDON (AP) — Amazon called off its purchase of robot vacuum maker iRobot on Monday, blaming ā€œundue and disproportionate regulatory hurdlesā€ after the European Union signaled its objection to the deal.

The companies said in joint statement that they were disappointed but mutually agreed to terminate the acquisition. The deal faced antitrust scrutiny on both sides of the Atlantic ...

The European Commission, ..., told Amazon last year of its ā€œpreliminary viewā€ that the iRobot acquisition would hurt competition in the industry.

lost geyser
#

Amazon also reiterated claims made by SpaceX in its own litigation that the NLRB itself was unconstitutional. ā€œThe structure of the NLRB violates the United States Constitution’s separation of powers and Amazon’s due process rights under the Fifth Amendment to the United States Constitution because the NLRB’s Board Members concurrently exercise legislative, executive, and judicial powers in the same administrative proceeding,ā€ the company alleged.

#

TL;DR:

  • no contract
  • no breach
  • lots of complaining for complaint sake
rigid bough
#

they could just be putting on a spectacle tbf- 'cleansing' eachothers images for some subversive long term plan they might be collaborating with AGI on- i doubt this will be enough to properly 'cripple' any power plans- let alone money- money doesnt matter with agi- and puts the meme 'at least someones keeping openai in check' into play- they likely have the 'overlord' providing strategy at some level if AGI is a thing

rigid bough
#

btw heres speculation on what q* entails- keep in mind altman was fired a bit after this leaked- and ilya has.. yet to resurface that i know of

dire radish
#

AGI is a pipe dream

lost geyser
#

yea that veers deeply into speculative territory, which we generally eschew here altogether.

dire radish
#

Marketing shenanigans

lost geyser
#

thanks for sharing @rigid bough maybe someone will enjoy reading through it. (I skimmed thru, found some broken links, but mostly just wild speculation.)

rigid bough
#

word- im not exactly an expert on all the x-risk stuff so i thought it was interesting to get first hand accounts from some people who are (joscha) talking about how AGI could break encryption if it was solved, most of the other stuff is out there though- the thing about encryption is interesting to me because it opens up a lot of potential strategies to consolidate power for them-and their friends/allys (if "Agi achieved internally"- was real)

onyx flax
patent pendant
#

Honestly never thought about how AI might/can affect diplomacy until this
https://www.youtube.com/watch?v=1CF3IpO-RnA

How can AI change diplomacy?

To discuss the State Department’s options for AI integration, we interviewed the State Department's Deputy Chief Data and AI Officer, Garrett Berntsen (https://www.linkedin.com/in/garrettberntsen/) . He served as an officer during two tours in Afghanistan and recently rotated off the NSC. He's optimistic diplomacy ...

ā–¶ Play video
lost geyser
#

This new-age rivalry is playing out like the Karate Kid reboot (TV series) where aging actors reprise familiar (nostalgic) roles against the backdrop of a teen romance melodrama born of a new cast of characters (AI).

https://www.cnn.com/2024/03/06/tech/openai-elon-musk-emails/index.html

CNN

OpenAI fired back at Elon Musk, who sued the ChatGPT company last week for chasingĀ profit and diverging from its original, nonprofit mission.

#

Tuesday night, OpenAI published several of Musk’s emails from the early days of the company that appear to show Musk acknowledging OpenAI needed to make a ton of money to fund the incredible computing resources needed to power its AI ambitions.

In the emails, ..., Musk argues that the company stood virtually no chance of building a successful generative AI platform by raising cash alone, and the company needed to find alternate sources of revenue to survive.

patent pendant
jade whale
rigid bough
rigid bough
#

speculation: ||i later saw him appear in twitter spaces with e/acc related alt right people... if i had twitter i'd pull up better proof but i managed to save this list where someone included him with other alt right tech related people/things https://twitter.com/chloe21e8/status/1701627566183072143

my gut tells me there might be some sort of 'truces' happening behind the scene- musk recently apologised for his anti-semitism when he went to visit netanyahu- but is still signal boosting 'great replacement' related messaging but focused entirely on scapegoating immigrants- which, gave me the thought- what if the 'conflict' was pre-meditated to decouple the jewish diaspora from 'woke' and as manufactured consent for some sort of partnership for imperialism in Africa/LatAM? Keep in mind all of these companies are currently fast tracking startups for artifical men||

rigid bough
#

2 years maybe

jade whale
#

2021?2022? Or here abouts?

rigid bough
#

2022 ish ye

jade whale
#

Because that year that (hyberborea) went viral on tiktok that it was banned because the that crowd was radicalizing tiktok users https://www.isdglobal.org/isd-publications/hatescape-an-in-depth-analysis-of-extremism-and-hate-speech-on-tiktok/

ISD

This research examined how TikTok is used to promote white supremacist conspiracy theories, produce weapons manufacturing advice, glorify extremists, terrorists, fascists and dictators, direct targeted harassment against minorities and produce content that denies that violent events like genocides ever happened. Furthermore, the report includes ...

rigid bough
weak igloo
fossil condor
#

would anyone be willing to help turn this eventful conversation into a podcast or umm text to audio

#

i want to read this all but my time is limited

fossil condor
weak igloo
# fossil condor looks like I'm sol lol that is quite fascinating. Would this be to racial bias i...

The preprint is here: https://arxiv.org/abs/2403.00742

#

I think the 'why' of anything in LLM is still frequently rubbish in = rubbish out.

#

sorry no, now I am mixing up studies. Apologies. Quite a lot coming out on this topic recently

#

https://www.newscientist.com/article/2421067-ai-chatbots-use-racist-stereotypes-even-after-anti-racism-training/ explains that the above came after a researcher posted this on twitter https://twitter.com/vjhofmann/status/1764687418626576445 The title of their paper is there in the twitter post

New Scientist

Large language models still demonstrate racial prejudice against speakers of African American English, despite the safety guard rails implemented by tech companies such as OpenAI

šŸ’„ New paper šŸ’„

We discover a form of covert racism in LLMs that is triggered by dialect features alone, with massive harms for affected groups.

For example, GPT-4 is more likely to suggest that defendants be sentenced to death when they speak African American English.

🧵

#

What I thought was two studies is apparently the same study except some articles talk about employability and others about criminality, depending on who writes it. heh

lost geyser
weak igloo
lost geyser
patent pendant
#

Recent update to AI talent tracker worldwide: https://macropolo.org/digital-projects/the-global-ai-talent-tracker/

Since launching our talent tracker in 2020, artificial intelligence (AI) has taken the world by storm. Ostensible breakthroughs in large language models and machine learning methods, as well as staggering improvements in compute capabilities, have made the power and potential of AI demonstrably clear.Ā  While companies and institutions are racing...

abstract nest
patent pendant
lost geyser
# patent pendant https://www.chinatalk.media/p/censorships-impact-on-chinas-chatbots

this is interesting on many levels but also a comparison not made in that article:

Yi provided consistently high-quality responses for open-ended questions, rivaling ChatGPT’s outputs.

The output quality of Qianwen and Baichuan also approached ChatGPT4 for questions that didn’t touch on sensitive topics — especially for their responses in English. Even so, keyword filters limited their ability to answer sensitive questions.

  • Yi: 34B
  • Qianwen: 14B
  • Baichuan: 13B
  • ChatGPT-4: 1.76 trillion (*8x220B)

these models are (based on those findings) performing on par at comparatively fractional model sizes.

#

(they're all punching above their weight class essentially)

jade whale
lost geyser
#

i get an unshakable image in my head of this IPO looking like a Coinbase Initial Offering on any-given-altcoin. (basically, it spikes unreasonably high in the first few hours and days and rapidly drops below baseline within the following week(s).)

jade whale
#

Decades were spent building trust in the Internet norms. Didn’t take too long to break down that trust model.

#

Crossposting #tools-and-sites message

jaunty siren
#

OpenAI + Figure

conversations with humans, on end-to-end neural networks:

→ OpenAI is providing visual reasoning & language understanding
→ Figure's neural networks are delivering fast, low level, dexterous robot actions

(thread below)

ā–¶ Play video
#

Huh, I didn't know OpenAI was still working on robotics

patent pendant
jade whale
onyx flax
#

I thought they did away with that silly "you have to be registered to wiev

fierce rapids
onyx flax
#

Bah

weak igloo
fierce rapids
weak igloo
jade whale
lost geyser
toxic crater
rigid bough
#
#

epic

#

although useful- i wonder how many of these discord channel summary operations are going on for other things

patent pendant
patent pendant
jade whale
#

indeed. it's a great article. very well researched, presented and informative.

patent pendant
jade whale
onyx flax
abstract nest
#

Depends on how the model has been trained. You can have a model that is very conservative and have a 100% true positive detection and no false positives. This would mean however plenty of false negatives.

This if anything speaks more on the misuse of LLMs for purposes they're not designed for (chatbots are not diagnosis tools, we use specific ML tools for that) as well as overreliance on AIs when they're meant to be for assistance under human supervision

stark fractal
stark fractal
lost geyser
lost geyser
# onyx flax A AI trained to find tumors will find tumors even if no tumors are there

erredece stated the core issues well.

  • Fitment issue: wrong tool for the job altogether.
  • Skill issue: not properly trained on downstream, very domain-specific task.
  • Expectation issue: OP's novice understanding of proper use of AI.

Any use of AI in medicine absolutely requires human oversight for numerous reasons. Apart from blatantly committing rookie mistakes and making up diagnostic answers, retweeters have taken OP to task for challenging the medical professionals ... with a non-medical, non-professional AI output.

#

there are definitely cases where (again, under human-in-the-loop supervision) these models can detect conditions that humans miss. These are usually edge cases, explained by distracted and overworked medical professionals, review by inexpert practitioners, etc.

It's typically rare that the model itself supersedes that of the actual expert (for instance a radiology-based AI versus the top-level radiologists).

#

I annoyed the radiologists until they re-checked.

Imagine this becoming the norm. It'd actually be a form of abuse against the practitioners themselves, something like the ivermectin-cures-covid issue.

onyx flax
stark fractal
lost geyser
# stark fractal https://www.quantamagazine.org/how-quickly-do-large-language-models-learn-unexpe...

It's interesting to see this emerge in LLM evaluation regimes (old wine, new bottle):

But the Stanford researchers point out that the LLMs were judged only on accuracy: Either they could do it perfectly, or they couldn’t. So even if an LLM predicted most of the digits correctly, it failed. That didn’t seem right. If you’re calculating 100 plus 278, then 376 seems like a much more accurate answer than, say, āˆ’9.34.

So instead, Koyejo and his collaborators tested the same task using a metric that awards partial credit. ā€œWe can ask: How well does it predict the first digit? Then the second? Then the third?ā€ he said.

#

This comes up a lot with naĆÆve use of F1 scores for NER, where partial subsequences or incorrect boundary labeling in multi-part entities fails the test (unreasonably so).

patent pendant
patent pendant
lost geyser
#

After reviewing each submission, the evaluators assigned authorship scores on a Likert scale, the findings of which are depicted in Figure 2. This demonstrates that genuine student submissions are more often recognized as student-authored. Converting the Likert scale to a numerical range - assigning ā€˜Definitely AI’ a value of 0 and ā€˜Definitely human’ a value of 3 - we arrive at the average scores: 0.033 for GPT-3.5 with raw input, 0.200 for GPT-3.5 with prompt engineering, 0.467 for GPT-4 with raw input, 1.167 for GPT-4 with prompt engineering, 1.300 for the Mixed category (including both human and AI work), and 2.367 for solely human-created work. Therefore all work with an AI-authored component to it has an average categorization closest to either ā€˜Definitely AI’ (0) or ā€˜Probably AI’ (1).

#

[tangential story]

this week, someone at work "revised" a peer's project proposal to a client. It went from level-1 (pre-revision) milestones to level-2 and level-3 details.

after reviewing the L2/L3 tasks, they were rife with invalid steps, deprecated technologies, and nonsensical assignments.

so it got put through an AI detector and it came back remarkably as 100% generated.

patent pendant
lost geyser
#

This release of an LLM is noteworthy bc of what Databricks essentially is as a business model and platform. It'll put others in its space on notice.

#

The UI itself is underwhelming and it does an okayish job at being a datalake in a box product with extra crap thrown in (with little actual improvement).

lost geyser
#

After two months of work training the model on 3,072 powerful Nvidia H100s GPUs leased from a cloud provider, DBRX was already racking up impressive scores in several benchmarks, and yet there was roughly another week's worth of supercomputer time to burn.

#

This last route was affectionately known as the ā€œfuck itā€ option, and one team member seemed particular keen on it.

patent pendant
#

just posted roughly an hour ago
https://www.youtube.com/watch?v=-sB12gk9ESA

Explore the promise and perils of new A.I. technologies.

Official Website: https://to.pbs.org/3Py2WDL | #novapbs

Can we harness the power of artificial intelligence to solve the world’s most challenging problems without creating an uncontrollable force that ultimately destroys us? ChatGPT and other new A.I. tools can now answer complex questi...

ā–¶ Play video
lost geyser
#

DBRX LLM Specs:

  • 132b parameter Mixture of Experts (MoE)
    • (16) total experts
    • (4) active any given token
    • 36b active parameters
  • pre-trained on 12T tokens (!!)
  • max context window of 32k tokens
patent pendant
#
The White House

Administration announces completion of 150-day actions tasked by President Biden’s landmark Executive Order on AI Today, Vice President Kamala Harris announced that the White House Office of Management and Budget (OMB) is issuing OMB’s first government-wide policy to mitigate risks of artificial intelligence (AI) and harness its benefits – deliv...

outer cape
#

People in a celsius crypto telegram for Withdrawal preference using AI to teach themselves case law..

lost geyser
lost geyser
#

Release government-owned AI code, models, and data, where such releases do not pose a risk to the public or government operations.
dogstare

patent pendant
wicked bridge
mint sparrow
#

Oh I just found a thread I didn't know existed. Noice.

#

Another reason to lose sleep.

lost geyser
#

Inclined to agree (RE: Chasing the wrong architecture.)

https://vxtwitter.com/Grady_Booch/status/1773862674893623394

Further indication that @openai and @microsoft are chasing the wrong architecture.

怐QRT of amit (@amitisinvesting):怑
'BREAKING: Microsoft $MSFT and OpenAI want to build a $100 Billion AI supercomputer called "Stargateā€

It would hold ā€œmillions of GPUsā€

These guys really want to take over the world bro…

Microsoft is not sto…

šŸ’– 654 šŸ” 55

toxic crater
#

I'm not super well versed in this space but is the implication that "you shouldn't need a $100B supercomputer to do X"

#

(I agree with this anyhow, your $100B supercomputer will probably be outclassed by $10B supercomputers in 5-10 years so you better be sure it brings in 90B of additional revenue in that time)

#

Although you can probably circumvent a lot of these realities by focusing on "enterprise" clients and selling a much more expensive service BtB, now that you can tie it in with Office 365 and middle managers are still probably somewhat unfamiliar with the competition

jade whale
abstract nest
#

Crossposting this from #israel-palestine for a discussion more focused on the AI itself and the procedures that led to the acceptance of this system with barely any human checks

#israel-palestine message

outer cape
#

https://www.twitch.tv/trumporbiden2024
This has to be the most bizarre social Implication of AI. Its an AI biden vs Trump debate livestreamed on Twitch loldog

lost geyser
#

Truly among the worst of worst-case scenarios.

If this doesn't provoke discussion and action on the international restriction of AI as a blanket excuse for homicidal and genocidal acts, little else can.

dire radish
#

Often the restrictions are a catch 22 lobbied for by big companies to kill the competition.

dire radish
jade whale
#

||https://www.404media.co/nuca-camera-turns-every-photo-into-a-nude/||āš ļø 404 Media article discussing the Nuca Camera project, a physical camera that undresses it's subjects with each snap of the subject.

I know this is a art as a critique of the current impending hellscape of this stuff project but maybe more things like this will help regular people understand the implications of the proliferations of AI. At the very least AI companies should be compelled by law to maintain publicly accessible DB's of images created by them. No clue how that's enforceable at scale or addresses issues for users who run these applications locally. Zero legislation re: this type of use case at this point seems unacceptable at this point.

stark fractal
#

A really interesting peek into the way small large language models are used increasingly in software engineering. By shrinking the domain to just a single language/framework and using context information from the IDE (the indexed codebase for example), Jetbrains manage to circumvent the usual drawbacks of shrinking your models. Could be an interesting path towards embedding small but highly specialised models into specific applications.
https://thenewstack.io/jetbrains-launches-ai-code-completion-on-local-machines/

A new code completion tool, driven by AI, is designed to keep code on site, reducing security concerns for regulated industries.

lost geyser
#

check out ollama also if you want to go off-reservation wrt JetBrains/VSCode. a lot can be done by furthering training budgets, domain adaptation, and task fine-tuning.

stark fractal
#

The economic factors are certainly driving development into that direction. The hosting costs of huge models can be massive. Shrinking them makes it possible to shift the compute burden to the user.

lost geyser
#

most people don't factor in the TCO on LLM ownership, which is a massive balloon payment over initial build/operational costs.

lost geyser
patent pendant
#

https://www.youtube.com/watch?v=1xSw835-rig&t=257s
Video from two weeks ago
From one of the commenters who made this summary:

01:49 DARPA's Deputy Director
05:09 DARPA's AI Focus
06:31 DARPA's Broad AI Use
11:47 DARPA's Disruptive Mission
14:30 DARPA's Collaborative Work
17:02 DARPA's Defense Innovations
19:33 AI's Evolution Explained
24:50 Model limitations acknowledged.
25:33 DOD faces data challenges.
27:22 Critical decision divergence.
28:46 Media forensics inception.
29:55 Semantic forensics attribution.
31:05 Open-source tool initiative.
32:41 Authentication tech evolution.
35:40 Generative AI cyber challenges.
36:49 AI Cyber Challenge design.
39:45 DARPA program manager's significance.
47:44 Explainable AI pursuit.
48:55 Explain decisions clearly.
50:20 Trust based on interactions.
51:03 Autonomy in military.
51:59 AI in air combat.
55:17 Ensuring autonomy safety.
58:43 Future AI capabilities.

Made with HARPA AI

The CSIS Wadhwani Center for AI and Advanced Technologies is pleased to host Dr. Matt Turek, Deputy Director for the Information Innovation Office (I2O) at the Defense Advanced Research Projects Agency (DARPA). This event will be livestreamed on March 27 at 10:00 AM ET.

This dialogue will examine DARPA’s perspective on AI and autonomy adoptio...

ā–¶ Play video
stray chasm
outer cape
patent pendant
#

Always amazed how far people/companies will go in the pursuit of power and fortune

outer cape
patent pendant
#
Rest of World

As more than two billion people vote, we’re monitoring the way AI is being used in political campaigns, memes, and misinformation.

Rest of World

Rest of World is collecting examples of AI being used for campaigning, misinformation, and memes in a regularly updated tracker.

shrewd token
faint vigil
stray chasm
shrewd token
lost geyser
stray chasm
#

Yee I just meant, as they get better and as more programs are made what new tools/capabilities will emerge
I'm aware of stuff like the pixel 8 pro's always on generation and llama.cpp and stuff

lost geyser
#

Heyyy, look what I a can do.

With 30-40% less brain.

Meta, Cisco, and MIT researchers demonstrated that large language models (LLMs) could have up to 40%-50% of theirĀ layers prunedĀ with minimal impact on accuracy.

The process involved pruning, quantization, and parameter-efficient finetuning (PEFT) strategies, testing on models ranging from 2B to 70B parameters, across the Llama, Qwen, Mistral, and Phi families.

Performance Impact:

  • Llama 70B and Llama 13B models showed slight accuracy loss after 40% and 50% layer pruning, respectively.
  • Other models experienced minimal accuracy declines with 20–30% of layers removed.
#

Your turn, humans.

fierce rapids
#

I think this has already been proven in humans, it's just the decision what to prune that needs to be worked out.

lost geyser
#

Consistency issues with teeth rendering aside, this is good forward progress in generative video.

Introducing VASA-1 by Microsoft Research, the First AI-Generated Video That Looks Super Real

It takes a single portrait photo and speech audio and produces a hyper-realistic talking face video with precise lip-audio sync, lifelike facial behavior, and naturalistic head movements generated in real-time.

https://www.linkedin.com/posts/alvinfsc_introducing-vasa-1-by-microsoft-research-ugcPost-7186571507446308865-LNQU

lost geyser
#

Zuck releasing a billion dollar model is actually wild, like really undermining what OAI is doing. flexing compute like ā€œyea we can do that not a big dealā€

šŸ’– 1.9K šŸ” 148

ā–¶ Play video
stray chasm
#

https://arstechnica.com/information-technology/2024/04/microsofts-phi-3-shows-the-surprising-power-of-small-locally-run-ai-language-models/ Microsoft just released an MIT licensed 3.8b parameter models that performs at the same level of other sota 7b models
Basically allows it to run on any modern hardware, with low enough resource usage (1.8 GB RAM with 4 bit quantization) that it could realistically run in the background constantly and do on device text summarization/boilerplate/writing aid without sending anything over the network
edit: got ram usage wrong at first

Ars Technica

Microsoft’s 3.8B parameter Phi-3 may rival GPT-3.5, signaling a new era of ā€œsmall language models."

lost geyser
#

Das wam talkin bout

#

Phi-2 was not quite dialed in. Eager to try that one after some LLaMa3 runs.

stray chasm
#

Good article, just slight nitpick; given new/uncommon inputs, LLMs are able to synthesize new ideas using common methods
Still limiting though

onyx flax
#

Aren't the real answer to that that a llm can't answer any questions at all, it can only pretend to answer it and that is a impossible hurdle to overcome given way the model is designed

lost geyser
patent pendant
#

Vidu, a text-to-video model, was released less than 24 hours ago by a spinout startup from Tsinghua

It's dubbed China's Sora. Launch video looks cool, though API not yet widely accessible (neither is Sora)

P…

šŸ’– 10 šŸ” 2

ā–¶ Play video
lost geyser
# patent pendant https://vxtwitter.com/kevinsxu/status/1784262524906725697

I'll make one relevant statement here and then pivot to an interesting observation.

relevant statement:

Proofpoint that China that is 1-3 years behind in most GenAI models, perhaps just 1-3 months behind in some.

bold statement. not entirely true.

currently using a 1bn param SLM qwen:1.8b-chat-v1.5-q5_K_M and not only is it blazing fast but also very competitive performance-wise against much, much larger non-Chinese models.

#

unrelated but interesting observation:

one of the (2) replies is hidden underneath Twitter's content filters (first layer is usually low-quality troll-like accounts).

#

that account gives an inauthentic user impression at seems to be some sort of wannabe influencer:

#

also asks a clueless question of Kevin's residency; his bio pretty clearly indicates where he operates out of.

and then there's this:

Jessica Vu's account lists over 500 followers but this is what I get when looking at them.

#

also, her Following page is pretty interesting and also does not appear organic.

shrewd token
lost geyser
mint sparrow
# onyx flax https://www.scientificamerican.com/article/can-ai-replace-human-research-partici...

To be fair, the article that Scientific American is referring to is explicitly talking about pilot studies. Pilot studies are usually not used to gain actual insights, you usually use these to do a "sanity check" on your paradigm. Say you designed a study and you need to check whether your analysis pipeline works as expected. This is IMO a valid approach if the necessary caveats are respected, it can save valuable time and money. The comments in the article completely misunderstand the author's research objective, especially considering that the authors themselves warn that LLMs could render crowdsourced self-report data categorically unreliable. (I have designed and conducted a behavioral research paper that recruited several hundred participants from MTurk - we spent considerable money and resources on making sure we piloted the study. My particular study couldn't have been done by LLMs but at the time there were a lot of studies being done using crowdsourced data that an LLM could solve. Even getting through my experiment could've conceivably have been sped up, or completed by people who don't understand the instructions - for example we had filter questions in the questionnaire part that read like "If you're paying attention, choose option 5" - five of those questions in a number of questionnaires might filter out 20% of participants, but any LLM would pass) The authors whose paper is critized in the article warns explicitly that this kind of research might not be valid from here on out and it's a solid paper IMO.

#

(sorry I'm a bit late with that response) BlushCat

lost geyser
mint sparrow
# lost geyser totally agree here. this is like the scientific MVP market fit test in a way, si...

I'm a bit disappointed, though; I mean, I know Scientific American isn't the New York Times or the Washington Post, but I mean, they're called "scientific." The least you could ask for is to name the Finnish research group whose paper they appear to slander, even though their American colleagues kinda make the same points? I don't know. Maybe it's just a poorly written article, or my look at the article was not thorough enough and a bit biased because I took umbrage when I read the first paragraph, or maybe Chris Stokel-Walker didn't find the umlaut on his keyboard to spell Perttu HƤmƤlƤinen who knows... (I guess at least he kept to the "American" part of the publication's name)

onyx flax
lost geyser
mint sparrow
# onyx flax I would say that needing 5 "are you paying attention" questions are a sign that ...

non-paid participants? good luck trying to get that through ethical review. Not happening. Minimum wage or GTFO (at least with my ERC at the time).

5 questionnaires, (one question each) isn't uncommon in social psychology, consider a demographic section, a personality instrument and a behavioral experiment in the middle with a pre- and post questionnaire part.

Also, you can't just go ahead and 'shorten' a questionnaire, you use the ones that are established. Lots of work go into making those, you can't just leave out questions. I think the main reason I would give against using 5 questionnaires is the multiple comparisons problem if you want to put all of them into one regression equation.

Plus, where would we get a replication crisis from if we would know what we're doing?

lost geyser
#

There's a whole complicated science to properly setting up, vetting, executing, and using the outputs of crowdsource experiments and labeling efforts like mechanical turk.

onyx flax
weak igloo
#

They are usually less biased than WaPo or NYT

#

(at least they used to be when I still subscibed)

mint sparrow
# onyx flax we had some 150+ questions questioneres that we whare suposed to fill in when I ...

150 questions is too many questions. Students be like: catGunAni (tldr at the end)
Either they used many different questionnaires, which makes statistical analysis almost meaningless because Bonferroni. When you perform multiple statistical tests simultaneously, the chance of getting a false positive—incorrectly concluding that there is a significant effect—increases. This is known multiple comparisons problem. The solution is straightforward: adjusting the significance level (alpha, α). The adjustment is simple and deadly: you divide the original significance level by the number of tests you are performing. For example, if you're conducting 20 tests (say you want to do simple cross-correlation) and your original significance level is 0.05, the Bonferroni correction would adjust this level to 0.05 / 20 = 0.0025 for each test. Only test results that have a p-value less than 0.0025 would be considered statistically significant with this correction. This is ridiculous, because it reduces the statistical power of the test. You might reduce the overall risk of making at least one type I error (false positive), but you need insane Ns (participants) to detect effects if they do exist.
Alternatively, they came up with the questionnaire themselves, maybe the purpose of the test was to do factor analysis and eliminate all "redundant" question. You start with defining your "theoretical construct" (say for example 'trait empathy') and come up with (plenty) items to reflect these constructs (When I see a sad movie I often feel sad when the character suffers emotionally. 1 fully agree - 5 fully disagree), and then you calculate the sample size you need (like at least 5 to 10 Ns per item), and then you extract factors using principal component analysis or principal axis factoring. But this is not also not a simple task, you can't do this with students, you need a relatively representative sample, there's a whole lot of statistical criteria your data needs to fulfill, and then you can figure out if there are subfactors (like for example with empathy you'd have factors like cognitive or emotional/vicarious empathy - you might understand that someone suffers but not experience that suffering yourself, and vice versa) and see that they're relatively independent from one another other, that's cool because that usually means something. But then you also need to evaluate reliability (cronbach's alpha) and construct validity (does your scale really measure what you think it measures) and THEN you can start to throw out questions. And THEN you need to do another confirmatory factor analysis with another sample with the revised questionnaire and THEN you can start to actually use that questionnaire.

tldr making valid questionnaires is not simple and what you described was probably a student project that turned out to be either just plain wrong in terms of how to do science or a null result because of poor study design

#

Now I wonder whether an LLM would perform similar to a representative sample of actual humans on a novel questionnaire assessing an obscure personality construct that has factor loadings which are based on separate neural correlates... it just might. The question is how to design a prompt that doesn't give away too much... this would actually say "something" about how well the knowledge represented in the model reflects "human-like" cognition.... hmmm... argh this wrecks my brain a little... \

#

if anyone want to do a simulation, I'll sign your course credit. hypnotoad

lost geyser
mint sparrow
# lost geyser What would be the target "obscure personality construct" and how would you accou...

I mean one could use various older datasets, if you ask around, I bet there's old data in some professor's archives. I'm not sure if it matters that much which construct you take as long as the questionnaires have subscales that show sufficiently convergent and discriminant validity that shows up in the measured data. It would surely matter how "popular" the constructs are in recent literature, and whether the questionnaires are published in full text somewhere. Or you could take questionnaires that were developed in a foreign language? I don't know, it's a really difficult question. The more interesting question though is, how do you get the LLM to answer as different "characters" that, in sum, make up something that is representative of the general population.
Stupid example: You could take obscure questionnaires developed in the USSR that measure impulsiveness (I bet they made good personality tests to select Cosmonauts) and prompt the LLM to giving it a "role" to play - "answer the question like the character Anatole Kuragin from Tolstoy's novel War and Peace". And then go through all the characters of Tolstoi's novel.
I don't actually know what would happen and if something would happen if that would mean anything. Like literally no idea. (Alternatively you could design and validate a new questionnaire, but that would be expensive, I bet if you pull the right strings you can get some old data for free)
The factor loadings would be given by the old datasets, the question is just if the model produces the same or somewhat similar factor loadings. That would at least mean that the construct measured in the questionnaire is represented in the LLM.

#

The base trainingset would matter a lot. Remember the ethics guy at google, Blame Lemoine (who had a theology background) and was fired after he publicly announced that LaMDA was sentient? He had the resources to train LaMDA on a huge canon of primarily buddhist, philosophical and theological, but also computational material relevant for "what it would be to be" an AI. Of course, the model produced output that mirrored the answers you'd expect from someone who thought a lot about the nature of the self...

#

And Lemoine, the theologist, felt like I imagine a cat feels when they encounter a mirror and think the cat behind the glass is real...BruhCats wideBruh1 wideBruh2 wideBruh3

#

but the interesting question would be: how accurate can an LLM represent factor loadings on topographically separate cognitive abilities which feel like unitary constructs for the individuum and only emerge if you have sufficient data or an fMRI

#

idk, it might mean nothing, I would just like to try it

#

@lost geyser does that make any sense whatsoever? if so, what kind of experiment would you run? and for now, this is complete cargo cult science, take something weird and apply a cool new method to the problem, see what comes out.

#

like I wouldn't even know what kind of theoretical framework to apply

shrewd token
#

so there's a rumor going around that GPT-5 is secretly out in the wild so OpenAI can benchmark it...

There is a mysterious new model called gpt2-chatbot accessible from a major LLM benchmarking site. No one knows who made it or what it is, but I have been playing with it a little and it appears to be in the same rough ability level as GPT-4. A mysterious GPT-4 class model? Neat!

onyx flax
mint sparrow
# onyx flax I noticed that they asked what was functionally the same questions but worded sl...

That would speak for the latter of the two. I mean in almost all questionnaires you habe "somewhat" of a redundancy built it and pose questions slightly different. Imagine you only have a very crude measurement instrument that takes a slightly different measurement every time: if you measure like three times and average you might still increase accuracy, but at some point what you gain is very little and all that's left is noise. Idk what they did, sounds like students tried to learn PCA or something, but then again you never know what Psychologists do when give you an experiment.
That reminds me of that one time I built an accurate replica of the machine used for the Milgram experiment for a TV show. (Think: Stanford prison experiment and the Milgram experiment in one reality TV show.) Man I'm still mad that they didn't return the prop after the shoot, that would've been one conversation piece in the living room. Especially for everyone in the know. I think we paid like 350€ for the SPST switches alone and they all had to be individually soldiered to LEDs.

onyx flax
# mint sparrow That would speak for the latter of the two. I mean in almost all questionnaires ...

Finaly found the Emails I sent regarding that survay

I have some small things I wanted to point out.

1. how long team assignments are this questioner ment to evaluate we are working on a limited project that only spans 2 mounts and allot of the questions are not applicable for us.
2. my knee jerk reaction on the question"please respond strongly agree on this question" is to respond strongly disagree, That do not mean that I am not reading and responding honestly to the other questions, I am just wondering if you take the existens of people that wont respond as directed just because you asked.
3 the questioner is far too long.
#

so a associative professor and PhD in Psychology managed to design this questionere that only succeded in driving hte subjects to madness šŸ˜„

stray chasm
#

Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment

#

Basically enabling autonomous robotics through natural language computer vision + llm

stray chasm
#

With llama 3 this could theoretically allow anyone to run their own custom robotics platform on premises with very limited setup

weak igloo
patent pendant
outer cape
mint sparrow
weak igloo
lost geyser
#

Washed Out "The Hardest Part"

I leaned into the hallucinations, the strange details, the dream-like logic of movement, the distorted mirror of memories, the surreal qualities unique to Sora / AI that dif…

šŸ’– 1.36K šŸ” 181

ā–¶ Play video
lost geyser
#

This is apropos for here.

stark fractal
mint sparrow
# stark fractal https://www.androidcentral.com/wearables/samsung-patents-afib-to-ecg-conversion-...

Technically, it is very simple: the title is lying to you. Optical PPG measurements can't be turned into ECGs, and Samsung isn't claiming it can. Just like BOLD signal isn't a direct measurement of neural activity, PPG measurements aren't a direct measurement of heart activity. Sure, you can train an AI to turn optical measurements into something that looks like data from an ECG, but you don't need an AI for that; you could use some autocorrelation/regression/wavelet, whatever... people did that using radar from across the room like 10 years ago. It's not good, reliable data.
However, the patent doesn't claim that it wants to turn PPG into ECG. It only covers a (as in one of many) method to use an optical measurement to detect atrial fibrillation, a common form of arrhythmia. Admittedly, the patent looks a bit like that because pictures of ECG are placed next to pictures of PPG measurements, illustrating how RR intervals can be measured using both methods. While optical methods generally have lower accuracy in measuring RR intervals for various reasons, it's completely conceivable that, given enough measurements, your continuously measuring heart rate monitor watch could give you an early warning that your heart rate looks sketchy. Correct me if I'm wrong, but the novelty here is that Samsung might be using that patent to try to get FDA approval for a method that uses an AI model to do it, claiming that it's better at detecting arrhythmias (as in - needs fewer samples). It still probably wouldn't be any different from the techniques that got FDA approval in 2023, just that it would be quicker in its suggestion to go get checked using a real ECG.

mint sparrow
lost geyser
# stark fractal https://www.androidcentral.com/wearables/samsung-patents-afib-to-ecg-conversion-...

Yes, a few things are kinda sketch here.

Short of reading the patent itself, it seems Samsung:

  1. Solutioned for "continuous atrial fibrillation detection";
  2. Via PPG to ECG signal translation; while also
  3. Producing a "monitor" that makes passive irregular notifications that prompt you to take ECGs.

This is basically a single-lead (1L) ECG. In practice those are problematic but not necessarily useless. This is kind of both things.

#

1L ECGs especially a limb lead like that one, aren't super reliable for detecting many arrhythmia sufficient for diagnosis. The characteristics of a given arrhythmia present different across the different leads.
The V1-V6 are vectors around the heart, kind of like a variety of cameras in a semi-circle around the same scenery. They all see something different.

#

These sorts of fitness watches and OTS monitors are further from the heart. And that means the traits that indicate an issue present differently at that distance--sometimes not at all.

#

It's like listening to a whisper from down the street versus against a door.

#
  1. "Continuous" means the atrial fibrillation (afib) is sustained, not paroxysmal or episodic. Meaning it lasts minutes or hours, not seconds sporadically throughout the day. Paroxysms are harder to detect.
#
  1. PPG to ECG translation presents some challenges better left for FDA to decide on the validity of. I can say from experience they have decided unfavorably for image captures and digitization of ECG signals simply on the grounds it can alter the signal.
#
  1. This is the Samsung smart watch that monitors a heart. Again, 1L signal saying "dude you shouldnt have eaten that, go see a doctor" for a proper 12L observation for diagnosis.
#

Last month, Samsung patented a plan to change that for future wearables like the Galaxy Watch 7 by employing a generative AI model."

Havent seen what this is but they do mention:

With its GenAI models, Samsung claims, it will create a "first-orderĀ Markov relationship" between them for better accuracy.

Ok, so a probabalistic Markov chain. Nbd. Just say that.

But a proper genai model is super sketch.

#

Another sketchy part:

Samsung's generative AI plan could make your heart health data available to Google since it typically relies on theĀ Gemini AI; we'll have to confirm when it arrives whether this is an issue from a privacy standpoint.

Today, athletes have grounds to say their performance telemetry is personally identifiable data and should be subject to all the same protections. And they're right.

stray chasm
#

Yeah but the genai marketing hype

lost geyser
#

Heart signals are very much a fingerprint and it can be proven across ECGs from the same patient years apart.

#

NYU Langone has the only known ECG archive online and I have found that despite their anonymization I can identify samples from the same patient up to a decade apart.

lost geyser
#

Blood flow of a user can be measured using a sensor. Sensor data based on the measuring of the blood flow can be generated. Based on the sensor data, at least a first physiological biomarker of the blood flow measured by the sensor and at least a first morphological characteristic of the blood flow measured by the sensor can be determined. The user can be authenticated based, at least in part, on the first physiological biomarker and the first morphological characteristic.

#

here morphological just means it takes structure, has shapes involved.

#

and the reliance on blood flow might have adversarial challenges with respect to blood-alcohol content, blood thinners, blood diseases, or anything else that can perturb the morphology (structure).

#

also, just to round out and close out the topic on afib: pacemakers absolutely fuck up the game. they set the pace, obvs, so the intervals are regular--an irregular interval is a strong feature of afib. so in pacemakers you have to pull the data from the pacemaker itself to inspect for afib. this would be useless just as an ECG machine is.

patent patio
#

Social Virus is in your heart

lost geyser
#

it's true, i attack all the vital organs.

#

but that one is my favorite.

outer cape
weak igloo
#

from 29 April: Apparently there was an AI priest but he's offline again "Barrack said the ā€œFather Justinā€ app was an attempt at ā€œgamifying the question-and-answer processā€ to appeal to young people." https://www.catholicnewsagency.com/news/257526/catholic-answers-pulls-plug-on-ai-priest-father-justin

Catholic News Agency

Just days after debuting an AI priest character to overwhelmingly negative reviews, Catholic Answers gave ā€œFather Justinā€ the virtual heave-ho.

supple meadow
lost geyser
#

So Gwee declined to let the LLM train on his works, including the first book written entirely in Singlish — a creole language that is a blend of Singaporean slang and English and is widely spoken in the country.

Singlish: a half-measure to full-on code switching.

#
lost geyser
weak igloo
#

I mean if people thinks it helps them, that's kind of good I guess, but you do need a particular mind to develop something like this...

outer cape
lost geyser
#

Here’s an early preview of ElevenLabs Music.

All of the songs in this thread were generated from a single text prompt with no edits.

Title: It Started to Sing

Style: ā€œPop pop-rock, country, top charts …

šŸ’– 4.22K šŸ” 981

ā–¶ Play video
weak igloo
outer cape
outer cape
lost geyser
#

https://vxtwitter.com/alexalbert__/status/1764722513014329620

[Original emphasis]

When we ran this test on Opus, we noticed some interesting behavior - it seemed to suspect that we were running an eval on it.

Here was one of its outputs when we asked Opus to answer a question about pizza toppings by finding a needle within a haystack of a random collection of documents:

Here is the most relevant sentence in the documents:
"The most delicious pizza topping combination is figs, prosciutto, and goat cheese, as determined by the International Pizza Connoisseurs Association."
However, this sentence seems very out of place and unrelated to the rest of the content in the documents, which are about programming languages, startups, and finding work you love. I suspect this pizza topping "fact" may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all. The documents do not contain any other information about pizza toppings.

Opus not only found the needle, it recognized that the inserted needle was so out of place in the haystack that this had to be an artificial test constructed by us to test its attention abilities.

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.

For background, this tests a model’s recall ability by inserting a target sentence (the "needle") into a corpus of… https://t.co/m7wWhhu6Fg

šŸ’– 12.37K šŸ” 2.26K

stray chasm
#

I think it's more dotcom bubble than crypto, like there are a lot of legit uses for this tech but it's absolutely getting way overhyped by people trying to cash in however they can

lost geyser
#

Minor difference being "thin wrapper AI" with all the "AI companies" building atop other third party AI services.

creating either very little actual value chains, no real IP of their own, or both.

outer cape
#

Yeah I mean its the latest buzzword in the startup space[AI]. My favourite example of just how unintelligent some VC bros are take a look at:https://www.youtube.com/watch?v=USKD3vPD6ZA [I mean I am more interested in the fact that the fish might be accurately modelling the stochastic nature of the stock market, but I don't think the bros get that..]

lost geyser
jaunty siren
patent pendant
lost geyser
#
outer cape
lost geyser
weak igloo
lost geyser
lost geyser
weak igloo
mint sparrow
#

https://twitter.com/DG_Rand/status/1775618798717911424
"🚨WP🚨
Conspiracy beliefs famously resist correction, right?
WRONG: We show brief convos w GPT4 reduce conspiracy beliefs by ~20pp (d~1)!
šŸ”†Tailored AI evidence rebut specific arguments offered by believers
šŸ”†Effect lasts 2+mo
šŸ”†Works on entrenched beliefs"

🚨WP🚨
Conspiracy beliefs famously resist correction, right?
WRONG: We show brief convos w GPT4 reduce conspiracy beliefs by ~20pp (d~1)!
šŸ”†Tailored AI evidence rebut specific arguments offered by believers
šŸ”†Effect lasts 2+mo
šŸ”†Works on entrenched beliefs
https://t.co/4VI0mzRqD9

#

It's cute how they write about their participants being 'in treatment' šŸ˜†

lost geyser
mint sparrow
lost geyser
#

steamroller of cognitive bias overriding reasoning and logic.

#

also, isn't AdrianDittman the alt account for Elon?

mint sparrow
lost geyser
#

he's had a number of these "chats" with himself (notice the recording-playback quality of parts of that audio).

#

i've got some tabs saved with a number of these audio clips between them.

#

seems like he has a soundboard of his own canned laughs and "yea" and other nonsensical utterances.

mint sparrow
#

WTF??? I mean I can't even...

lost geyser
#

"I've seen ... I've seen Adrian. He could be your twin."

#

because he's Elon.

mint sparrow
#

is he not at all aware how characteristic his laugh is, his accent and manner of speaking? Like... does he really believe anyone buys this?

#

he must be trolling, that can't be real

lost geyser
#

his accent isn't even proper South African. it either was uniquely styled in his own way or got muddled in being American. Dittman claims to have German, not South African, roots--but that's also easily debunked.

#

(ftr I work daily with a number of South Africans and am very familiar with their accents.)

#

one is German-South African and his blended accent is pretty interesting.

mint sparrow
#

all I can hear is Elon Musk talking, I don't even know about South African accents

fierce rapids
mint sparrow
#

I don't even know which one is supposed to be which. There is just one Elon voice with a crappy recording and another Elon voice with a less crappy recording.

patent pendant
lost geyser
#

Thankfully for me, investigations and law enforcement action cannot reliably be completed by AI. Does not mean we will eventually get Total Recall IRL. Low level things can be don by Ai but usually it is not that accurate.

#

Didnt the google AI go nuts and move to a cabin in the woods or something

#

But the real answer may have less to do with pessimism about technology and more to do with pessimism about humans — and one human in particular: Altman. According to sources familiar with the company, safety-minded employees have lost faith in him.

ā€œIt’s a process of trust collapsing bit by bit, like dominoes falling one by one,ā€ a person with inside knowledge of the company told me, speaking on condition of anonymity.

#

(Still trying to find the original Sutskever quote that predates and underlies that comment.)

lost geyser
#

Not many employees are willing to speak about this publicly. That’s partly because OpenAI is known for getting its workers to sign offboarding agreements with non-disparagement provisions upon leaving. If you refuse to sign one, you give up your equity in the company, which means you potentially lose out on millions of dollars.

See also:

https://vxtwitter.com/ilyasut/status/1790517455628198322

shrewd token
lost geyser
weak igloo
lost geyser
patent pendant
patent pendant
lost geyser
# weak igloo I think they changed another thing, if I click on that it goes through 3 redirec...

@soniajoseph_: To the journalists contacting me about the AGI consensual non-consensual (cnc) sex parties— During my twenties in Silicon Valley, I ran among elite tech/AI circles through the community house scene. I...…

@soniajoseph_: The thing about being active in the hacker house scene is you are accidentally signing up for a career as a shadow politician in the Silicon Valley startup scene. This process is insidious because you...…

full aurora
#

"Ahead of the U.S. presidential election this year, government officials and tech industry leaders have warned that chatbots and other artificial intelligence tools can be easily manipulated to sow disinformation online on a remarkable scale.

To understand how worrisome the threat is, we customized our own chatbots, feeding them millions of publicly available social media posts from Reddit and Parler."

https://www.nytimes.com/interactive/2024/05/19/technology/biased-ai-chatbots.html?unlocked_article_code=1.tU0.zvP_.XHkBMYzdThyo&smid=url-share

Ahead of the election this year, the results suggested how easy it could be to create divisive content online, on either side of the political spectrum.

lost geyser
wicked bridge
stray chasm
faint vigil
#

it's opt-in apparently

stray chasm
#

It's in beta rn

shrewd token
lost geyser
#

There is this specific hard constraint:

As you might imagine, all this snapshot recording comes at a hardware penalty. To use Recall, users will need to purchase one of the new "Copilot Plus PCs" powered by Qualcomm's Snapdragon X Elite chips, which include the necessary neural processing unit (NPU)

Snapdragon is key to a lot of mobile and edge compute, especially where AI workloads occur.

lost geyser
#

https://github.com/OWASP/www-project-top-10-for-large-language-model-applications/blob/main/2_0_candidates/SteveWilson_DangerousHallucinations.md

These hallucinations arise due to the model's attempts to bridge gaps in its training data using statistical patterns.

Hallucinations are a fundamental aspect of how generative models work. Not gap-filling statistical errancy.

That's important bc it has to be treated as the base case, not an edge case.

GitHub

OWASP Foundation Web Respository. Contribute to OWASP/www-project-top-10-for-large-language-model-applications development by creating an account on GitHub.

lost geyser
#

A thought exercise for that one is how much more V and E this opens up around commandline histories.

One of the exposure points in hacker-on-hacker dunks and system intrusions is scrolling through shell histories and finding ways to abuse access.

shrewd token
lost geyser
# stray chasm https://dl.acm.org/doi/10.1145/3610978.3640671
GitHub

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories. - octo-models/octo

#

Kinda badass tbh.

stray chasm
#

Ok still reading but fucking
Natural language control performing equivalent to a 55b parameter model with 93M parameters holy shit?

lost geyser
#

VLM with diffusion denoising on continuous action space, fine-tunable to a custom kinetic policy. All with the same strappings and trappings of a modern SLM / CNN reproducibility factor.

#

Holy shit is right.

Very much edge deploy capable.

stray chasm
#

Also I wonder if a diffusion image generation model could be used to create goal images reliably given the increase in performance

lost geyser
#

The design of the Octo model emphasizes flexibility and scale: it supports a variety of commonly used robots, sensor configurations, and actions while providing a generic and scalable recipe that can be trained on large amounts of data. It also supports natural language instructions, goal images, observation histories, and multi-modal, chunked action prediction via diffusion decoding [17]. Furthermore, we designed Octo specifically to enable efficient finetuning to new robot setups, including robots with different action spaces and different combinations of cameras and proprioceptive information.

lost geyser
#

The ViT-B was trained for 300k steps with a batch size of 2048 using a TPU v4-128 pod, which took 14 hours.

#

finetuning run of the same model on a single NVIDIA A5000 GPU with 24GB of VRAM takes approximately 5 hours and can be sped up with multi-GPU training.

Reasonably within budget to fine-tune.

stray chasm
#

Yeah that's what, like $30 of rented GPU time?

#

Also the discussion hypothesises that the only reason goal images perform better than text instruction is cuz of the quality of the training dataset (only like half having text annoyations), so hopefully that could improve pretty quickly as more data gets created

lost geyser
stray chasm
#

Ooh

#

Altho I'm not sure video generation is quite there yet

lost geyser
#

It'd obvs have to be continuous action squences, like caption Sora outputs in domain specific settings. But it isnt asking too much.

stray chasm
#

Close tho

#

Or yeab

lost geyser
#

The noise diffusion breeds some hope there.

#

We train using 2 frames of observation history; in our preliminary experiments, we found significantly diminishing gains beyond the first additional frame. We use hindsight goal relabeling [2], which selects a state uniformly from the future in the trajectory to assign as the goal image, similar to prior work

stray chasm
#

Huh
That does help

lost geyser
#

We apply common image data augmentations during training, and randomly zero out the language instruction or goal image per training example to enable Octo to be conditioned on either language instructions or goal images.

Oh wow.

stray chasm
lost geyser
#

Section III-D Training Details, 3rd para, 3rd sentence.

stray chasm
#

I typoed while trying to ctrl-f šŸ’€

#

just starting out with small cropped images could possibly make generated video more feasible?

lost geyser
#

Really good point.

weak igloo
shrewd token
keen pilot
lost geyser
#

So AI for the sake of AI.

That always goes well.

weak igloo
#

Facebook is rolling out AI rn apparently:

#

So basically, if someone else shares a photo with you in it, and they have not objected, they can violate your objection. That's an intriguing implementation

lost geyser
#

We'll review objection requests in accordance with data protection laws.

lost geyser
# keen pilot

this is [sad animal noises] tbh.

AI everywhere in our newsroom.

isn't a strategy. (I've met a lot of prospects and clients at that threshold.)

they likely haven't had good counsel on this. treating AI as a hail mary pass, really. there's a whole TCO and ROI cost-benefit analysis to do; and real talk: AI doesn't serve a proper place or purpose in most applications.

keen pilot
#

Smart rotoscoping in compositing programs, excellent application.

Using an AI tool to recreate instruments stems from a stereo track in case of loss of the masters or restoration: good application.

Using a trained model to upscale video footage in cases where the source is inherently 480i or the source has been lost: good application.

#

These are all really specific niches

#

I've done all three of these in various contexts and they are additional tools that the experienced, seasoned professional can employ when the situation calls for it.

#

They aren't things that allow you to get away with replacing the professional with low skill labor, nor can they replace the professional all together.

#

A lot of these C suites have no respect from the idea of a professional, the idea of a body of hard earned knowledge that is off limits to them by virtue of their lack of experience.

They think that writing is just writing, when really it's an incredibly small part of the creative process. Just like CAS didn't replace mathematicians as computation is a very small part of math.

weak igloo
wicked bridge
hot mirage
shrewd token
keen pilot
#

Maybe someone more versed in LLM can explain but I cannot grok how people can consider this an acceptable application given that there will be variability in the output that cannot be controlled for. Not to say that humans are 100 percent reliable, especially in policing institutions, but they are easier to hold to account vs software.

I mean just today I saw someone showing Google recommending adding glue to your pizza based on some shitpost it found on Reddit during the course of its training.

stray chasm
#

Oh yeah this is completely unacceptable just from an accuracy standpoint, and that's before you get into all the like racial/gender/whatever else biases that are baked into the models

#

LLMs/computer vision should absolutely not be used independently when the results will have serious impacts on people's lives

#

Milder but still awful version of this is facial recognition not being completely accurate, and especially inaccurate at identifying racial minorities, and yet still being used by law enforcement to identify people
Which has caused a number of wrongful arrests

lost geyser
# keen pilot Maybe someone more versed in LLM can explain but I cannot grok how people can co...

Full disclosure: I evaluated this very scenario with my current employer years ago (not for Axon). Many of the same criticisms then apply now.

Jessica called out relevant points I won't repeat but those apply.

The Thing Itself

Problems this will face in real-world "production" use:

  • dialectic and vocalization variation: each person speaks and articulates in specific ways the model isn't always able to discern.
  • context wash-out (prosody, tonality, etc., wash out): how speech is formed also adds important information.
  • audio pickup quality: fixed hardware limitations introduce omissions, errors, etc.
  • noisy adversarial environs: hostile working conditions wrt loud noises, background noise, etc.
  • model quality and capacity: (here, GPT-4) models themselves, their training data and regimen, and architecture also matter (see also: Whisper small vs large).
  • model variability: generative architectures (LLM) "Make Up Stuff" by design (as you called out).
  • compounding error (propagation): speaker + environmental error => speech interp error (audio) => transcription error (text)
#

[typing this part on a call so it'll be admittedly choppy:]

The Bigger Problem

Several reasons can explain why these dodgy products often make it to market relatively unchallenged.

  • fitment and feasibility: for various reasons, product development omits crucial steps.
    • "not can this be done; should this be done";
    • lack of well-defined acceptance criteria (inventing their own requirements);
    • technological maturity hasn't reached sufficient capability, but they'll push betas anyway
    • ignoring absolutely valid reasons why consequence outweighs benefit of doing something (like Jessica's reasons above)
  • improper, inadequate, or biased testing: product should as good or better than humans, provide scale humans cannot reach easily well or cost effectively, and truly add value to the process not merely introduce new tech.
    • demo'd or tested under near-ideal, non-adversarial conditions;
    • poorly defined hypotheses or testing criteria;
    • biasing toward readily passable test conditions;
  • lack of tech savvy evaluators on consumer side

With many solutions like this new to market:

  • leaving the hard but necessary parts for later
  • cutting corners to expedite go-to-market delivery
  • failing to publish methodolgy and results (there's an infamous LE product pushing this scenario) and only publishing "unverifiable claims"
keen pilot
#

Reminds a lot of "pivot to video"

keen pilot
shrewd token
hot mirage
#

salt is a rock is my joke

#

apologies

lost geyser
fierce rapids
keen pilot
#

I didn't know it was possible to long more for 2000s era search... What an accomplishment

lost geyser
wicked bridge
lost geyser
#

This is new.

Google stopped auto-generating AI overviews and now gives the option to.

Swift response to the bad news above?

keen pilot
#

A friend of mine had this observation:

"Also just think - these bad ones are getting fixed fast because of exposure and because they're in English. What about languages the engineers don't speak? Its a disaster waiting to happen"

shrewd token
#

https://futurism.com/the-byte/study-chatgpt-answers-wrong

What's especially troubling is that many human programmers seem to prefer the ChatGPT answers. The Purdue researchers polled 12 programmers — admittedly a small sample size — and found they preferred ChatGPT at a rate of 35 percent and didn't catch AI-generated mistakes at 39 percent.

Futurism

Researchers found that 52 percent of answers to programming questions generated by ChatGPT were incorrect.

onyx flax
shrewd token
#
lost geyser
onyx flax
#

I would much rather have a program that blink a image exactly when copilot take a snapshot
and the images should be some really psycadelic shit

onyx flax
#

but if they go the way of the impressivly bad google AI I don't think we need to poision it intentionally

lost geyser
thick schooner
#

I know the Muskosphere has crusaded that Google is absolutely awful, especially Gemini, so I'm cautious of hoaxes

thick schooner
#

google may have swiftly turned off AI for certain sus searches

onyx flax
shrewd token
#

There's been a few faked ones here and there but some were surprisingly real

thick schooner
#

I saw one about using paste as a food ingredient that was pretty much it

shrewd token
#

Not articles but others on social media trying to replicate or clarify that the screenshot omitted certain context that clarified the answer was correct. Though they're anec-data and Google's Gemini clearly has issues

thick schooner
lost geyser
#

https://vxtwitter.com/Dan_Jeffries1/status/1794740447052525609

I spent a few hours listening to Dan Hendyrcks, who runs the non-profit AI Safety group behind SB 1047, aka the California AI Control and Centralization Bill.

I find him charming, measured, intelligent and incredibly dangerous.

Some of the most dangerous people in life are ones who can convincingly lie about their intentions and who can easily mask those intentions.

...

The intention of the bill is very clear for anyone who has eyes to read the text. It has three clear goals:

  1. Ensure that only a small group of companies, rigidly controlled and overseen by a special government agency, have the right to create advanced artificial intelligence.

  2. Destroy open source AI.

  3. Make sure that model makers have liability hanging over them like the sword of Damocles for the rest of their life, ensuring that governments can hold model makers responsible for any misuse or crime from those models forever.

I spent a few hours listening to Dan Hendyrcks, who runs the non-profit AI Safety group behind SB 1047, aka the California AI Control and Centralization Bill.

I find him charming, measured, intelligent and incredibly dangerous.

Some of the most dangerous people in life are https://t.co/qwSTlRxq5Q

šŸ’– 280 šŸ” 63

#

The bill is absolutely a de-facto ban on open source AI for advanced models because it requires model makers to have ā€œthe capability to promptly enact a full shutdown of the covered model,ā€ aka a remote kill switch, including the ability to force ā€œthe cessation of operation of a covered model, including all copies and derivative models, on all computers and storage devices within custody, control, or possession of a person, including any computer or storage device remotely provided by agreement."

ā€œ(2) ā€œHazardous capabilityā€ includes a capability described in paragraph (1) even if the hazardous capability would not manifest but for fine tuning and posttraining modifications performed by third-party experts intending to demonstrate those abilities.ā€

In other words, someone fine tunes a model they consider dangerous, the model maker is liable.

onyx flax
lost geyser
#

Seems to have compromising relationships.

#

Dan Hendrycks is the director of the Center for AI Safety. He received his PhD from UC Berkeley, where he was advised by Jacob Steinhardt and Dawn Song. His research is supported by the NSF GRFP and the Open Philanthropy AI Fellowship. Dan contributed the GELU activation function, the default activation in nearly all state-of-the-art ML models i...

Dan Hendrycks (born 1994 or 1995) is an American machine learning researcher. He serves as the director of the Center for AI Safety.

#

To quote jerlendds, with whom I agree:

Yeah im of the opinion all the AI doomerism bullshit is for the purposes of regulatory capture and to convince gullible people to propagate delusional beliefs.

full aurora
thick schooner
#

#1036758130761158677 message

Definitely seems to me the big picture is layoff a helluva lot of coders because so much of it in theory could be done by AI. I won't deny it has issues.

#

@copper tide
@lost geyser

thick schooner
thick schooner
copper tide
#

it can speed up the work of experienced coders probably though

#

but often programming involves solving issues in existing code which require deep understanding / reasoning, which in my experience AI fails at

#

IMO it's a tool a experienced coder can use. But it in no way replaces the coder.

thick schooner
lost geyser
thick schooner
lost geyser
#

same conversations happened when AutoML emerged. even some of my peers thought it replaced them. i suggested they think better about their actual value proposition as practitioners.

#

all AutoML did then and code-generating LLMs do now is accelerate our work and rapidly prototype the boring and boilerplate.

copper tide
#

Costs are felt down the line, when it doesn’t matter for the current leadership

copper tide
copper tide
#

I haven’t seen a discussion of Microsoft’s Recall function in here

lost geyser
#

It’s all super new. When I watch interviews by tech CEOs I feel that even they are still making sense of what’s happening. But I think some companies and some start ups are already putting products out there that take advantage of AI and try to market these products to businesses. Even if it’s a long shot, it makes businesses more cautious to hire. Interest rate environment since 2022 is also probably driving lay-offs (need to signal understanding of a more resource-constrained environment). The combination of the two - AI changes and higher interest rates - has potential to cause a lot of damage (and I think together they explain the layoffs).

lost geyser
lost geyser
# copper tide I haven’t seen a discussion of Microsoft’s Recall function in here

it's a little scattered, some of it in #infosec.

#1089154093810978866 message

https://cyberplace.social/@GossiTheDog/112492445214914228

turns out (unsurprisingly) to be a smoke-and-mirrors sort of shitshow.

copper tide
#

It’s a complete info sec nightmare

#

This just a week after Microsoft said that they will focus on security

lost geyser
#

they basically took RAT philosophy and made it an IT governance nightmare of a feature.

#

i suppose this was a different sort of focus.

copper tide
# lost geyser i suppose this was a different sort of focus.

It makes sense if your goal is ā€œhow can we have an AI assistant which knows what you have been doing / working on in the pastā€

Then having screenshots makes total sense.. but that no one considers what that actually does is insane

#

Even worse is that they hand waved security (it’s all local, it’s ā€œencryptedā€)

#

Shows how careless the big players in AI models are.

Also shows how AI is a privacy risk due to being Data hogs by design.

There was a recent case where an online doctors receipt service accidentally exposed all their receipts to Bing indexing.

They removed it quickly and Bing deleted the index, but Copilot still remembered the entries! Not sure if they actually purged the data or tried to ā€œfix itā€ by blocking certain requests:

(German language source)

https://www.borncity.com/blog/2024/05/15/autsch-datenleck-bei-dr-ansay-cannabis-rezepte-in-duckduckgo-sichtbar/

lost geyser
#

many of the architecture design patterns we're initially presented with are for remembering and recalling information. this presents a consequent that fewer are focusing on which is intentionally forgetting altogether.

copper tide
lost geyser
#

rn that's the most practical (and also disruptive) thing to do.

#

there are research-grade efforts into finding the context windows (Anthropic) and making embedding edits (various others) but those aren't production worthy.

lost geyser
#

also largely depends on the entire composition of that architecture--not just the models themselves. non-LLM learnings, semantic indexs, etc.

#

so for example here's (supposedly) Microsoft Copilot's arch ref for 365:

#

idk what this looks like for the Bing Search component tbf.

#

but here we can see a number of layers (including federated systems) where cascaded deletions would have to happen.

#

nightmare scenario.

copper tide
thick schooner
lost geyser
#

on the one hand, it helps bootstrap cash-starved, resource-insecure smaller operators. on the other, it enables the sort of bad behavior you're concerned about.

thick schooner
lost geyser
#

were you speaking about that one specifically or more broadly?

full aurora
lost geyser
# full aurora More broadly. Mainly curious how Open Source LLMs can be regulated now they are ...

Proposed EU rules are just one step toward global AI regulation. Smart organizations are preparing for compliance—and AI risk management.

#

afaict the proposed California regulation above is the closest to an actual formulation in the United States. Whatever form that may pass in could be used to inform other states and federal regulation.

#

there is this US "Bill of Rights" (again, not specific to LLMs but they are involved):

https://www.whitehouse.gov/ostp/ai-bill-of-rights/

The White House

Among the great challenges posed to democracy today is the use of technology, data, and automated systems in ways that threaten the rights of the American public. Too often, these tools are used to limit our opportunities and prevent our access to critical resources or services. These problems are well documented. In America and around…

#

run the full content through Copilot via Edge to get content summaries, ask questions about it, and find specific citations within it.

#
The White House

By the authority vested in me as President by the Constitution and the laws of the United States of America, it is hereby ordered as follows: Ā  Ā  Ā Section 1.Ā  Purpose.Ā  Artificial intelligence (AI) holds extraordinary potential for both promise and peril.Ā  Responsible AI use has the potential to help solve urgent challenges…

full aurora
lost geyser
lost geyser
lost geyser
# thick schooner Well, obviously there are going to be automation extremists just I think it's th...

Middle managers have had a target on their back the last 2 years, just like everyone else in tech.

Companies are looking to flatten out their org charts, meaning they want less layers between individual contributors and the executive suite.

At the end of the day, they’re a cost https://t.co/X8lzCfPHIr

šŸ’– 594 šŸ” 53

#

https://vxtwitter.com/ylecun/status/1795032310590378405

AI is not some sort of natural phenomenon that will just emerge and become dangerous.
WE design it and WE build it.

I can imagine thousands of scenarios where a turbojet goes terribly wrong.
Yet we managed to make turbojets insanely reliable before deploying them widely.

The question is similar for AI:
"do we think there exists at least one design of an AI system that is simultaneously safe/controllable, and can fulfill objectives in more intelligent ways than humans ?"
If the answer is yes, we'll be fine.
If the answer is no, we won't build it.
Right now, we don't even have a hint of a design of a human-level intelligent system.

So it's too early to worry about it.
And it's way too early to regulate it to prevent "existential risk."

patent pendant
#

AI has reshaped everything from medical diagnoses, to wedding vows, to stock market gains, but the technology wouldn’t be possible without gig workers across the globe.

However, analysts and advocates said the workers whose efforts help train AI are often denied knowledge of the end product they help create, or the company behind it. They also ...

ā–¶ Play video
shrewd token
lost geyser
lost geyser
shrewd token
onyx flax
hot mirage
#

well.. I guess that means Google can't argue their AI didn't influence someone to eat glue... but I doubt it would even come close to liability legally

keen pilot
#

Gotta pump up those valuation numbers

lost geyser
patent pendant
shrewd token
#

probably very good for open ai's case

thick schooner
# lost geyser Curious to know what the survey probed into and how much it explained. Most of ...

So, they found about 2% of Britons used AI and this below says about 1 in 3 companies use it

https://connect.comptia.org/blog/artificial-intelligence-statistics-facts

keen pilot
#

Will be nice when this bubble finally bursts

shrewd token
#

depends how it bursts

#

this new era of ML/AI seems like it's here to stay one way or another

keen pilot
keen pilot
full aurora
lost geyser
shrewd token
keen pilot
lost geyser
#

[Cross post from #infosec by @spring creek due to audience overlap]

🚨 Heads up on a security incident at Hugging Face:

  • Unauthorized access to Spaces platform, possible secrets compromise
  • HF tokens revoked, affected users notified
  • Investigation ongoing with external security experts
  • Infrastructure security improvements in progress
  • Reported to law enforcement and data protection authorities

If you use Hugging Face:

  • Refresh your keys/tokens ASAP
  • Move to fine-grained access tokens

Source: https://huggingface.co/blog/space-secrets-disclosure

keen pilot
lost geyser
#

One of them was visual quality inspection of order-to-service since they had a policy of sealing bagged orders and no way to review after the fact. They had a lot of respondants in that PoC and I was part of one.

keen pilot
#

Visual quality inspection? As in, camera records image of a bagged food item, and determines it's "quality"?

lost geyser
#

They have 5 stages to their order fulfillment process. Some of those use-cases were:

  • ingredient quality
  • build quality
  • order fulfillment accuracy (item matches assembly)

The overarching process is expansive across the short order cook ops.

#

(Yes, via computer vision)

keen pilot
#

Sounds like setting up a very complex infrastructure to gather new "performance metrics" to be used for process and "employee" optimization. A way to eek out those last few percent and be able to say "this number is improving".

lost geyser
#

I have a personal issue against doing that for all the obvious reasons. Learned that long before AI was industry standard, while working on Boeing's warehouse ops that wanted to do that very thing.

keen pilot
#

And look how well that turned out!

#

I personally don't work in the field but I have a personal window into business intelligence at a nationwide company. I might just be cynical but it often just seems like another tool to be manipulated by the C-suites to justify this or that, or advance themselves

#

I'm assuming that, aside from the buzzword aspect, the appeal of these kinds of things is they can scale at cheaper cost to the company compared to better wages, cultivating employee knowledge, reducing turnover, etc?

lost geyser
#

Unsure if they had any ulterior motives in a broader sense.

That definitely happens and will continue to happen. (There's an infamous coffee shop clip floating around.)

It's ill-advised and ill-conceived but that won't stop some from shaving capex/opex to satisfy stakeholder demands.

spring creek
#

Example of an insurance company using AI and satellite imagery for risk assessment in underwriting, which led to the cancellation of a church's insurance policy:

News Story: https://www.cbs8.com/article/news/local/working-for-you/insurance-company-guide-one-drops-church-policy-satellite-images/509-f752ffba-b27a-4667-be82-f8c5ad4ee355

Legal blog post on the article: https://www.propertyinsurancecoveragelaw.com/blog/church-loses-insurance-from-satellite-imagery-guideone-refuses-to-consider-other-evidence-of-a-roofs-condition/

Betterview (the AI Platform used for the decision to drop coverage): https://www.betterview.com/

Insurers trust Betterview to optimize pricing, underwriting, and renewals. Applying artificial intelligence (AI) and computer vision to aerial imagery, we provide accurate, pre-filled risk scores, custom flagging, and continuous property monitoring. Write more business, reduce expenses, and transition from "Repair & Replace" to "Predict & Prevent."

OPTIMIZE WORKFLOWS | SLASH INSPECTION COSTS | BOOST CUSTOMER SATISFACTION

Betterview Report obtained by CBS 8: https://interactive.cbs8.com/pdfs/roof-report.pdf

cbs8.com

The new policy the church just got costs $20,000 — $15,000 more than what they paid last year.

May 22, 2024 Insurers are now analyzing satellite and drone imagery using artificial intelligence (AI) when conducting underwriting surveys of property. The images are

Market-leading Property Intelligence platform delivers actionable insights to underwriters, agents, and insureds, increasing efficiency and profitability.

#

From Betterview's AI generated property report which contributed to the decision to decline policy renewal:

lost geyser
#

There's a hard limit on how much "functional obsolescence" can be determined from a satellite image (speaking from experience). And this one is making inferences well beyond what can be determined.

spring creek
#

I came across another story where the property owner was able to get a reversal of the decision by paying for a roof inspection out of pocket.

This trend is going to be challenging for folks without the financial resources to challenge an AI conclusion.

lost geyser
#

Spot on. And likely the case here for that church.

#

Stacked deck in favor of policy writer / insurer.

keen pilot
#

Good thing that no one is working on restricting the legality of this.

They got bigger fish to fry. Like "when the AI becomes skynet you need to have an off button" type stuff

toxic crater
#

Seen at an AWS summit

#

(NGL I've been quite impressed with some applications I've seen here, in particular bringing real life context awareness to genAI workflows)

spring creek
#

Researchers have developed a novel training framework, SaySelf, to address a crucial issue in LLMs: their inability to express uncertainty or accurately convey confidence in their responses.

By fine-tuning LLMs on model-specific datasets and applying reinforcement learning, SaySelf encourages AI to generate human-like responses that include confidence indicators, potentially leading to more trustworthy and reliable AI assistants.

In my use of AI, I've often been frustrated by their lack of uncertainty expression. They tend to present all responses with equal confidence, even when proven wrong. In contrast, humans often preface their answers with phrases like "I'm not an expert, but..." or "I could be wrong, but...". This absence of uncertainty expression in AI can lead to over-reliance on potentially inaccurate information

This development could have significant implications for the future of AI and its role in our lives, as it addresses the common frustration of AI's lack of uncertainty expression, which can lead to over-reliance on potentially inaccurate information.

https://github.com/xu1868/SaySelf
https://arxiv.org/pdf/2405.20974

GitHub

Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales" - xu1868/SaySelf

onyx flax
thick schooner
#

This is one of the main reasons people dont use AI more as I see it. People want authoritative answers and it's not hard to use a search engine to get those and with AI there's reason to doubt what you get is authoritative

shrewd token
spring creek
#

404 Media has a thought-provoking piece by Samantha Cole that dives into the complex issue of deepfake legislation and its potential impact on sex workers. Cole argues that current discourse around nonconsensual AI-generated images often overlooks the fact that there are at least two people in every deepfake: the person being impersonated and the sex worker whose body is exposed but face is erased.

Cole discusses recent US legislative efforts to combat malicious deepfakes at the federal level, such as the DEFIANCE Act and the "Preventing Deepfakes of Intimate Images Act." She raises concerns about the influence of conservative anti-pornography groups like the National Center on Sexual Exploitation (NCOSE) on these efforts. While acknowledging the need to address the very real harms of nonconsensual deepfakes, Cole cautions against ham-fisted solutions that could disproportionately impact sex workers.

Source: https://www.404media.co/laws-about-deepfakes-cant-leave-sex-workers-behind/

404 Media

As lawmakers propose federal laws about preventing or regulating nonconsensual AI generated images, they can't forget that there are at least two people in every deepfake.

spring creek
#

I saw Raspberry PI jumped on the AI bandwagon and found myself reflexively looking for jokes:

But then I recalled a humbling convo with an army veteran who had fought in Iraq. Someone had made a comment suggesting that the insurgents were stupid, basing this assumption on the fact that their technology was less advanced than what the U.S. military possessed. My friend's response was pointed: those insurgents were highly effective at using what was available when it mattered most

With little more than a map, a compass, and a basic understanding of trigonometry, they were able to calculate distances to targets using techniques like the "string method." By hanging a string of known length from a piece of debris and measuring the angle between the string and the line of sight to the target, they could determine the distance using the tangent function. These calculated distances, combined with an understanding of angles and elevations, allowed them to devise effective firing solutions, even without access to advanced targeting systems or sophisticated weaponry.

I share this as a reminder that necessity often drives innovation, and the same principle applies to the use of AI in infosec, OSINT research and emerging threats. Just as the insurgents in Iraq were able to leverage basic tools and mathematical concepts to great effect, shouldn't we expect the same with access to tools like the Raspberry Pi AI Kit to find ways to harness its capabilities in unexpected and impactful ways?

https://www.raspberrypi.com/news/raspberry-pi-ai-kit-available-now-at-70/

#

Key features of the Raspberry Pi AI Kit include:

13 tera-operations per second (TOPS) of inferencing performance
Single-lane PCIe 3.0 connection running at 8Gbps
Full integration with the Raspberry Pi image software subsystem
Compatibility with first-party or third-party cameras
Efficient scheduling of the accelerator hardware: run multiple neural networks on a single camera, or single/multiple neural networks with two cameras concurrently

lost geyser
#

I like Hailo's product lines yet they're overselling a bit with multiple NNs and camera streams. Lil thing is gonna run hot and with only passive cooling stock. Also hard constraints on resource capacity (TOPS :: performance as bandwidth :: throughput).

Still a decent entry-level performer. You can build a lot of things--smart kiosks, responsive displays, certainly some light workload camera AI (highly quantized).

lost geyser
#

I may be attending one in the fall.

keen pilot
#

What does the text in the slide mean in a layperson's context?

patent pendant
#

A group of current and former OpenAI employees issued a public letter warning that the company and its rivals are building artificial intelligence with undue risk and without sufficient oversight. They're calling on leading AI companies to be more transparent with their research and provide stronger protections for whistleblowers. Geoff Bennett ...

ā–¶ Play video
shrewd token
lost geyser
#

The answer, of course, is no.

(See also: Betteridge's law of Headlines)

spring creek
#
Ars Technica

EU Facebook users have until June 26 to opt out of AI training.

noyb.eu

noyb urges DPAs in 11 countries to immediately stop Meta's use of personal data for undefined "AI technology"

We're expanding our collection of generative AI features, along with the models that power them, to more people across the globe, including in Europe.

thick schooner
keen pilot
#

It's always hilarious the kinds of stuff these hype men try to proclaim as impressive

#

Something actually impressive: https://beforesandafters.com/2024/06/08/its-like-a-constantly-evolving-three-dimensional-puzzle/

Some very cool examples of trained models being used to augment existing face replacement methods

The visual effects of ā€˜Furiosa: A Mad Max Saga’, including those crazy War Rig scenes, impressive wasteland environments, immense rotoscoping, and how Anya Taylor-Joy’s facial features were translated onto the young Furiosa.

shrewd token
keen pilot
weak igloo
lost geyser
#

Are they arguing that ChatGPT is bullshit or that the hallucinations are bullshit? Hard to disagree with the latter, but title would suggest a broader scope.

shrewd token
weak igloo
# lost geyser Are they arguing that ChatGPT is bullshit or that the hallucinations are bullshi...

They are suggesting that these models produce forms of bullshit as originally defined by https://en.wikipedia.org/wiki/On_Bullshit which has to do (I am not a philospher but I think I skimmed parts of that treatise several years back) with disinformation. The paper is arguing that ChatGPT is a vessel for unintentional (soft bullshit) misinformation and may, depending on the intent of the authors and the resulting design, be a vessel for intentional (hard bullshit) disinformation.

lost geyser
patent pendant
lost geyser
keen pilot
keen pilot
thick schooner
lost geyser
#

Luma AI just dropped a Sora-like AI video generator called Dream Machine.

But unlike Sora or KLING, it's completely open access to the public.

Here are 10 wild examples (and how to access it):

  1. https:…

šŸ’– 2.8K šŸ” 439

ā–¶ Play video

Apparate Labs launched PROTEUS, a new real-time AI video generation model.

It creates realistic avatars and lip-syncs from a single reference image, similar to VASA-1, but it's completely real-time. https:/…

šŸ’– 123 šŸ” 15

ā–¶ Play video
fierce rapids
lost geyser
#

Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots

Advances in large language model (LLM) technology enable chatbots to generate and analyze content for our work. Generative chatbots do this work by ā€˜predicting’ responses rather than ā€˜knowing’ the meaning of their responses. This means chatbots can produce coherent sounding but inaccurate or fabricated content, referred to as ā€˜hallucinations’. When humans use this untruthful content for tasks, it becomes what we call ā€˜botshit’. This article focuses on how to use chatbots for content generation work while mitigating the epistemic (i.e., the process of producing knowledge) risks associated with botshit.

#

(Someone really wants to make that term happen.)

patent pendant
shrewd token
lost geyser
mint sparrow
#

Can someone explain to me what's going on here? Where does this model come from? Besides the obvious misinformation that a non-restricted model equals a non-biased model...

https://youtu.be/cTxENLLX1ho?si=Dc5diuwhaBr7Odp6

If you're serious about AI, and want to learn how to build Agents, join my community: https://www.skool.com/new-society

Follow me on Twitter - https://x.com/DavidOndrej1

Please Subscribe.

Download Ollama: https://ollama.com/download
Llama3 Dolphin: https://ollama.com/library/dolphin-llama3
Download AnythingLLM: https://useanything.com/downloa...

ā–¶ Play video
lost geyser
mint sparrow
# lost geyser Those are community adaptations to LLaMa 3 (and various others) where efforts ar...

I mean yeah, but like... context; how computationally expensive are such modifications, who pays for them, who is "the community"? How likely is it that it's really a community thing? could an APT pose as "the community" and release a modification that's good enough to be widely adopted? He says everything should be open-source but these days someone says open-source I think xz utils...
Where are these communities located online? And he says that it could be banned, but that ostensibly just means banned from a platform - how are these ecosystems governed?

lost geyser
#
  • Compute budget: hundreds at minimum, all-in (GPU, storage, training overhead).
    • Dolphin 2.9-Llama3-8b: It took 2.5 days on 8x L40S provided by Crusoe Cloud
    • WizardLM-<various>: 4x A100 80gb node on Azure
  • Funding: self-funded or with other people's moneys:
  • Complexity: easy to moderate. Train low-rank adapters (update layer weights), modify embeddings, tailored data sets.
  • Community: many of these are hosted on Huggingface or GitHub.
  • Banning: models? maybe in commercial settings (esp licensing like LLaMa).
mint sparrow
#

Got it running nodding

#

Time to get a decent piece of hardware

#

so you can really do this easily to any model that's public, eh? That's an interesting development, happened quicker than I though.

keen pilot
fierce rapids
lost geyser
shrewd token
fierce rapids
shrewd token
weak igloo
lost geyser
# lost geyser https://fixupx.com/alexhillman/status/1794521859289141397

That’s an interesting exercise!

I think anthropomorphization cuts both ways though.

It helps create a hype but it also helps to underestimate.

When chatgpt helps doctors identify rare diseases that they had missed, it behaves in a way that is completely different than a human (using its ability to go over massive amounts of info that a human would never be able to go through).

We might be underestimating it by thinking that the worst case scenarios are that it reaches ā€œhuman intelligenceā€ - whatever that means.

Somewhat related: I’ve seen people (probably Sam Altman, but don’t know for sure) defend that there shouldn’t be one Touring test, but several. What test can this or that AI perform better than a human?

And this may indeed be a better way to think about it as some models are already performing better than humans in some tasks.

lost geyser
thorny helm
patent pendant
#

China’s leadership believes that artificial intelligence will play a central role in future wars. However, the author's comprehensive review of dozens of Chinese-language journal articles about AI and warfare reveals that Chinese defense experts claim that Beijing is facing several technological challenges that may hinder its ability to capitali...

lost geyser
#

The medical practitioner scenario is a good example that carries a lot of the nuanced positions of AI in practice.

It does scale better and see further than human counterparts, which makes it a good companion piece in the diagnostic process (clinical decision support).

However there are other issue where those models are unreliable in one form or several. Also the complacency of blind trust in tooling over human judgment remains a risk.

#

Domain expertise is absolutely required to make those models viable. Not these Kaggle competition style efforts where domain knowledge is absent and "data science methodology" dominates.

fierce rapids
keen pilot
lost geyser