#folding-and-boinc

1 messages Β· Page 79 of 1

stoic pecan
#

before throttling

#

whenever i run my gpu on max it hovers around 65c

raw violet
#

nice

#

think im gonna just use my gpu

stoic pecan
#

Yeah good idea

#

Lol to keep temp down I propped my laptop on a 3D printed stand and took off the bottom panel

rich depot
#

Gpu puts in more work anyway

#

@raw violet apply new thermal paste and your temps will be grand i got an 290x and tbh it does really well now i got my new thermal paste its gone from going from 94 degrees on half usage to 65 degrees max with side cover on

raw violet
#

any brands you reccomend my pc is just a prebuild

rich depot
#

Thermal paste i recommend this let me take look

#

Got it off Amazon for about 7 quid

#

But its really good

raw violet
#

anyway i can confirm its my thermal paste thats the issue

#

the cpu is cooled by a aio

rich depot
#

I had taken the thermal paste off my gpu and it was like not good at all all dried out ect a d wasnt applied properly so i used the plastic thing to spread it across and its never gone above 65 degrees

raw violet
#

oh you mean gpu i was thinking cpu

rich depot
#

Appart from when i stress test but then it only gets to 69 70

#

I mean im a do it for my cpu as well

#

Because it honestly has helped alot and my pc has now stopped crashing

raw violet
#

is 74-75c high for a 1080

rich depot
#

Depends is it a hot day

#

Or a cool day when u tested it

raw violet
#

thoes temps are from running the folding on high for my gpu

#

today

rich depot
#

75 c will not hurt it and if u have been doing it all day its ight

#

I was running this gpu at 80 degrees with toothpaste tho

#

Cuz i didnt have thermal paste at the time

raw violet
#

nice

rich depot
#

But when the toothpaste dried out it was at 95 degrees all the time

#

But since my new application of thermal paste its sitting pretty st 65 while gaming and never really goes any higher

#

Ill brb

rich turtle
trail night
#

Who is Den-Fi?

main mist
#

A person on this discord.

#

He takes amazing photos.

#

Has very beautiful hardware also

brittle coral
#

@trail night if you look, den-fi is the person who posted that image xD

#

ask him yourself about if if you haven't seen his stuff

#

(sidenote i haven't either, @rich turtle whats good)

fickle needle
#

Folding is weird, i'm on a GTS 450 and still on pace to get the quick return bonux

brittle coral
#

ian

#

QRB is relative

fickle needle
#

hmm?

#

I guess the more pertinent question is relative to what?

brittle coral
#

hardware

#

this is a 2080ti with QRB

#

you can see the base points on the right and the bonus after that

#

let me see if i have the same WU on something slower

mystic hollow
brittle coral
#

ayeee paranoid keep it upo

#

also ugh my friend's PC keeps shutting down and not rebooting

#

i just had to start it myself again

#

wtf and now on a reboot it isn't starting F@H

#

nvm, it took FOREVER Y_Y his PC may need a fresh windows install xD

fickle needle
#

I guess the depressing thing is previously my GPU managed to complete a previous work unit slower than a i5 750 would have XD

brittle coral
#

yea lol

#

it is not a very fast card

#

it made my 2080ti comparison a perfect contrast though of what QRB really means for fast hardware

#

here's a great example

#

that system i just rebooted must have died early in the night after i went to sleep

#

it was at 14% on a WU and lost most of it's QRB

#

this may go faster as it progresses, i doubt i totally lost my QRB, but because it was offline so long the math on it has been adjusted till it has time to process more data and that number slides back toward some level of QRB

#

a 5700xt should not lose all its QRB in a few hours lol

#

also, make sure your system clock is set correctly

#

i reinstalled and deleted 2 rounds of wu thinking my client was bugged when it was my system clock that needed fixing

#

windows broke for DST settings on all my installs with the last DST change =/

#

the WUs are not identical but they're similarly performing

#

also i get about half the QRB on my 5700xt

rich turtle
#

@trail night Me! #1 on the team, #90 overall.

#

@brittle coral awaveboy

brittle coral
#

i'd never heard of you till today but im new around here lol

#

the board seems to think i deserve #11 but that's in recent points per day lol

#

people keep telling me every time i move up 100 on the all time and i'm like ooh @_@ that happened

#

no time for that when i have bigger problems like streaming just cause 4 and screaming at motherboards to not shut down without rebooting overnight

#

speak of the devil, it just did it again not 20m after being rebooted. i think this guy might just be running out of time on the clock lol

trail night
#

@trail night Me! #1 on the team, #90 overall.
@rich turtle What specs u using to fold?

rich turtle
#

Right now 3x 2080 Tis and 2x 1080 Tis.

#

I have 3 more 2080 Tis and 2x 2080s when we have an event.

silver wing
#

Well I have..... CPUs so hah!..... 😒

brittle coral
#

word

#

waht are the other 5 cards doing when were not having an event

silver wing
#

Taunting me with their sexy shrouds and CUDA cores

shut apex
#

lol.. all this time i didnt know there was a discord server!

brittle coral
#

yup!

#

i just started using discord because of this server

#

linus made the video about folding and i'd done distributed computing as one of the main contributors for team anandtech starting a decade ago

#

so i was like "ooh yea i have lots of power" and then i went out and bought 3x as much power as i had, some is on loan as well

#

so im at like 7x now xD

#

i rapidly scaled from 1.5mppd up to around 7ish now with downtime to game

shut apex
#

lol! nice! I just have the one machine.. but i did find an old Xeon CPU.. maybe i should build another machine πŸ™‚

brittle coral
#

nice

#

im dealing with a board i think is dying

#

and the need to replace it for a friend but keep a 4th rig online for myself in ther process

#

annoyed too because i just bought a cpu to replace the one in his board

#

and another for the matching board as well as RAM

#

and now im prob gonna have to return the cpu and ram and give him the old board

#

but idk if i can get a 1600AF now cause theyre selling out everywhere

shut apex
#

thats too bad!

brittle coral
#

and that's the CPU i'd want for a cheap folding rig

#

6 cores decent clocks

#

ryzen+

rich turtle
#

The other GPUs are in systems that have other uses. I just commandeer them when it's folding time.

shut apex
#

Just a few more seconds... or minutes

#

nice! and next Javelin is in 3 days! perfect!

stark bough
#

looks like my folding/console game capture pc build is dashed for now. markup on capture card is insane

brittle coral
#

@rich turtle thats really chill that you can just commandeer that processing power tho when needed. Im guessing it's either other projects or you use it for medical renders or something?

#

I have a friend who runs a dental practice and he just got a quad 2080ti rig with a 3970x threadripper to do renders for his 3D printer to print medical implants

#

but he's running F@H on it whenever there isn't customer workloads to process

#

then here i am with 4 whole rigs with less power xD

main mist
#

quad 2080 tis...

brittle coral
#

yea

#

watercooled

main mist
#

Seems way overkill

brittle coral
#

its not

main mist
#

whats the software used?

brittle coral
#

he is considering methods to build a custom case and find a workstation or server board with dual socket XATX form factor and minimum of 8 slots

#

IDK some 3D rendering engine

#

that has to be realtime level detail accurate

#

true to life 3D modeling

#

again, for dental implants

#

i can ask him waht software it is if you really care

main mist
#

...

brittle coral
#

xD

#

im laughing my ass off

#

he's right tho

main mist
#

what software though

brittle coral
#

i asked hi mto get specific

main mist
#

hes not right at all...

brittle coral
#

he says he's using a bunch

#

whatever gets you the most cuda cores

main mist
#

well what are they?

brittle coral
#

he has something using tensor cores as well

#

so i would look into what the latest 3D rendering engines are

#

but it sounds like he's literally using all of the high end software

main mist
#

or just ask him...

brittle coral
#

i've heard him talk about at least a dozen different engines

main mist
#

name a few

brittle coral
#

but i cant remember what they were because i didn't care enough to memorize it (learning disab led)

#

but yea like

#

idk why you think you need less power

#

high end movie renders for say transformers take 3 days per frame on current gear for the quality they want

#

they will take literally everything they can get XD

#

since the software runs on a single work station currently

main mist
#

thats different

brittle coral
#

yea but rendering is rendering

main mist
#

... a fucking movie render is way different

brittle coral
#

true to life renders

#

thats what i said from the get go

#

he does lots of things not just medical lol

#

some of it is customer time, some of it is hobby

#

it's mostly all for work he says as well

#

no time for hobby now unless he gets another PC

fickle needle
#

and also, time is money, if it speeds up delivery time spending a couple grand on cards isn't that rediculous

fluid leaf
#

video cards as a tax deductible business expense... i'm in the wrong line of work!

brittle coral
#

right lol

brittle coral
#

so my AMD rig has been down all day and im sad lol i gotta wait till another SSD gets here tmrw to copy data from the drive that the OS was on cause i need to back it up before wiping it rip

jade flicker
#

anyone familiar with this issue?

There is no domain decomposition for 16 ranks that is compatible with the given box and a minimum cell size of 1.46925 nm
I've split my 3950X into 2*16 thread slots but one of them keeps erroring with this message

#

For now 16 + 2*8 threaded cpu slots work

#

not sure if that "1130" attempts number is accurate

brittle coral
#

word i've never seen that before, i don't have enough threads to test this issue

raw violet
#

anyone know why the protein wont show on gpu but does on cpu

#

in the viewer

brittle coral
#

they just fixed the viewer in the latest version (7.6.9) but it may still be buggy

onyx canopy
#

100th wu done 😎

#

top 5k Ltt soon πŸ‘€

shut apex
#

What is the best $/PPD setup? One of those new Ryzen 3300 and some cheap ram, etc?? more cores must be better, but at what cost?

shut halo
#

buy couple P106-100 mining cards, they are cheapos and do decent PPD

fluid leaf
#

Depends on how you approach it. Are you building a folding rig or a primary use rig? If a folding rig, you have to consider the other fixed costs such as mobo, ram, etc. However, for $/PPD, spending on GPU will probably yield more $/PPD compared to CPU.

shut apex
#

yeah, GPU would be more efficient.. dont know why im even thinking about it, must be the Pent, and the lowly numbers I make compared to some other people πŸ™‚

mystic hollow
#

i personally dropped the CPU slots, as the cost/kWh simply makes them too expensive

#

and i can drop my GTX 1060 6GB down to 60W power limit, which also makes it near silent

#

while i have 3900x in my server, i just can't find a good reason to use it πŸ˜›

#

(plus got in the TOP 5000 in LMG folding group)

fluid leaf
#

3000 series Ryzen CPUs can do CPU projects fairly efficiently. You can run them in eco-mode which drops the TDP down one notch. I also have a 3900X and it goes from 105W TDP to 65W TDP. In normal mode, I'm usually a hair about 4ghz all cores loaded. In eco-mode, it's still at 3.8ghz. Got quite a bit more energy efficiency out of it.

#

If you have extra CPU cycles, consider putting it to work at rosetta@home. It's also doing covid-19 research and is CPU only. LTT is also participating in the BOINC pentathalon competition right now so rosetta work would help out the team.

mystic hollow
#

@fluid leaf it is my server, i do not want to drop it's speed if it does other jobs πŸ˜›

rocky otter
#

Has anyone else gotten a GPU wu with a super low amount of points today?

fluid leaf
#

BOINC has settings where you can dedicate specific cores and/or also suspend if other CPU usage goes above a certain configurable threshold

#

you can also set a scheduler so it only runs during off-peak times, if applicable

brittle coral
#

@shut apex i have someone selling a pile of p106-100 cards and 1070s if you need hardware on the cheap

shut halo
#

Is anyone here folding with a Evga rtx2060KO? Or does anyone know how much PPD they do?

brittle coral
#

same or worse than 2060

#

they overheat pretty bad in F@H is what ive read

#

because they are hotter

#

so while some apps benefit from the design, this one does not from what i've read

#

and with the heat, some ppl ran into throttling issues

#

man

#

summer is here

#

at least for me, its like 95f today

#

and temps in my garage folding room are balls but the GPU temps are cozy because i put the 1200cfm fans on them back when it was friggin cold still xD

#

hottest card is 70c despite being right up against another cause its getting pressure blasted with air πŸ˜„

stark bough
#

Dropping out of folding for now again. Wish I could schedule it so I only run 1 WU every 12 hours

west fog
#

Apparently the folding stuff loves more cores

#

Just got a 3900x and all cores go to 100% when folding lol

mystic hollow
#

@west fog wasn't a thing a month ago

jade slate
#

hey, broke 10 milion

bleak frigate
jade flicker
#

anyone know it's normal that the F@H doesn't detect OpenCL but only CUDA for nvidia cards? cause I'm getting "Failed to start core: OpenCL device matching slot 4 not found, make sure the OpenCL driver is installed or try setting 'opencl-index' manually" with 'opencl-nvidia-440xx' installed https://puu.sh/FHSZ0/e980a49098.png

small current
west fog
#

Getting close to breaking 9 million :)

verbal zephyr
#

live and monitoring

mystic hollow
verbal zephyr
gaunt wasp
brittle coral
#

ayeeeee so many pics i hope y'all are doing well

#

my 2nd 860 evo just got here finally for the 4th mining rig πŸ˜„

#

took em an extra day, and it was sitting on top of my new monitor when i woke up xD family brought it all inside

#

and the water block for my 2080ti, i wont even have time today to get to that probably

#

but yea, i can get the last 2mppd of my farm back online finally ugh

shut apex
#

@shut apex i have someone selling a pile of p106-100 cards and 1070s if you need hardware on the cheap
@brittle coral Thanks! I am not really in the market for more folding machines! The one I have is causing me enough setup time. I keep reading about new ways to make folding more efficient. Latest change.. changed NumberFields jobs to use half my gpu instead of the whole one.. made it produce almost twice the output! My 2070S can handle it! WINNING!!

brittle coral
#

hahaha nice

#

well if you change your mind lmk, i know international shipping can be a bitch if you're really in sweden but used prices in the US may be low enough to make it worthwhile

shut apex
#

im in cali! πŸ™‚

brittle coral
#

oh thats perf so am i

shut apex
#

🌴

brittle coral
#

well let me know im in the bay

#

if you are local you could do pickup

shut apex
#

orange county

brittle coral
#

and if you need any parts help ever

#

testing ect

#

oof XD

shut apex
#

heh

brittle coral
#

i love orange its like the only part of socal i enjoyed

#

LA was gross

#

never made it south to SD

shut apex
#

lol.. yeah oc is nice

brittle coral
#

whats that other park besides disney, knotts berry farm?

#

have a family by marriage relative who works/worked there (idk its been 10 years) and we went there for a free show the day after the guy's wedding

shut apex
#

..thinking..

brittle coral
#

it was wild he was full blooded chinese but dark enough to be mistaken for african

#

from doing shows in the sun

#

but yea, all the fun stuff to do is down that way lol

#

hmu if you're ever in the bay

shut apex
#

i used to go there for work every now and then, before C-19

brittle coral
#

eyea i know my way around nightlife here

#

went afk but yea, I work at the club JWZ (one of the old netscape devs who helped build the original firefox core) owns now, it's a great intersection between the tech community and the music scene since I also DJ and am trying to build a rentals company for gear for shows at venues without sound and lights

shut apex
#

Cool! if i ever get to go back to SF!

brittle coral
#

wow

#

im hella salty

#

so my folding rig was down for 2 days because of a poorly seated DIMM

#

.......

#

it wasnt a corrupted OS drive, i tried to install windows on the new one and the bootable USB i just rebuilt kernel errored on boot XD

#

so i tore that bitches bones out and put them back in after testing them one by one

brittle coral
#

ugh still no luck haha

fast rampart
#

I have over 4 million points for Folding@Home.

stark bough
#

LA was gross
@brittle coral
LA is an endless sprawl and the wind patterns hold in the smog. SD is nice

brittle coral
#

yup i dont just mean the air tho the people were too xD. I do also have asthma though and that didn't help. the main speculated cause for all the people being that way historically was the leaded gasoline fumes making everyone a bit nuts but then the young people grew up around it and just learned to be that way too lol. my sister went to LMU till yesterday, she just finished her last final, and she said she generally never socialized outside of her close friend group for similar reasons. many people in the bay share the sentiment. so many people have moved to LA to focus on themselves at the expense of others, i have many friends from there none the less and I know it's not all bad but it really shows lol

#

i'm so glad they're getting some fresh air right now tho

#

the photos are absolutely beautiful

#

it's like a friggin render it's so clean now

#

im curious to see what kind of long term positive environmental effects the lockdown may have in future studies

#

they're already predicting fewer deaths in many areas long term because of the length of the exposure downtime

#

anything cardiovascular plus other cancers all decreasing in probability even if the numbers go back up after

spiral coral
#

So, is it better to contribute GPU time or CPU time?

main mist
#

Both

spiral coral
#

Do most of you contribute from multiple machines or just one?

main mist
#

Many people have multiple machines.

spiral coral
#

My first step was to donate a bunch of unused cores on my VMWare box.

#

Sadly I now know the limits of free ESX

#

though I suppose I could spin up another VM

plain geyser
#

One machine or many any helps πŸ˜›

spiral coral
#

We just had a cool spell here, so I'm thinking of bringing up my backup VMWare box to give another VM or two of computer, but it's all CPU

plain geyser
#

but yeah tehre isnt enough WU to go around right now anyways

#

my main system only got a total of 7 WU today

#

for GPU

spiral coral
#

I got 18 for CPU

plain geyser
#

CPU never runs out πŸ˜„

#

the GPU WU pool is very very sparse

#

i don't think between the 3 system i have folding i have ever seen one CPU sit idle for more than a second in-between WUs

spiral coral
#

hmm... Is there just more GPU capacity so it's processed faster, or is it just that the workloads are more-often best suited to CPU... or is it that not as many researchers design GPU-based works?

plain geyser
#

i think theer are just more powerful GPUs lol

#

for crazy peole like me that have mroe than 1 GPU per system

#

but i dont know really

#

i am just trying to get to 2 digits in rankings ;D

#

and the lack of WU is making that goal really hard to reach πŸ˜„

spiral coral
#

Ahh. more CPUs?

plain geyser
#

CPU WU dont give as much points

#

GPU WU are much harder and intensive

#

CPU avagre 50k a day?

#

my GOU can do 2-4 mil a day if they dont run out of WUs.

#

GPU*

spiral coral
#

50k/day for how many CPUs/how fast?

plain geyser
#

sorry CPU is about 50k ish per day and GPU is about a 1.2-1.8 milling a day

#

at elst for me

formal galleon
#

huh

plain geyser
#

i ahve 8600k 8700k and 3700x folding

spiral coral
#

hmm...

plain geyser
#

but yeah keep running out of WUs so doign about 1.3-2.2 mil a day

#

total between my systems combined

spiral coral
#

Hmmm... I wonder if I can get my roommates at my other place to boot up my systems there.

#

could have a GPU or two there contributing

plain geyser
#

careful to make sure you room is cooled well

#

got hoem today to my room feeling like a ovan cause my cat turn of the AC

spiral coral
#

It was snowing outside earlier today.

plain geyser
#

oh you have no problems than πŸ˜„

#

i swear my cat is trying to murder my main system, this si the 3rd time she did this

#

i think the GPu was sitting at 87 nd cpu at 92 lol

spiral coral
#

she's a cat, of course she wants to make your place warm! Just make sure there's always a sunbeam somewhere and she won't always be turning off the AC.

plain geyser
#

meh i got IPR on everything so if they fry it's aite i get new stuff lol

spiral coral
#

Some of my network is running legacy but good gear, so need to take care of it.

shut apex
#

If you are talking about F@H. If you have the time an inclination, join the BOINC Penthalon.. Just run Rosetta@Home for LinusTechTips_Team!! The more the merrier!

thick holly
#

Hey guys!!! Going to dl Rosetta at home in a bit. Do I need to take off CPU on my FAH?

small current
#

also, is there any way to pause just cpu folding as its going 100c+

#

almost throttleing

#

my gpu is ok at 71c

jade slate
#

put the folding power on half medium

#

also OCing a GPU causes a WU to fail easier

#

and you probably got a big WU

#

wait, is it a laptop with with an I7 7700HQ and a GTX1060 max-q

thick holly
#

Did Spectrum change his Discord name?

small current
#

yes it is

#

wait, is it a laptop with with an I7 7700HQ and a GTX1060 max-q

jade slate
#

yeah run at medium power

spiral coral
#

How does Rosetta/BOINC deal with hyperthreading?

#

Or does it... not... which is what it looks like to me?

shut apex
#

@thick holly for F@H CPU, I have diasbled my F@H cpu for the duration of the BOINC event. All threads are working on Rosetta

#

My 12 core CPU with hyperthreading is doing 24 tasks asks at time

spiral coral
#

Hmm

#

I'll admit I just threw BOINC on my main PC last night and let it rip overnight. this morning I found that many of the work units failed

#

when looking at the logs, the errors all related to timeouts.

#

so this morning I switched to only using half the CPUs (ie only the number of cores I have) and I have managed to successfully finish more work units in the 3 hours since I woke up than the computer did during the 6 hours I was asleep.

#

which is what leads me to believe that Rosetta doesn't understand that a thread doesn't equal a CPU... besides, logic and project management tells me that BOINC should shape the task to the number of cores and not the number of threads.

#

because, 12 hours into it my 10 year old second hand machine has earned 5 times the number of credits than my 6 month old daily workhorse

#

which doesn't seem right

shut apex
#

thats strange!

spiral coral
#

Now I am benchmarking an 8-CPU dedicated VM running on a 12-core/12-thread host with 77% overall utilization against a daily use 8-core/16-thread system.

#

but I killed most stuff on the daily use machine... except Discord, email app, Slack, and VMRC

weary epoch
#

what kind of cpu was running into timeouts with all threads used? even an old sandy bridge is able to keep up in rosetta at home

spiral coral
#

Ryzen 7

#

16 threads/8 cores means the computer can only do 8 things at once. It likes to pretend it can do 16 things at once because when it's switching what it's doing, it can switch faster

#

but, if you are doing a single long-running thing, all hyperthreading will do is interrupt the long-running task to do other tasks

weary epoch
#

hyperthreading is more granular than that - what you're describing is more like having 8C/8T and running 16
tasks
with hyperthreading/smt, core resources are statically partitioned or competitively shared so that usually, you get better utilization of core resources (and thus more throughput)

#

for example if one thread stalls on an instruction cache miss, the core can keep fetching instructions for the other thread (assuming it isn't also stuck on an instruction cache miss)

#

so to put it simply, yes, it is doing all 16 tasks at once

tight urchin
#

Clam is correct

spiral coral
#

But from my overnight run with not much else going on on the system, there were 16 tasks on my 8C/16T machine, and many of the WUs timed out... a few succeeded, but those that timed out were all either almost done or barely started. which makes me think that the threads don't stall all that often

#

now, I will admit that this is the computer that I'm currently chatting with and has a variety of devices installed that I'm not using... but since I limited BOINC to no more than 44% CPU utilization, it hasn't failed any new WUs and hasn't suspended due to thinking my system is in use.

#

and while task manager is sitting pegged at 50% CPU utilization, I've managed to complete many more WUs and my <1 year old system is now quickly catching up to my 10 year old system... which is more along the lines of what I expected.

#

I'm finding this to be an interesting experiment in system/workload optimization

rich depot
#

Just a reminder that you can Boinc on Android so if you have an old phone or any other Android device you arent using at the moment you can Boinc

weary epoch
#

rosetta at home work units tend to need a lot of RAM, and also hit the memory hierarchy (caches and dram) pretty hard
on zen 2, running as many rosetta WUs as you have threads for seems to be a net win. it's possible that on zen 1, the weaker cache/memory subsystem holds it back and it's not able to scale with SMT

but I think that's quite unlikely

rich depot
#

My phone has 6 GB ddr4x ram, 128 GB storage plus a 256GB micro SD card. It also has an 8 core CPU with 4 cores at 2.8 GHz and 4 cores at 2.4 GHz @weary epoch

weary epoch
#

oh I was talking to mmz who has a gen 1 ryzen system

rich depot
#

Ah

weary epoch
#

not sure how it's set up for phones, but phone CPUs tend to be quite weak

rich depot
#

The Boinc app has been out for years

weary epoch
#

yeah, I've seen it. but I haven't tried measuring things when running on phone

rich depot
#

Arm is pretty powerful but sadly it's not as well supported

#

Arm CPUs can outperform x86

weary epoch
#

not really

#

I mean theoretically yes, but there are no ARM cores out there that can match zen 2 / skylake performance right now

rich depot
#

All Amazon data centers run off of arm CPUs made by Amazon @weary epoch

weary epoch
#

that's not true

rich depot
#

Lots of datacenters are fabricating their own arm chips

weary epoch
#

they have a limited set of graviton instances out there, but the majority of their cloud offerings are still x86

rich depot
#

Apple is moving towards arm chips

weary epoch
#

phones and tablets have been on ARM for a while, because ARM CPU makers have excelled at low power/performance targets, along with integrating stuff like modems/dsps/isps on-chip

rich depot
#

Only reason why arm isn't as big is because it doesn't have the same offering of applications

#

When apple moves to arm however it will force changes towards ARM

weary epoch
#

and they literally can't match zen2/skylake in terms of desktop performance. that's the other reason

rich depot
#

They could match it if desktop arm chips were being manufactured

weary epoch
#

yeah we'll see when apple moves towards arm. mostly it's unfounded rumors, and it could make sense in macbook air (which is a low power / low performance target)

rich depot
#

Pretty sure if AMD got into making ARM chips they could be powerful

#

I do think x86 will be abandoned at some point

weary epoch
#

In theory yes. If you bolted ARM decoders onto Zen 2, well it would perform like zen 2

#

I disagree, I think they'll continue to coexist

#

there's really not much to instruction set, it's down to what each CPU maker can do

#

right now, no one making ARM CPUs can go >4 GHz on a reasonably wide core

rich depot
#

I think x86 will be here until Chinese manufacturers make knockoff x86 CPUs that outperform AMD and Intel and AMD, Intel will team up to replace x86 HYPERPOGGER @weary epoch

weary epoch
#

will wait to see that happen. right now Chinese x86 chips are just copied Taiwanese VIA Nano chips. They claim they tweaked them, but apparently not much because they still underperform Excavactor

rich depot
#

XD

weary epoch
#

AMD and Intel have done a lot of work to hit high performance targets, and it's not easy to catch up

rich depot
#

Yeah

#

If China uses leaked internal data from hackers then they can catch up rather fast @weary epoch

weary epoch
#

getting leaks is one thing, implementing it is another

rich depot
#

The leaked AMD GPU stuff could be used by a Chinese company to copy it and since China doesn't care about copyright or really any western policies then it will be released

weary epoch
#

nah, I mean even with some circuit diagrams, implementing it is another thing

#

like they have a copy of zen 1....not like that helped them do much

rich depot
#

Yeah

#

The Chinese could kidnap Taiwanese chip manufacturers which would change things drastically, I picture the Chinese government doing this at some point @weary epoch

weary epoch
#

they already bought via

rich depot
#

XD

#

According to China and the WHO Taiwan doesn't exist

weary epoch
#

that's sorta immaterial to processor design

#

anyway they have all the stuff from VIA, but still haven't managed to push the decade old Nano design much

rich depot
#

Ah

weary epoch
#

high performance CPU design is hard πŸ˜›

brittle coral
#

yea right didnt china buy VIA ages ago?

#

i remember when their CPUs were fast enough to "compete" (LOL) with intel's atom stuff early on, but yea they never kept up and now it's probably to late

#

One does not simply hop into the x86 CPU game at it's current stage

weary epoch
#

well VIA Nano by itself was already a competitor to Atoms

#

so they just haven't moved from where VIA left off

brittle coral
#

yea and thats my point

#

nvidia stands a better chance of creating an x86 cpu than anyone else right now simply because they at least have engineers who know how to build high performance transistors in a similar fashion, they'd just need to hire some old x86 engineers to help apply that knowledge and invest probably 10-15 billion dollars minimum, no big deal right?

#

just a few years of profits to try and play catch up, and that's if they can get an x86 license πŸ™‚

weary epoch
#

nvidia just doesn't have CPU expertise and don't seem to care though...

brittle coral
#

This convo has been going on since the early 2000s lol

#

exactly

#

nobody does

weary epoch
#

their strength is in GPUs and they're pushing that as hard as they can. not much point in going after Intel or AMD's CPU lead

brittle coral
#

its just AMD and intel and if china could compete they would have tried already, they keep saying they're gonna and then it ends up being total garbage

weary epoch
#

yeah

brittle coral
#

dont get me wrong i'd root for whoever tries to do it that isn't under the control of the CCP

#

but china is never going to figure this game out xD

weary epoch
#

china's targeting the low power/low perf and "you can browse the web and type emails on it" segment

#

which is also mostly where ARM is, and that segment is easier to tackle than all out 'how fast can we make a core go' segment

brittle coral
#

last i heard they were targeting HPC

#

xD

#

but that was in 2008

weary epoch
#

HPC's just about how many cores you can make

brittle coral
#

they didnt want their high end systems that spy on the world to have spying microcode potentially

weary epoch
#

very different from desktop segment

brittle coral
#

yea but in 2008 it wasnt as much

#

they were still talking about 10ghz cpus

weary epoch
#

like FAH would do decently well on a 100 core atom chip, but that chip would suuuck on desktop

brittle coral
#

i mean yes and no, if the instructions per clock is subpar and it can only run at low clocks (sub 2ghz) without burning out the transistors then you're still gonna suffer when you run into density problems eventually

weary epoch
#

what do you mean?

#

if you have low IPC and low clocks, you better be drawing very little power. and that's what China's chips do right now

brittle coral
#

well last i heard they were like a decade behind on process tech

weary epoch
#

so if they wanted to do HPC, those are usually optimized very well to scale across tons of cores

brittle coral
#

so unless they're building it at TSMC or something that's a factor im thinking about

weary epoch
#

no they're just buying TSMC wafers so they're not too behidn

brittle coral
#

okay

#

see

#

when i was looking into this intel was at 65nm

#

and china was still using 300

#

xD

#

in private communist owned fabs and the like all internal

#

they were obsessed with the west not having any access to their hardware designs EVER

#

and it shot them in the foot on performance for over a decade

weary epoch
#

not that the west cares

#

when they have intel and amd

#

yeah zhaoxin is now on tsmc 16nm

#

certainly the gap now isn't 300nm vs 65nm. but as for whether they're gonna catch up, we'll wait and see

brittle coral
#

the west's best hope for spying on china via hardware insertion of monitoring tools is getting a spy in the design program now IMO, it's not worth trying to infiltrate TSMC at that stage of design lol

weary epoch
#

idk why the west would care though

#

they already have way better designs at every power/perf target

brittle coral
#

we were saying the same thing about russia back in 2008

#

in terms of political interference

#

but the US govt had stayed on high watch still

#

im factoring based on government interests not consumer market cares

#

there's a lot of factors in tech that the consumer industry never thinks about

weary epoch
#

oh if you're talking about the bigger political picture (not cpu microarchitecture) then sure. but that's a whole different topic

brittle coral
#

but one of my family works in crypto and this is a common subject

#

yea but they're intertwined

#

thats the main reason china didnt want to use western fabs ever

#

so thats a big part of why they're behind

#

it all plays together

weary epoch
#

but about the exact microarchitecture china's designing, I doubt anyone else cares

brittle coral
#

obviously

#

im implying that those dont matter and the goal would be to inject hardware designs into whatever they do make, but the arch itself is irrelevant to those ends

weary epoch
#

oh, if there's any monitoring, the CPU wouldn't be the right place to put it. networking devices would be better

brittle coral
#

its just that now with process tech so far along, you can't just insert stuff at the manufacturing phase like you could decades ago, it has to be designed in kind more often

#

yea i mean you're right

#

but this is still a common topic of discussion in cryptography circles

#

as the best way to tap data unaltered is in cache

#

it has to be decrypted at that stage to operate on

weary epoch
#

and where exactly would you send it then πŸ˜›

brittle coral
#

ayeee

#

its a minimum 2 component system

#

you tap the cache and then hack the network stack to accept that low level output. i am not ap rofessional so i don't know how to describe it better

#

i just know my step-dad spent several years on such a case working on litigation relating to it because someone was suing the US govt for putting a spy in their company

weary epoch
#

ok yeah you'd need those two parts

brittle coral
#

but he couldn't talk about what company it was and only recently let slip about what the case itself was because its a decade past and the topic became public knowledge at some point, but he's STILL not allowed to name the company and there were over a dozen involved in the case (in private)

weary epoch
#

eh, guess we'll never know then

brittle coral
#

yea lol

#

the one thing i can confirm is that the conspiracies about the US govt trying to inject microcode into transistor architectures to spy on stuff is definitely real

#

but we all knew that already

weary epoch
#

idk, it's hard to verify any of that anyway

brittle coral
#

yea lol

#

let the guys at DEFCON worry about it πŸ˜‰

weary epoch
#

anyway on the subject of bionic/rosetta at home, I'm running 8 WUs on one CCX

brittle coral
#

there's so many of them already we can save our time and nerd out about who has the smallest chips πŸ™‚

#

word

#

i wish i had all my systems up

#

my AMD GPU system has been down for most of this week

#

i keep trying stuff to troubleshoot it but when it's in windows it keeps just powering off

#

t he kicker?

#

it was doing this with

#

a different mobo

#

then a differnet PSU

#

different GPU

weary epoch
#

fluctuates a lot, but it's averaging about 1 IPC per thread (two instructions per core clock)

brittle coral
#

the only thing similar at this point is the RAM

#

and that's tested good

#

and the SSD + 4x pcie riser for m2

#

and its fine lol

weary epoch
#

and some crazy number of hits to L2/L3 cache

brittle coral
#

nice

#

thats fat

#

thats legit perfect SMT performance basically

#

fast clap

weary epoch
#

it's not much SMT scaling, iirc it got pretty close to 2 IPC just running one WU per core

#

but cache bandwidth and latency is definitely a big thing with rosetta

brittle coral
#

for sure and yea that's what i was looking at even more closely

#

especially on systems with lots more cores i was reading cache latency can become really tricky on some designs

#

intel did better with it on average but ended up having average higher latency across the whole chip because of it, at least in their latest mobile designs

weary epoch
#

yeah. AMD just doesn't try though, it's 4 cores per 16 MB L3 cache

#

they don't try to scale past 4 cores/4 L3 slices

brittle coral
#

yea

#

didnt hurt them to badly in most workloads either

weary epoch
#

yeah

brittle coral
#

so my friend just added his 6 rx580 system that he had been folding on for a month now to LTT because he just hasn't had any time to do tuning till today

#

so thats a thing ^_^

#

in the voice of tim allen "MOAR POWER!!! AUHG AUHG AUGH AUGH"

#

until a few days ago his printer was running 24/7 printing valves while his wife spent her off hours sewing masks for the hospital she works at

spiral coral
#

rosetta at home work units tend to need a lot of RAM, and also hit the memory hierarchy (caches and dram) pretty hard
on zen 2, running as many rosetta WUs as you have threads for seems to be a net win. it's possible that on zen 1, the weaker cache/memory subsystem holds it back and it's not able to scale with SMT

but I think that's quite unlikely
@weary epoch
But what I'm observing is the opposite. If there was a lot of idle CPU time waiting for cache and RAM, then two threads running on the same core would work well. Sadly Windows doesn't differentiate between wait and other cpu states. On my Linux box, I noticed that the ystem had some largish % CPU that was waiting... and it even paused some WU's with the message 'waiting for memory' I recently rebooted the VM with 2 gigs allocated per core and I'm now 98% CPU being utilized by rosetta

brittle coral
#

oof i was delirious before bed apparently

#

i checked my rigs that were on with VNC

#

they were both up

#

but now i check today, one was off

#

for 26 hours.....

#

idk if it's related but it crashed after (system was on but unresponsive, idk if anything was on screen because monitor wasnt getting signal)

weary epoch
#

windows can't see what's happening inside a core. actually no OS can, none of them read those counters afaik (and they're not easy to interpret). Windows does schedule tasks so they don't share cores, as long as you don't have more tasks than cores

#

I don't know why rosetta at home isn't scaling with SMT on Zen 1. I don't have a Zen 1 chip to test with, but not scaling with SMT is extremely rare

#

if it says 'waiting for memory', you might be running out of RAM capacity

#

^ that's how much memory it takes to run 32 rosetta WUs at once

spiral coral
#

Yeah, I have the 8-core system a bunch more RAM and it's still not using all of it. It's a VM, so I was kind of tweaking it to see how it did. The Ryzen system has 32 gigs of RAM and running on metal so no issues on that one.

weary epoch
#

I guess it's one of those extremely rare cases where something doesn't scale with SMT?

spiral coral
#

I guess

brittle coral
#

@weary epoch the reasons i was given was that there's not enough cache or some other element of the pipeline in those designs?

#

it affects all multithreaded workloads

#

and in Zen2 they doubled up on those components

weary epoch
#

so I set rosetta to use 50% of CPU threads, and I get around 1.5 IPC/thread. it was around 1 IPC/thread using all CPU threads, so on my 3950X, it's scaling pretty well

spiral coral
#

but I don't see how having more cache can make you run two things at once. There's one core, so if the thread can chug for a long time without having to wait for cache, RAM, disk IO, or anything else, it is most efficient to let it go through that whole process... it's only if that thread can't run constantly on that core that you get the otherwise wasted cycles which a second thread could use.... which is what hyperthreading makes way more efficient by having the second thread ready to hop and do its stuff

weary epoch
#

on any modern CPU, it's extremely rare for one thread to utilize a core's resources fully, because there's not enough independent instructions, you have some instructions waiting on a cache miss, etc.

#

so two threads gives the CPU core more independent instructions

#

as for how much additional cache helps, well that's a complex topic

spiral coral
#

I must admit that this Ryzen is the first AMD CPU I have bought since... 2001, if I recall, so I don't know (and couldn't easily find it online) how deep is the pipeline on these CPUs?

brittle coral
#

im looking at a zen1 vs zen2 vid now

#

if they tell me ill lyk

spiral coral
#

thank you

brittle coral
#

also you skipped the athlon 64?

#

how xD

#

respectfully of course

weary epoch
#

by how deep, you mean branch mispredict penalty?

brittle coral
#

probably

spiral coral
#

yes

brittle coral
#

but also it'll affect clock speed

spiral coral
#

And I skipped the Athlon 64 as I had no interest in going to 64-bit in those days as it just meant the rest of my hardware and drivers wouldn't work.

brittle coral
#

hence why the slower clocked athlon 64 CPUs were faster per clock than intel back then

weary epoch
spiral coral
#

After I went through 3 K6-2 CPUs in an 18-month period I went Intel and didn't look back... until recently.

brittle coral
#

i never used my a64 on a 64bit system

weary epoch
#

those figures are like for an op cache hit. it'll be worse if the branch target is in the instruction cache

brittle coral
#

amd had a huge performance bonus till core2

#

er till core i guess

spiral coral
#

yeah, but in those days I wanted something that would last a while, so I stuck with Intel through those years.

brittle coral
#

intel did worse at that tho imo

#

those old dual core a64s were in rigs forever after that i had to support at my job (consumer pc repair)

weary epoch
#

a64 would still work today for web browsing and email

brittle coral
#

while the intel systems all disappeared fast because they were hot and slow and all single core, while even some of the a64 single cores lasted longer

#

yea

#

a dual core a64 system would

spiral coral
#

If it wasn't for moving at various points in time, my old old P-IIIs and P4s would still work. Actually, I have one of each still around.

brittle coral
#

intel competed with hyperthreading which was meh back then and many disabled it

weary epoch
#

also for details like pipeline length, good luck getting that from youtubers. they usually have no clue what they're talking about

brittle coral
#

yea lol

#

this is a more in depth video tho

weary epoch
brittle coral
#

btw it looks like the big difference on zen2 was the doubling of L3

spiral coral
#

and I agree with hyperthreading... though I did go multi-CPU way ahead of the curve... well... proof is in that statement, I was multi CPU before CPUs were multi-core

brittle coral
#

in the past only a few cores actually had direct access to ram without going through a bunch of chiplets

#

it's an issue of bandwidth to RAM is what im seeing, which is why zen2 has less cache misses since double L3 helps lol

weary epoch
#

yeah northbridge architecture. and AMD was first to bring the northbridge/memory controller on-die

#

and now it's off-die again with chiplets πŸ˜›

brittle coral
#

hahaha right

#

i was amused when i saw that

weary epoch
#

I'm not seeing a memory bandwidth bottleneck at least with zen 2 and rosetta

brittle coral
#

i bet you they eventually go back

#

or go to a multi controller design

#

idk how that'd work out tho im no engineer

weary epoch
#

actually it could be

#

it's on and off, but I'm also using counters that I'm not sure are accurate

#

for l3 hit bandwidth I'm assuming each hit fetches a 64-byte cache line

brittle coral
#

also

#

it has better branch prediction

weary epoch
#

yeah zen 2 branch prediction is pretty good

brittle coral
#

they are using a totally redesigned branch prediction pipeline

#

even if it only helped by a couple % lol

weary epoch
#

CPU performance uplifts are measured in a couple % these days

brittle coral
#

yea lol

#

i got into a really funny argument the other day with a friend because he was going on about intel and nvidia's market control dominance

#

he wasn't thinking about the fact that if amd didn't exist those companies wouldn't do nearly as much as fast and prices would be insane

weary epoch
#

also in that case, where the branch predictor is 98.5% accurate and I'm getting 2.34 mispredicts per 1K instructions....let's assume a branch mispredict costs 18 cycles
that's 42 wasted cycles. Zen 2 can in theory feed 5 ops/cycle into the backend, so that's potentially 210 wasted instructions per 1k instructions

brittle coral
#

AMD absolutely has a stranglehold on control of market price and speed of development and im happy for it because now that we are only seeing a few % competition is the only thing driving those numbers higher

#

and HPC demands

weary epoch
#

sort of a worst case estimate but still

brittle coral
#

yea thats a 20% impact though

#

thats not small time

weary epoch
#

actual impact is closer to 6% wasted work (because it's not feeding 5 instructions in per cycle for this workload)

#

just measuring ops delivered by op cache/decoder, versus ops that were actually committed at the end of the pipeline

brittle coral
#

yea but you know there's someone out there running an HPC datacenter for their company and that benchmark is how they calculate what hardware to buy

#

there's always someone who cares about the outlier %

weary epoch
#

but yeah going from 98.5% accuracy to 99.4% would be a 3% improvement πŸ˜›

brittle coral
#

yup

#

sounds about right

#

or if you look at it another way

#

thats a near 50% improvement if you use the low number as a zero baseline

#

min-max math for MMOs works that way lol. you take a number for an incoming value and calculate the % difference using the low number as a zero baseline and that's your improvement

weary epoch
#

well that's kinda a sketchy way of looking at it

brittle coral
#

you calculate on the margin of difference. if someone is hitting you for 10k damage and you increase your resistance from 97 to 99% and end up taking 5k damage now then thats half damage

#

it matters in that context

weary epoch
#

depends on how resistance is calculated for that particular game

brittle coral
#

when you're at the upper end of % and you always operate in that space it matters

#

yup

#

im thinking of everquest 2

weary epoch
#

every MMO I've played has very weird mechanics for resistance calculations

brittle coral
#

thats how they did the math

weary epoch
#

I played ff14 (where it didn't matter) and ESO (where increasing resistance gave very diminishing returns)

brittle coral
#

it enabled people to think about small % differences at the correct order of magnitude

#

because it was indeed half the damage going up a couple % at the high end

#

and it was shown to me originally at the time using a compute analog

#

because they were doing a type of HPC where that improvement mattered

#

the longer it took to process the more obsolete the data became

weary epoch
#

yeah HPC is a different animal (compared to desktop stuff)

#

hah

brittle coral
#

everything had to be in sync

#

if one section of the pipeline slowed down it meant that 98% of the server slowed down to keep up with that 2%

weary epoch
#

oh yeah for HPC workloads where you need every thread to finish something before everyone can move on, stuff like turbo and hyperthreading can really throw a wrench in the works

#

so sometimes they turn that stuff off

brittle coral
#

Yup

#

Even for games it mattered

#

Early on people with multi gpu setups would turn off HT and dynamic core clocking because it lead to more consistent FPS output from both GPUs in SLI/crossfire

weary epoch
#

I haven't heard of that

brittle coral
#

I remember having to on my 2600k with a pair of 7950s cause it was causing frame tearing

weary epoch
#

though in early days, the Pentium 4 was the only hyperthreading thing in town, and its particular HT implementation had some issues

brittle coral
#

It wasn't as well known. Most overclockers already turned off HT on the early i7 chips to hit higher OCs

#

They totally skipped it for core so when multi gpu got big it didn't matter for like 5 years

stark bough
brittle coral
#

@stark bough wont they flatten out fine tho once you stick them on stuff?

#

i've had plenty of stickers come this way turn out fine on the end surface they go to

silver wing
#

You can also flatten them before sticking them to things by placing it on a very flat surface, hard cover book, and then placing a hard cover book on top and then a lot of extra weight on that

#

works very well, that's how you crisp up bank notes to looking like new

stark bough
#

Just glad thigg bt s weren’t peeed off but thanks I’ll give that a try. Also yay my shirts came in

#

I just find it funny how they shipped from Blaine

#

Be more Vancouver plz

stoic pecan
#

wait how many ppd r yall getting on one 2080ti?

silver wing
#

@weary epoch SMT doesn't scale when the execution units are fully utilized or the load store/front end is fully utilized. This is common on Zen1/Zen+ when running AVX2 workloads as the two 128bit FMA's combine to do one 256bit operation per cycle. Each FP pipeline is completely independent from load/dispatch to execution so when an AVX2 operation comes in and gets split the independent pipelines become dependent on each other (both have to complete before new work). That means with SMT enabled and thread 1 sends in an AVX2 operation and thread 2 wants to do anything floating point operation (SSE or AVX) it has to wait for thread 1 to finish so zero gain from SMT.

#

Zen 2 increased the FMAs to 256bit each, 2 per core, so you can do two AVX2 operations per cycle unlike Zen1/Zen+. AMD did not implement AVX-512 by combining those two 256bit FMA's as it would require a lot of other supporting changes

proven marlin
#

Also gaining on Plant3DNow in the Marathon, though looks like the L'Alliance is on our rear end.

weary epoch
#

@silver wing in rosetta I get 2-2.4 instr/core clock with SMT (depends on WUs), while Zen 2's pipeline can do 5 IPC. And rosetta's light on FP ops compared to FAH, averaging 0.12 flops per instruction. On top of that, lots of the flops aren't FMA, and Zen 1 can in theory do 2x256 bit operations per cycle because it has 2x128 bit fadd and 2x128 bit fmul (matching Sandy Bridge's AVX throughput)

#

so I don't think frontend throughput or the FPU are limiting factors for rosetta and SMT scaling.

silver wing
#

Rosetta is AVX2. Zen has 4 FP units 2 Add and 2 Multiply Add and you can't do both at the same time for a lot of things, AVX2 being one of those

weary epoch
#

another thing - the FP execution units are fully pipelined. so while a FP add takes 3 cycles to produce a result on Zen, you can feed the FP add unit an independent operation every cycle. So it's extremely hard to get FP throughput bound except in very optimized code like linpack

#

if you do avx-256 operations, Zen 1 breaks that into two 128-bit ops. you could do one 256-bit add and one 256-bit multiply per cycle on zen 1

silver wing
#

Anything AVX2 going through Zen/Zen+ is going to hit a massive traffic jam

weary epoch
#

only for FMA though, it's not that weak for regular AVX2. in any case, rosetta doesn't have enough FP operations for the FPU to become a bottleneck

silver wing
#

Yes but those two 128bit ops are co-dependent and since you can't do an Add opp and an Add Multiple op at the same time that's where Zen2 picks up the most

weary epoch
#

for sure Zen 2 has a huge lead when you have 256-bit ops. I just don't think rosetta has enough of them to give Zen 1 a hard time

silver wing
#

So you have 4 FPU per core but you can't always actually have 4 ops going, AVX2 or not. I'd have to read up again why but that's where Zen's weakness is, but it's really good at 128 bit ops though

weary epoch
#

FAH on the other hand might. FAH has a ton of FP operations

#

I believe you can have 4 ops going in Zen 1's FPU, as long as they're not FMA

silver wing
#

Zen 2 made it way less of a problem

#

A Zen core is 2 FPUs of each type

weary epoch
#

both zen/zen 2's FPUs look really similar, except Zen 2 doubles the vector width

silver wing
weary epoch
#

oh, my screenshot is from the zen 1 optimization manual

#

basically you can do 4 FP ops/clock, unless you need three input operands (aka, FMA)

silver wing
#

The main problem when it comes to Zen/Zen+ is when you do do a 256bit op and it's a FMA you can't do an FADD opp so both get wasted and SMT literally ends up net 0 gain

weary epoch
#

assuming each thread can get to 1x256-bit FMA per cycle, yeah. but rosetta's nowhere near that

#

the FP unit has a lot of idle time

silver wing
#

no idea, I haven't really measure it or have the tools to atm anyway. All I know is it's AVX2 compiled at least

weary epoch
#

I'm reading Zen 2 performance counters. I don't have a Zen 1 chip unfortunately

silver wing
#

those will do, the workload is the same so you'll get the same information I'd expect

weary epoch
silver wing
#

"Zen 2 handles transitions between the SSE and AVX mode by microcode which takes approximately 100 cycles in either direction. Zeroing the upper half of all YMM registers with the VZEROUPPER or VZEROALL instruction before executing SSE instructions prevents the transition." Didn't see that before, interesting

weary epoch
#

yeah, because Zen 2 uses full 256-bit registers, but still has to deal with 128-bit ops while preserving the upper half of the register

#

you'll see a similar warning in Intel optimization manuals, though I don't think it's 100 cycles there (have to check)
zen 1 has no such penalty because it only tracks 128-bit registers

silver wing
#

100 cycles is a lot though, that sounds like a good way to cripple performance if you aren't doing it properly lol

weary epoch
#

it's on transition, not on every op

#

as long as you're not mixing 128-bit and 256-bit stuff in the same loop, you're mostly ok

silver wing
#

yea I just mean if you are transitioning, kinda frequently, and not following that it'll be painful

weary epoch
#

oh yeah, hence the advice

#

but people should have been optimizing for that. just checked agner's manual, Sandy Bridge has a 70-cycle penalty for that transition

silver wing
#

only place I can think of off hand where that would be a problem is emulatos and plugins

weary epoch
#

oh, what's with emulators and plugins?

silver wing
#

I mean game console emulators where there is like 3-5 different graphics plugins to choose from and they all have different SSE and AVX modes you can pick from

#

I haven't run up my ps2 emulator in ages though

weary epoch
#

then each function that does 256-bit stuff should do VZEROUPPER

silver wing
#

you'd hope so, probably do by now, but those are a mess with weird tweaks and bug fixes just to make them work

weary epoch
#

well, good news is Skylake has no such transition penalty

silver wing
#

But I'm still on Ivy-E

weary epoch
#

welp, 70-cycle transition penalty then

silver wing
#

back in my day

weary epoch
#

eh most people used ivy bridge when I was in college

#

haswell was just coming out

silver wing
#

4930k + 2x 290X still

weary epoch
#

I do have a salvaged sandy bridge system...

#

trying to see if the CPU can tell me when it eats a transition penalty

silver wing
#

I've still got working Nehalem EP and Westmere-EP systems

weary epoch
#

I used to have a 2xE5645 westmere system but got rid of it

#

per-core perf was too miserably slow

silver wing
#

6x X5650 burning though boinc atm

weary epoch
#

nice

#

bet it's eating a lot of memory too, if you're using all 24 threads

silver wing
#

At least for Universe@Home not much slower than the Sandy-EP, Ivy-EP and Broadwell-EX systems

weary epoch
#

hmm, I never tried universe@home

#

just doing rosetta now

silver wing
#

It's the Javelin event

#

err no City Run

#

ends in the next 30 something hours anyway

weary epoch
#

ah

silver wing
#

Xeon 5118 are disappointingly slow too, 8890v4 are eating them alive

weary epoch
#

I don't know about most distributed computing projects. Just FAH and Rosetta for now

#

oh broadwell better lol

#

wait 5118 is Skylake?

#

Xeon Gold 5118?

silver wing
#

yea and the single AVX-512 ones

weary epoch
#

at first I thought that was a nehalem or core 2 xeon

silver wing
#

not that any of this is AVX-512 anyway

weary epoch
#

the broadwell chip does have more cores...

silver wing
#

yea I'm just comparing task completion times

weary epoch
#

oh hmm, weird

silver wing
#

8890v4 is sitting at around 2.6Ghz all core

weary epoch
#

is skx just clocking lower?

silver wing
#

5118 2.7Ghz

weary epoch
#

could also be SKX's higher uncore latency (for core to core, L3 access, and memory access), if it's running at a low mesh clock

#

also SKX's L3 is a victim cache, so it can't absorb L2 writebacks

silver wing
#

and effective memory bandwidth can be a bit higher on broadwell too

weary epoch
#

broadwell needs less memory bandwidth because its L3 can absorb stores

silver wing
#

"The new Skylake-SP offers mediocre bandwidth to a single thread: only 12 GB/s is available despite the use of fast DDR-4 2666. The Broadwell-EP delivers 50% more bandwidth with slower DDR4-2400. It is clear that Skylake-SP needs more threads to get the most of its available memory bandwidth."

weary epoch
#

that's interesting

silver wing
#

Total bandwidth it up, per core way down

weary epoch
#

just wondering why that's the case

#

latency maybe, if mesh clock is low. but at high core counts, ring suffers from more hops anyway

silver wing
#

Rosetta and Universe are all many single thread tasks

weary epoch
#

which is actually really good for Zen because you don't have much cross-ccx communication πŸ™‚

#

but still good for other architectures because core to core coherency is expensive anyway

silver wing
#

and single thread memory bandwidth is more than double that of Skylake-SP

weary epoch
#

that's so weird

#

on paper skylake should match broadwell in that respect. something else must be going on

silver wing
#

speaking about Zen 1 EPYC btw, I don't think I've actually checked Zen 2 EPYC

#

argh, that image broken for you?

#

there we go showing now

weary epoch
#

yup it's showing

#

hmmmm

silver wing
#

damn it, anandtech didn't do the same test for EPYC 2

weary epoch
#

one thing is, each Zen core can track 22 outstanding L1D misses. Skylake and broadwell can only track 10

silver wing
weary epoch
#

did they lock core clocks for the test?

silver wing
weary epoch
#

ok a calculation isn't as straightforward as I wanted

#

was trying to calculate how many L1D misses each core has to track, but clearly prefetch has a role to play

#

so epyc's doing 0.134 cache lines per core clock at 3.2 GHz. with 371 cycles of memory latency (116 ns * 3.2 cycles per ns), it's tracking 47 outstanding memory requests

with skylake, the same math gives 17 outstanding memory requests

#

I bet it's the queue between the core and uncore

#

and AMD has a bigger queues

#

I can't find hard numbers on the actual queue size though

silver wing
#

wikichip is the best place I find for that type of info

weary epoch
#

wikichip doesn't say 😦

#

if AMD/Intel don't publish it, it's really hard to measure

hushed bobcat
#

Is the LLT Team BOINC Event over?

silver wing
#

no

#

marathon still has 8 days to go

hushed bobcat
#

BOINC is hammering my CPU

#

maybe if i set it at 50% it would use 60%....

silver wing
#

Everyone that can could you for the next 24 hours run Universe@Home instead of Rosetta@Home, then move back to Rosetta after that. Would be a big help to gaining a rank in the competition

hushed bobcat
#

how do i do that?

#

nevermind

#

or maybe not

#

I write my passwords down and signed in a few days ago already

#

so im pretty sure my email and pass are correct

rich depot
#

yo so ive been folding for a long time and my Points earned is till 0?

#

do i have to wait for this "Work Unit" to get to 0?

#

@quiet wedge thanks

#

sorry me

hushed bobcat
#

you will get the points several hours after you successfully complete a WU

rich depot
#

ok so when the cpu gets to 100%?

hushed bobcat
#

not quite

#

it has to complete then upload to a collection server

#

once the collection server processes it you should get points

rich depot
#

ok makes sense thanks!! and go team ltt

hushed bobcat
#

πŸ‘

silver wing
#

@hushed bobcat Do you know each project requires it's own signup and account? Account's aren't global

hushed bobcat
#

i did not know that

silver wing
#

one of the quirks with boinc

#

also means you need to remember the LinusTechTips_Team for each project when you create an account

spiral coral
#

Does anyone know what it means when a WU seems to take a lot longer than all the others? I have a WU which has been running for over 20 hours, the progress is still going up but very slowly... now at 99.161%. Is it just a really weird WorkUnit that has a weird ending, or is it some type of bug?

marble quartz
#

is the boinc pentathlon still open to join?

#

how is boinc different to f@h

stoic pecan
#

^

silver wing
#

boinc has many projects, boinc is more just the framework and also the software you use

#

there are projects like Rosetta that are doing medical research and others doing other scientific areas, like Universe@Home which is looking at black holes etc

#

So Rosetta (CPU) + F@H (GPU) is a nice combo right now

brittle coral
#

Milkyway@home as well doing data processing for the Sloan Digital Sky Survey πŸ™‚

#

that's a good GPU project to run

#

now that seti@home is gone as well

#

also, thoughts on the 7.6.13 client?

#

does it still have the slots not showing bug or did they fix that

#

nvm it seems like they did πŸ™‚

#

yaaaaaay, my 3600x system is up with my AMD GPUs and my 5700xt just got a WU πŸ˜„ this is a good day

#

everything went smoothly no troubleshooting just wipe the drive and install windows

#

man that WU just sucked down a massive amount of CPU to get started, it went from 1% to 48% in a few seconds then back to 1% once it loaded everything into RAM, ive never watched that go down in task manager before lol

#

my farm should be a lot more stable now tho since im not using an unstable bench build that im still finishing troubleshooting. i got it stable on a fresh windows install for now on a different SSD but still, since when does windows cause a PC to randomly power off though even idle without overheating?

limber onyx
#

@spiral coral yes, some projects take longer than others

spiral coral
#

Thank you for the note - I understand that... I also see greatly varied RAM usage by WU, what surprises me about this one is that it went from 0-99% more or less in line with the other WUs it that started at the same time, it's the 99-100 that is has already taken double what 0-99 took, and though it keeps insisting that there are 10 minutes and 15 seconds remaining, my estimates put it at 14 more hours.

brittle coral
#

so uh, i been busy dealing with stuff for my farm

#

forgot to note that i made it past 100m a few days ago

#

yay me? dances

#

yes im almost a third of the way to my next

west fog
#

Gonna break 11m tonight partyblob

brittle coral
#

get on dat

#

πŸ˜„

#

everyone crank all the points on all the things forever

gaunt wasp
#

Wow. Just got a fresh work unit, project 16440. Look at this helicase protein! They’re comparing MERS to SARS-CoV-2 helicase to understand why SARS-CoV-2 is more infectious...

brittle coral
#

thats dope

#

i havent tried opening the renderer yet

#

also 7.6.13 seems to have fixed the bug with people having to use an older version to get slots so thats cool

vapid jungle
fossil bay
#

uwu

fickle needle
#

when i'm so bored of waiting for my folding client to pick up a new WU I just let BOINC run isntead

fickle needle
#

so fahbench covers the 21 GPU core, is there any way to test for the 22 core?

#

and does the mm_22 core require some higher level of compute support?

rich depot
#

Can you guys help?

rich shell
#

Maybe your gpu isn't supported

rich depot
#

I've been folding for over 300+ hours

rich shell
#

OK then, i have no idea

#

Lol

hushed bobcat
#

log?

rich depot
#

ha ha ha thanks

fickle needle
#

RIP, my GTS 450 and GTX 550 Ti have been obsceleted for foldign

#

hur dur lets pull out support for old hardware in the middle of the pandemic

brittle coral
#

to be fair

#

those GPUs produce less PPD than a modern CPU

#

and use more power in the instance of the GTX550ti

fickle needle
#

then with that logic, my GPU is more PPD than an old CPU, why is there still support for CPUs from like 2007?

#

I guess i'm kinda more annoyed that they thanos snapped two whole gens of Nvidia GPUs out of existence all at once, no announcement, no patch notes

brittle coral
#

lol i mean also nvidia has been dropping support for older stuff bit by bit so it might not be their fault

#

remember the RTX gen already has a cuda implementation that isn't compatible with older hardware as well

#

it sort of is with hacks but performance is severely affected

green nymph
brittle coral
#

@green nymph any idea besides making individual per PC accounts how to get it to properly recognize my per PC core/thread % settings?

#

i haven't been running my 3 ryzen rigs with a combined 26 threads available on the BOINC stuff because they kept insisting on defaulting to whatever the highest amount was

#

i could i guess standardize it to my weakest 6 core but that leaves a bunch of random threads unused or running very slow F@H work 😦

#

i have 2 6 cores but one's running 2 more threads for CPU cause it's running AMD cards and they use way less CPU than the 3 nvidia card system

#

and then my 3800x is running 4 more threads than the 3600x and 6 more than the 2600x xD

fickle needle
#

@brittle coral putting it on Nvidia is nonsense, plenty of other programs work on old cards perfectly fine,

brittle coral
#

i guess im mostly just speculating why tbh

#

also it depends on what programs specifically, not everything needs the performance levels that folding does

#

performance requirements for many things are fairly static by design while folding wants to use the fastest stuff they can get their hands on, it's possible the newest core uses something that isn't supported on those older generations. you'd have to ask on the official folding forums for a concrete answer though

#

and of course, running and being supported currently are different things, not that i don't disagree that it sucks if they removed it while you were actively using them

fickle needle
#

haha, nothing says 'concrete answere' like some random forum XD

brittle coral
#

its not a random forum

#

it's the official dev forum

fickle needle
#

?

brittle coral
#

that is where they discuss and work with the community on these things

#

it's THEIR forum xD

#

every time i've posted asking a technical question one of the engineers or people who help manage the beta team has had the answer in technical detail

#

this channel is just some random forum by comparison, hence my pointing you there

fickle needle
#

they're pretty sketchy for an 'official forum' XD

brittle coral
#

I mean isn't that how most forums usually are to a degree? unless the right person responds it's all just hearsay

#

they're there so that those at the top with the tech can have access to the community to let them know what they're doing and why, and they post update notes which may explain your issue as well

#

and of course so the community can give feedback and help them test their code

fickle needle
#

yeah, already checked the update notes, nowhere does it say 'fuck you, your cards no good'

brittle coral
#

you should let them know then, there may be a bug with their latest release

fickle needle
#

good point

brittle coral
#

they may have not intended to cut them both out or it could be some other issue entirely. for a while i was getting true beta WUs that werent meant for advanced flag users

#

and they were able to help me sort out why that was happening and there were tons of others all reporting it and then they fixed the bug on their end and instructed us on how to delete the WUs

#

super helpful and replied within 2 hours

#

plus the info i needed was already in the thread

#

the beta forum at least is extremely active

#

in other news, just hit a new all time firefox high xD. and people were saying i didn't need to upgrade to 32gb for any reason when i told them i was doing it

#

i was getting BSODs before i upgraded from memory management errors cause the windows memory handler couldn't keep up

#

that's with just one F@H GPU as well, there were 2 in this rig before i did it and at hte time the WUs were using 1-2gb of ram

plain geyser
#

Woot i am finally 3 digits in rankings!

brittle hull
#

why did i read the channel name as "folding and boing" lol

feral kite
#

I read it folding and bionic xd

brittle hull
#

lol

brittle coral
#

ugh i may finally get this 290x up tonight again in the rig it's going out in if it doesn't start randomly shutting down again once I image the drive to another one and reinstall fresh lol. I just got a 256gb USB3.1 gen2 samsung microSSD to image it to, should be pretty fast :). drive is also 256gb and it's full so it should be perfect lol

wintry spire
#

@plain geyser what?

brittle coral
#

@wintry spire he means he's in the top 1000 on the team (or all time) for folding lol

brittle coral
#

guys am I insane for putting 4000+ RPM 120x120x38mm server fans on my ryzen coolers? the 2600x was doing so well that i thought the fan wasn't performing as hard as i thought, and so i didn't order another fan ahead of time (i used up the others already) for the 3600x when it came and so the 212 evo black thats on there is still not keeping the temp peaks out of the mid 80c range, which is acceptable but it's also way way hotter than the 45-55c i was getting on the 2600x xD

#

what are you guys doing for dedicated folding rigs that are out of earshot to keep temps in check?

shut halo
#

dude, that 80+Β°c in folding is pretty normal on zen2. Or atleast thats what I get when folding with my 3800x (at stock). Now I have enabled 65W eco mode and the cpu stays in mid 60's

#

so. I recommend you enable some eco mode, or lower the power limit on the cpu

plain linden
#

my 3800x with a 280mm aio is sitting 74-76c while folding. 21c room.

fickle needle
#

and I mean, if you're worried about your 3800x running you can drop a few threads

fluid leaf
#

Agreed on the eco-mode. I only lost ~200mhz with all cores loaded. dropped my temps 12C and got a healthy bit more energy efficiency

plain linden
#

i have pbo enabled (should be enabled, asus motherboard has whatever cpu enhancements that default on with dhcp memory enabled)and my 3800x seems to like sitting 4.1 on all the folding cores. folding client is set to medium power so only 6 cores are folding? might be 7 ryzen master is showing 7 cores at 4.1ghz or so.

weary epoch
#

you could also use PBO and set PPT (package power target) to a lower value. that's another way of increasing power efficiency and bringing temps down

#

with more flexibility than eco mode

fluid leaf
#

@weary epoch nice tip. i'll have to play around with that once the pentathalon is over

brittle coral
#

@shut halo is it normal though on a zen+ as well? im trying to get an apples to apples comparison point here. the rigs are in different enough parts of the garage that one is working with about 3-4c higher ambient temps at night unless i vent out the whole garage by opening the door at some point for a bit, while the other one is in the wire closet where its a bit tight and there's a box fan blowing air directly into the room while the whole garage is supplied by a tiny blower fan that moves about the same amount of air as the closet is getting lol. but even then, one fan is moving like 40-70cfm while the other is moving over 180, so i got a 240~ cfm fan to compensate and we're gonna see how it performs xD. even if zen2 runs hotter under more stringent cooling it should still help with that 30-40 degree temp delta im seeing @_@

#

like that's a massive friggin delta. my 3800x is that hot under water in a much warmer room with a restricted (clogging) block im gonna clean soon because of it. I'm fine with it being hotter cause of where it is and the 2 extra cores but it just seemed higher than i figured i'd expect given the ambient temps. who knows lol maybe that fan will make a big difference but even if it only drops me 5-7c thats a big enough delta to keep me from throttling during the summer months when its going to get blazingly hot in there (in excess of 40c probably most warm days and 50c on the hottest). Also I would never lower the power limit for a CPU before trying to cool it further first, im trying to get every bit out of my hardware personally cause i have the power budget for it.

#

@silver wing i think i figured out an issue (a bit to late) for my BOINC config problem, wanted some input. i figure if i run BOINC in a VM i can tell it to use 100% of resources and just give it what i want right? IDK why i didn't think of this sooner, i've been so busy just getting my 4th rig working fully and fixing my friend's PC (technically my 5th folding rig now with 1 gpu in it) to give it back. Is it worth setting this up now still?

silver wing
#

You can do a VM yes but you can configure both boinc and FAH maximum number of cores to use so you might not have to

brittle coral
#

it wont accept the values though per PC

#

that's the issue im having that requires this kind of solution

#

It always accepts whatever the most recent value I enter as the default in the web config

#

i remember this always being a thing before as well but at the time my rigs were all identical core/thread config so it didn't matter lol

silver wing
#

you can create configuration files in the project directories and set maximums

#

So on Windows C:\ProgramData\BOINC\projects

#

Create app_config.xml file in the project folder, it won't exist

#

slap this in the file

#

<app_config>
<project_max_concurrent>12</project_max_concurrent>
</app_config>

#

set number to how many cores you want used if it's a 1 task per core project

#

also if you make changes in boinc manager it'll override web config just don't click the button to resync that

#

so you can set max % in that per system and it can be different

#

app_config.xml is a very reliable way though

umbral kernel
#

dumb question but why does FAH not support tesla GPUs

rich depot
#

because of the GHX CL21

lusty trellis
#

Is there a way in f@h config xML to disable opencl and just set it to CUDA. I think the client is confused in mine

brittle coral
pseudo abyss
#

I have fallen bellow top 500 on LTT team 😦

brittle coral
#

@pseudo abyss oof :(. did more processing power come online or did you have to downgrade?

rich depot
#

What is a goood keyboard I can get for Β£50

brittle coral
#

@rich depot that is a better question for tech-chat channels both, you'll get more activity there for it

#

idk how to answer that off the top of my head personally, my G910 keyboard was more

gaunt wasp
pseudo abyss
#

@brittle coral it's partly that I haven't folded as much the last while, but it's mostly that it's the last year I got above rank 440 or whatever it was, and this year there have joined so many that have more powerful gear than me or is folding much more of the time than me.

wintry halo
#

I’m new here, gonna stay of course, but I wanted to ask this question about folding, just found out about it like 2 hours ago, thought I might as well help and run my system while I sleep. Downloaded client, ran at full, because I thought I would max at 80C on my cpu (running a 3700x OCed to 4.2Ghz with a kraken x72 360mm AIO), boy was I wrong. My cpu hit 97C before I could stop it, and even running at light usage I hit 82C. I have 6 fans in my case, 3 intake 3 outake both sets running at 1500-1800 rpm when at full speed, and I’ve never had a problem with cpu overheating with this system before

#

Is there a way I can reduce my temps?

#

Oh and I also idle at 60C instead of my normal 40C even though I still idle around 2% usage

#

My room temps are generally around 25-26C as I have no air conditioning

#

My airflow into or out of my system isn’t blocked by any object on the outside

brittle coral
#

@wintry halo you can power or voltage limit the CPU in ryzen master, effectively underclocking it while it's on auto

#

set the TDP limit lower and it'll aim to stay within that window

wintry halo
#

Alr

brittle coral
#

doing it in software will make it easier to set it back for while you're gaming. I'm getting lower temps than you on my 3800x now on water though by a fair bit

#

97c sounds like a bad mount

#

but my loop is also custom not AIO

#

i have a clogged block tho so idfk

wintry halo
#

It’s never hit over 65C before

#

So i was suprised when it jolted to 97C

brittle coral
#

my 3800x is pinned at 4350 in ryzen master at 1.325v maximum (1.26-1.27 with vdroop)

wintry halo
#

Ah

brittle coral
#

thats p normal F@H uses the full FPU on your CPU which heats it up real fast

#

and cores on ryzen are tiny

#

they're like less than 10% of the die package xD

wintry halo
#

I’m not 100% sure as I did it a while ago but I believe I’m running 1.2574 volts on my cpu

brittle coral
#

all that cache takes up a lot of room

#

dang yea that doesnt make sense

wintry halo
#

Yea

brittle coral
#

are you sure your cooler is running properly?

wintry halo
#

Yea

brittle coral
#

i know those kraken AIOs are rated pretty well

wintry halo
#

Unless that just killed it

brittle coral
#

i would never use one but idk

wintry halo
#

Maybe the loop was a bit hot

brittle coral
#

even on just air cooling my 3600x does better

#

my 3800x was hitting those temps with the stock cooler

wintry halo
#

Ah

brittle coral
#

2600x is actually a joke, its cool as balls with this 4000rpm 38mm thick delta fan i have

wintry halo
#

I maxed the speed on my loop so maybe that damaged the pump

#

Lol

brittle coral
#

i ordered a 5200rpm fan for the 3600x to try and match the performance im getting on my 2600x temps wise but its still doing okay lol

#

i mean

#

those tiny pumps are kind of shit

wintry halo
#

True

brittle coral
#

but they should be able to handle the pump running at maximum

wintry halo
#

I didn’t have the money at the time for a full custom loop

#

That gets expensive real quick

brittle coral
#

if you're lucky that radiator isn't proprietary and it will have some kind of barb fitting on it so you could mod it into a basic ass custom loop, but honestly a proper radiator may perform better, idk how thick that radiator is. either way it shouldnt be hitting 97c even folding

#

and yea it does

#

i don't even want to estimate what my loop costs at this point, i just spent like $800 on new parts to add to what i already had

wintry halo
#

It’s like maybe an inch thick?

#

My bad

#

That’s with the fans on it

brittle coral
#

i just got 12 compression fittings, 4 Y adapters, 5 pairs of quick disconnects

wintry halo
#

More like 4/10 of an inch

pseudo abyss
#

Most of the people that go past me is doing 1,5 mill points a day or more

brittle coral
#

omg that is tiny

wintry halo
#

Maybe a bit bigger

#

I’ll look tomorrow when I wake up

#

It’s 1:20 am and I got school

#

I’ll mention you when I figure out how big it is

brittle coral
#

yea like my smallest radiator is 30mm thick but even then a 20mm rad shouldnt perform that poorly if its a 3x

wintry halo
#

Yea I’m a complete dumbass

#

I just looked it up