If you are advertising something, make sure its accurate and not full of marketing bullshit as people will see through it.
AMD has agreed to pay purchasers of its FX Bulldozer processors a total of $12.1m to settle a four-year false advertising lawsuit. Considering the number of processors sold and assuming a 20 per cent take-up by eligible purchasers, that works out to $35 a chip, the preliminary agreement argues: a figure that is “significantly …
"If you are advertising something, make sure its accurate and not full of marketing bullshit as people will see through it"
Some, not all people will.
I quote Apple as an example of successful marketing bullshit. Maybe not technically inaccurate in the same way as this but marketing bullshit nonetheless.
My first "Desktop" CPU did not have an FPU or any cache at all. Whether these are required to call something a core is a matter of opinion. Back in the day reviews were clear about the FPUs being shared between two "cores". Anyone who made a trivial effort to find out would have known. There were plenty of benchmarks graphs around showing Bulldozer's limitations with multithreaded FPU tasks.
The most unbelievable part of this case is that there are still people around who think anyone but lawyers benefits from a class action lawsuit.
Yep. I'm kinda in AMDs court here for where the ball should land. An engine is an engine, if it is fast or slow. A wheel is a wheel, if it is small or large.
AMD did not lie on the size of the engine, or the number of tires... anywhere. The consumers assumed more tires made a fast car, and did not check the size of the engine.
IF AMD had been more whimsical with their wording and advertising (they do, and many other companies do at other times), I could totally understand.#
Seems more proof that everything, everywhere is broken. That the fake adds get a pass, but the totally open and legit ones get sued into oblivion.
No, this is more like buying an 8 cylinder engine only to find out that it's really a four cylinder with two adjacent cylinders glued together and sharing a single spark plug and manifold -- 8 pistons, four combustion chambers. The state of the art at the time was independent FPUs, I remember the pain very well of moving from K8 (independent FPUs) to Bulldozer (needing to order double the cores for the same FP throughput).
Oddly enough, I still have my Bulldozer since it can use coreboot. I never ever thought I was getting 8 independent cores, but I can see how others, especially the less tech savvy, would have not got the performance they were expecting for multimedia apps based on core count alone.
That engine exists, Honda produced a version for a racing motorbike in the early 80s and then put it is a road bike in the early 2000s at $50k each
Oval piston with two Conrods, and 8 valves per cylinder.
Just a side note for you....
Or more successfully, the venerable Deltic engine.
The thing about any innovation is that by definition the existing language is less than adequate in describing it. That's why it's usually better to just make something up. You might be accused of bulshitting but at least you can't be accused of lying.
>That engine exists, Honda produced a version for a racing motorbike in the early 80s and then put it is a road bike in the early 2000s at $50k each
That's one expensive coffin but those on the transplant list will be grateful. I live in the Peak District and if you saw the number of lunatic motorcyclists as I do descending on us from afar then you'd be a bit dismissive of motorcyclists self inflicted folly as I. However sensible motorcyclists always welcome.
Uhh... Yeah, if AMD sold 6502's with a "Bulldozer" label on them, they would have gotten sued for much the same reason.
Umm:- I8080, I286, I386, M68000, M68010, M68020.
FPUs were separate components from CPUs for a long time.
Note:- Hyper-threading, as developed by Sun and used by Intel, appears to the OS as multiple CPUs even though only a single ALU and FPU per core exist.
The problem with this case is that AMD were completely accurate but despite this were sued and had to make a pay out.
A CPU has never needed to include an FPU, for the first decade or so of my career FPUs were always additional accelerator devices or even boards It is still the case that many CPUs do not have FPUs. I recently designed a twin core fixed point DSP system, with (fixed point is the clue) no FPU.
AMD were completely accurate and precise using the normal meaning of the word core. I am sure their detailled literature and any technical reviews would also have been accurate. That total accuracy does not protect a company from a lawsuit by ignorant and negligent purchasers, or perhaps just opportunists who see a chance to get a little money does not paint the US legal system in a good light.
Every sympathy for AMD in this completely unjust situation, ignoring the money what about the unfair reputational damage?
"AMD were completely accurate and precise using the normal meaning of the word core."
I disagree. Not because of the shared FPU but because of the shared instruction decoder. That is what is turning a Bulldozer's module implementation into a glorified HyperThreading core. Later designs with separate instruction decoders could reasonably be considered separate cores. Not Bulldozer.
I am not sure I agree. I worked on bit slice CPUs that did not even have instruction decode and decod eis not what I conside rpart of a 'core' but I can understand someone disagreeing. What I don't understand is that anyone who cared about performance would have looked at benchmarks and the detailled tehcnical information. I still think this is an abusive lawsuit.
The problem with this case is that AMD were completely accurate but despite this were sued and had to make a pay out.
I agree. They were accurate in the description of the CPU as it did have 8 integer cores. The fact that they shared pretty much everything else between 2 was explicitly stated all over the place: In their marketing material, their technical documentation, all the reviews etc.
It is likely that the people who should be sued are the sales bods who sold the computers to non-technical people. They'll have pushed the "it's got 8 thingys, whereas that one's only got 4" line.
TBH in this case 8 core was fair - it did have 8 integer cores. It just had the module design where two of these were paired with a lot of shared front-end and back-end, and shared FPU. The multi-thread scaling (let's ignore that single threaded performance was, let's say, 'poor') was pretty good, at a time when Intel barely got anything from SMT, AMD was getting 80%.
This is just a shut-up-and-go-away payment. But AMD did leave themselves open with this design and how they marketed it.
"If you are advertising something, make sure its (sic) accurate"
Except that it was accurate. The FX series did have 4, 6 or 8 physical cores, it's just that each pair of cores shared some resources. The only reason AMD agreed to the settlement was to get the lawsuit out of the court system and be done with it.
It's elementary, my dear Dwarf.
It did not have 8 physical cores. It had four cores, each of which had two integer decoding units stuck in it. An integer processing unit is not a core. This is like having a four-room house, sticking 8 beds in those four rooms, and calling it an 8-bedroom house. A bed is not equal to a bedroom.
Only if you ignore absolutely everything AMD ever said about Bulldozer. They were quite explicit that the design was to have two integer cores sharing a single FPU since floating point work was in the minority for the vast majority of use cases. If we go with a house analogy, what AMD did was sell an eight bed/four bath; the complaint is essentially people completely ignoring the realtor and whining when things don't meet their preconceptions once the deal closes.
Normally, in a world that is just, a lawyer should be paid for his work, not get a share of the money that is supposed to go to the people who have been wronged.
But, this is the US of A, a country which has defined the term lobbying and set the example of how not to manage corporations or professional interests, so what else can you expect ?
"But, this is the US of A, a country which has re-defined the term lobbying and set the example of how not to manage corporations or professional interests, so what else can you expect ?"
FTFY (The term "lobbying" is actually defined by the physical layout of the Palace of Westminster, and the lobby is where constituents and others can meet their Member of Parliament.)
The agreement says the lawyers haven’t even discussed how much they are going to pay one another at this point but have kindly offered to “limit” their frees to no more than 30 per cent of the settlement fund - so $3.63m.
I did hit tips and corrections but a @mailto has nothing to launch on my device.
In this case, AMD had done nothing wrong. The chips did have eight cores, and each core was able to do floating-point arithmetic at full speed. Only vector processing was limited, for full-length vectors of the latest kind, by the shared floating-point unit. So are the cores of previous generations no longer "cores", because their vectors were only half the size?
The lawsuit was spurious, and it's unfortunate AMD had to pay some money to settle it, even though the poor performance of the Bulldozer chips for other reasons no doubt erodes some sympathy for AMD.
"The chips did have eight cores, and each core was able to do floating-point arithmetic at full speed."
This is patently not true. Full speed would mean using both FMAC units. When two "cores" use FP operation they effectivelly (i.e. on a rough averge) use only one. Sure, in most cases you won't be hammering both cores with FP workloads but if your workloads are indeed FP heavy, then it performs like every module is single core.
BTW, I personally wouldn't get too worked up about the FPUs. In a comment above I clarify that in my opinion what turns BD in 4-core architecture is the shared instruction decoder. It makes BD modules glorified HyperThreading cores. Only in later designs this was rectified and those can reasonably (but not completely) be considered x2 core.
And if you go through CPU operations manuals, how many times do you find instruction combinations that limit instruction throughput due to design choices.
AMD examined code during the Bulldozer design process, thought that the choices they made around combining some units made sense based on code analysis and the increasing use of multithreading.
The reality was that for most multithreaded tests, Bulldozer performed as expected BUT it vastly underperformed versus Intel and previous generation chips in single-threaded workloads - not what you would expect if the issue was due to two cores actually being one.
The problem was as much down to a long pipeline design to allow higher clock speeds combined with poor cache performance - the arguments about "what is a core" and sharing of decode/FPU units likely did not affect performance for most users when compared to the pipeline/cache issues.
Bulldozers real problem was the lack of performance versus the competition and AMD are paying because it is a cheaper option than trying to fight the case on merit. $12m is probably less than a year long legal fight.
FX 8350 are still widely used because they are overclockable to 4.3 GHz and higher without special cooling.
They did not underperform vs Intel except for single thread applications like games that were optimized for the security vulnerable speculative execution of Intel. The AMD chips, from a consumer point of view, were a better value and better designed in retrospect. I still use 2 FX 8350, one on a dev Windows 2012 R2 server and the other on a Win7 workstation used for programming streaming software, including OBS studio and Streamlabs OBS. It was used to develop the free, popular OBS-shaderfilter plugin. That means the system runs multiple browser sources, graphics heavy software with shaders, web browsers, multiple web cams, video encoding/NDI, chatbot (python based), game controller scripting software and plays a modern video game (60fps) at the same time. Intel chips from 2013-2014 can't do it
This lawsuit is insane.
The 83xx CPU's are second generation (effectively third as there were fixes deployed to gen 1) generation and address the cache issues and process issues affecting frequency scaling that were present in the first two generations.
It's what Bulldozer could have been in generation 1 if mistakes hadn't been made and Global Foundries had a working 32nm process.
Imagine you have a multithreaded FP-heavy workload. With Bulldozer, if you run it as 4 threads or with 8 threads you'd get roughly similar performance. The 8 thread would probably be slightly faster because it would allow better utilization of the FPUs but OTOH it could thrash the shared L2 cache. Still, going from 4 thread to 8 threads will not result in anything akin to 2x the performance (disregarding Amdahl's law for the time being).
Now, there's always the possibility that for the 4 thread scenario a shitty OS would schedule the threads on every "core" on each of2 modules, rather than on single "core" for each of 4 modules. That would suck on its own although it would provide the 2x boost you'd expect - but has nothing to do with the point that I'm trying to make. Bulldozer was not an 8 core processor for any meaningful definition of the term. And I say this as an AMD fanboy. I hated them because they made me buy Intel CPUs. Only in the last 2 years have a returned to considering AMD for CPUs and I have advised the purchase of several Ryzen systems and built a ThreadRipper for myself. AMD's comeback is even more impressive because of the Bulldozer (and to a large extent the later iteration) fiasco. Part of which is the fake core-ness.
You are trying to fit the design to your evidence - while I don't dispute that FP workloads would exhibit terrible performance with 256-bit calculations, this was an AVX (128-bit) capable processor. The 256-bit issues came largely from competing with Intels AVX2 CPU's.
In single-core operations, gen-1 Bulldozers were slower than there predecessors (Phenom II) and Intel 2XXX CPU's but performed about where they were expected to perform in multicore workloads. If they really didn't have 2 cores per module would you expect this?
Looking purely at FPU, the Phenom-II design was significantly less capable than the Bulldozer design but outperformed it? Because the long pipeline and poor cache design meant that the FPU units weren't being fed - they were starved, not overworked.
Effectively AMD had multiple issues:
- they designed the CPU for clock scaling (long pipeline vs Phenom/Intel short pipelines) which meant they needed ~50% more clockspeed to achieve the same instruction throughput as previous generations. (i.e. they tried a long Pentium4-type pipeline to scale clock speed higher when that approach had allowed AMD to badly hurt Intel previously...)
- Global Foundries process sucked. It was hot and didn't scale to the frequencies promised.
- AMD designed the CPU for a move to heavy multithreading when PC performance is still largely classified by single-core performance
- AMD had to reduce cache sizes to allow for high frequency, only they broke the cache efficiency due to design issues, resulting in poor cache hit rates and high latencies for cache and memory access. These were tweaked in some parts and fixed in gen-2 (83xx) to the point where the 83xx chips are pretty reasonable.
Combine those underlying issues with poor scheduling on Windows/Linux and a number of other minor bugs that needed fixed and it meant Bulldozer sucked compared to Intel. But it was definitely a 2 core/compute block design that scaled to 8 cores/4 compute blocks for all reasonable definitions of "cores" used across the industry.
The one positive from Bulldozer is it made AMD fundamentally reassess its direction and allowed the technical people to produce Zen with little interference from other parts of the organisation, so we now have EPYC/Ryzen.
FPU wasnt the only thing shared between cores; the penny pinching design crippled the stupid things in many real life applications.
After finding my shiny new FX needed to be O/C'ed to 5GHz to achieve what my old Phenom II could do at 3.6GHz, I went back to the Phenom; and only replaced it last month, with a new Ryzen system.
"FPU wasnt the only thing shared between cores; the penny pinching design crippled the stupid things in many real life applications."
You're correct, but sharing components between cores wasn't the major issue.
As you note, gen-1 FX's needed around 50% more clock speed to achieve the same performance as Phenom II chips for many workloads. This was because AMD focussed on frequency scaling rather than IPC. The design scaled to ~8GHz (with liquid N cooling...) far exceeding the scaling of other processor designs, but IPC was much lower.
AMD's assumption was that higher frequencies and more core cores with more functionality would beat IPC and the less capable hyperthreading. The reality was that they made a lot of mistakes, effectively eliminating a lot of the things they had done well in the past (cache and front end efficiency) while ignoring the things they did badly (microop retirement, IPC, AVX/AVX2). While it was a learning experience that almost bankrupted AMD, it appears to be paying dividends with Zen - they have learnt from their mistakes.
Combine the design decisions/design errors with Global Founderies taking a long time to have decent 32nm process design rules and you have a lot of pain for AMD for the 4-year lifetime of the Bulldozer core and its variants.
With the P4 design, Intel tried to scale frequency for "easy" performance scaling but was unable to compete with AMD's K8 architecture that relied on a high instruction rate per clock (IPC) to outperform P4. Intel believed that the long-pipeline P$ architecture was still viable but required further process shrinks to allow them to add more features to compensate for pipeline stalls.
Intel came back with the Core/Core2 architecture that allowed Intel to outperform AMD due to superior manufacturing with similar designs. New process delays meant AMD were launching 45nm parts as Intel were transitioning to 32nm which gave Intel a significantly larger transistor budget for core designs in addition to advantages in volume and yield (rumoured but very likely to be true given problems at AMD and the eventual spinout of GF)
AMD then choose to increase core numbers to try and improve performance when they couldn't compete using similar designs. The design wasn't terrible, but it was executed poorly in both the design phase (cache hit rates and memory latency issues) and manufacturing (initial design rules for SOI ran hot and didn't yield well), both of which crippled AMD's Bulldozer core.
Moving onto the current AMD offering (Zen), AMD have gone back to doing what they do well (short pipelines, lots of cache and high IPC, fast inter-core connects) and they have executed very well. Intel on the other hand look stuck with a broken 10nm process and have tried to use features to distinguish their product lines while they are stuck on 14nm.
AMD likely have 1-2 years (2020-2021) to reshape the x86 market before Intel can come back with a competitive manufacturing process. Assuming TSMC continues at their current pace on the manufacturing side and AMD make no major mistakes, it could be an interesting fight.
So back to the question: Is there another AMD sueball waiting to happen? Sure....this lawsuit was based on mistakes resulting in a lack of performance and its possible there are further mistakes present with the Zen architecture BUT it depends on the impact of those - we haven't seen anything significant in ~3 years versus the Bulldozer issues showing up in initial reviews.
If any sueballs are lobbed around as a result of this case, I would expect Intels Meltdown and Spectre variants to be the targets as they really do result in cores needing to be disabled to fix the issue (see MS/VMware/etc recommendations for addressign Spectre issues).
I often wonder: the "Quad Core" in my laptop APU does not seem to be that fast at all.
Yet a full fat Phenom 2 quad core of the same speed (S1G4) is loads faster according to my measurements.
Could this be due to heat? The APU is essentially a combined CPU and GPU on the same silicon, so
though the GPU portion will emit heat it is typically a fraction of that from the CPU.
AMD's APU's have been very dependent on memory bandwidth to perform well as both the CPU and GPU use system memory rather than the GOU having its own dedicated memory.
Because a lot of AMD's APU's ended up in low end laptops, cheap RAM and poor chipset/motherboard designs meant that both the CPU and GPU struggled to get the bandwidth it required to perform well. Add in any thermal throttling and it gets even worse.
I think the problem here is that the word "Core" is actually a marketing term. What is a "Core"? - if someone says "It can process 2 thread simultaneously" then that's a specific, testable claim. But since the term "Core" doesn't have a specific technical meaning, it's a little harder to test the claim since the speaker is allowed to define what a "core" is in terms of the product under discussion.
Core as in ALU, so essentially integer math and logic.
But that would mess up the counting since cores have had multiple integer units going back as far as the original Pentium. You wouldn't call that a dual core processor. I prefer the simultaneous thread metric proposed earlier.
What is a core? Go back to when we had single core CPUs - THAT's a core. Why is everyone pretending that "core" is some made-up word? You're not allowed to define what a core is. It means a whole CPU. a two-core CPU has two CPUs on the same package; not bits and pieces of a CPU.
All 8 worked for me. I can run 8 chess games simultaneously. And there are not 4 at a higher level than the others. Chess is integer not float. Still, wasn't that impressive as compared to the best Intel stuff...but I did not pay for the best Intel stuff either.
Most of the big talk about Bulldozer was way before launch. Everyone who wanted to look, could look at benchmarks. I honestly don't think the lawyers had a good case. But 12m probably does not hurt them too bad either. But then there are 49 more states. People generally don't buy CPUs for floating point. Float is better done by the GPU. That is likely why they had fewer float units.
I used my computer to play literally millions of games of chess for Stockfish development. Most of that independent testing of ideas, but some on their Framework.
I own AMD stock...so I am disclosing that. But everything I said was factual. http://tests.stockfishchess.org/users 549300 games run there. But like I said, most of the games I ran were not on the Framework.
From experience in the web hosting industry, I found the 16-core Bulldozer chips (Interlagos, 6200 series) marginally inferior to the 12-core K10-based (Magny Cours, 6100 series), especially when adjusted for clock speed. Piledriver (Abu Dhabi, 6300 series) did seem to provide a measurable improvement over Bulldozer, but the Magny Cours machines always seemed a bit smoother.
Biting the hand that feeds IT © 1998–2019