By Christian Prokopp on 2023-01-20
How Bold Data achieved an astonishing 2.3x improvement by switching from x86 to ARM.
Measure, measure, measure, and do the math. It pays.
Generally, AWS ARM EC2 instances perform similarly to their x86 equivalents. AWS prices them that way with a bit of favour towards ARM. But they are different architectures, and sometimes, one excels over the other in specific tasks.
I recently switched an EC2 fleet from x86 to ARM as a sense check, expecting similar performance. Occasional experiments allow you to (re)check assumptions if you always measure and have baselines.
To my surprise, there was a substantial performance difference of approximately 70% for the specific workload on this fleet, i.e. ARM performed 170% throughput of the x86 equivalents for both Intel and AMD instances (see above image).
While an impressive improvement, it does not yet account for the price difference. Always watch spot prices, as they can spike and exceed on-demand costs. Sometimes ARM, while having a lower initial cost, can exceed x86 spot prices. However, in this case, the ARM instances cost 75% of the x86 equivalents I used before.
In summary, we achieved 2.3x per $ by switching from x86 to ARM when combining the higher throughput and lower cost per instance. The caveat from the post should be clear. Improvements depend entirely on your workload and instance (spot) prices and are not a general rule. It is a reminder that measuring, observing and experimenting with your performance and cost is essential.
Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at christian@bolddata.biz for inquiries.
2024-04-12
128k tokens are 96k words in English for ChatGPT 3.5 and 4. The ratio is estimated to be 0.75 words per token. However, the answer is not straightf...
2023-11-23
Recently, OpenAI released GPT4 turbo preview with 128k at its DevDay. That addresses a serious limitation for Retrieval Augmented Generation (RAG)...
2023-11-09
Today, I received access to the new custom GPT feature on ChatGPT, and it appears to do what Sam Altman demonstrated. The implications are far-reac...
2023-11-07
OpenAI's DevDay announcement yesterday addresses issues I wrote about in the infeasibility of RAG after building Llamar.ai this summer. Did I get i...
2023-04-12
Learn to harness the potential of ChatGPT4, your virtual programming partner, with nine prompting tips. Improve your programming skills by communic...
2022-06-20
Insurance works because it shares costs in the face of uncertainty. What happens when Tesla removes uncertainty and distributes cost seemingly more...