July 18, 2024
Will A.I. Go the Way of Bitcoin Mining
Introduction
There was a robust conversation in a Spaces on Twitter recently where participants went through the many facets of AI and the pros, cons, and unknowns of Bitcoin Mining Companies making the “pivot” into High Performance Computing / Artificial Intelligence (A.I.). The discussion brought attention to which companies were pursuing legitimate advancements vs. which were making grandiose claims. With that in mind, a grandiose claim circles on Twitter today in the form of a post from "Etched". Full Article
The post claimed the Etched Soho Chip was "the fastest AI chip of all time.”; (a fairly bold statement to say the least) and given the nature of the claim, looking at it with some context can be beneficial. Firstly, any claims made about a newly spec’d chip with no benchmarks other than the manufacturer’s should be treated with a healthy level of skepticism. Secondly, any claims of a new, unproven technology leading to the obsolescence of an “old” proven technology are typically premature for several reasons:
Evolving Ecosystem: The AI Ecosystem is rapidly changing, with new model architectures and hardware solutions constantly emerging, making long-term hardware predictions challenging.
Flexibility vs. Efficiency: GPUs offer flexibility for a wide range of AI tasks, while specialized hardware like LPUs and TPUs / ASICs provide higher efficiency for specific tasks.
Continued Relevance of GPUs: GPUs will remain relevant due to their versatility. Unless a fundamental shift in LLM design or a related step function of AI development takes place, this versatility will allow them to be used for emerging AI architectures and tasks that specialized hardware may not support.
This Doesn’t Sound Like Chat GPT
The press release from Etched stated “We’ve spent the past two years building Sohu, the world’s first specialized chip (ASIC) for transformers (the ‘T’ in ChatGPT).” This is a very important claim as it touches on the very foundation of how A.I. (in its current form) “works”.
GPT stands for “Generative Pretrained Transformer” which is an architecture based on Neural Networks. This neural network structure is used for text generation, machine translation, and text classification, all known as “Natural Language Processing Tasks.” OpenAI built GPT from the ground up into one of the largest language models, with over 200 Billion Parameters. Parameters are the numerical values acquired from data during the training phase that dictate how a neural network handles input data and generates output data. As the number of parameters increases a model becomes more intricate and capable, allowing it to process larger and more complex datasets.
The neural network of GPT is a “transformer” model that adopts self-attention mechanisms to process input text and generate output text. A simpler way of thinking about it is that these attention mechanisms are meant to reproduce human cognition where we can pay more attention to certain inputs while ignoring or diminishing the importance of other inputs based on the information we receive. Simply put, Etched claims the Sohu chip is designed to address one of the fundamental building blocks of A.I. more effectively, than any device or system to date.
What Does All That Mean?
That can be a lot to digest, and assuming that readers of this post are unfamiliar with Artificial Intelligence, or distinctions between different technologies, and the hardware that drives them, a brief overview of the popular technologies and why they are important would be beneficial. In very general terms:
GPUs (Graphics Processing Units): GPUs are extremely versatile across a wide range of applications due to their specialized architecture that is well-suited for parallel matrix processing of mathematical operations across many cores.They are excellent for general-purpose AI workloads, including training and inference. The most well known examples of GPUs are made by Nvidia.
LPUs (Language Processing Units): LPUs like those developed by Groq are fundamentally different from GPUs in their architecture and optimization for specific tasks. LPUs are purpose-built for sequential processing, which is crucial for natural language processing (NLP) tasks. The Groq LPU uses a Tensor Streaming Processor (TSP) architecture, which processes data tasks in a sequential, organized manner, similar to an assembly line. LPUs can offer higher performance and energy efficiency for specific language tasks but are less flexible than GPUs for other types of A.I. workloads.
TPUs (Tensor Processing Units): TPUs are a specialized form of application-specific integrated circuit (ASIC) created by Google to enhance the performance of machine learning tasks. While Graphics Processing Units (GPUs) were initially engineered for rendering graphics and subsequently repurposed for artificial intelligence applications, TPUs were designed explicitly to meet the unique requirements of machine learning from the outset. This targeted design strategy has produced a hardware platform that is particularly proficient at performing tensor operations, which are essential components of numerous A.I. algorithms. The most well known example of a TPU is Google’s Cloud TPU v4. TPUs have a more specialized architecture with systolic arrays optimized for matrix operations, which are fundamental to many machine learning algorithms.
What Makes Sohu By Etched Different
Sohu is an Application-Specific Integrated Circuit (ASIC) designed exclusively for transformer models. Unlike GPUs, TPUs, or LPUs, which are more general-purpose A.I. “accelerators”, Sohu cannot run other types of A.I. models and this extreme specialization allows for much higher performance on transformer based models. Without digging into the technicals of the chip in a manner that is far beyond the scope of this article, Sohu’s specialization for transformer models allows for higher compute density and simplified software optimization compared to more general-purpose AI accelerators like GPUs, TPUs, or LPUs. However, this specialization comes at the cost of flexibility.
BItcoin’s Specialized Model Inspired A.I. Innovation
Detailing the differences between GPUs, LPUs, and TPUs/ASIC Transformers, and exploring their roles in shaping the future of AI processing highlights intriguing parallels between Large Language Models (LLMs) and Bitcoin. For comparison purposes, one can think of LLMs as the “Layer 1” of the AGI Cognitive “Chain” (CogChain). The GPUs, LPUs, TPUs, and specialized Transformer ASICs can be likened to Bitcoin Mining Rigs. Future machine learning models, whether they are LLMs or otherwise, will serve as “Layer 2”, providing increased functionality, speed, and applications on top of “Layer 1”. This analogy may seem confusing at first, but breaking it down can help with understanding:
Foundational vs. Specialized: LLMs can serve as a foundational layer for many AI applications, much like Bitcoin’s blockchain serves as the base layer for the Bitcoin network. This foundational layer underpins the overall structure and functionality, providing a solid base upon which additional layers can build.
Hardware Specialization: Just as specialized hardware (ASICs) has emerged for Bitcoin mining to optimize efficiency and performance, specialized hardware for AI tasks is also beginning to appear. GPUs (Graphics Processing Units), LPUs (Learning Processing Units), and TPUs (Tensor Processing Units) are examples of this trend. These specialized units are designed to handle specific tasks more efficiently than general-purpose hardware.
Evolving Ecosystem: The AI ecosystem is rapidly evolving, with new model architectures and hardware solutions constantly emerging. This rapid change makes long-term hardware predictions challenging, similar to the ever-growing and changing Bitcoin ecosystem. The pace of innovation in AI mirrors the dynamic nature of the cryptocurrency space.
GPUs and Bitcoin Aren’t Going Anywhere
In conclusion, while specialized AI hardware might offer significant performance improvements for specific tasks, GPUs will remain important due to their flexibility and ability to adapt to new AI models and architectures. As these models continue to evolve and develop, the AI hardware landscape will likely diversify further. Different hardware solutions will coexist to meet various needs, much like the emergence of different layer 2 solutions alongside Bitcoin. This diversified approach ensures that a range of tasks and applications can be effectively supported, reflecting the multifaceted nature of both AI and Bitcoin.
For more information, please visit swan.com.
Swan IRA — Real Bitcoin, No Taxes*
Hold your IRA with the most trusted name in Bitcoin.
Adrian Morris has been lurking in the shadows of the Bitcoin space for over a decade. Writing has always been a passion of his, as he writes engaging and thought-provoking pieces.
Adrian was raised in the streets of Brooklyn, New York and now resides in Arizona. His life mission is to raise awareness of Bitcoin and how it can bring about a bright-orange future.
News
July 16, 2024
Trump, Bitcoin, and the Swamp
July 16, 2024
Bitcoin on the Ballot
July 12, 2024
CFTC Coming For Bitcoin?
More from Swan Signal Blog
Thoughts on Bitcoin from the Swan team and friends.
New Personal Account, Swan Vault on Mobile App, and Withdrawals to Self-Custody
By Matt Carvalho
We are excited to share the latest improvements to the Swan Bitcoin mobile app, streamlining your Bitcoin buying and storage experience.
How to Set Up Swan Vault: Unlocking Safer Bitcoin Self-Custody
By Matt Carvalho
Discover how Swan Vault puts you in full control of your Bitcoin with a simple, secure setup. See just how easy it is to safeguard your wealth and unlock your monetary autonomy.
Swan Bitcoin and Equity Trust Collaboration Unlocks New Retirement Account Features for Bitcoin Investors
By Brady Swenson
Equity Trust and Swan collaborate to bring new IRA account types and features to Swan’s IRA product.