As stated on the ProofofResearch Twitter account a few weeks ago, my position on Deepbrain Chain is that the idea that they are proposing is implausible and cannot be manifested. This is supported by a wealth of research attesting to this fact as well.
Below, is all of the compiled data and research that I have currently aggregated on this topic to prove my suggestion that Deepbrain Chain’s idea is bunk.
Research Paper Introduction
Peer-reviewed research article on the topic of neural networks + parallel computing:
“This paper presents a theoretical analysis and practical evaluation of the main bottlenecks towards a scalable distributed solution for the training of Deep Neural Networks (DNNs). The presented results show, that the current state of the art approach, using data-parallelized Stochastic Gradient Descent (SGD), is quickly turning into a vastly communication bound problem. In addition, we present simple but fixed theoretic constraints, preventing effective scaling of DNN training beyond only a few dozen nodes. This leads to poor scalability of DNN training in most practical scenarios.”
The first important takeaway from this article is the fact that,
“Current models take several ExaFLOP to compute, while processing hundreds of petabyte of data.”
This would be the main constraint of the $DBC idea that they propose. The issue with parallel computing on a blockchain is the amount of data that the chain would need in order to facilitate a global neural network (which $DBC proposes). Obviously, there is no such innovation in blockchain technology that allows for the storage of multiple petabytes of data per block.
In a decentralized environment, not only would the blockchain need to be capable of handling petabytes of data, but so would all of the other nodes on the network. The latency of such a system would make it utterly impossible for $DBC to do what it proposes.
Evaluating the Scalability of Decentralized Deep Neural Network Systems
Let’s take a look at this figure from the aforementioned research paper:
As we can see, each network depicted above suffers greatly in performance as more nodes are added.
Essentially, what we’re seeing in the graph above is an increase in the latency of the neural network.
Latency above a certain point (and this is rather unforgiving), will compromise neural networks.
Latency and Neural Networks
In order to understand this portion of the research, it is important that we first define latency and neural networks.
Defining and Understanding Latency in Blockchain
Latency, in this sense, refers to the delay in information relay between one node to another. This is a factor in blockchain (especially decentralized ones), that delays the propagation of blocks. This is the reason for why certain transactions may be received by certain nodes a bit quicker than others on the Bitcoin network. Latency is also the reason why there is sometimes a delay in the transmission of an acceptable block on the Bitcoin protocol (i.e., one that has the appropriate nonce value/PoW and conforms to the protocol’s standards).
This is the primary reason for orphaned blocks, which only occur when two locks are submitted at the “same time”.
Below is a great article from Hackernoon explaining latency in blockchains:
How long does it take for a transaction to be confirmed for the first time and does it ever reach an irreversible state? During the past 9 years we’ve seen several approaches to solving the general problem of distributed ledgers.
One of the primary contributing factors to increased latency in blockchain is the size of the blocks.
This is corroborated by research provided in a paper titled, ‘On Scaling Decentralized Blockchains’:
There are a few quotes from this piece specifically that we’re going to isolate in order to make this point clear.
“A critical reference point for Bitcoin’s performance is Decker and Wattenhofer’s 2012 measurement study of the Bitcoin networks’ block propagation . At the time, the median and 90-percentile time for Bitcoin nodes to receive a block was 6.5 seconds and 26 seconds respectively. This study also showed that for small blocks, less than roughly 20KB in size, latency was a significant factor in block propagation times. Beyond this size, throughput was the dominating factor,a nd was invariant in block size; thus they found that for large enough blocks the block propagation times grew linearly with respect to the block.”
Let’s also observe the Network Propagation Rate chart provided by the researchers in this piece as well as this will give a better idea of the impact of size on the latency of a decentralized blockchain.
This quote from the researchers, however, serves as the most important:
“To improve the system’s latency, we can in principle simply reduce the block interval. To do so while retaining high effective throughput, however, would also require a reduction in the block size.”
These researchers, as many others, observed through direct experimentation in testnet environments that it is virtually impossible to reduce the latency on the network without either:
a) reducing the block times and the block sizes
b) Increasing the block times while maintaining the same block size
Therefore, the challenge that $DBC is attempting to conquer in facilitating such a neural network is a substantial one, if not impossible. Again, we have never witnessed any example in which a decentralized blockchain has been able to handle over a few MB of data, let alone anything in exception of a gigabyte, terrabyte or petabyte. Doing so while maintaining a network with under 10 ms of latency is absolutely impossible at this point in time and there is no conceivable model for how such a feat would be accomplished.
Defining and Understanding Neural Networks
“A neural network is a type of machine learning which models itself after the human brain. This creates an artificial neural network that via an algorithm allows the computer to learn by incorporating new data.”
So, essentially it is a network of computers that are all working together to learn (via machine learning) how to complete one specific, particular task.
Another helpful quote from the Techradar piece to understand neural networks:
“Unlike other algorithms, neural networks with their deep learning cannot be programmed directly for the task. Rather, they have the requirement, just like a child’s developing brain, that they need to learn the information.”
The article then isolates the various methods by which computers (GPUs in this case) are able to ‘learn’ in these neural networks.
We won’t dive too deeply into neural computing and how that works but what has been written thus far should give an accurate, general idea of what it is.
Stochastic Gradient Descent
Let’s head back to the first peer-reviewed research piece that we posted about decentralized, deep neural networks (DNNs).
Specifically, we want to look at this quote:
“Deep Neural Networks are trained using the Backpropagation Algorithm. Numerially, this is formulated as a highly non-convex optimization problem in a very high dimensional space, which is typically solved via Stochastic Gradient Descent (SGD).”
This is relevant for a number of reasons:
- Each ‘step’ that you see in the figures above represents another ‘transaction’ that would need to be processed on the chain.
- These ‘transactions’ must happen within milliseconds of one another and there must be little to no latency between each.
- These ‘transactions’ are accompanied with a LOT of data (remember this is deep learning; so there is information being processed back and forth, continuously).
- This is feasible in a synchronous environment; but this is borderline impossible in an asynchronous environment.
Synchronous vs. Asynchronous Environments
For those that do not know, blockchains are either ‘synchronous’ or ‘asynchronous’. While there has been much debate about the difference between the two in modern blockchain, most scholars/researchers would agree that a truly decentralized blockchain must be asynchronous.
If you’re wondering what the difference is between ‘synchronous’ and ‘asynchronous’ in computing, one answer given on StackExchange’s site sums it up perfectly.
As an answer to the question of ‘What’s the difference between synchronous and asynchronous?’, the answerer stated:
“When you execute something synchronously, you wait for it to finish before moving on to another task. When you execute something asynchronously, you can move on to another task before it finishes.”
The definition is simple, but coherent and it fits for what we’re going to explain in terms of blockchain.
Synchronous vs. Asynchronous in Blockchain
Bitcoin is a perfect example of an asynchronous system.
This is because transactions on Bitcoin are never in a state of ‘settlement finality’.
Blockchains such as $NEO, which rely on dBFT (Delegated Byzantine Fault Tolerance), are synchronous.
Below, is a figure from a research paper titled, ‘The Quest for Scalable Blockchain Fabric: Proof-of-Work vs. BFT’.
As we can see in the chart above, ‘synchrony’ is needed for ‘liveness’ on BFT consensus-based chains (i.e., ones that utilize PoS).
This is enumerated further in the research piece when it states:
“The BFT approach to consensus typically requires every node to know the entire set of its peer nodes participating in consensus. This in turn calls for a (logically) centralized identity management in which a trusted parties issues identities and cryptographic certificates to nodes.”
This statement of course plays more into the argument for why PoS blockchains are not truly decentralized and cannot be by nature, but that’s another argument for another day.
Currently, we’re looking at the proposal by the $DBC team that they will be able to erect a decentralized blockchain protocol that is capable of facilitating a deep neural network. Thus, we are looking at this under the assumption that they will erect a truly decentralized network that has the capacity to bring a deep neural network into fruition.
Let’s take a brief look at ‘consensus finality’ now. The research piece defines it as a, “Property that mandates that a valid block, appended to the blockchain at some point in time, be never removed from the blockchain.”
Bitcoin, obviously, does not possess this property because, no matter how improbable, all transactions can technically be reversed at some point if one were to create another chain with more Proof of Work than the longest accepted chain.
This is not the case with chains that operate via a BFT consensus mechanism, such as $NEO, which was mentioned earlier. This is corroborated by the research report, which explicitly states, “Consensus finality is not satisfied by PoW-based blockchains.”
Back to Stochastic Gradient Descent
Now that we have sufficiently defined the Stochastic Gradient Descent in relation to deep learning neural networks as well as asynchronous vs. synchronous blockchains, we can now evalute the conclusions formed in a recently published piece of research by Xiangru Lian, Wei Zhang, Ce Zhang, and Ji Liu, titled, ‘Asynchronous Decentralized Parallel Stochastic Gradient Descent’.
The research outlines the main problem, which is that:
“Most existing systems such as TensorFlow, MXNet, and CNTK support two communication modes: (1) synchronous communication via parameter servers or AllReduce, or (2) asynchronous communication via parameter servers. When there are stragglers (i.e., slower works) in the system, which is common especially at the scale of hundreds [of] devices, asynchronous approaches are more robust. However, most asynchronous implementations have a centralized design.”
This underlies the main problem in $DBC’s proposal, which is that they will create a decentralized blockchain that is capable of facilitating useful and usable neural networks that can be utilized by computers/devices all around the globe.
As stated originally through the Twitter page, the type of innovation that Deepbrain Chain is proposing is something that is entirely unprecedented in the world of computing.
To insist that one will create a decentralized network that is capable of facilitating deep learning on blockchain, no less, is akin to stating that one will manifest a quantum computer and sufficiently crack all modern forms of encryption.
The technology and technical expertise to erect such a network in a practical and usable manner does not currently exist and there is no research, researcher, computing expert, or system that proposes such an idea for a decentralized, asynchronous system with any concrete plans of actually enacting this idea.