Author: vasa

This article was first published at


Since the inception of the decentralization revolution in 2009, a lot of promising projects have come up and changed the way we see and live in this world.

One of such projects is Protocol Labs, which has given birth to amazing projects like IPFS. IPFS lacks an incentivization layer that can help in its mass adoption and hence its ultimate goal to replace HTTP.

Here is where filecoin comes in. Since its announcement filecoin has gained a lot of interest in the community. But due to its economics (crowsale and investment strategies), it has lost a number of supporters. Clearly saying, a lot of people seemed to be pissed off on seeing its plans.

There is a lot of information on its tech and economics on the web, which can be confusing as well as overwhelming. So, here we have consolidated all the information available in ONE SINGLE SOURCE.

First, we will discuss the technical aspect of filecoin and then move to its economic aspect in our next post.

But before diving into the core technical stuff, let's analyze today's state of the file storage market. If you are only interested in the technical stuff, then jump to the next section.

State of Today's File Storage Market


Today, Amazon S3 is the juggernaut of file storage on the internet. There're a number of reasons for this:

In a world where we have such an amazing cloud storage service, any competition will have to perform even better than this, or at least equivalent. At small scales, decentralized networks don't work well. But if it (IPFS) gets adopted on a large scale (more adoption than BitTorrent) this may prove to be a better version of the Internet and hence will open a totally new economy.


Technical Overview


We will divide this into 4 sections:


An Overview of How Filecoin network works?


There are 3 groups of users in Filecoin: clients, storage miners & retrieval miners.

Clients pay to store and retrieve data. They can choose from the available service providers. If they want to store the private data, they need to encrypt it before submitting it to the providers.

Storage miners store the client's data for the reward. They decide how much space they are willing to reserve for storage. After the client and storage miner agree on the deal, the miner is obliged to continuously provide proofs that he stores the data. Everyone can look at the proofs and make sure that the storage miner is reliable.

Retrieval miners provide client's data at their request. They can get the data either from clients or storage miners. Retrieval miners and clients exchange data and coins using micropayments: the data is split into pieces and clients pay a small number of coins for each piece. Retrieval miner can also work as a storage miner.

Finally, the network represents all full nodes that validate the actions of clients and miners. These nodes count the available storage, check the storage proofs, and repair data faults.

Some terms used in the paper:

Pieces: A piece is some part of the data that a client is storing in the decentralized storage network. For example, data (maybe a cat pic) can be deliberately divided into many pieces and each piece can be stored by a different set of Storage Miners.

Sectors: A sector is some disk space that a Storage Miner provides to the network (it can be thought of as a unique id that is associated with a specific part of disk space of a particular storage provider). Miners store pieces from clients in their sectors and earn tokens for their services. In order to store pieces, Storage Miners must pledge their sectors to the network.

Allocation Table: The AllocTable is a data structure that keeps track of pieces and their assigned sectors. The AllocTable is updated at every block in the ledger and its Merkle root is stored in the latest block. In practice, the table is used to keep the state of the DSN, allowing for quick look-ups during proof verification.

Orders: An order is a statement of intent to request or offer a service. Clients submit bid orders to the markets to request a service (resp. Storage Market for storing data and Retrieval Market for retrieving data) and Miners submit ask orders to offer a service.

Orderbook: Orderbooks are sets of orders. Filecoin maintains separate orderbooks for Storage Market and Retrieval market.

Pledge: A pledge is a commitment to offering storage (specifically a sector(s)) to the network. Storage Miners must submit their pledge to the ledger (filecoin blockchain) in order to start accepting orders in the Storage Market. A pledge consists of the size of the pledged sector and the collateral deposited by the Storage Miner.

Users share their intentions by making orders. Clients submit bid orders, specifying the price they want to pay. Miners submit ask orders, specifying the price they want to receive. When bid and ask orders match, the client and miner both sign a deal order and submit it to the blockchain.

Bid and ask orders together form Storage Market (a market for file storage) and Retrieval Market (market for file retrieval). Let's dive deep into these markets and see how they work.


Storage Market


It is a decentralized exchange run by the Network, where all asks and bids are stored in the blockchain for storing data on the Filecoin network.

A Client submits a bid order (using PUT protocol, explained in the next section) to the Storage Orderbook. Clients must deposit the coins specified in the order & specify the number of replicas they want to be stored. Clients can either submit multiple orders or specify a replication factor in an order. Higher redundancy (higher replication factor) results in a higher tolerance for storage faults (discussed below).

Storage Miners pledge their storage to the network by depositing collateral via a pledge transaction in the blockchain via Manage.PledgeSector. The collateral (filecoins) is deposited for the time intended to provide the service, and it is returned if the miner generates proofs of storage for the data they commit to storing. If some proofs of storage fail, a proportional amount of collateral is lost. Once the pledge transaction appears in the blockchain, miners can offer their storage in the Storage Market: they set their price and add an ask order to the market's orderbook.

Once the pledge transaction appears in the blockchain (hence in the AllocTable), miners can offer their storage in the Storage Market: they set their price and add an ask order to the market's orderbook via Put.AddOrders.

When a matching ask & bid order is found (via Put.MatchOrders), the client sends the piece (data) to the miner.

When receiving the piece, miners run Put.ReceivePiece. When the data is received, both the miner and the client sign a deal order and submit it to the blockchain (in Storage Market orderbook).

Storage Miners' storage is divided into sectors, each sector contains pieces assigned to the miner. The Network keeps track of each Storage Miners' sector via the allocation table. At this point (when deal order is signed), the Network assigns the data to the miner and makes a note of it in the allocation table.

When a Storage Miner sector is filled, the sector is sealed. Sealing is a slow, sequential operation that transforms the data in a sector into a replica, a unique physical copy of the data that is associated with the public key of the Storage Miner. Sealing is a necessary operation during the Proof-of-Replication (described in Consensus Section below).

When Storage Miners are assigned data, they must repeatedly generate proofs of replication to guarantee they are storing the data (we will talk in detail about these proofs below). Proofs are posted on the blockchain and the Network verifies them.

All the storage allocations are public to every participant in the network. At every block, the Network checks if the required proofs for each assignment are present, checks that they are valid, and acts accordingly:

Retrieval Market

It is an off-chain exchange where clients and retrieval miners discover each other in a peer-to-peer way. Once the client and miner agree on the price, they start to exchange data and coins piece by piece using micropayments.

Let's see how it works.

Retrieval Miners announce their pieces by gossiping their ask orders to the network: they set their price and add an ask order to the market's orderbook.

A client submits a bid order to the Retrieval Market orderbook. Retrieval Miners check if their orders are matched with a corresponding bid order from a client.

Once orders are matched, Retrieval Miners send the piece to the client (miner sends parts of data & client sends micro-payments). When the piece is received, both the miner and the client sign a deal order and submit it to the blockchain.

Summing up

Below is a diagram showing all the activities taking place in the network.

An In-depth Study on Filecoin Protocols


Filecoin introduces the concept of a decentralized storage network (DSN). DSN is a scheme that describes a network of independent clients and storage providers. DSN aggregates storage offered by multiple independent storage providers and self-coordinate to provide data storage and data retrieval to clients. Coordination is decentralized and does not require trusted parties: the secure operation of these systems is achieved through protocols that coordinate and verify operations carried out by individual parties. DSNs can employ different strategies for coordination, including Byzantine Agreement, gossip protocols, or CRDTs, depending on the requirements of the system.

DSN involves the implementation of 3 functions: put, get, and manage. Put allows clients to store data under unique identifiers. Get allows clients to retrieve data using the identifier. Manage orchestrates the network by measuring space available for rent, auditing providers, and repairing possible data faults. The Manage protocol is run by storage providers often in conjunction with clients or a network of auditors (this involves Byzantine Faults which are discussed below).

DSN has several properties. The first 2 are essentially required.

Optional Properties of a DSN:


Fault tolerance


There are 2 types of possible faults which a DSN should tolerate:


Consensus Algorithms


The Filecoin DSN protocol can be implemented on top of any consensus protocol that allows for verification of the Filecoin's proofs. Proof-of-Work schemes often require solving puzzles whose solutions are not reusable or require a substantial amount of wasteful computation to find.

Non-reusable Work: Most permissionless blockchains require miners to solve a hard computational puzzle, such as inverting a hash function. Often the solutions to these puzzles are useless and do not have any inherent value beyond securing the network. Some blockchains like Ethereum (executing smart contract logic) and Primecoin(finding new prime numbers) have tried to use some of the computational power to do useful work.

Wasteful Work: Solving hard puzzles can be really expensive in terms of the cost of machinery and energy consumed, especially if these puzzles solely rely on computational power. When the mining algorithm is embarrassingly parallel, then the prevalent factor to solve the puzzle is computational power.

Attempts to reduce waste: Ideally, the majority of a network's resources should be spent on useful work. Some efforts require miners to use more energy-efficient solutions. For example, Spacemint requires miners to dedicate disk space rather than computation; while more energy-efficient, these disks are still "wasted" since they are filled with random data. Other efforts replace hard to solve puzzles with a traditional byzantine agreement based on Proof-of-Stake, where stakeholders vote on the next block proportional to their share of currency in the system.

So, instead of wasteful Proof-of-Work computation, the work Filecoin miners do generate Proof-of-Spacetime is what allows them to participate in the consensus.

Useful work: We consider the work done by the miners in a consensus protocol to be useful, if the outcome of the computation is valuable to the network, beyond securing the blockchain.

Filecoin proposes a useful work consensus protocol, where the probability that the network elects a miner to create a new block (we refer to this as the voting power of the miner) is proportional to their storage currently in use in relation to the rest of the network. Filecoin protocol is designed such that miners would rather invest in storage than in computing power to parallelize the mining computation. Miners offer storage and re-use the computation for proof that data is being stored to participate in the consensus.


Modelling Mining Power


Power Fault Tolerance: In this technical report, Power Fault Tolerance is presented as an abstraction that re-frames byzantine faults in terms of participants' influence over the outcome of the protocol. Every participant controls some power of which n is the total power in the network, and f is the fraction of power controlled by faulty or adversarial participants.

Power in Filecoin: In Filecoin, the power p of miner M at time t is the sum of the M's storage assignments. The influence I of M is the fraction of M's power over the total power in the network. In Filecoin, power has the following properties:

To read more about how this power plays a role (mathematically) in the consensus algorithm refer to the whitepaper.

We also need mechanisms to prevent three types of attacks that malicious miners could exploit to get rewarded for storage they are not providing: Sybil attack, Outsourcing attacks, Generation attacks.

Storage providers must convince their clients that they stored the data they were paid to store. In practice, storage providers will generate Proofs-of-Storage (PoS) that the blockchain network (or the clients themselves) verifies.

To make the act of storage publicly verifiable, Filecoin introduces two consensus algorithms: Proof-of-Replication (PoRep) and Proof-of-Spacetime (PoSt).

Proof-of-Replication (PoRep) is a novel Proof-of-Storage that allows a server (i.e. the prover P) to convince a user (i.e. the verifier V) that some data D has been replicated to its own uniquely dedicated physical storage. Our scheme is an interactive protocol, where the prover P: (a) commits to store n distinct replicas(physically independent copies) of some data D, and then (b) convinces the verifier V, that P is indeed storing each of the replicas via a challenge/response protocol. PoRep improves on PoR and PDP schemes, preventing Sybil Attacks, Outsourcing Attacks, and Generation Attacks.

Proof-of-Spacetime: Proof-of-Storage schemes allow a user to check if a storage provider is storing the outsourced data at the time of the challenge. How can we use PoS schemes to prove that some data was being stored throughout a period of time? A natural answer to this question is to require the user to repeatedly (e.g. every minute) send challenges to the storage provider. However, the communication complexity required in each interaction can be the bottleneck in systems such as Filecoin, where storage providers are required to submit their proofs to the blockchain network.

To address this question, we introduce a new proof, Proof-of-Spacetime, where a verifier can check if a prover is storing her/his outsourced data for a range of time. The intuition is to require the prover to

The prover receives a random challenge(c) from the verifier and generates Proofs-of-Replication in sequence, using the output of proof as an input of the other for a specified amount of iterations t. Thus ensuring that all of the work done is reusable (as discussed above).

PoSt & PoRep uses zk-SNARKS, making proofs are very short and easy to verify.

To know more about the Practical implementation of PoSt refer to the Whitepaper.


Smart Contracts


Smart Contracts enable users of Filecoin to write stateful programs that can spend tokens, request storage/retrieval of data in the markets and validate storage proofs. Users can interact with the smart contracts by sending transactions to the ledger that trigger function calls in the contract. We extend the Smart Contract system to support Filecoin specific operations (e.g. market operations, proof verification).

Filecoin supports contracts specific to data storage, as well as more generic smart contracts:


Cross chain Interactions


Bridges are tools that aim at connecting different blockchains; while still work in progress, we plan to support cross chain interaction in order to bring the Filecoin storage in other blockchain-based platforms as well as bringing functionalities from other platforms into Filecoin.


Some Other Issues


Here we list a few potential issues which are not well discussed in the whitepaper.


Possible Improvements in Filecoin Protocol


Here we list some possible improvements in the filecoin protocol.

Ultimately, the encryption-key and some information to help find the right Storage nodes become part of the "capability string" (read more about the encoding process). The important point is that a capability string is both necessary and sufficient to retrieve a value from the Grid — the case where this will fail is when too many nodes have become unavailable (or gone offline) and you can no longer retrieve enough shares.

There are write-capabilities, read-capabilities and verify capabilities; one can be diminished into the "less authoritative" capabilities offline. That is, someone with a write-capability can turn it into a read-capability (without interacting with a server). A verify-capability can confirm the existence and integrity of value, but not decrypt the contents. It is possible to put both mutable and immutable values into the Grid; naturally, immutable values don't have a write-capability at all.