Blockchain & Data Science

By Antonio Hernández-Garduño

At its core, blockchain technology enables the possibility of having a distributed ledger. On the other hand, data science is the science of extracting valuable insights from both structured and unstructured data. It is thus apparent that both blockchain technology and data science are disciplines whose main concern deals with data.

1. Interdisciplinary opportunities

Some interesting interdisciplinary opportunities involving both blockchain and data science are:

  • Enabling data integrity and transparency
  • Ensuring data authenticity
  • Accesibility to clean data of significance to economic activity

1.1. Data integrity and transparency

Data integrity and transparency are some of the mayor selling points of blockchain technology, which are very valuable in data science. Immutability of the blockchain ensures that data can not be tampered with. Furthermore, it is usually the case that any node participating in the consensus mechanism of a particular blockchain has access to all the accumulated data, ensuring transparency. (Details vary depending on the blockchain.)

1.2. Data authenticity

To each block in a blockchain there is an associated fingerprint obtained through a hashing algorithm. Each succesive block keeps a record of the hash of the previous block. This mechanism ensures data authenticity, as any tampering with the data would inevitably be revealed by a "fingerpint inconsistency".

1.3. Clean data of economic activity

The structured nature of the blockchain allows to obtain data that is for the most part clean (i.e. the amount of furthe "structuring" is minimal). Moreover, many blockchain projects related with cryptocurrencies, like Bitcoin or Cardano, allow accesibility to data that is very relevant to economic activity. Moreover, since the blockchain tracks every individual transaction, this opens the door to the possibility of real-time data analysis with rich economic implications.

2. Cardano

During the past couple of years I have been working and familiarizing myself with the EUTxO model on which the Cardano blockchain is based, as well as the many tools that can be used to extract data on this blockchain. The EUTxO model allows to have a turing-complete language to implement smart-contracts. I am currently interested in exploring the growing economic activity been conducted on this blockchain, as well as its interconnections with other blockchains.