Deep Trouble – Neural Networks Easily Fooled

Deep Neural Networks (DNNs) always leave me with a vague uneasy feeling in my stomach.  They seem to work well at image recognition tasks, yet we cannot really explain how they work.  How does the computer know that image is a fish, or a piano, or whatever?  Neural networks were originally modeled after how biological neurons were thought to work, although they quickly became a research area of their own without regard to biological operation.

Well, it turns out that DNNs don’t work at all like human brains.  A paper by Nguyen, et al 1 explores how easy it is to create images that look like nothing (white noise essentially) to the human brain, and yet are categorized with 99%+ certainty as a cheetah, peacock, etc. by DNNs that perform at human levels on standard image classification libraries.  Here a small sample of images from their paper:

The researchers created the false images through an evolutionary algorithm that seeks to modify existing images through mutation and combination in order to improve on a goal, in this case to find images that would score highly with a DNN classifier (“fooling images”).  They used a couple of methods, both of which worked.  The direct method, illustrated in the top two images, works by direct manipulation of the pixels in an image file.  The indirect method, illustrated in the bottom two images, works by using a series of formulas to generate the pixels; the formulas were then evolved.  The idea was to create images that looked more like images and less like random noise.  In both cases, the researchers found it easy to come up with images that fooled the DNN.

Their results also seemed robust.  They performed various trials with random starting points, they even added their fooling images to the training sets, so as to warn the DNN that these were incorrect.  Even after doing that, they were still able to find other fooling images that were misclassified by the new “improved” DNN.  They even repeated this process as many as 15 times, all to no avail.  The DNNs were still easily fooled.

As the authors point out, this ability of DNNs to be fooled has some serious implications for safety and security, for example in the area of self-driving cars.  For real world results, see the Self-driving car hack.

There is something going on here that we do not fully understand.  Researchers are starting to look into what features a DNN is really considering – which may help us to improve or alter the game for image recognition. Until then, pay attention to that pit in your stomach.

Notes:

  1. A. Nguyen, J Yosinkski, and J. Clune, Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, Computer Vision and Pattern Recognition (CVPR ’15), IEEE, 2015

Bitcoin Network Hash Rates. Where are you, Bitcoin cash?

The Bitcoin blockchain just split in two.  What is Hash Power and where did it go?  With the split, people are discussing the terms “Hash Power” and trying to decide how many miners are mining each blockchain.  We will present the basic equations for working out these issues, and estimate the relative hash powers mining the 2 chains.

What is the Hash Rate?

Mining is the process by which the bitcoin blockchain is kept secure.  Briefly, the blockchain is a public ledger that everyone “agrees” on that contains the entire history of all bitcoin transaction since time began.  Miners do work (think mining gold) in order to add blocks to the blockchain, for which they get rewarded.  Bitcoin uses what’s called a Proof-of-Work system for mining, other methods used on other cryptos include Proof-of-Stake.

The basic work that the miner has to do is take a batch of current transactions, and find a number, called a nonce, so that the data (in a predetermined format) hashes to a number that is less than a target.

Let’s dissect these pieces one at a time:

  1. nonce: named because originally it was a “number used once”, but here it is just a number that the miner is free to pick.  Why the miner would want to pick different ones will be clear shortly.
  2. hash: A hash is a function that takes an arbitrary length piece of data and computes a digest, or shorter abbreviation for the data. Note: for the purists out there, we are just considering cryptographic hashes.  Bitcoin uses the hash SHA256, which is a particular hash that produces a 256 bit digest.  The idea of a good hash is that it is very hard to manipulate the data going into the hash to produce any particular hash – effectively it looks like a random number produced from the data (which is always the same for the same data).
  3. target: this is a 256 bit number.  Let’s call it T.  It gets calculated a special way on the Bitcoin network, as will be described later.

Now, to rehash things, so to speak:

To successfully add a block to the bitcoin blockchain, the miner must be the first person to find a nonce N, so that:
\[
\textrm{SHA256}(\textrm{SHA256}(\textrm{blockdata}, N)) < T
\]
where blockdata represents the transactions that the miner wants to include in the block. The nonce is in a certain place in that data, but here we call it out as a separate argument to make it clear that the results varies with N. Different miners might want to include different transactions for a variety of reasons, which we don’t need to go into here. Because of the properties of the SHA256 hash, this problem can only be solved (based on our current understanding of SHA256) by repeatedly trying different values of N until we find one that results in our target being hit. The miner has no idea which values of N are better than any others (because the hash is “random”), so they just try them in sequence, 0, 1, 2, 3, … until they find one that gives a hash value less than T. You might ask “why the double hash?”. Excellent question. No one knows for sure, but that’s just the way it is.

How long does it take for a miner to find an acceptable nonce?  Suppose that the miner can calculate H double-hashes per second.  Then, since the result of SHA256 is effectively a random number from 0 to \(2^{256}-1\), the probability of any given attempt satisfying the target is:
\[
prob=\frac{T}{2^{256}}
\]
and the expected time \(E(X)\)to find a new block would be
\[
E(X)=\frac{2^{256}}{T \cdot H}
\]

Difficulty

How is the target T calculated?  The target gets adjusted every 2016 blocks – the idea is that the average time for each block should be 10 minutes.  Thus, if there are so many miners that blocks are found much more frequently, say every 2 minutes, the target T would get adjusted (made smaller), so that the new block times become approximately 10 minutes.
Here are the details of the calculation. It uses something called the Difficulty (D), which is kind of the inverse of the T. That is, the higher difficulty, the lower the target T. You can find the current difficulty from a variety of websites, bitcoinwisdom.com has a nice graph of its history.

Remark: There is something called the packed difficulty (32 bits), which you may see. It is a floating-point representation of the difficulty which is shorter. For example, a packed difficulty 0x1d00ffff becomes:
\[
D_1=\text{0x00ffff} \cdot 2^{8 (\text{0x1d} – 3)}=\text{0xffff}\cdot 2^{208}=\text{0x00000000ffff0000000000000000000000000000000000000000000000000000}
\]
where there should be 52 0’s after the ffff. You can check my counting if you want.
This particular value of difficulty is called difficulty-1, \(D_1\).  It is the highest possible difficulty, and it used to compute the target from the current difficulty D.
\[
T=\frac{D_1}{D}
\]
If the current difficulty were \(D_1\), the target would be 1, and the SHA double hash would have to be 0. With a probability of that happening on each try of \(1/2^{256}\) or \(p=8.636 \cdot 10^{-78}\), that’s pretty hard! The expected number of hashes to find a block for a difficulty D is:
\[
\frac{2^{256}}{T}=\frac{D\cdot 2^{256}}{(2^{16}-1)2^{208}}\approx D\cdot2^{32}
\]
Let X be the time to find a block, we have the following relationships:
\begin{align}
T&=\frac{2^{256}}{D\cdot2^{32}}=\frac{2^{224}}{D}\\
\textrm{E}(X)&=\frac{2^{256}}{T \cdot H}=\frac{D \cdot 2^{32}}{H}
\end{align}
We can reverse the last equation to to estimate the number of hashes per second that a particular time to find a block represents:
\[
\text{hashpower (estimate)}=\frac{D\cdot 2^{32}}{X}
\]
Now, when we see the word “estimate”, we should be thinking “what is its variance?” The estimate here is zero-mean-error, but its variance is quite high. It turns out (I won’t prove it here) that if we expect \(\lambda\) blocks per second, then, for the time per block X:
\begin{align}
\textrm{E}(X)&=\frac{1}{\lambda} \\
\textrm{Var}(X)&=\frac{2}{\lambda^2}
\end{align}
Now, if we average block times over M blocks, the variance can be reduced per the usual formula:
\begin{align}
\textrm{Var}_M(X)&=\frac{2}{\lambda^2 \cdot M} \\
\textrm{std-dev}_M(X)&=\sqrt{\frac{2}{M}}\cdot \textrm{avg}(X)
\end{align}
For an example, the block times for bitcoin are generally around 10 minutes. If we average various numbers of past block times, we get the following standard deviations:

Number of Blocks Std Deviation of Average Block Time
6 5.77 min
10 4.47 min
20 3.16 min

Thus, we can see that even with looking at the last 20 blocks, we have a great uncertainty in the actual hash power that is operating to produce that block.  Let’s now look at the current block times for Bitcoin and Bitcoin cash and estimate the hash power operating on the chains.  Note that if mining hashpower is switching between Bitcoin and Bitcoin cash, it will take some time to determine that from the data – since we will want more than 20 blocks to average over.

Update 6 Aug 2017 10:47 AM EST:  blocktimes and difficulty dropped significantly (and BTC closer to expected average):

Bitcoin (BTC) Bitcoin Cash (BCC)
Avg Block Time (min) Std Dev Avg Block Time (min) Std Dev
Last 39 (BTC) 20 (BCC) blocks as of 10:47AM EST 6 Aug 2017 9.95 2.25 20.25 6.40
Difficulty 860 221 984 436 144 323 701 657
Estimated Hash Power (double-hash/sec) 6.19 * 10^18 5.1 * 10^17
Relative Hash Power (BTC=1.0) 1.0 .082

 

Note: this is current as of 5 Aug 2017 12:42 AM EST.  I hope to have a online updating version up later.

Bitcoin (BTC) Bitcoin Cash (BCC)
Avg Block Time (min) Std Dev Avg Block Time (min) Std Dev
Last 6 blocks 6.8 3.9 74.3 42.9
Last 20 blocks 6.0 1.9 76.0 24.0
Last 39 (BTC) 29 (BCC) blocks 5.7 1.5 55.6 12.6
Difficulty 860 221 984 436 225 505 642 691
Estimated Hash Power (double-hash/sec)  1.1 * 10^19  2.9 * 10^17
Relative Hash Power (BTC=1.0)  1.0 0.026

References

  1. Exponential Distribution
  2. Bitcoin Difficulty
  3. blockchair.com (for block times)
  4. A great detailed discussion of all aspects of bitcoin and other crytpocurrency concepts.
  5. Discussion geared for a less technical audience

An Investor’s View of ICOs

ICO? Initial Coin Offering.  Remind you of an IPO (Initial Public Offering)?  It is meant to.  Bancor raised $147 million in a few hours. Block.one sold $185 million of EOS tokens in 5 days.  Tezos has raised over $100 million in 2 days, with 10 more days to go and no cap. There are currently almost 20 ICOs planned for the month of July.  See the sites at the bottom of this post for lists of ICOs.

There are plenty of analyses to be found online about the pros and cons of the various ICOs (or token sale, or whatever name they are actually called). What I want to do is give you the perspective of a private investor on these deals – and how they compare to more traditional private investments.

I have made dozens of private investments in startup companies, been involved in dozens more as a member of an investment committee, and seen probably hundreds, if not thousands, of other deals cross my monitor.

There are a number of dimensions along which a deal gets evaluated.  Some are:

  • Business plan Is this a good business to be in?
  • Team Are these the right people to execute the plan?
  • Market  Is the market for this product large enough or growing fast enough to be interesting?
  • Price What price am I paying to get in on this opportunity?
  • Terms What are the fine print details on the deal?

As I said above, there are plenty of other places debating the business plans and teams.  Today, I want to discuss the terms.  Generally, these do not get much attention in a private deal.  Not because they are not important, but because they are so important that standard terms have evolved over time, and people assume that these standard terms will be included.  Each one has a reason for being included.

A Private Investment Term Sheet

Let’s go through a standard private investment term sheet and see what kinds of things are in there. We will call our company the imaginative name NewCo.  We will call our new coin NewCoin.  I am leaving out a lot of the gory technical language here; if you want to see a full model term sheet, check out the National Venture Capital Association.

Price Example: Investors will invest $2 million in NewCo, at a $10 million fully-diluted post-money valuation, including a employee pool of 20% of the post-money capitalization. They will receive Series A Preferred Stock.  Explanation: Investors will get 20% of NewCo after the financing, including all other convertible securities, such as options (this is the fully-diluted part).  NewCo is setting aside 20% its shares for issuance to employees in the future.  The Series A Preferred Stock is really just a label – its properties are defined by the rest of the term sheet.
Dividends How and when dividends might be paid.  Important to prevent the company from paying out all it’s cash in dividends to the common stockholders and not giving any to the investors.
Capitalization The company’s capital structure before and after Closing is set forth in exhibit A.  So everyone knows who owns what.
Liquidation Preference Describes how liquidation or acquisition proceeds are split with investors.  Generally there is some preference to investors.  In our example, if the company were sold a month after the investment, for $5 million, the investors would receive 50% of their investment (losing $1m), but the founders/employees would walk away with $4 million, without doing anything.  To prevent this, various preferences for the investors have evolved called non-participating or particpating Preferred Stock.  Generally, the first proceeds of a liquidation go back to the investors until they receive their investment back.
 Conversion This allows investors to convert to Common Stock when it become advantageous. This is how they can exit the deal in an IPO, for example.
 Antidilution This specifies how the investors purchase price can change if the company issues shares later at a much cheaper price.  Without this, there would be no way to stop a company from issuing lots of cheap shares after an investment and diluting the investor.
Information Rights Details how information will be provided to investors over time, for example, monthly financial reports, annual audits, etc.  This forces the company to keep providing information in the future.
Voting Rights Specifies how various votes will be taken.
Registration Rights Very technical details on how shares should be registered, relative to public offerings.
Representations and Warranties Statements about the company that the founders declare are true, E.g. here are all our debts, we have no pending lawsuits, etc.
Board of Directors How many members will be on the board, who will they be, how can they be changed in the future.  They are generally elected by the stockholders.
Future Financing How can future financings be conducted?  Will the current investors have the right to participate in a pro rata share?  Do they have the right to approve the financing?
Vesting If a Founder leaves the company 2 days after financing, should they keep all their shares?  Since the investors presumably invested because of the specific founders, they would not want to to see them leave.  Typical terms might be 25% of the stock vests after 12 months, with the rest vesting over another 3 years.  Also Founder’s rights to sell shares is restricted (also note that investors generally have restrictions on their sales as well).
Employee Matters Each Founder should sign a non-compete agreement, and non-disclosure agreement (NDA), and a non-solicit agreement (that they will not hire away current employees if they leave).  Also, they agree that intellectual property developed by them for the company belongs to the company (IP Assignment).

In a private investment, money goes into NewCo, which is managed by a Board of Directors, who hires the CEO.  If all the employees walk out, they don’t take the funds of the company with them, and they lose their shares according to the vesting schedule.  While obviously not what was intended, the money would still be there, and the BOD (elected by the shareholders) could hire a new CEO and team.  They could even give the funds back to the investors and call it a day.  The BOD has a fiduciary responsibility to the investors to act in their best interests.  They could not, for instance, divide the money up among themselves.

Compare versus ICO

Let’s now look at a typical ICO and see how these items relate.

Generally ICOs specify the price you pay for each NewCoin, generally in bitcoin (BTC) or ethereum (ETH).  But each NewCoin does not get you any ownership in a business.  The business that NewCoin is in could do well, NewCoin could become very popular and fashionable, but NewCoin has to increase in value for you to see a return.  You also do not own any future IP or technology developments that the founders/developers create [compare with IP Assignment].   Since code is almost always open-source, there is nothing proprietary about it.

In an ICO, investors have no oversight or control over what the Founder team is doing. In the case of a foundation being created, this might be up to the foundation’s directors, but again, they might not be elected or answerable to the investors.  There should be something comparable to an elected BOD.

What prevents founders and developers from walking away?  Where is the vesting?.  The distribution of NewCoin to founders is generally opaque (no cap table is generally provided) so you don’t know who owns what.  If they developers want to develop a better, competing coin based on what they learned at NewCoin, can they? [compare to non-compete agreement].  To the extent founders hold NewCoin, there would be an incentive for them to help out NewCoin, but the distribution is unknown, and without vesting they might just sit back and see others do the work.

Coins kept by founders running NewCoin could be considered like the stock held by founders in a PI.  They have incentive to make the coin increase in value.  The issue is there is no lockup to prevent them from selling, no vesting terms, and no public knowledge of their ownership.  They could sell their coins and no one knows.

ICOs are somewhat more akin to a Limited Partnership where the investors generally do not have control over the operations and details, except that in an Limited Partnership (1) there are provisions for removing General Partners (GPs) in extreme cases, and (2) the GP has a fiduciary responsibility to the LPs.  An ICO should have a mechanism for providing such a responsibility.

What if the NewCoin needs more money to finish development? In a private investment, there are ways to raise additional funds, but it is not as obvious how NewCoin does this.  They could sell some of the reserved coins (if there are any and they have value), but often the supply is fixed.  In an ICO you might be stuck.

Private companies generally stage their investment.  At each stage, money is raised to reduce risk and bring the company to a milestone that greatly changes its value.  This is more capital efficient and less risky for all concerned.  Rather than raise 100M on an idea, companies might raise $1-5M in a series A to prove out an idea or concept, then raise 10-20M in Series B, and so on.  Each phase should be reducing some risk.  ICOs are generally a one-shot deal.  There is no first proof-of-concept stage, with later money coming with results.

Lists of ICOs:

Lending Rates for Bitcoin and USD on Cryptoexchanges

We are going to take a look at the current interest rates in bitcoin lending.  Why is this important?  In addition to being a way to make extra income from your fiat (“real” currency) or bitcoin holdings without selling them, these interest rates indicate traders sentiment about the relative values of these currencies in the future.  We will look at bitcoin in this post, because it is the largest cryptocurrency, both in terms of market cap and trading volume.  As of the date of this writing here are the top 4 cryptocurrencies by total market cap.  Information from http://coinmarketmap.com.

Crypto-Currency Market Cap ($USD, billion) 24 Hr Trading volume ($USD, billion)
Bitcoin (BTC) 45 1.0
Ethereum (ETH) 32 0.78
Ripple 11 0.18
Litecoin (LTC) 2.5 0.37

The Bitfinex Exchange is one of the exchanges that allows margin trading, and also customer lending of fiat and crytpocurrencies.  They call the lending exchange funding and it works with an order book just like a regular trading exchange.  Customers submit funding offers and requests, and the exchange matches the orders.  So, for example, I could offer $USD800 for 4 days at .0329% per day (12% annually).  This get placed on the order book.  Someone else could accept that offer, and then the loan happens.  As the lender, I get paid interest daily at the contract rate, paid by the borrower.  Now, what does the borrower do with the proceeds?  They can’t withdraw the money from Bitfinex, this is not a general personal loan.  They could use this money to buy bitcoins on margin.  There are specific rules on how much they can borrow at purchase, and how much margin they must maintain in the future.  We won’t go into the specifics of those rules in this post, but just be aware that the exchange can liquidate, or sell, a position to maintain margin requirements.  In fact, this is what caused the recent meltdown in the Ethereum market at one exchange (GDAX) Ethereum Flash Crash, as traders had their positions liquidated automatically for margin calls.

So, show me the data!

interest rate chart

In this chart, fUSD is the funding rate as an APR for $USD, and fBTC is the funding rate for bitcoin (BTC). These are based on a small sample of 2 day loans actually traded on that day. The first thing to note is that the rates are quite volatile, reaching highs of over 100% (USD) and 50% (BTC) during the last year. The current rates (June 2017) are around 40% (USD) and 4% (BTC).

One more thing we might want to look at.  We might wonder about the difference between the two rates and what that means.  In economics, there is something called uncovered interest rate parity, which normally looks at the difference in interest rates between 2 currencies, say the USD and the Euro (EUR).  This difference is related to the relative inflation expected between the 2 currencies in the future.  In fact, futures contracts for the currencies should have a price related to this difference in such a way as to make arbitrage not possible.  Here is the difference graph.

Interest rate difference graph

Note that it switches around both sides of zero.  Between approximately 15 Mar 2017 and 1 May 2017, it was negative, meaning people were demanding higher rates to lend BTC than to lend USD.  Perhaps this is an indication that people wanted to short BTC at that time – although we don’t know which fiat currency they were shorting it against.

Another thing that we can pull from the data is the yield curve.  In the bond market, this indicates the interest rates for bonds at different maturities, for example, a 30 day, 60 day, 1 year, etc.  In the case of Bitfinex, the loans can only be made for 2-30 days, so we have a more limited set of possibilities.

You can see the curves are relatively flat, but they only go out to 30 days, so we can’t say much about 1 year rates. Note that most of the volume is at the extremes, that is 2 days or 30 days, so the numbers in between are not that meaningful.

In post to come, we will look at Bitcoin futures, to see how their pricing might be related to the uncovered interest rate parity. Stay tuned.