There are more details on my test setup in previous posts. For this test clients and server ran on separate hosts and ping takes ~250 usecs between them today. Eight sysbench processes were run on the client host and each process created between 1 and 16 connections to mysqld. The database is cached by InnoDB and the clients were divided evenly between the tables. Each table has 8M rows.
These are results in TPS for 8, 16, 32, 64 and 128 concurrent clients. Each transaction is connect followed by a HANDLER fetch. The binaries orig572.psen and orig5612.psen use the performance schema with default options for MySQL 5.7.2 and 5.6.12. Throughput is much worse compared to the same code without the PS. All binary names are explained here.
And this chart has data for some of the binaries.